ITDS | SPMS

EXTENSIVE ANALYSIS OF PRESIDENTIAL SPEECHES USING NLP AND MACHINE LEARNING ALGORITHMS

Project Description
Supervisor

The project undertakes a critical look at the SONA speeches delivered over the years by the Presidents of Ghana. From here, we analyze the economic, social, and governance status of the country as contained in the SONA using techniques from NLP and machine learning. Sometimes, these speeches sound ambiguous or misleading, and that is why we need a model of machine learning that can classify speech segments into truthful or untruthful. We will be using SONA texts pre-processed with advanced NLP techniques: tokenization, lemmatization, and part-of-speech tagging. We classify the so-preprocessed data with the use of different models of machine learning; for example, Support Vector Machines and Random Forests. This is a very important study, as it has the potential of bringing ease into analyzing political communication in such a way that the data provided would give insight into the points of deception and hence enhance greater transparency within the realm of governance. Our longterm aim is to develop an easy-to-use system for fact-checking that enables analysts, journalists, and the general public to critically receive presidential speeches and, by extension, enhance accountability and integrity within the realm of political discourse. The study contributes to the wheel that is already in motion in political analysis, adding a broader view of the effectiveness of SONA and its implications for Ghana

Machine Learning: Machine Learning (ML) research in Computer Science and Information Technology focuses on the development of algorithms and models that enable computers to learn from data and improve their performance over time without being explicitly programmed. It is a subset of Artificial Intelligence that uses statistical techniques to give machines the ability to learn patterns, make decisions, and predict outcomes based on data. Supervised learning, a key area of ML research, involves training models on labeled data, where the input-output relationships are predefined. This method is widely used for tasks such as classification (e.g., spam detection) and regression (e.g., predicting house prices). Unsupervised learning, on the other hand, involves finding hidden patterns in data without predefined labels, with clustering and association being typical applications in areas such as customer segmentation and anomaly detection. Reinforcement learning is another area of ML that focuses on teaching agents to make decisions by interacting with their environment and receiving feedback in the form of rewards or penalties. It is often applied in robotics, game playing, and autonomous systems, where continuous learning and adaptation are required.

Developing a machine-learning model that can reliably classify speech information into discrete segments based on sentiment (positive/negative) and veracity (truth/false) is the main goal of this research

2023/2024

Nov 17, 2024

Derick Dankwah Yamoah Owusu Romeo Apaflo Godson Teye Kiel Kofi Boateng

BSc Information Technology