ITDS | SPMS

NATURAL LANGUAGE PROCESSING FOR ADVERSARIAL ATTACK DETECTION IN AI TRAINING DATASET

Project Description
Supervisor

NLP has nowadays become one of the main technologies in the digital world. It helps enhance communication and is mainly applied for system automatization. But together with big advantages, the NLP models suffer increasingly from adversarial attacks-such are smart manipulations that influence model predictions and raise serious questions about the reliability and effectiveness of these systems. The presented work aims at certain of these risks by building a framework for improved security of NLP models in detail. Advanced defense algorithms are proposed to reduce the impact of such adversarial attacks by up to 50%. This will also make the applications of NLP more strong in different domains. We will combine some approaches, such as ensemble learning, with the anomaly detection techniques, including Local Outlier Factor and Isolation Forest, all set up within a Python-based environment. Early testing showed that around 12% of the data was flagged as adversarial, strengthening the security of NLP models. In the end, the work presented here provides a necessary foundation for future research to be done on protecting AI from malicious attacks so that these systems remain reliable and trustworthy, especially when used in critical applications. Keywords: NLP, Adversarial Attack, Anomaly Detection, Ensemble Methods, Security in AI, Defense Machine Learning, Python, Isolation Forest, Local Outlier Factor.

Machine Learning: Machine Learning (ML) research in Computer Science and Information Technology focuses on the development of algorithms and models that enable computers to learn from data and improve their performance over time without being explicitly programmed. It is a subset of Artificial Intelligence that uses statistical techniques to give machines the ability to learn patterns, make decisions, and predict outcomes based on data. Supervised learning, a key area of ML research, involves training models on labeled data, where the input-output relationships are predefined. This method is widely used for tasks such as classification (e.g., spam detection) and regression (e.g., predicting house prices). Unsupervised learning, on the other hand, involves finding hidden patterns in data without predefined labels, with clustering and association being typical applications in areas such as customer segmentation and anomaly detection. Reinforcement learning is another area of ML that focuses on teaching agents to make decisions by interacting with their environment and receiving feedback in the form of rewards or penalties. It is often applied in robotics, game playing, and autonomous systems, where continuous learning and adaptation are required.

The general objective of this project is to enhance the security and robustness of natural language processing (NLP) models against adversarial attacks through the development of effective detection algorithms and defense mechanisms.

2023/2024

Nov 17, 2024

NKANSAH KWADWO EDWARD THOMPSON JUNIOR WILLIAMS AKWASI NYAMEKYE KUSI ASAFO ADJEI EMMANUEL OWUSU RICHARD

BSc Information Technology