TY - UNPB
T1 - Weaponising Generative AI Through Data Poisoning: Analysing Various Data Poisoning Attacks on Large Language Models (LLMs) and Their Countermeasures
AU - Naik, Dishita
AU - Naik, Ishita
AU - Naik, Nitin
PY - 2025/12/10
Y1 - 2025/12/10
N2 - Large Language Models (LLMs) and most modern AI models profoundly rely on the quantity, quality and integrity of training data, which ultimately determines the overall success of these LLMs or AI models. This enormous amount of training data is collected from diverse sources and in various formats, then undergoes multiple transformation processes to train the LLM or AI model. This enormous training data poses significant challenges in the data management supply chain of the LLM or AI model, making it difficult to detect anomalies or corrupted data before they impact the training of the LLM or AI model, thereby increasing the risk of data-related attacks on the LLM or AI model. Data poisoning occurs when attackers intentionally manipulate or corrupt the training data used to develop an LLM or AI model. It is also challenging to detect data poisoning due to the subtlety of manipulation or corruption and the enormous volume of training data. Therefore, data poisoning attacks are one of the most prevalent attacks on LLMs and AI models that can adversely affect the learning process, behaviour, functionality or performance of the LLM or AI model. This paper will present an in-depth analysis of data poisoning attacks on LLMs or AI models covering their types, attack vectors, risks and mitigation techniques. Initially, it will classify data poisoning attacks on LLMs or AI models into various types based on the nature, target and aim of poisoning. Next, it will analyse each type of data poisoning attack with a clear distinction from other types of data poisoning attacks and its specific attack vectors. Subsequently, it will analyse several risks associated with data poisoning attacks on LLMs or AI models. Finally, it will analyse several mitigation techniques for data poisoning attacks on LLMs or AI models.
AB - Large Language Models (LLMs) and most modern AI models profoundly rely on the quantity, quality and integrity of training data, which ultimately determines the overall success of these LLMs or AI models. This enormous amount of training data is collected from diverse sources and in various formats, then undergoes multiple transformation processes to train the LLM or AI model. This enormous training data poses significant challenges in the data management supply chain of the LLM or AI model, making it difficult to detect anomalies or corrupted data before they impact the training of the LLM or AI model, thereby increasing the risk of data-related attacks on the LLM or AI model. Data poisoning occurs when attackers intentionally manipulate or corrupt the training data used to develop an LLM or AI model. It is also challenging to detect data poisoning due to the subtlety of manipulation or corruption and the enormous volume of training data. Therefore, data poisoning attacks are one of the most prevalent attacks on LLMs and AI models that can adversely affect the learning process, behaviour, functionality or performance of the LLM or AI model. This paper will present an in-depth analysis of data poisoning attacks on LLMs or AI models covering their types, attack vectors, risks and mitigation techniques. Initially, it will classify data poisoning attacks on LLMs or AI models into various types based on the nature, target and aim of poisoning. Next, it will analyse each type of data poisoning attack with a clear distinction from other types of data poisoning attacks and its specific attack vectors. Subsequently, it will analyse several risks associated with data poisoning attacks on LLMs or AI models. Finally, it will analyse several mitigation techniques for data poisoning attacks on LLMs or AI models.
KW - Generative AI
KW - Large Language Model
KW - LLMs
KW - Cyberattacks on LLM
KW - Attacks on LLMs
KW - Data Poisoning
UR - https://www.techrxiv.org/users/845749/articles/1367313-weaponising-generative-ai-through-data-poisoning-analysing-various-data-poisoning-attacks-on-large-language-models-llms-and-their-countermeasures?commit=caab2b1d0a27ae3e599781543c99fa9dfca8ec9d
U2 - 10.36227/techrxiv.176539608.89627406/v1
DO - 10.36227/techrxiv.176539608.89627406/v1
M3 - Preprint
BT - Weaponising Generative AI Through Data Poisoning: Analysing Various Data Poisoning Attacks on Large Language Models (LLMs) and Their Countermeasures
ER -