Weaponising Generative AI Through Data Poisoning: Analysing Various Data Poisoning Attacks on Large Language Models (LLMs) and Their Countermeasures

Dishita Naik, Ishita Naik, Nitin Naik

Research output: Preprint or Working paperPreprint

Abstract

Large Language Models (LLMs) and most modern AI models profoundly rely on the quantity, quality and integrity of training data, which ultimately determines the overall success of these LLMs or AI models. This enormous amount of training data is collected from diverse sources and in various formats, then undergoes multiple transformation processes to train the LLM or AI model. This enormous training data poses significant challenges in the data management supply chain of the LLM or AI model, making it difficult to detect anomalies or corrupted data before they impact the training of the LLM or AI model, thereby increasing the risk of data-related attacks on the LLM or AI model. Data poisoning occurs when attackers intentionally manipulate or corrupt the training data used to develop an LLM or AI model. It is also challenging to detect data poisoning due to the subtlety of manipulation or corruption and the enormous volume of training data. Therefore, data poisoning attacks are one of the most prevalent attacks on LLMs and AI models that can adversely affect the learning process, behaviour, functionality or performance of the LLM or AI model. This paper will present an in-depth analysis of data poisoning attacks on LLMs or AI models covering their types, attack vectors, risks and mitigation techniques. Initially, it will classify data poisoning attacks on LLMs or AI models into various types based on the nature, target and aim of poisoning. Next, it will analyse each type of data poisoning attack with a clear distinction from other types of data poisoning attacks and its specific attack vectors. Subsequently, it will analyse several risks associated with data poisoning attacks on LLMs or AI models. Finally, it will analyse several mitigation techniques for data poisoning attacks on LLMs or AI models.
Original languageEnglish
Number of pages22
DOIs
Publication statusPublished - 10 Dec 2025

Keywords

  • Generative AI
  • Large Language Model
  • LLMs
  • Cyberattacks on LLM
  • Attacks on LLMs
  • Data Poisoning

Fingerprint

Dive into the research topics of 'Weaponising Generative AI Through Data Poisoning: Analysing Various Data Poisoning Attacks on Large Language Models (LLMs) and Their Countermeasures'. Together they form a unique fingerprint.

Cite this