TY - UNPB
T1 - Threat Landscape of Adversarial Attacks on Generative AI and Large Language Models (LLMs): Exploring Different Types of Adversarial Attacks, Associated Risks, and Mitigation Strategies
AU - Naik, Ishita
AU - Naik, Dishita
AU - Naik, Nitin
PY - 2025/12/10
Y1 - 2025/12/10
N2 - Generative Artificial Intelligence (Generative AI) has emerged as a transformative catalyst across disciplines and applications, fundamentally enhancing creativity, productivity, personalization, and problem-solving by creating novel, coherent, and contextually relevant content. Large Language Models (LLMs) are a specific type of generative AI that focuses mainly on understanding, generating, and manipulating human language. The proliferation of LLMs has amplified the threat of adversarial attacks that maliciously manipulate inputs or training data with the adversarial intent to exploit, compromise, or mislead LLMs. Unlike conventional cyberattacks that exploit commonly known software vulnerabilities or directly attack the IT infrastructure, adversarial attacks on LLMs exploit the statistical and linguistic patterns the LLMs have learned. These attack strategies coupled with the sheer scale of LLM deployments and diversity of inputs, exacerbates detection and mitigation challenges of ever-evolving adversarial attacks. Therefore, this paper will explore adversarial attacks on LLMs, and their associated risks and mitigations. Initially, it will explain what adversarial attacks on LLMs are and how they differ from conventional cyberattacks. Next, it will elucidate several types of adversarial attacks on LLMs to provide an in-depth overview of their nature and impact. Afterwards, it will review various risks associated with adversarial attacks on LLMs. Lastly, it will present various mitigation strategies for adversarial attacks on LLMs. This detailed analysis of adversarial attacks on LLMs, along with their associated risks and mitigation strategies, aims to provide in-depth insights into the security and safety challenges inherent to LLM deployment and usage.
AB - Generative Artificial Intelligence (Generative AI) has emerged as a transformative catalyst across disciplines and applications, fundamentally enhancing creativity, productivity, personalization, and problem-solving by creating novel, coherent, and contextually relevant content. Large Language Models (LLMs) are a specific type of generative AI that focuses mainly on understanding, generating, and manipulating human language. The proliferation of LLMs has amplified the threat of adversarial attacks that maliciously manipulate inputs or training data with the adversarial intent to exploit, compromise, or mislead LLMs. Unlike conventional cyberattacks that exploit commonly known software vulnerabilities or directly attack the IT infrastructure, adversarial attacks on LLMs exploit the statistical and linguistic patterns the LLMs have learned. These attack strategies coupled with the sheer scale of LLM deployments and diversity of inputs, exacerbates detection and mitigation challenges of ever-evolving adversarial attacks. Therefore, this paper will explore adversarial attacks on LLMs, and their associated risks and mitigations. Initially, it will explain what adversarial attacks on LLMs are and how they differ from conventional cyberattacks. Next, it will elucidate several types of adversarial attacks on LLMs to provide an in-depth overview of their nature and impact. Afterwards, it will review various risks associated with adversarial attacks on LLMs. Lastly, it will present various mitigation strategies for adversarial attacks on LLMs. This detailed analysis of adversarial attacks on LLMs, along with their associated risks and mitigation strategies, aims to provide in-depth insights into the security and safety challenges inherent to LLM deployment and usage.
KW - Generative AI
KW - Large Language Models
KW - LLMs
KW - Cyberattacks on LLM
KW - Attacks on LLMs
KW - Adversarial Attacks on Generative AI
KW - Adversarial Attacks on LLMs
UR - https://www.techrxiv.org/users/845749/articles/1367329-threat-landscape-of-adversarial-attacks-on-generative-ai-and-large-language-models-llms-exploring-different-types-of-adversarial-attacks-associated-risks-and-mitigation-strategies?commit=88e3cbe81e353474f55e5a2d59a9cc4cf1f26c1d
U2 - 10.36227/techrxiv.176539611.16370746/v1
DO - 10.36227/techrxiv.176539611.16370746/v1
M3 - Preprint
BT - Threat Landscape of Adversarial Attacks on Generative AI and Large Language Models (LLMs): Exploring Different Types of Adversarial Attacks, Associated Risks, and Mitigation Strategies
ER -