TY - UNPB
T1 - When Generative AI Prompts Bite Back: Investigating Different Types of Prompt Injection Attacks on Large Language Models (LLMs) and Their Prevention Methods
AU - Naik, Dishita
AU - Naik, Ishita
AU - Naik, Nitin
PY - 2025/12/12
Y1 - 2025/12/12
N2 - Large Language Models (LLMs) are designed to exhibit a high degree of flexibility, enabling them to adapt to a wide range of tasks expressed in natural language. LLMs predominantly operate through a prompt-based paradigm, wherein all interactions are facilitated through natural language prompts. The flexibility and expressive power of LLM prompts also render them a primary vector for prompt injection attacks due to the complex and often opaque structural characteristics of prompt composition, as well as the virtually unbounded attack surface introduced by the infinite variability of natural language expressions. The inherent complexity of prompt structures, coupled with their representation through the infinitely variable medium of natural language, creates opportunities for attackers to exploit these LLM prompts in ways that can manipulate the behaviour of the LLM and enable it to generate unintended or adversarial outputs. Such potential attacks pose significant challenges to the safe and trustworthy deployment of LLMs, thereby necessitating rigorous investigation and a comprehensive understanding to develop effective mitigation strategies. This paper will investigate different types of prompt injection attacks on LLMs. Firstly, it will investigate prompts in LLMs, and various elements of the LLM prompt. Secondly, it will investigate each type of prompt injection attack in detail including its associated attack vectors and an explicit difference with other types of prompt injection attacks. Thirdly, it will investigate several risks associated with prompt injection attacks on LLMs. Lastly, it will investigate several prevention methods for prompt injection attacks on LLMs.
AB - Large Language Models (LLMs) are designed to exhibit a high degree of flexibility, enabling them to adapt to a wide range of tasks expressed in natural language. LLMs predominantly operate through a prompt-based paradigm, wherein all interactions are facilitated through natural language prompts. The flexibility and expressive power of LLM prompts also render them a primary vector for prompt injection attacks due to the complex and often opaque structural characteristics of prompt composition, as well as the virtually unbounded attack surface introduced by the infinite variability of natural language expressions. The inherent complexity of prompt structures, coupled with their representation through the infinitely variable medium of natural language, creates opportunities for attackers to exploit these LLM prompts in ways that can manipulate the behaviour of the LLM and enable it to generate unintended or adversarial outputs. Such potential attacks pose significant challenges to the safe and trustworthy deployment of LLMs, thereby necessitating rigorous investigation and a comprehensive understanding to develop effective mitigation strategies. This paper will investigate different types of prompt injection attacks on LLMs. Firstly, it will investigate prompts in LLMs, and various elements of the LLM prompt. Secondly, it will investigate each type of prompt injection attack in detail including its associated attack vectors and an explicit difference with other types of prompt injection attacks. Thirdly, it will investigate several risks associated with prompt injection attacks on LLMs. Lastly, it will investigate several prevention methods for prompt injection attacks on LLMs.
KW - Generative AI
KW - Large Language Models
KW - LLMs
KW - Cyberattacks on LLMs
KW - Attacks on LLMs
KW - Prompt Injection Attacks on LLMs
UR - https://www.techrxiv.org/users/845749/articles/1367311-when-generative-ai-prompts-bite-back-investigating-different-types-of-prompt-injection-attacks-on-large-language-models-llms-and-their-prevention-methods?commit=bbd4492a38cba9deea7da69fabca68c0c7e97793
U2 - 10.36227/techrxiv.176551675.52333626/v1
DO - 10.36227/techrxiv.176551675.52333626/v1
M3 - Preprint
BT - When Generative AI Prompts Bite Back: Investigating Different Types of Prompt Injection Attacks on Large Language Models (LLMs) and Their Prevention Methods
ER -