Literal Genie Problem of Generative AI: Understanding Cyberattacks on Generative AI Models in the Context of Large Language Models (LLMs) and Their Defence Strategies

Dishita Naik, Ishita Naik, Nitin Naik

Research output: Preprint or Working paperPreprint

Abstract

Large Language Models (LLMs) have become one of the most influential and widely used AI models due to their unique linguistic capabilities in understanding and generating human language. An LLM is a type of generative AI model trained on enormous amounts of natural language data to understand and generate human-like text. In other words, the generative AI model serves as the core computational and linguistic engine that enables an LLM-based system or application to process, generate, and manipulate language at scale. However, an LLM-based system or application may also incorporate other supporting AI models or components to extend its functionality in addition to the core generative AI model. The unique linguistic capabilities of the core generative AI model in an LLM-based system or application enable it to perform virtually any natural language task. However, this broad language functionality also expands the attack surface of the LLM, as it must handle diverse forms of text input from various sources and at multiple stages including user queries, retrieved documents in Retrieval-Augmented Generation (RAG), internal logs, and outputs to downstream components. Consequently, the very strengths that make LLMs powerful also introduce potential vulnerabilities, which attackers can further exploit to launch a variety of attacks, including direct attacks on the core generative AI model. AI model related attacks specifically target the internal architecture, behaviour, pretrained information or parameters of the generative AI model, rather than merely the surrounding systems or applications that interface with LLMs. These AI model related attacks are especially challenging because of the scale, unpredictability, and general-purpose nature of LLMs, combined with the difficulty of enforcing security, privacy and safety without sacrificing its functionality. Given the significance of the AI model in the context of LLMs, and the necessity of understanding its critical design and operational aspects that may introduce potential vulnerabilities, this paper will explore the most common AI model related attacks in the context of LLMs. Initially, it will explain AI model and its related attacks in the context of LLMs. Next, it will elucidate the most common types of AI model related attacks, where each type will cover associated attack vectors and distinction from other types of AI model related attacks in the context of LLMs. Subsequently, it will highlight several risks associated with AI model related attacks in the context of LLMs. Finally, it will provide several defence strategies for AI model related attacks in the context of LLMs.
Original languageEnglish
Number of pages21
DOIs
Publication statusPublished - 10 Dec 2025

Keywords

  • Generative AI
  • LLM
  • Large Language Models
  • Cyberattack on LLMs
  • Attacks on LLMs

Fingerprint

Dive into the research topics of 'Literal Genie Problem of Generative AI: Understanding Cyberattacks on Generative AI Models in the Context of Large Language Models (LLMs) and Their Defence Strategies'. Together they form a unique fingerprint.

Cite this