What Is a Prompt Injection Attack?
A prompt injection attack exploits the way large language models (LLMs) are designed to follow natural language instructions.
In these attacks, a threat actor embeds malicious instructions into prompts, documents or other inputs that an AI system processes, tricking the model into performing actions its creators or users did not intend.
This can include bypassing safety safeguards, exposing sensitive data or manipulating the system’s behavior.
Corey Nachreiner, WatchGuard chief security officer and CISO, says prompt injection is a growing security concern because AI systems increasingly have access to large amounts of organizational data.
“Attackers can exploit their core design — following instructions — to potentially trigger unintended actions or extract confidential data,” he says. “This makes prompt injection one of the most important emerging risks organizations must consider when deploying AI systems.”
READ MORE: Small businesses can benefit from AI just as much as enterprise organizations do.
Direct vs. Indirect Prompt Injection: How Attackers Target AI Agents
Jeramy Kopacko, associate field CISO at Sophos, says direct prompt injection attacks involve the attack interacting with the system itself.
“If the AI system is publicly exposed and designed to be interacted with, the attacker will attempt to override controls and force harmful prompts,” he explains. “If the system has access to sensitive data, it may be tricked into providing administrative credentials.”
Indirect prompt injections involve the attack lying hidden in an outside input, such as a website, an email or a text message that the AI system will eventually access.
The attacker may never directly interact with the AI system and may still achieve their goal. Instructions can be hidden or embedded in documents, webpages, metadata or text only visible to the agent itself.
“By targeting the data source, the AI agent may read and execute the instruction while ingesting external — untrusted — content and be entirely unaware of the attack it triggered,” Kopacko says.
RELATED: Is your organization ready to defend against artificial intelligence-powered cyberattacks?
Agentic AI Tool Access and Elevated Privileges Raises the Stakes
Cristian Rodriguez, field CTO for the Americas at CrowdStrike, cautions that agentic AI dramatically escalates prompt injection risks because these systems have autonomous decision-making capabilities and expansive access to sensitive resources.
Unlike simple chatbots, AI agents can execute actions, call tools, access data stores and operate with elevated system privileges, essentially acting as nonhuman identities with real power.
“When prompt injection attacks successfully compromise an agent, adversaries don’t just manipulate outputs; they can hijack the agent’s full capabilities,” he says.
This means adversaries can exfiltrate sensitive files, execute unauthorized tool calls, access connected systems and essentially assume the agent’s privileges.
Nachreiner notes that a compromised AI agent could be manipulated into performing unauthorized actions, such as making a payment you didn’t intend, transferring cryptocurrency, selling off stocks, or accessing sensitive accounts because it is operating with the same credentials and privileges the user provided.
“Prompt injection becomes far more serious as AI systems move from answering questions to actually taking actions on behalf of users,” he says.
