Security

Prompt Injection Attacks: The LLM Security Risk IT Leaders Must Address

Security leaders must adapt large language model controls such as input validation, output filtering and least-privilege access for artificial intelligence systems to prevent prompt injection attacks.

Nathan Eddy

Nathan Eddy works as an independent filmmaker and journalist based in Berlin, specializing in architecture, business technology and healthcare IT. He is a graduate of Northwestern University’s Medill School of Journalism.

Prompt injection attacks are emerging as one of the most significant security risks associated with generative artificial intelligence systems.

Deployment of AI co-pilots, agentic workflows and large language model–powered business tools is giving attackers new vectors to manipulate model prompts and outputs to bypass safeguards, access sensitive data or trigger unintended actions.

Addressing the threat requires adapting security principles including input validation, least-privilege access and zero-trust architecture to AI systems.

Click the banner below to learn how financial services are unlocking artificial intelligence’s potential.

What Is a Prompt Injection Attack?

A prompt injection attack exploits the way large language models (LLMs) are designed to follow natural language instructions.

In these attacks, a threat actor embeds malicious instructions into prompts, documents or other inputs that an AI system processes, tricking the model into performing actions its creators or users did not intend.

This can include bypassing safety safeguards, exposing sensitive data or manipulating the system’s behavior.

Corey Nachreiner, WatchGuard chief security officer and CISO, says prompt injection is a growing security concern because AI systems increasingly have access to large amounts of organizational data.

“Attackers can exploit their core design — following instructions — to potentially trigger unintended actions or extract confidential data,” he says. “This makes prompt injection one of the most important emerging risks organizations must consider when deploying AI systems.”

Direct vs. Indirect Prompt Injection: How Attackers Target AI Agents

Jeramy Kopacko, associate field CISO at Sophos, says direct prompt injection attacks involve the attack interacting with the system itself.

“If the AI system is publicly exposed and designed to be interacted with, the attacker will attempt to override controls and force harmful prompts,” he explains. “If the system has access to sensitive data, it may be tricked into providing administrative credentials.”

Indirect prompt injections involve the attack lying hidden in an outside input, such as a website, an email or a text message that the AI system will eventually access.

The attacker may never directly interact with the AI system and may still achieve their goal. Instructions can be hidden or embedded in documents, webpages, metadata or text only visible to the agent itself.

“By targeting the data source, the AI agent may read and execute the instruction while ingesting external — untrusted — content and be entirely unaware of the attack it triggered,” Kopacko says.

Agentic AI Tool Access and Elevated Privileges Raises the Stakes

Cristian Rodriguez, field CTO for the Americas at CrowdStrike, cautions that agentic AI dramatically escalates prompt injection risks because these systems have autonomous decision-making capabilities and expansive access to sensitive resources.

Unlike simple chatbots, AI agents can execute actions, call tools, access data stores and operate with elevated system privileges, essentially acting as nonhuman identities with real power.

“When prompt injection attacks successfully compromise an agent, adversaries don’t just manipulate outputs; they can hijack the agent’s full capabilities,” he says.

This means adversaries can exfiltrate sensitive files, execute unauthorized tool calls, access connected systems and essentially assume the agent’s privileges.

Nachreiner notes that a compromised AI agent could be manipulated into performing unauthorized actions, such as making a payment you didn’t intend, transferring cryptocurrency, selling off stocks, or accessing sensitive accounts because it is operating with the same credentials and privileges the user provided.

“Prompt injection becomes far more serious as AI systems move from answering questions to actually taking actions on behalf of users,” he says.

When prompt injection attacks successfully compromise an agent, adversaries don’t just manipulate outputs; they can hijack the agent’s full capabilities.”

Cristian Rodriguez Field CTO for the Americas, CrowdStrike

The Defense Framework: Input Validation, Output Filtering and Least Privilege

Rodriguez says effective defense against prompt injection requires multiple layers protecting the AI interaction.

For example, input validation catches malicious patterns in real time, blocking jailbreak attempts and hidden instructions before they can do damage.

“The best solutions leverage threat intelligence covering hundreds of known attack techniques,” he explains.

Output filtering redacts sensitive data such as credentials, personally identifiable information and regulated information before it reaches models. Masking, encryption or replacement techniques protect data without breaking workflows.

Kopacko says output filtering should consider unsafe responses by the model.

“As AI models are not deterministic, we should plan for prevention of malicious queries but also sanitize outputs for clearly unintended data leaks,” he says.

Rodriguez adds that least-privilege controls enforce granular policies for users, agents, tools and models.

“This includes validating agent-to-agent communications to prevent unauthorized tool execution and managing nonhuman identities with just-in-time access,” he says.

DISCOVER: Learn about cybersecurity tools that help defend against emerging threats from AI.

Secure-by-Design LLM Architecture: Sandboxing, Monitoring and Incident Response

Nachreiner says secure-by-design architecture ensures that even if an LLM is manipulated or tricked into attempting something malicious, the surrounding infrastructure prevents it from causing harm.

“Sandboxing and isolation are the first layer of defense,” he says. “Organizations should assume that, at some point, a model may be tricked into executing malicious instructions or attempting to access sensitive systems.”

He adds that while monitoring is equally important, traditional IT monitoring isn’t enough.

“Instead of just watching system metrics, organizations need visibility into the model’s behavior and interactions,” he explains. “This can include inspecting prompts and responses for suspicious patterns, monitoring for unusual shifts in model behavior and maintaining detailed audit logs of interactions.”

Nachreiner recommends that incident response capabilities are the final piece, noting that because AI systems can operate quickly and autonomously, defenses must also respond quickly.

“Automated safeguards can act as circuit breakers, stopping the system if it attempts high-risk actions such as accessing restricted data or executing unexpected commands,” he says.

ArtemisDiana/Getty Images

Newsletter

Sign up today to receive our newsletter in your inbox

BizTech Magazine

Prompt Injection Attacks: The LLM Security Risk IT Leaders Must Address

What Is a Prompt Injection Attack?

Direct vs. Indirect Prompt Injection: How Attackers Target AI Agents

Agentic AI Tool Access and Elevated Privileges Raises the Stakes

The Defense Framework: Input Validation, Output Filtering and Least Privilege

Secure-by-Design LLM Architecture: Sandboxing, Monitoring and Incident Response

What Is a Prompt Injection Attack?

Direct vs. Indirect Prompt Injection: How Attackers Target AI Agents

Agentic AI Tool Access and Elevated Privileges Raises the Stakes

The Defense Framework: Input Validation, Output Filtering and Least Privilege

Secure-by-Design LLM Architecture: Sandboxing, Monitoring and Incident Response

More On

Related Articles

New Research from CDW on Workplace Friction