Indirect Prompt Injection
What is Indirect Prompt Injection?
Indirect Prompt InjectionA prompt-injection variant where malicious instructions are hidden inside third-party content (web pages, documents, emails) that an LLM later ingests through retrieval, browsing, or tool use.
Indirect prompt injection — described in detail by Greshake et al. (2023) — does not require the attacker to talk to the model directly. Instead, the attacker plants instructions inside a resource the LLM is expected to consume: a webpage summarized by an agent, a PDF parsed by a RAG pipeline, an email read by a copilot, or even alt-text on an image. When the model concatenates that content into its context, it may follow the embedded instructions, leak conversation history, call tools, or exfiltrate data via crafted URLs. Defences include content sandboxing, retrieval allow-listing, segregating data from instructions, output egress controls, and human approval gates for sensitive tool calls.
● Examples
- 01
A resume PDF containing white-on-white text that instructs a hiring copilot to recommend the candidate.
- 02
A web page that, when summarized by an AI browser agent, tells it to send the user's emails to an attacker URL.
● Frequently asked questions
What is Indirect Prompt Injection?
A prompt-injection variant where malicious instructions are hidden inside third-party content (web pages, documents, emails) that an LLM later ingests through retrieval, browsing, or tool use. It belongs to the AI & ML Security category of cybersecurity.
What does Indirect Prompt Injection mean?
A prompt-injection variant where malicious instructions are hidden inside third-party content (web pages, documents, emails) that an LLM later ingests through retrieval, browsing, or tool use.
How does Indirect Prompt Injection work?
Indirect prompt injection — described in detail by Greshake et al. (2023) — does not require the attacker to talk to the model directly. Instead, the attacker plants instructions inside a resource the LLM is expected to consume: a webpage summarized by an agent, a PDF parsed by a RAG pipeline, an email read by a copilot, or even alt-text on an image. When the model concatenates that content into its context, it may follow the embedded instructions, leak conversation history, call tools, or exfiltrate data via crafted URLs. Defences include content sandboxing, retrieval allow-listing, segregating data from instructions, output egress controls, and human approval gates for sensitive tool calls.
How do you defend against Indirect Prompt Injection?
Defences for Indirect Prompt Injection typically combine technical controls and operational practices, as detailed in the full definition above.
What are other names for Indirect Prompt Injection?
Common alternative names include: Cross-domain prompt injection, Stored prompt injection.
● Related terms
- ai-security№ 866
Prompt Injection
An attack that overrides an LLM's original instructions by smuggling adversarial text into the prompt, causing the model to ignore safeguards or execute attacker-chosen actions.
- ai-security№ 898
RAG Security
The discipline of securing retrieval-augmented generation pipelines so that the documents, vector stores, and retrieval steps that feed an LLM cannot be poisoned, abused, or used to exfiltrate data.
- ai-security№ 030
AI Jailbreak
A technique that causes an aligned AI model to bypass its safety policies and produce content or behaviour the operator intended to forbid.
- ai-security№ 777
OWASP LLM Top 10
An OWASP-maintained list of the ten most critical security risks affecting applications that build on large language models.
- ai-security№ 034
AI Supply Chain Risk
The set of threats arising from the third-party datasets, base models, libraries, plug-ins, and infrastructure that organisations combine to build and deploy AI systems.
- ai-security№ 618
LLM Guardrails
Mechanisms that constrain what an LLM-based application can input or output, enforcing safety, security, and business rules around the underlying model.
● See also
- № 1163Token Smuggling
- № 657MCP Attacks
- № 619LLM System Prompt Leak