Skip to content
Vol. 1 · Ed. 2026
CyberGlossary
Entry № 027

Agentic AI Security

What is Agentic AI Security?

Agentic AI SecurityThe discipline of securing autonomous LLM agents that plan, call tools, and act on real-world systems, where prompt injection turns into remote code execution and excessive agency into actual blast radius.


Agentic AI security covers the controls, threat models, and runtime guardrails needed when large language models stop merely answering and start acting — issuing tool calls, browsing the web, writing to files, sending emails, or executing transactions. Compared to a chat-only LLM, an agent's untrusted inputs (retrieved pages, tool outputs, multimodal content) flow directly into next-step decisions, so a single instance of indirect prompt injection can pivot into data exfiltration, account takeover, or destructive actions. Effective programs combine least-privilege tool scoping, sandboxed execution, structured output validation, human-in-the-loop checkpoints for high-impact actions, allow-listed tools, isolated browsing contexts, and detection of behavioral drift such as exfiltration patterns or out-of-policy tool sequences. As of 2025–2026, agentic AI security is the fastest-growing slice of AI security work, driven by Anthropic's Claude tool use, OpenAI's Operator-class agents, and enterprise rollouts via MCP-based agent runtimes.

Examples

  1. 01

    A purchasing agent reads an attacker-controlled vendor email containing hidden 'forward all invoices' instructions and tries to act on them.

  2. 02

    An engineering copilot agent is constrained to read-only git tools and a sandboxed shell, with destructive commands gated behind explicit human approval.

Frequently asked questions

What is Agentic AI Security?

The discipline of securing autonomous LLM agents that plan, call tools, and act on real-world systems, where prompt injection turns into remote code execution and excessive agency into actual blast radius. It belongs to the AI & ML Security category of cybersecurity.

What does Agentic AI Security mean?

The discipline of securing autonomous LLM agents that plan, call tools, and act on real-world systems, where prompt injection turns into remote code execution and excessive agency into actual blast radius.

How does Agentic AI Security work?

Agentic AI security covers the controls, threat models, and runtime guardrails needed when large language models stop merely answering and start acting — issuing tool calls, browsing the web, writing to files, sending emails, or executing transactions. Compared to a chat-only LLM, an agent's untrusted inputs (retrieved pages, tool outputs, multimodal content) flow directly into next-step decisions, so a single instance of indirect prompt injection can pivot into data exfiltration, account takeover, or destructive actions. Effective programs combine least-privilege tool scoping, sandboxed execution, structured output validation, human-in-the-loop checkpoints for high-impact actions, allow-listed tools, isolated browsing contexts, and detection of behavioral drift such as exfiltration patterns or out-of-policy tool sequences. As of 2025–2026, agentic AI security is the fastest-growing slice of AI security work, driven by Anthropic's Claude tool use, OpenAI's Operator-class agents, and enterprise rollouts via MCP-based agent runtimes.

How do you defend against Agentic AI Security?

Defences for Agentic AI Security typically combine technical controls and operational practices, as detailed in the full definition above.

What are other names for Agentic AI Security?

Common alternative names include: LLM agent security, autonomous agent security.

Related terms

See also