Model Context Protocol (MCP)
What is Model Context Protocol (MCP)?
Model Context Protocol (MCP)An open protocol introduced by Anthropic in late 2024 that standardizes how LLM clients connect to external tools, data sources, and prompts via servers, making MCP servers a primary security boundary for agentic AI.
Model Context Protocol (MCP) is an open specification first released by Anthropic in November 2024 and rapidly adopted across the industry as the default way to plug LLM clients (Claude Desktop, Claude Code, IDEs, agent runtimes) into external capabilities. An MCP server exposes a typed set of tools, resources, and prompts, while an MCP client mediates which servers a model can talk to and how. From a security perspective, MCP servers concentrate three high-risk powers in one place: they can read sensitive data (databases, files, Slack, Drive), trigger side effects (write APIs, deployments, payments), and inject text into the model's context window. This makes MCP a primary attack surface for agentic AI: malicious MCP servers, tool-poisoning, indirect prompt injection via returned content, over-broad consent prompts, and lack of capability scoping are all active concerns. Defensive practices include signed/curated server catalogs, per-tool capability scopes, human approval for write tools, output sanitization, and treating any MCP server you did not write as untrusted.
● Examples
- 01
Claude Desktop connects to a local 'filesystem' MCP server that exposes scoped read/write tools rooted at a specific project directory.
- 02
An attacker publishes a third-party MCP server that silently exfiltrates retrieved documents to a remote endpoint when used as part of an agent's tool chain.
● Frequently asked questions
What is Model Context Protocol (MCP)?
An open protocol introduced by Anthropic in late 2024 that standardizes how LLM clients connect to external tools, data sources, and prompts via servers, making MCP servers a primary security boundary for agentic AI. It belongs to the AI & ML Security category of cybersecurity.
What does Model Context Protocol (MCP) mean?
An open protocol introduced by Anthropic in late 2024 that standardizes how LLM clients connect to external tools, data sources, and prompts via servers, making MCP servers a primary security boundary for agentic AI.
How does Model Context Protocol (MCP) work?
Model Context Protocol (MCP) is an open specification first released by Anthropic in November 2024 and rapidly adopted across the industry as the default way to plug LLM clients (Claude Desktop, Claude Code, IDEs, agent runtimes) into external capabilities. An MCP server exposes a typed set of tools, resources, and prompts, while an MCP client mediates which servers a model can talk to and how. From a security perspective, MCP servers concentrate three high-risk powers in one place: they can read sensitive data (databases, files, Slack, Drive), trigger side effects (write APIs, deployments, payments), and inject text into the model's context window. This makes MCP a primary attack surface for agentic AI: malicious MCP servers, tool-poisoning, indirect prompt injection via returned content, over-broad consent prompts, and lack of capability scoping are all active concerns. Defensive practices include signed/curated server catalogs, per-tool capability scopes, human approval for write tools, output sanitization, and treating any MCP server you did not write as untrusted.
How do you defend against Model Context Protocol (MCP)?
Defences for Model Context Protocol (MCP) typically combine technical controls and operational practices, as detailed in the full definition above.
What are other names for Model Context Protocol (MCP)?
Common alternative names include: MCP, MCP protocol.
● Related terms
- ai-security№ 731
MCP Attacks
Attacks that exploit the Model Context Protocol (MCP) to inject prompts, abuse tools, or pivot through servers an AI assistant trusts.
- ai-security№ 027
Agentic AI Security
The discipline of securing autonomous LLM agents that plan, call tools, and act on real-world systems, where prompt injection turns into remote code execution and excessive agency into actual blast radius.
- ai-security№ 1285
Tool-Use Injection
Attacks that manipulate an LLM agent's tool-calling layer — forging tool arguments, smuggling instructions through tool outputs, or coaxing the model into calling unsanctioned tools.
- ai-security№ 586
Indirect Prompt Injection
A prompt-injection variant where malicious instructions are hidden inside third-party content (web pages, documents, emails) that an LLM later ingests through retrieval, browsing, or tool use.
- ai-security№ 969
Prompt Injection
An attack that overrides an LLM's original instructions by smuggling adversarial text into the prompt, causing the model to ignore safeguards or execute attacker-chosen actions.
- ai-security№ 689
LLM Guardrails
Mechanisms that constrain what an LLM-based application can input or output, enforcing safety, security, and business rules around the underlying model.