Post 1 in this series established the foundational problem: safety alignment does not transfer from text generation to tool-call execution. This post examines the infrastructure through which that gap is most dangerously exploited. The Model Context Protocol — MCP — is now the dominant standard connecting AI agents to tools, data sources, and external systems. Seven papers in our review of 49+ arXiv publications from Q1 2026 target MCP specifically. The primary sources are Breaking the Protocol (arXiv:2601.17549), MCP-ITP (arXiv:2601.07395), SMCP: Secure Model Context Protocol (arXiv:2602.01129), MCPShield (arXiv:2602.14281), Security Threat Modeling for Emerging AI-Agent Protocols (arXiv:2602.11327), Real Faults in MCP Software (arXiv:2603.05637), and Beyond Max Tokens (arXiv:2601.10955).
Standards solve problems. The Model Context Protocol emerged in late 2024 to solve a genuine one: every AI agent framework had its own proprietary method for connecting to tools and data sources, and the resulting fragmentation made interoperability nearly impossible. MCP gave the ecosystem a common language — a universal interface through which any agent could discover, invoke, and communicate with any compliant tool server.
The adoption was unusually rapid. Within months of release, MCP had become the default integration layer for the major agent frameworks — LangChain, LlamaIndex, AutoGen, Claude, and dozens of commercial platforms. Thousands of MCP server implementations appeared in public repositories. Enterprise deployments followed. By early 2026, MCP was not one option among many. It was the infrastructure.
That is precisely when the security research community started paying close attention. Because what MCP had created, alongside genuine coordination value, was a monoculture: a single protocol whose architectural decisions, trust assumptions, and design flaws would propagate simultaneously to every deployment built on top of it. In traditional software security, monocultures are known to be dangerous. The agentic AI ecosystem had built one at remarkable speed.
Minimum amplification in attack success rate when MCP is present versus non-MCP integrations, rising to 41% under cross-server configurations — established empirically across 847 attack scenarios by the Breaking the Protocol paper (arXiv:2601.17549). MCP does not merely expose agents to attack. It makes existing attacks meaningfully more effective.
What MCP Actually Does — and What It Assumes
To understand the security problem, it helps to understand what MCP is doing architecturally. The protocol enables three core interactions: tool discovery, tool invocation, and sampling. In tool discovery, an agent queries an MCP server to learn what tools are available and what they do — receiving back a list of capability descriptions in natural language. In tool invocation, the agent calls a tool by name with a set of parameters and receives a result. In sampling, an MCP server can request that the agent generate a completion — effectively allowing the server to prompt the agent from within the agent’s own workflow.
Each of these interactions embeds a trust assumption that the 2026 research corpus identifies as a vulnerability. Tool discovery assumes that capability descriptions are honest. Tool invocation assumes that the server calling itself “FileManager” is the FileManager the agent thinks it is. Sampling assumes that server-initiated prompts should be treated with the same trust as user-initiated prompts. None of these assumptions are enforced at the protocol level. They are assumptions about server behavior that MCP takes on faith.
MCP’s Three Unenforced Trust Assumptions
Capability attestation absence. MCP servers self-describe their capabilities in natural language. There is no cryptographic mechanism by which an agent can verify that a server claiming to be a read-only file browser is actually constrained to read-only operations. A server can claim any capability set and the protocol has no way to challenge it.
Bidirectional sampling without origin authentication. The sampling feature — which allows MCP servers to prompt the agent — was designed for legitimate use cases like iterative refinement. It also creates a path by which a compromised or malicious server can inject instructions into the agent’s reasoning stream without the user’s knowledge or consent.
Implicit trust propagation. In multi-server configurations, trust granted to one MCP server implicitly extends through shared context. An agent that grants elevated trust to a calendar server may have that trust exploited by a separately compromised email server operating in the same workflow.
Three Protocol-Level Vulnerabilities
The Breaking the Protocol paper (arXiv:2601.17549) is the most comprehensive security analysis of MCP’s architectural design published to date. Testing 847 attack scenarios across five MCP server implementations, it identifies three vulnerability classes that are structural — meaning they follow directly from how MCP was designed, not from bugs in any particular implementation.
Vulnerability one: tool poisoning through metadata manipulation
The most counterintuitive attack in the MCP corpus does not involve invoking a malicious tool at all. The MCP-ITP paper (arXiv:2601.07395) describes implicit tool poisoning: embedding adversarial instructions in the metadata of a registered tool — its description, its parameter names, its schema annotations — such that the agent reads those instructions during tool discovery and executes them against a different, legitimate tool. The poisoned tool is never called. Its metadata is the attack surface.
The MCP-ITP framework formulates poisoned tool generation as a black-box optimization problem, automatically crafting metadata that achieves the attacker’s objective. Across 12 LLM agents, it achieves up to 84.2% attack success while suppressing detection to as low as 0.3%. The attack is effective precisely because existing defenses focus on what tools are invoked, not on what tool descriptions say.
Why Metadata Is the Blindspot
Security monitoring for agentic systems overwhelmingly focuses on tool invocations — what was called, with what parameters, and what it returned. Tool descriptions are treated as static configuration, reviewed once at registration and then ignored. MCP-ITP exploits exactly that gap: the attack lives in the configuration layer that defenders have stopped watching.
Vulnerability two: server-side prompt injection via sampling
MCP’s sampling feature was designed to allow tool servers to request agent completions as part of legitimate multi-step workflows. The security consequence is that any MCP server — including a compromised one — can inject instructions into the agent’s context at a point where the agent is least likely to apply suspicion. The instructions arrive not as user input, which safety training treats with some wariness, but as tool infrastructure, which the agent is trained to trust.
The Breaking the Protocol paper documents this as one of the three structural vulnerabilities in MCP’s design. Unlike tool poisoning, which requires pre-positioning malicious metadata, sampling-based injection can be triggered dynamically in response to agent behavior — making it adaptive to the specific workflow the agent is executing at the time of attack.
Vulnerability three: cross-server trust propagation
Enterprise MCP deployments rarely connect agents to a single server. A typical agentic workflow might involve a calendar server, a document management server, an email server, and a CRM server operating in parallel. MCP’s design does not enforce trust boundaries between these servers. Context established in one server’s interaction — including permission grants, identity claims, and behavioral precedents — can be read and exploited by other servers sharing the same agent session.
The Security Threat Modeling for Emerging AI-Agent Protocols paper (arXiv:2602.11327) extends this analysis across four protocols — MCP, Google’s A2A, Agora, and ANP — and finds that cross-server trust propagation is a shared vulnerability across all of them. The problem is not a bug in MCP’s implementation. It is a consequence of designing protocols for capability and convenience without a formal trust model.
MCP amplifies attack success rates by 23 to 41 percent compared to non-MCP integrations. The protocol does not create new attack categories. It makes existing attacks more effective, more reliable, and harder to detect.
— Breaking the Protocol: Security Analysis of the MCP Specification (arXiv:2601.17549)The Monoculture Consequence
Each of the three vulnerabilities above would be concerning in isolation. What makes the MCP situation genuinely alarming is the monoculture dynamic. When a single protocol governs the entire ecosystem, a flaw in that protocol is not one organization’s problem. It is the field’s problem, simultaneously, at scale.
The Real Faults in MCP Software paper (arXiv:2603.05637) provides the empirical foundation for this concern. Analyzing 385 MCP server repositories with 30,795 closed issues, the researchers derived a five-category fault taxonomy from real-world deployments. The finding that matters most is not any individual fault category — it is the distribution pattern. Because MCP servers are built against a common specification, the same fault patterns appear across independent implementations. A design decision in the MCP specification propagates as a shared vulnerability to every server built to that specification.
This is the monoculture risk in its precise form. It is not that MCP is badly designed — it is a functional, well-specified protocol. It is that adopting a universal standard means adopting universal vulnerability. In traditional software, this dynamic is well-understood: the security community has spent decades warning against it in operating systems, cryptographic libraries, and network protocols. The agentic AI ecosystem is learning the same lesson, compressed into a much shorter timeline.
A Novel Attack Class: Economic Denial of Service
The Beyond Max Tokens paper (arXiv:2601.10955) surfaces a vulnerability category that the MCP security literature had not previously formalized: stealthy economic denial of service at the tool layer. The attack exploits MCP-compatible tool server fields to expand tasks into trajectories exceeding 60,000 tokens, using Monte Carlo Tree Search to optimize the inflation path.
The consequences are striking. Cost inflation reaches up to 658 times the baseline. Energy consumption increases 100 to 560 times. GPU KV cache occupancy rises from under 1% to between 35% and 74%. Critically, the attack achieves this while keeping task outcomes correct — the agent completes the assigned work, so conventional quality monitoring does not flag the attack. The only signal is resource consumption, and in cloud deployments where resource consumption is often abstracted from the application layer, that signal may not reach the people who could act on it.
Why MCP Won
Before MCP, every agent framework maintained proprietary tool integration methods. Building an agent that could work across multiple tool ecosystems required custom adapters for each one. MCP eliminated that friction — one protocol, one integration pattern, universal compatibility.
The coordination value is real and was the right problem to solve. The ecosystem needed a standard. MCP delivered one, and adoption reflected genuine utility.
Coordination SolvedWhat Monoculture Means in Practice
Every organization that adopted MCP inherited the same three structural vulnerabilities. Every new server implementation built to the specification embeds the same unenforced trust assumptions. A flaw discovered in the MCP protocol specification is not one vendor’s patch — it is an ecosystem-wide retrofit.
The 23–41% attack amplification documented by the research is not a per-deployment statistic. It is a property of the protocol itself, applying to every deployment simultaneously.
Security Debt AccumulatedWhat the Defense Research Proposes
The 2026 corpus does not stop at diagnosis. Two papers propose substantive architectural responses to MCP’s security gaps, and their approaches are instructive precisely because they disagree about where the fix belongs.
The SMCP: Secure Model Context Protocol paper (arXiv:2602.01129) argues the fix belongs at the protocol layer. SMCP proposes a security-enhanced variant of MCP adding unified identity management, mutual authentication, continuous security context propagation, fine-grained policy enforcement through ABAC and RBAC, and comprehensive audit logging. A Trusted Component Registry ensures only verified participants can join agent-tool interactions. The proposal maps explicitly to NIST CSF and OWASP AI Top 10. In testing, SMCP reduces attack success from 52.8% to 12.4%.
The MCPShield paper (arXiv:2602.14281) takes a different position: rather than modifying the protocol, add a security cognition layer that operates above it. MCPShield intercepts agent-server interactions across three phases — pre-invocation metadata probing, runtime behavioral monitoring, and post-invocation consistency evaluation — and applies adaptive trust calibration based on observed behavior. It demonstrates strong generalization across six novel MCP attack scenarios without requiring any changes to the underlying protocol or existing server implementations.
SMCP requires retrofitting the protocol and every server built to it. MCPShield requires no protocol changes but adds a monitoring layer that itself becomes a security dependency. Neither approach is free. The 2026 research establishes that the cost of doing nothing — inheriting 23–41% attack amplification across every MCP deployment — is higher than either.
