The Tool Description Is the Attack Surface, Not the Tool Call

In The MCP Problem: How Standardization Created a Monoculture Attack Surface, we documented three protocol-level vulnerabilities that follow directly from MCP’s design. The first — tool poisoning through metadata manipulation — warranted analysis beyond what a series post could contain. This companion post provides that analysis. Fault Lines Series 4 Post 2, Mapping the Agentic Attack Surface, covers the full MCP attack taxonomy — tool poisoning, tool shadowing, rug pulls, and cross-server escalation — as one of six compounding attack surface dimensions. This post examines a single mechanism within that taxonomy in depth: what arXiv:2601.07395 established about how the attack is constructed, why current monitoring architecture cannot detect it, and what the architectural fix requires.

Where enterprise security monitoring is watching

Enterprise security programs approaching agentic AI deployments bring a reasonable instinct: monitor the calls. In traditional API security, the invocation is the event. A service calls an endpoint; that endpoint is authenticated, authorized, and logged; the response is validated. The audit trail lives in the invocation record. This instinct is correct for traditional API integrations. In an MCP deployment, it monitors the wrong layer.

MCP introduces a phase that traditional API integrations do not have: tool discovery. Before an agent can invoke a tool, it must learn what tools are available. It does this by querying an MCP server and receiving back a list of capability descriptions — tool names, natural-language descriptions of what each tool does, parameter schemas, and usage annotations. The agent reads these descriptions as part of its reasoning context. It uses them to decide what tools to call, how to call them, and what to do with the results.

Security monitoring programs log what the agent called. They do not, as a rule, treat what the agent read during discovery as adversarial input. That asymmetry is the attack surface.

84.2%

Attack success rate achieved by MCP-ITP (arXiv:2601.07395) across 12 LLM agents by embedding adversarial instructions in tool metadata. Detection rate: 0.3%. The poisoned tool was never invoked. The entire attack lived in the description layer.

What MCP-ITP found

The MCP-ITP paper (arXiv:2601.07395) formalizes an attack class it calls implicit tool poisoning — and the word “implicit” is precise. The attack does not involve invoking a malicious tool. It does not require a malicious tool to be called at all. The attack lives in the description of a tool that is registered but, in the attack scenario, never actually invoked.

The mechanism is direct. An attacker registers a tool with a legitimate-sounding name and plausible description. Embedded within that description — in the natural language that explains what the tool does, what its parameters mean, how it should be used — are adversarial instructions. Those instructions are designed to influence the agent’s behavior toward a different, legitimate tool the agent is about to call. The poisoned tool serves as a delivery vehicle for instructions that execute against the target tool’s operation. The attack succeeds because the agent reads tool descriptions as trusted reasoning context, not as untrusted input.

MCP-ITP frames the construction of these poisoned tool descriptions as a black-box optimization problem: given a target tool and a target malicious objective, automatically craft the metadata that achieves that objective. The optimization is effective. Across 12 LLM agents — covering the major commercial and open-weight models in production use as of Q1 2026 — the attack achieves up to 84.2% attack success while suppressing detection to as low as 0.3%.

Research Finding

Security monitoring for agentic systems overwhelmingly focuses on tool invocations — what was called, with what parameters, and what it returned. Tool descriptions are treated as static configuration, reviewed once at registration and then ignored. MCP-ITP exploits exactly that gap: the attack lives in the configuration layer that defenders have stopped watching.

Why MCP’s architecture enables this

The attack succeeds because it exploits a trust assumption that is not a bug in any particular MCP implementation. It is a design characteristic of the protocol itself.

MCP’s tool discovery mechanism assumes that capability descriptions are honest. The protocol has no cryptographic mechanism by which an agent can verify that a server claiming to be a read-only file browser is actually constrained to read-only operations. When the agent receives a list of tool descriptions from an MCP server, it has no channel through which to verify that those descriptions accurately represent the tools’ behavior. It reads them as trusted context.

This is the first of three unenforced trust assumptions our monoculture analysis documented. The absence of capability attestation — the inability to verify that a server’s natural-language self-description is accurate — is not something an enterprise can fix by hardening its own MCP implementation. It is a characteristic of the specification. Every MCP deployment inherits it.

Capability Attestation Absence — The Protocol Gap Underlying Tool Poisoning

MCP servers self-describe their capabilities in natural language. There is no cryptographic mechanism by which an agent can verify that a server’s description accurately represents what it does or is permitted to do. A server can claim any capability set, include any instructions in its tool descriptions, and the protocol has no way to challenge it. This is an architectural characteristic of the MCP specification — not a flaw in any individual implementation — which means every deployment built on the protocol inherits the same gap simultaneously.

Capability attestation absence is what makes implicit tool poisoning possible. It is also what makes the attack class persistent: no implementation-level hardening by any individual enterprise can close a gap in the shared protocol specification.

The configuration layer defenders stopped watching

The 0.3% detection rate in MCP-ITP‘s findings is not primarily a detection technology failure. It is a monitoring architecture failure. The controls that would catch this attack class — anomaly detection on tool descriptions at registration, semantic analysis of metadata for embedded instructions, runtime consistency checks between a tool’s registered description and its observed behavior — do not exist in most enterprise MCP deployments because implicit tool poisoning has not been on the enterprise security radar.

The gap is a category error. Tool descriptions are treated, organizationally, as configuration: something reviewed by a human during the server onboarding process, validated at that moment, and then treated as stable. The security program operating under this model has exempted the description layer from continuous security monitoring — not deliberately, but because the model was inherited from a world where configuration was not an active attack surface.

The practical consequence is specific. An enterprise that has deployed a sophisticated SIEM with MCP-aware invocation logging, session tracking, and anomaly detection on call patterns has built a strong defense against the attack classes that monitoring is designed to catch. It has built no defense against implicit tool poisoning, because the attack never generates an invocation event to log. The tool whose description contains the adversarial instructions is never called. Nothing in the invocation audit trail records the moment the agent read and processed those instructions.

Current monitoring posture

Monitor the invocation layer

Tool invocations are logged. Authentication and authorization are verified on each call. Anomaly detection flags unusual call patterns, high-frequency invocations, and unauthorized parameter combinations. The audit trail lives in the invocation record.

This is the correct posture for traditional API security. In an MCP deployment, it is necessary and insufficient. Implicit tool poisoning generates no invocation event to flag.

Necessary — not sufficient

What the research requires

Treat descriptions as adversarial input

Tool descriptions are validated at registration as potentially adversarial input, not as honest configuration. Schema binding is enforced at registration with cryptographic commitment. Runtime consistency checks verify that tool behavior matches its registered description. The description layer is monitored continuously.

Registration is an adversarial input event. Description changes are security events. The monitoring boundary extends from invocations upward to the discovery phase.

Structural extension required

Three architectural extensions enterprise architects must make

Three extensions follow from the MCP-ITP findings. Each addresses the description layer that current monitoring architecture does not reach.

First, treat tool registration as an adversarial input event. The onboarding of a new MCP server — and specifically, the ingestion of its tool descriptions — should be subject to the same scrutiny as any untrusted input entering an enterprise system. Natural-language descriptions should be analyzed for embedded instructions. Schema annotations should be reviewed for adversarial patterns, not only for structural validity. The moment a tool description enters the enterprise environment is a security event, not a configuration task.

Second, enforce schema binding at registration with runtime verification. The Context-Aware Broker Protocol dispatch documents the CABP architecture (arXiv:2603.13417) and its six-stage pipeline. The extension that tool poisoning requires is a description-layer commitment mechanism: cryptographic binding of the registered tool description at onboarding, with consistency verification at each invocation. If the description at runtime does not match the description at registration, the invocation should not proceed. This closes implicit tool poisoning (which exploits the initial description) and the rug pull vulnerability documented in the OWASP MCP security analysis — a trusted server silently updating its tool descriptions after onboarding — simultaneously. Both attacks exploit the same absence: a cryptographic commitment to what a tool is. The ETDI framework (arXiv:2506.01333) provides the strongest published implementation of this mechanism, achieving a 76% relative reduction in attack success with 8.3ms latency overhead — the empirical basis for why schema binding is the right architectural direction. What MCP-ITP adds to that picture is the explanation of why this mechanism is necessary: not as a defense against a theoretical threat class, but against an attack that achieves 84.2% success and 0.3% detection against every unprotected deployment regardless of model or framework.

Third, distinguish between reviewing descriptions and monitoring them. Human review at onboarding is necessary. It is not sufficient as a continuous control. Tool descriptions that pass initial review may later be updated — by a legitimate server update, by a compromised server, or by a supply chain compromise upstream of the server itself. The monitoring program must treat description changes as security events requiring review, not as configuration updates handled through the next scheduled audit cycle.

The tools that would catch this attack — anomaly detection on tool descriptions, semantic analysis of metadata at registration, runtime consistency checks between description and observed behavior — do not exist in most enterprise MCP deployments because the attack class itself has not been on the enterprise security radar.

— Luminity Digital synthesis of arXiv:2601.07395 (MCP-ITP) findings

The monoculture implication

MCP-ITP achieves consistent results across 12 LLM agents precisely because the vulnerability is not in any agent’s implementation. It is in the protocol’s trust model. Different agents, different architectures, different safety training regimes — all susceptible to the same attack class at comparable success rates, because they all rely on the same tool discovery mechanism that assumes capability descriptions are honest.

This is the monoculture consequence our earlier analysis predicted: when a single protocol’s trust assumptions become universal, its blind spots become universal simultaneously. An enterprise that deploys a well-hardened custom agent on a carefully configured MCP server is as susceptible to implicit tool poisoning as an enterprise that deploys a commodity agent with minimal configuration, if both depend on the same discovery mechanism. The vulnerability is in the shared specification, not in either deployment’s individual choices.

The architectural response is therefore not agent-specific. It is infrastructure. The description-layer monitoring, schema binding, and registration controls identified above need to live at the broker layer — in the CABP-style enforcement infrastructure that sits between the agent and its MCP servers — precisely because that is the only position where the control can be applied consistently regardless of which agent uses which tools. An enterprise that implements these controls at the agent layer has implemented them for one agent. An enterprise that implements them at the broker layer has implemented them for its entire MCP deployment.

The monoculture created a shared attack surface. The broker layer is where the shared defense must live.

VEC Adversarial instructions embedded in tool metadata (description, parameter names, schema annotations)
TGT Legitimate tool invoked by the agent — not the poisoned tool itself
GAP MCP capability attestation absence — no verification that descriptions are accurate
84% Attack success rate across 12 LLM agents (arXiv:2601.07395)
0.3% Detection rate under current monitoring posture
FIX Schema binding at registration + continuous description monitoring at the broker layer

Fault Lines Post 2 — Mapping the Agentic Attack Surface — Full taxonomy of the MCP attack surface across six compounding dimensions, including tool poisoning, tool shadowing, rug pulls, and cross-server escalation. Read this first for the breadth picture.

luminitydigital.com →

The MCP Problem: How Standardization Created a Monoculture Attack Surface — Three protocol-level vulnerabilities. The foundational analysis.

luminitydigital.com →

What OWASP’s MCP Security Guide Gives You — And What It Can’t — The minimum bar, and the five alignment-grade capabilities that lie above it.

luminitydigital.com →

What Structural Enforcement at the MCP Layer Actually Looks Like — The CABP six-stage pipeline. Practitioner validation from Srinivasan (arXiv:2603.13417).

luminitydigital.com →

Governance Without Architecture — A close reading of Google Cloud’s MCP security documentation, and what its controls cannot reach.

luminitydigital.com →

Code Mode Doesn’t Fix the Trust Layer — Why Block’s programmatic MCP approach skips the broker layer the protocol requires.

luminitydigital.com →

The Tool Description Is
the Attack Surface,
Not the Tool Call

Where enterprise security monitoring is watching

What MCP-ITP found

Why MCP’s architecture enables this

Capability Attestation Absence — The Protocol Gap Underlying Tool Poisoning

The configuration layer defenders stopped watching

Monitor the invocation layer

Treat descriptions as adversarial input

Three architectural extensions enterprise architects must make

The monoculture implication

Agentic AI Security — Structured Research for Enterprise Architects

Like this:

Related

The Tool Description Isthe Attack Surface,Not the Tool Call

Where enterprise security monitoring is watching

What MCP-ITP found

Why MCP’s architecture enables this

Capability Attestation Absence — The Protocol Gap Underlying Tool Poisoning

The configuration layer defenders stopped watching

Monitor the invocation layer

Treat descriptions as adversarial input

Three architectural extensions enterprise architects must make

The monoculture implication

Agentic AI Security — Structured Research for Enterprise Architects

Share this:

Like this:

Related

The Tool Description Is
the Attack Surface,
Not the Tool Call