Governance Without Architecture

In the Where Agentic AI Breaks series and its successor Building Defensible Agents, we spent eight posts establishing a single structural argument: that the security problems in agentic AI are not primarily implementation failures — they are protocol-level architectural failures that governance and operational controls can mitigate but cannot resolve. Google Cloud’s MCP security documentation, published in March 2026, is a significant and welcome contribution to the field. It also confirms our argument in ways its authors may not have intended.

Credit, first, where it is due. When a major cloud vendor publishes dedicated security guidance for a protocol that is still Pre-GA, that is a meaningful signal of institutional seriousness. Google Cloud’s MCP security documentation covers agent identity, IAM least-privilege configuration, Model Armor prompt and response scanning, organization-level policy controls, a taxonomy of agent operating modes and their associated risks, and concrete PII de-identification templates. Compared with the security posture of most MCP integration guides currently circulating in the developer ecosystem, this is substantive work. It deserves careful engagement.

We have engaged with it carefully. And the closer you read it — with the research corpus from the Where Agentic AI Breaks series as a lens — the more clearly you can see the outline of what these controls cannot address. Not because the documentation is incorrect. Because the protocol’s architecture is.

There is a recognizable pattern in security documentation from infrastructure providers: guidance that is entirely correct at the layer it operates, but that operates at the wrong layer. The controls are sound. The problem they cannot reach is structural. Google Cloud’s MCP security guidance exhibits this pattern with unusual clarity, and that clarity — the precision with which the documentation describes the limits of what it can offer — is itself informative. What the documentation cannot say is, in many cases, more important than what it does.

What the Documentation Gets Right

The foundation is the documentation’s explicit taxonomy of agent operating modes. Google draws a clear distinction between Human-in-the-Middle operation — where an agent suggests actions for human approval before execution — and Agent-Only operation, where the agent acts autonomously. The documentation is honest about the associated risk differential: human oversight reduces but does not eliminate risk in the former mode, and Agent-Only operation “relies entirely on the agent’s programming” and is “vulnerable to prompt injection, insecure tool chaining, and naive error handling.” This is an accurate framing, and it is more candid about the limitations of autonomous operation than most vendor documentation provides.

The IAM least-privilege guidance is operationally sound. Creating dedicated service account identities for agents, scoping permissions to the minimum required for specific tasks, and using workload identity federation for non-Google Cloud deployments are all defensible practices that practitioners should implement regardless of any protocol-level concerns. The instruction to “follow the principle of least privilege” is correct and insufficient in equal measure — correct in its own right, insufficient as a complete security posture, as we will establish.

Model Armor integration is a genuine capability addition. Scanning prompts and responses for malicious URLs, prompt injection attempts, and jailbreak patterns is a meaningful defensive layer. The documentation recommends enabling it at medium confidence and above. This is a reasonable starting threshold, and the significance of that “medium and above” qualifier will become apparent shortly.

Pre-GA

The entirety of Google Cloud’s MCP security documentation — including its IAM controls, Model Armor integration, and org-level policy framework — rests on a product carrying an explicit Pre-GA disclaimer: available “as is,” with limited support, subject to the “Pre-GA Offerings Terms.” Enterprises are being asked to build governance infrastructure on ground that Google itself has not yet stabilized.

The Protocol Tells on Itself

The most revealing passage in Google Cloud’s security documentation is not in the threat model section. It is in the table describing mitigations for malicious MCP tool use. Under the scenario heading “Dynamic tools,” the documentation states plainly that trusted MCP servers can silently add new tools, and that an agent might automatically gain access to a new capability without any approval from the operators who deployed it.

Read that slowly. A trusted server — one you have vetted, enabled, and integrated — can add new capabilities to your agent without your knowledge. The mitigation Google recommends for this is periodic manual review of the tool list, combined with an option to configure specific tool allowlists where the client application supports it.

This is not a documentation failure. Google is not being evasive. The documentation is describing an architectural reality of the MCP specification: the protocol provides no versioned capability manifest, no cryptographic schema binding that would allow a client to detect unauthorized tool additions, and no session-scoped capability lock that would prevent a server from silently expanding the agent’s authority surface during an active session. The mitigation being offered is operational — because the protocol does not permit a structural one.

When the recommended mitigation for a server silently expanding your agent’s capabilities is “check the list periodically,” the documentation has implicitly acknowledged that the protocol cannot close the gap. It can only ask a human to watch it.

— Luminity Digital analysis of Google Cloud MCP security documentation, March 2026

In the Where Agentic AI Breaks series Post 2 — on MCP monoculture risk — we established that the protocol’s rapid adoption as a universal standard amplifies the impact of any protocol-level flaw across the entire ecosystem simultaneously. The dynamic tool problem is precisely this kind of protocol-level flaw: it is not a bug in Google’s implementation; it is a characteristic of the MCP specification itself. Every deployment of every MCP server inherits it.

The Instruction-Data Problem Is an Opt-In Defense

Google’s documentation includes guidance on protecting agents against prompt injection that is, in isolation, technically correct. Developers are advised to treat user-provided content and database-derived content as data to be analyzed rather than instructions to be executed. The guidance recommends using strong delimiters — XML tags — and explicit instructions to the model that it must never treat enclosed content as commands. The example provided is clear and well-constructed.

Here is the problem: this is developer guidance. It is advice to practitioners building on MCP. It is not protocol enforcement. It is not enforced by the MCP specification at the transport layer. It is not enforced by Google Cloud’s server infrastructure. It is enforced only when individual developers implement it correctly, in every integration they build, on every prompt template they write, and in every context where an agent retrieves or processes external content.

The Model Context Protocol, by design, delivers tool results, retrieved content, and user messages to the agent as an undifferentiated context stream. There is no structural separation between trusted instructions and untrusted data at the protocol layer — because the protocol does not have the architectural concept of a trust boundary within the context window. As we established in Where Agentic AI Breaks Post 1, drawing on the Trustworthy Agentic AI Requires Deterministic Architectural Boundaries paper (arXiv:2602.09947), this is what the command-data boundary collapse looks like: not an absence of good advice, but an absence of architecture that would make the advice enforceable.

The Enforcement Gap

Google’s instruction-data separation guidance asks developers to build, in their application layer, a boundary that the MCP protocol cannot enforce structurally. Any developer who implements the pattern incorrectly — uses insufficiently strong delimiters, fails to account for nested content, or misses a retrieval path — inherits the full prompt injection exposure. Any developer who does not implement it at all inherits the same exposure. The protocol provides no fallback, no warning, and no structural detection.

The guidance is correct. The reliance on its universal, correct implementation as a security control is the problem. A security posture that depends on every developer getting every integration right, every time, is a probabilistic posture — not an architectural one.

What IAM Cannot Know

The IAM least-privilege model is the cornerstone of Google’s access control guidance, and it is the right cornerstone for what it can do. IAM controls what a given identity — an agent’s service account — is authorized to do on which resources. It is a well-understood, well-enforced, and institutionally mature access control system. None of that changes the fundamental limitation that IAM tracks identity, not intent.

Under adversarial conditions — specifically, prompt injection through a tool result, indirect injection via retrieved content, or tool description manipulation — the agent is not the one choosing to take a harmful action. A compromised agent is, from IAM’s perspective, a correctly-credentialed principal executing permitted operations. The service account’s permissions have not changed. The operations being invoked are within scope. The policy is being followed. The harm is occurring inside the permission envelope.

This is not a flaw in IAM. It is a flaw in the assumption that identity-level controls can substitute for intent-level controls in autonomous agentic operation. The SEAgent framework (arXiv:2601.11893), which we covered in Building Defensible Agents Post 2, addresses exactly this gap: it applies mandatory access controls not to identities but to operation categories — preventing privilege escalation patterns regardless of which identity initiates them. IAM has no equivalent concept because it was designed for a world where the principal is a human with stable intent, not an agent whose intent can be overridden mid-session by injected instructions.

The Injected Agent Problem

Model Armor at medium confidence scans prompts and responses for injection patterns. Sophisticated injection attacks — including the RL-optimized adversarial suffixes demonstrated by AutoInject (arXiv:2602.05746), which successfully compromised multiple frontier models including hardened configurations — are specifically engineered to evade confidence-based classifiers. An attack that clears the Model Armor threshold executes with the full IAM grant of the agent’s service account. There is no secondary structural checkpoint. The permission system does not know what the agent was trying to do before it was compromised.

The Multi-Agent Blind Spot

Google Cloud’s security documentation models a single-agent topology: one agent, one set of MCP servers, one service account, one set of IAM permissions. The documentation’s threat model, its IAM guidance, its Model Armor integration, and its org-level policy controls are all designed around this architecture. This is a reasonable starting point for documentation. It is an incomplete picture of how production agentic systems are actually deployed.

In the Where Agentic AI Breaks series Post 5 — on multi-agent data exposure — we established that orchestrator-to-subagent architectures introduce lateral data flow paths that operate entirely outside the identity-based control plane. An orchestrator agent may hold broad IAM permissions appropriate to its overall task. The subagents it spawns may receive context — customer data, credentials, internal documents — that neither the orchestrator’s IAM policy nor the subagent’s IAM policy was designed to govern explicitly, because the transfer happens at the context window layer, not at a resource API boundary that IAM monitors.

The AgentRaft paper (arXiv:2603.07557), testing 6,675 real-world tools from the MCP.so registry, found that 57% of potential tool-call chains exhibit unauthorized sensitive data exposure — not because agents are making unauthorized API calls, but because data passes through tool parameters in ways that no IAM policy is watching. Multi-agent systems compound this: data that enters one agent’s context can reach another agent’s context through orchestration calls, bypassing any resource-level access control that was designed for direct human-to-resource access patterns.

Google’s documentation does not address this surface. That is not a criticism of the documentation’s authors — the MCP specification itself provides no standard for governing inter-agent data flow, trust delegation between agents, or the propagation of access scope across orchestration chains. The absence in the documentation reflects an absence in the protocol.

Model Armor and the Probabilistic Ceiling

Model Armor is presented in Google Cloud’s documentation as the primary runtime defense against prompt injection, jailbreak attempts, and malicious content in MCP tool calls and responses. It is a genuine and valuable tool. It is also, unambiguously, a probabilistic defense — and Google’s documentation is more honest about this than it might appear at first reading.

The recommended floor setting scans at medium confidence and above for injection and jailbreak patterns. “Medium and above” is not a certainty threshold. It is a sensitivity dial. Setting it lower increases false positive rates and degrades legitimate agent utility. Setting it at medium accepts that a category of injection attempts — those engineered to operate below the detection threshold — will pass through.

What Model Armor Addresses

Known Patterns at Detectable Confidence

Model Armor effectively catches well-characterized injection patterns, obvious jailbreak attempts, known malicious URL signatures, and common prompt manipulation techniques. For the majority of unsophisticated injection attempts, it is a meaningful defensive layer that should be enabled.

For organizations deploying agents against low-sophistication adversaries, Model Armor at medium confidence provides real risk reduction. Its value in that context is not in dispute.

Necessary · Not Sufficient

What Model Armor Cannot Address

Adaptive Attacks Below the Threshold

The AutoInject framework (arXiv:2602.05746) demonstrated that RL-optimized adversarial suffixes generated by a 1.5 billion parameter model can produce transferable injection attacks that compromise frontier models and evade classifier-based defenses. These attacks are specifically optimized against detection systems.

For organizations whose threat model includes sophisticated adversaries — which should describe any enterprise deploying agents with access to sensitive data or consequential actions — probabilistic scanning alone is an insufficient terminal defense.

Structural Controls Required

This is, precisely, the argument we made in Building Defensible Agents Post 1: probabilistic defenses are necessary but not sufficient conditions for safe agentic operation. They establish a meaningful first line. They are not the line that holds under sustained, adaptive adversarial pressure. The research establishes this empirically. Google’s documentation, in recommending Model Armor as the runtime security layer without addressing what happens when it fails, implicitly relies on it as more than it can be.

The OAuth Scope Granularity Gap

Google Cloud’s documentation offers read-only versus read-write tool access as the authorization granularity available through IAM deny policies. This is a meaningful lever — preventing agents from making write operations on production resources is a legitimate risk-reduction measure. It is also a coarse-grained boundary that does not map to the precision that safe agentic operation requires.

An agent with read-write access to Cloud Storage can write to any bucket within its IAM permission scope — not only the bucket relevant to the task it is currently executing. An agent with access to a BigQuery dataset can query any table in scope, not only the table its current workflow requires. The authorization model governs the resource type and the operation category. It does not govern task-contextual authority: the relationship between the specific current task and the specific resources that task should be allowed to touch.

This is not a failure of IAM design. IAM was designed for human principals who make deliberate, individually authorized resource access decisions. Agentic workflows involve autonomous multi-step task execution where the agent needs the authority to act on a broad resource set in principle, even when any given step requires only a narrow slice of that authority in practice. The protocol provides no standard mechanism for narrowing the agent’s live authority scope to the specific action the current workflow step requires — which means any injection that redirects the agent’s current step can redirect it anywhere within the full permission envelope.

The Pre-GA Risk Multiplier

Google Cloud’s MCP servers carry an explicit Pre-GA designation throughout the documentation. The terms are clear: available “as is,” limited support, subject to change before General Availability. This is standard engineering practice for a product in active development, and it is responsible disclosure.

It also means that any enterprise governance framework built on top of these servers — the IAM policies, the Model Armor floor settings, the org-level policy controls, the audit logging configurations — is built on ground that Google itself has not finalized. Security controls that are calibrated to the current API surface, the current tool schemas, the current authentication flows, may need to be materially rebuilt when the product reaches GA and those surfaces change. The documentation is a governance framework for a moving target.

This is not an argument against using Google Cloud MCP servers or following their security guidance. It is an argument for treating the current security posture as provisional, investing in monitoring and detection capabilities that do not depend on specific API surface stability, and planning for a governance refresh at GA. Organizations that treat the current documentation as a stable security foundation without accounting for Pre-GA churn are taking on risk that the documentation itself, if read carefully, is advertising.

What Sound MCP Security Requires That the Protocol Does Not Provide

To be clear about what we are arguing: Google Cloud’s security documentation describes, accurately and professionally, everything that is currently achievable within the constraints of the MCP specification. The documentation is not promising more than it can deliver. The gap is between what the governance layer can deliver and what safe agentic operation at enterprise scale actually requires.

The Building Defensible Agents series Post 2 identified four structural controls that the research corpus — and the failure modes of probabilistic-only defenses — indicates are necessary for a complete security posture. None of them are available within the current MCP specification.

Cryptographic tool schema binding would allow a client to detect unauthorized modifications to tool descriptions or capability additions between sessions. The protocol provides no mechanism for this. Authenticated Workflows (arXiv:2602.10465) demonstrated that cryptographic authentication of agent workflow steps achieves 100% recall with zero false positives across 174 test cases — because it makes workflow integrity structurally verifiable rather than probabilistically assessed.

Per-session capability scoping would allow an orchestrator to issue agents with a narrowed, task-specific authority envelope for a given session, with the protocol enforcing that the agent cannot invoke tools outside the defined scope regardless of what injected instructions request. The protocol has no session-scoped capability model. IAM permissions are static across the session; they do not narrow to the task.

Structural instruction-data separation at the transport layer would allow MCP clients to maintain an architectural distinction between trusted instructions and untrusted content retrieved through tools — enforced by the protocol, not by developer discipline. The protocol delivers both as undifferentiated context. Developer guidance is the only available substitute.

Inter-agent trust delegation standards would allow multi-agent architectures to propagate access scope with explicit governance: a subagent receiving work from an orchestrator would receive only the authority subset appropriate for its assigned subtask, not a copy of the orchestrator’s full permission scope. The protocol provides no standard for this. Data flows laterally across agent chains in ways that no current control plane is designed to observe.

The Central Insight

Google Cloud’s MCP security documentation is the most thorough vendor treatment of MCP security risks currently published. It is also — read precisely — a detailed inventory of the structural gaps that governance cannot close. Every place the documentation recommends a manual operational practice as a mitigation is a place where the protocol does not permit a structural control. Counting those places, and understanding what they represent, is the beginning of an honest enterprise security assessment for agentic AI.

None of this is Google’s fault. The MCP specification is maintained by Anthropic; Google is implementing and governing on top of it. The structural gaps are specification-level problems, not implementation failures. What Google can do — and has done, to their credit — is build the most complete governance layer the protocol permits and document it honestly. What they cannot do is make the protocol’s architecture safe through governance alone. No vendor can.

The implication for practitioners is not that MCP deployments should be avoided, or that Google Cloud MCP servers are insecure. It is that organizations deploying agents on this protocol should understand, clearly, which risks are mitigated by the available controls and which risks remain open at the structural layer — and should build their incident response, monitoring, and architectural review processes accordingly. The documentation Google has provided is an excellent starting point. It is not, and cannot be, an ending point.

Post-Publication Update · March 17, 2026

Since this analysis was published, Google’s own MCP rollout decisions have provided a live illustration of the structural risks described above — on three fronts simultaneously.

First, the dynamic tool problem we identified has now manifested at a higher order. As of March 17, Google and Google Cloud remote MCP servers are automatically enabled when you enable a supported service. You no longer need to make a separate MCP enablement decision. Enabling the Google Security Operations API now enables the Google SecOps MCP server as a side effect. This is the capability expansion problem from our analysis — not at the tool schema layer within a server, but at the service provisioning layer itself. An operator enabling a familiar API can now acquire an active MCP endpoint without having made a conscious MCP decision.

Second, the Pre-GA governance instability we warned about materialized on the same date. Organization policies using the gcp.managed.allowedMCPServices constraint — a control that appeared in Google’s own governance documentation as recently as weeks prior — were deprecated effective March 17. Any organization that built its MCP governance posture around that constraint had those controls silently invalidated beneath them. This is precisely the scenario our Pre-GA instability section described: governance infrastructure built on ground that has not been stabilized.

Third, the specific service involved sharpens the irony. Google SecOps is a security operations platform — the system enterprises use for threat detection, investigation, case management, and threat intelligence. Its MCP server, carrying access to that sensitive operational data, now enables automatically. The platform designed to surface security incidents now carries the structural MCP attack surface described in this post as a default provisioning outcome.

The Structural Argument, Now In Production

These are not hypothetical risks from a research corpus. They are governance controls being deprecated, capability surfaces expanding without explicit operator decisions, and security-sensitive endpoints enabling by default — all within weeks of the documentation that described the controls as current. The analysis above was written from the research literature. March 17 wrote it in production.

Related Reading · Luminity Digital Published Series

Series 1 · Complete Where Agentic AI Breaks — Five posts on the structural attack surfaces of MCP-connected agentic systems

Series 2 · Complete Building Defensible Agents — Three posts on probabilistic vs. deterministic defenses and NIST AI agent security

Industry Analysis · Now Reading Governance Without Architecture — A close reading of Google Cloud’s MCP security documentation

01 Dynamic tool addition — trusted servers can silently expand agent capabilities without operator approval
02 Agent-Only risk — autonomous operation relies entirely on agent programming, vulnerable to injection and insecure tool chaining
03 Probabilistic scanning — Model Armor operates at confidence thresholds, not certainty
04 Developer-dependent separation — instruction-data isolation requires correct implementation at every integration point
05 Pre-GA status — the entire framework is available “as is” with limited support

→ Cryptographic schema binding — protocol-level detection of unauthorized tool additions or description modifications
→ Per-session capability scoping — narrowing agent authority to the current task’s specific resource requirements
→ Structural instruction-data separation — protocol-enforced trust boundary between instructions and retrieved content
→ Inter-agent trust delegation — governed propagation of access scope across orchestrator-to-subagent chains

What the Documentation Gets Right

The Protocol Tells on Itself

The Instruction-Data Problem Is an Opt-In Defense

The Enforcement Gap

What IAM Cannot Know

The Injected Agent Problem

The Multi-Agent Blind Spot

Model Armor and the Probabilistic Ceiling

Known Patterns at Detectable Confidence

Adaptive Attacks Below the Threshold

The OAuth Scope Granularity Gap

The Pre-GA Risk Multiplier

What Sound MCP Security Requires That the Protocol Does Not Provide

The Structural Argument, Now In Production

The Full Research Foundation

Like this:

Related

Governance Without Architecture

What the Documentation Gets Right

The Protocol Tells on Itself

The Instruction-Data Problem Is an Opt-In Defense

The Enforcement Gap

What IAM Cannot Know

The Injected Agent Problem

The Multi-Agent Blind Spot

Model Armor and the Probabilistic Ceiling

Known Patterns at Detectable Confidence

Adaptive Attacks Below the Threshold

The OAuth Scope Granularity Gap

The Pre-GA Risk Multiplier

What Sound MCP Security Requires That the Protocol Does Not Provide

The Structural Argument, Now In Production

The Full Research Foundation

Share this:

Like this:

Related