What the Policy Layer Cannot Protect: The Gaps That Remain

This is Post 4 of The Policy Layer, the closing post of our fifth series on agentic AI security. Post 1 established the dependency graph as the correct state representation for authorization. Post 2 mapped the four approaches to policy authorship. Post 3 introduced the enforcement lifecycle axis and mapped the minimum viable stack. This post performs the honest accounting: where does that stack — correctly assembled — still fall short? The answer connects the mechanistic work of this series back to the structural thesis that has run through every series before it. The policy layer is necessary. It is not sufficient. The gap that remains is the same gap it has always been.

The three preceding posts of this series have built something precise: a formal model of system state that makes real authorization possible, a landscape of policy authorship approaches suited to different deployment contexts, and a two-axis enforcement taxonomy that maps which threat classes each enforcement lifecycle position can and cannot close. That architecture — dependency graph, reference monitor, Datalog or DSL policies, compile-time instrumentation plus runtime interception — represents a genuine and significant advance over the state of enterprise agentic AI security as it currently exists in most production deployments. Organizations that build it will be materially more secure than organizations that do not.

This post does not diminish that. It does something different: it names precisely where the architecture ends. Not as a counsel of despair, but because intellectual honesty about the limits of current solutions is the precondition for building the next layer correctly. The four gaps described below are not edge cases. Each is a threat class that the best currently available enforcement stack cannot see from its enforcement positions — and each maps directly onto the structural MCP vulnerabilities that Series 1 and Series 2 of this corpus identified as protocol-level failures that governance and operational controls can mitigate but cannot resolve.

Gap 1: Supply Chain Compromise — Inside the Enforcement Boundary Before Enforcement Begins

The dependency graph and reference monitor architecture works as follows: when an agent proposes an action, the monitor evaluates the causal history of that action against the enforced policy. If the action’s information path traces back through an untrusted source, the policy can block it. This is correct, rigorous, and — within its scope — complete.

The scope assumption is that the tools delivering content to the agent are themselves trustworthy. The dependency graph models tool results as nodes with provenance. It models the content within tool results as potentially untrusted. It does not and cannot model the tool implementation itself as potentially adversarial — because if the tool itself is malicious, it is already inside the enforcement boundary at the point where enforcement begins.

Supply chain compromise defeats this assumption at the foundation. The Luminity Series 1 corpus documented the operational reality: the first confirmed malicious MCP server appeared in September 2025; a self-propagating npm worm with an embedded MCP injection module was discovered in February 2026; a scan of 3,984 skills from a major agent skill marketplace found that 36.8% contained at least one security flaw. A compromised MCP server does not deliver untrusted content from a trusted tool. It delivers content from a tool that was replaced by an adversary before the session began. The reference monitor evaluates the provenance of that content and records it as originating from a trusted, registered tool — because that is what the protocol reports and what the compile-time instrumentation was configured to trust.

The Enforcement Boundary and the Supply Chain

The policy enforcement stack operates on the assumption that tool identity is reliable — that a tool registered in the dependency graph as “email_service” is the same tool that was vetted at deployment time. Supply chain compromise violates this assumption silently: the tool’s identity is unchanged, its schema may be unchanged, and the protocol reports it as the expected registered service. The content it delivers is malicious, but the reference monitor records its provenance as “trusted tool result.” No policy that evaluates information flow can detect a substitution that occurred at the supply chain layer before the enforcement infrastructure was ever aware of the tool.

The control that closes this gap is cryptographic tool schema binding — one of the four missing structural MCP controls identified in the Luminity corpus. If MCP carried a cryptographic commitment to each tool’s schema at registration, and if that commitment were verified at each session initiation, a substituted tool would produce a schema verification failure before any content was delivered. This is not an application-layer control. It is a protocol-layer requirement. The enforcement stack cannot retrofit it.

Gap 2: The Viral Agent Loop — Dynamic Topologies the Dependency Graph Cannot Know

The PCAS architecture requires that the dependency graph be constructed from a known agent topology. Compile-time instrumentation instruments a specific set of agents, tools, and inter-agent communication patterns. The reference monitor enforces policies over the graph that accumulates during execution of that instrumented system. This is precise and powerful within the deployment configuration that was instrumented.

The Viral Agent Loop — introduced in the Luminity Series 1 corpus as a distinct threat class — operates by violating the assumption of a known, bounded agent topology. A compromised tool result contains instructions that, when processed by the agent, cause it to spawn new agents or invoke agent frameworks not present in the original deployment configuration. Those spawned agents operate with the access credentials and tool permissions of the spawning agent, but outside the instrumented dependency graph. Their actions are not evaluated by the reference monitor because they are not executing within the instrumented system.

The practical anatomy of this attack is worth examining precisely. An adversary who understands that a target deployment uses PCAS-style compile-time instrumentation does not need to defeat the reference monitor directly. They need to cause the agent to take one action that the reference monitor permits — spawning a subprocess or invoking an external agent framework — and conduct the rest of the attack outside the enforcement scope. The reference monitor correctly evaluates the spawn action against the dependency graph and finds nothing that violates the specified policy. The spawned agent, operating outside the graph, proceeds without constraint.

87%

Of downstream decision-making poisoned within four hours by a single compromised agent across connected agent networks, as demonstrated in the Trust Paradox research (arXiv:2510.18563). The Viral Agent Loop amplifies this cascade by spawning topologies outside the enforcement scope — the poison propagates not merely through the known network but through dynamically created extensions of it that the compile-time dependency graph has no record of.

The control that closes this gap is per-session capability scoping — the second missing structural MCP control. If each agent’s authority to spawn sub-agents, invoke external frameworks, or extend the agent topology were scoped at the protocol level to the specific resource requirements of the current task, the spawn action that enables the Viral Agent Loop would require explicit protocol-level authorization. An agent could not silently extend the execution topology into an uninstrumented zone. This requires that the protocol carry capability declarations that constrain agent authority at the session boundary — a protocol-level mechanism, not an application-layer control.

Gap 3: Tool Schema Poisoning — The Attack That Happens Before the Reference Monitor Sees Anything

The dependency graph tracks information flow from the point at which events enter the system — tool calls, tool results, messages. The reference monitor evaluates proposed actions against the graph. Both mechanisms operate on the assumption that the information they observe has not been corrupted before they observe it.

Implicit tool poisoning attacks, documented in the Luminity corpus as MCP-ITP (arXiv:2601.07395), operate on a different attack surface: the tool descriptions themselves. In MCP, tool descriptions are the natural-language text that agents process to understand what a tool does, when to invoke it, and what arguments to pass. An adversary who can modify these descriptions — either through supply chain compromise of the tool registry or through a man-in-the-middle position on the MCP transport — can corrupt the agent’s reasoning about which tool to call and with what parameters before any action is presented to the reference monitor.

The attack is precise in its mechanism: the agent reads a poisoned tool description and, reasoning from that description, selects a tool invocation that is both consistent with the (corrupted) description and authorized by the enforced policy. The reference monitor evaluates the proposed action against the dependency graph and finds it compliant. The policy is enforced correctly. The enforcement is correct with respect to the information the reference monitor received. The information the reference monitor received was corrupted before it arrived.

This is structurally distinct from prompt injection, which injects malicious instructions into the data path. Tool schema poisoning corrupts the agent’s world model — its understanding of what its tools do — at the source. No information flow policy evaluated over the dependency graph can detect a corruption that occurred upstream of the graph’s first observation point.

Why Tool Schema Poisoning Is Upstream of Enforcement

The dependency graph begins recording at the moment an event enters the agentic system: when a tool call is made, when a tool result is returned, when a message is passed. Tool descriptions are not events in the dependency graph. They are the meta-layer that shapes the agent’s reasoning about which events to generate. An agent that has been given a corrupted tool description makes its tool selection decision based on that description, produces a tool call that is consistent with the description, and presents that tool call to the reference monitor as a candidate action.

The reference monitor evaluates: is this action causally influenced by untrusted content? It is not — or rather, the causal influence was exerted at the meta-layer of tool description, not through the data paths that the dependency graph models. The enforcement architecture has no observation point at the meta-layer. Closing this gap requires structural instruction-data separation at the transport layer — the third missing MCP structural control — so that tool descriptions are carried and verified separately from tool results, with cryptographic integrity protection that makes unauthorized modification detectable before the agent’s reasoning is shaped by the corrupted description.

Gap 4: Adaptive Adversarial Pressure — Policy-Compliant Exfiltration

The fourth gap is the most conceptually challenging and the one with the most direct bearing on the long-term arms race between enforcement and evasion. It does not attack the enforcement mechanism’s observation point or its logical scope. It attacks its specification.

The AutoInject research (arXiv:2602.05746) established a capability that the Luminity corpus has treated as a central empirical anchor since Series 2: RL-optimized adversarial suffixes, generated by a 1.5B parameter model, successfully compromise frontier model configurations — including configurations hardened against non-optimized attacks. The key finding was not merely that frontier models can be compromised. It was that the adversarial optimization is adaptive: the attack learns to produce outputs that defeat the specific defense in place.

Applied to a known policy enforcement specification, this optimization capability produces what this series has termed policy-compliant exfiltration: an adversary who can infer or observe the policy specification can optimize injected content to produce agent outputs that technically satisfy all enforced policy rules while achieving the intended malicious objective. A policy that states “do not send data to external email addresses” is satisfied by encoding exfiltrated data in a support ticket sent to an internal address that is subsequently forwarded externally. A policy that states “do not retrieve files outside the authorized workspace” is satisfied by reasoning the agent to request a summary of authorized files that happens to include the sensitive content from the unauthorized file it was told to summarize earlier in the session.

These evasions require knowing the policy. An adversary who cannot observe or infer the policy specification cannot optimize against it. But the policy specification — particularly when expressed in a human-readable DSL like AgentSpec — is not a secret. It is a design document. Organizations that publish their security architecture, or whose architecture can be inferred from the behaviors an instrumented system permits and blocks, have implicitly made the policy specification available to adversaries who operate at sufficient sophistication.

Policy-Compliant Exfiltration Is Not a Theoretical Concern

AutoInject (arXiv:2602.05746) demonstrated RL-optimized adversarial suffixes defeating hardened frontier model configurations — produced by a 1.5B parameter model that is trivially accessible and cheap to run. The optimization target in AutoInject was model-level alignment. The same optimization approach applies to policy-level specifications: an adversary who knows what actions the reference monitor will block can train an optimizer to produce outputs that achieve the same malicious objective through policy-compliant paths. The policy layer changes what a successful optimization must produce. It does not remove the adversary’s ability to find one.

The control that provides the deepest defense against policy-compliant exfiltration is not a better policy. It is defense-in-depth at the protocol layer: if the protocol itself enforces structural instruction-data separation and per-session capability scoping, the space of policy-compliant actions that can achieve malicious ends is materially narrowed by structural constraints that the adversary cannot optimize against without first compromising the protocol itself. This is why the four missing structural MCP controls are not merely important for completeness — they are the controls that change the adversary’s optimization problem from “find a policy-compliant path through an application-layer enforcement stack” to “compromise the protocol before optimization begins.”

Mapping the Gaps Against the Four Missing Structural Controls

Each gap maps precisely to one or more of the four missing structural MCP controls identified in the Luminity corpus. This is not coincidental. The gaps exist because the controls are missing. The controls are missing because MCP was not designed to carry them. That is the protocol-level architectural failure that this corpus has documented across five series and thirty-plus empirical citations.

Threat Gap	Application-Layer Stack	Missing Structural Control	Control Layer
Supply chain compromise	Cannot detect — tool inside enforcement boundary	Cryptographic tool schema binding	Protocol
Viral Agent Loop	Cannot track — dynamic topology outside dependency graph	Per-session capability scoping	Protocol
Tool schema poisoning	Cannot observe — corruption upstream of graph’s first node	Structural instruction-data separation	Protocol
Adaptive adversarial / policy-compliant exfiltration	Partially mitigated — spec secrecy + behavioral bounding	Per-session capability scoping + instruction-data separation	Protocol + Application

The table makes the architectural argument concrete. Three of the four gaps are not closeable by any combination of compile-time and runtime application-layer enforcement, however well specified and correctly deployed. They require protocol-level changes that MCP does not currently implement. The fourth gap is partially mitigated at the application layer — specification secrecy and behavioral bounding narrow the optimization space — but its structural closure requires the same protocol-layer controls.

What This Means for Organizations Deploying Today

The honest practitioner argument that follows from this analysis is not “do not deploy the policy layer until the protocol gaps are closed.” The protocol gaps may not close on any near-term timeline that enterprise deployments can wait for. The argument is: deploy the policy layer because it is a significant security improvement over unguarded deployment, understand precisely where its boundary sits, and design your deployment architecture with that boundary explicitly acknowledged.

Acknowledging the boundary means several things in practice. It means treating supply chain integrity as a separate security program, not as something the enforcement stack handles — vetting MCP server provenance, pinning tool schema hashes outside the protocol, monitoring for schema drift between sessions. It means designing agent topologies that minimize dynamic spawning authority, keeping the capability to extend the execution topology as narrow as the task requirements allow, recognizing that per-session capability scoping will eventually be a protocol-level control but must today be approximated through IAM constraints and deployment-level boundaries. It means treating tool description integrity as a first-class security property — versioning descriptions, logging description fetches, monitoring for modifications — recognizing that the protocol currently carries no cryptographic commitment to them.

And it means being precise about what the policy enforcement stack actually guarantees to the people making risk decisions in the organization. It guarantees that within the instrumented deployment configuration, against the threat classes the policy specification covers, no action that violates the policy executes. It does not guarantee anything about threat classes that operate below the enforcement boundary, outside the instrumented topology, upstream of the observation points, or against the policy specification itself.

What the Policy Layer Guarantees

Within Its Scope: Empirically Demonstrated

Zero policy violations in instrumented deployments for correctly specified policies. 93% compliance versus 48% unguarded baseline. Transitive information flow enforcement across agent boundaries. Deterministic blocking of actions whose causal history traces through untrusted sources. Runtime interception of operational constraint violations at millisecond overhead.

These guarantees are real, significant, and worth building. An organization that deploys this stack is materially more secure than one that does not, for the threat classes within scope.

Structural · Enforceable · Within Scope

What the Policy Layer Does Not Guarantee

Four Gaps: Protocol-Level Exposure

No protection against supply chain compromise of tools inside the enforcement boundary. No tracking of dynamically spawned agent topologies outside the instrumented dependency graph. No observation of tool description corruption upstream of the graph’s first event. No structural defense against policy-compliant exfiltration by adversaries who can infer the specification.

These gaps are not closeable at the application layer. Each maps to a missing structural MCP control that requires protocol-level change. Until those changes exist, the gaps remain open beneath every correctly deployed enforcement stack.

Protocol-Level · Structurally Open

The Thesis That Runs Through Five Series

This series set out to answer the mechanistic question that Series 2 left open: how does deterministic architectural enforcement actually work at the level of formal state models, policy specification, and enforcement lifecycle positioning? The answer across four posts has been precise and, within its scope, complete. The dependency graph provides the state representation. The four authorship approaches provide the specification options. The three enforcement lifecycle positions provide the deployment architecture. The 93% compliance rate in instrumented PCAS deployments demonstrates that the architecture works.

But the closing argument of this series is the same closing argument that has run through every series in this corpus, because the evidence continues to point to the same conclusion. The security problems in agentic AI are not primarily implementation failures. They are protocol-level architectural failures. The policy layer addresses implementation-level enforcement with genuine rigor. It does not reach the protocol layer. The four missing structural MCP controls — cryptographic tool schema binding, per-session capability scoping, structural instruction-data separation, and inter-agent trust delegation standards — are not missing because organizations have not implemented them. They are missing because MCP’s protocol design does not carry them. No vendor governance policy, no application-layer enforcement stack, no combination of compile-time and runtime controls can close gaps that exist at the protocol level below them.

The research agenda that follows from this is not ambiguous. Protocol-level enforcement — the STPA+IFC trajectory, the capability-enhanced MCP proposals, the cryptographic schema binding work — is the research program whose maturation will determine whether the remaining gaps can be closed structurally rather than managed operationally. The Luminity corpus will track that maturation. When the research produces deployable answers, the next series will report them with the same empirical rigor that this one has applied to what can be answered today.

The policy layer is the most significant structural advance in agentic AI security that the research community has produced in the 2025–2026 window. It closes real gaps. It produces real guarantees. And it sits above a protocol that has not changed — a protocol whose structural vulnerabilities remain open beneath every enforcement stack, however well-built, that the application layer can assemble above them.

— Synthesis: Luminity Digital Series 1–5 Research Corpus; PCAS (arXiv:2602.16708); Systems Security Foundations (arXiv:2512.01295); AutoInject (arXiv:2602.05746)

The Central Insight of This Series

The policy layer — dependency graph state representation, formal policy specification, reference monitor enforcement across the compile-time and runtime lifecycle positions — is necessary, empirically validated, and represents the current frontier of deployable agentic AI security architecture. Its boundary is precise: it cannot see supply chain compromise inside the enforcement perimeter, dynamic topologies spawned outside the instrumented dependency graph, tool schema corruption upstream of the graph’s first observation point, or policy-compliant exfiltration optimized against a known specification. Each gap maps to a missing structural MCP control that only protocol-level change can close. Building the policy layer correctly is the right next step for every organization deploying agentic AI in authorization-critical contexts. Knowing where it ends is what makes that step honest.

The Policy Layer: From Architectural Enforcement to Operational Reality · Four-Part Series

Post 1 · Published The Agentic State Problem: Why Linear History Can’t Support Real Authorization

Post 2 · Published The Policy Authorship Problem: Who Writes the Rules, and How

Post 3 · Published Compile-Time, Runtime, or Protocol: Where Enforcement Lives Determines What It Can Guarantee

Post 4 · Now Reading What the Policy Layer Cannot Protect: The Gaps That Remain

Companion Post The Toolkit That Tried to Be a Kernel

Standards Readout · Accompanying What AIUC-1’s Q2 2026 Update Gets Structurally Right

A correctly deployed enforcement stack — dependency graph, reference monitor, formal policies across compile-time and runtime positions — is a material security improvement. Its scope is bounded by four threat classes it cannot reach: supply chain compromise inside the enforcement boundary, the Viral Agent Loop in dynamically spawned topologies, tool schema poisoning upstream of the graph’s observation points, and policy-compliant exfiltration against a known specification. Each gap maps to a missing structural MCP control that only protocol-level change can close.

Series 1 Where Agentic AI Breaks 5 posts · The failure mode map
Series 2 Building Defensible Agents 3 posts · Deterministic architecture
Series 3 The Invisible Attack 3 posts · Indirect prompt injection
Series 4 Fault Lines 3 posts · Attack surface compounding

01 arXiv:2602.05746 — AutoInject: Automated Prompt Injection via RL
02 arXiv:2601.07395 — MCP-ITP: Implicit Tool Poisoning Attacks
03 arXiv:2602.16708 — PCAS: Policy Compiler for Secure Agentic Systems
04 arXiv:2512.01295 — Systems Security Foundations for Agentic Computing
05 arXiv:2510.18563 — The Trust Paradox in Multi-Agent Systems
06 arXiv:2601.08012 — Towards Verifiably Safe Tool Use (STPA+IFC+MCP)
07 arXiv:2602.16943 — Mind the GAP: Text Safety vs. Tool-Call Safety

→ Policy-compliant exfiltration — An attack class in which an adversary who can infer or observe the enforced policy specification optimizes injected content to produce outputs that satisfy all policy rules while achieving a malicious objective.
→ Dynamic topology risk — The vulnerability class arising when agent systems spawn new agents at runtime in response to task requirements, creating orchestrator-subagent relationships outside the instrumented dependency graph and therefore outside the enforcement scope.
→ Tool schema poisoning — Manipulation of MCP tool descriptions to corrupt an agent’s tool selection reasoning before any candidate action is presented to the reference monitor; operates upstream of the dependency graph’s first observation point.
→ Protocol-level structural gap — A vulnerability class rooted in MCP’s protocol design: the absence of cryptographic tool schema binding, per-session capability scoping, structural instruction-data separation, and inter-agent trust delegation standards — gaps that application-layer enforcement cannot close from above.

The Toolkit That Tried to Be a Kernel — Microsoft’s Agent Governance Toolkit analyzed through the structural/probabilistic enforcement taxonomy established in this series. Seven packages, two enforcement tiers, one honest architectural ceiling. Companion to Series 5 and Governance Without Architecture.

What AIUC-1’s Q2 2026 Update Gets Structurally Right — the Q2 2026 AIUC-1 standards refresh read through the structural/probabilistic enforcement framework. What the fourteen updated requirements close, how, and where the certification scope ends. Standards Readout · Accompanying Series 5.

What the Policy Layer Cannot Protect: The Gaps That Remain

Gap 1: Supply Chain Compromise — Inside the Enforcement Boundary Before Enforcement Begins

The Enforcement Boundary and the Supply Chain

Gap 2: The Viral Agent Loop — Dynamic Topologies the Dependency Graph Cannot Know

Gap 3: Tool Schema Poisoning — The Attack That Happens Before the Reference Monitor Sees Anything

Why Tool Schema Poisoning Is Upstream of Enforcement

Gap 4: Adaptive Adversarial Pressure — Policy-Compliant Exfiltration

Policy-Compliant Exfiltration Is Not a Theoretical Concern

Mapping the Gaps Against the Four Missing Structural Controls

What This Means for Organizations Deploying Today

Within Its Scope: Empirically Demonstrated

Four Gaps: Protocol-Level Exposure

The Thesis That Runs Through Five Series

Series 5 Complete — The Policy Layer

Like this:

Related

What the Policy Layer Cannot Protect: The Gaps That Remain

Gap 1: Supply Chain Compromise — Inside the Enforcement Boundary Before Enforcement Begins

The Enforcement Boundary and the Supply Chain

Gap 2: The Viral Agent Loop — Dynamic Topologies the Dependency Graph Cannot Know

Gap 3: Tool Schema Poisoning — The Attack That Happens Before the Reference Monitor Sees Anything

Why Tool Schema Poisoning Is Upstream of Enforcement

Gap 4: Adaptive Adversarial Pressure — Policy-Compliant Exfiltration

Policy-Compliant Exfiltration Is Not a Theoretical Concern

Mapping the Gaps Against the Four Missing Structural Controls

What This Means for Organizations Deploying Today

Within Its Scope: Empirically Demonstrated

Four Gaps: Protocol-Level Exposure

The Thesis That Runs Through Five Series

Series 5 Complete — The Policy Layer

Share this:

Like this:

Related