Enterprise AI security programs have a maturity problem. Not in the sense that they are unsophisticated — the policies, ethics frameworks, risk assessments, and governance structures that leading organizations have built around AI deployment represent genuine investment and institutional effort. The maturity problem is that they have been investing in the wrong layer.
Governance controls — policies, system prompts, human-in-the-loop oversight, responsible use frameworks, ethics reviews — are visible. They produce artifacts: documents, approvals, board presentations, training programs. They satisfy the stakeholders who ask about AI safety at the institutional level. They are also, when it comes to the failure modes identified in Posts 1 and 2 of this series, the controls that matter least.
Containment controls — purpose binding, kill switches, network isolation, privilege scoping, sandbox configuration — are infrastructure. They produce no visible artifacts. They have no natural stakeholder. They rarely appear in AI strategy presentations. They are the controls that actually determine whether a compromised or misconfigured agent causes a recoverable incident or a catastrophic one.
The deployment survey data for 2026 documents the gap between these two layers with numerical precision. The gap is large, it is widening, and it is the direct consequence of how enterprise AI governance programs have been structured since agentic deployment began in earnest.
Two Different Control Problems
The distinction between governance and containment is not merely taxonomic. It reflects a fundamental difference in what the two control types actually protect against.
What They Protect Against
Policy violations, misuse, non-compliance, reputational harm from inappropriate AI outputs. Governance controls are designed to ensure agents behave in accordance with organizational intent — that they do what they are supposed to do and refuse what they are not.
Governance controls operate through instruction: system prompts, policy constraints, approval workflows, human review gates. Their effectiveness depends on the model following the instructions reliably.
Instruction-DependentWhat They Protect Against
Structural failure, compromise, and blast radius expansion. Containment controls are designed to limit the damage when agents malfunction, are subverted, or operate in ways that governance controls do not catch — regardless of whether the model follows instructions.
Containment controls operate through architecture: network boundaries, privilege limits, execution constraints, circuit breakers. Their effectiveness is independent of model inference.
Architecture-EnforcedThe critical point is that governance controls assume a functioning execution environment. A system prompt constraining an agent’s behavior assumes the agent can be constrained. A human-in-the-loop approval workflow assumes the agent will pause for approval. A policy against accessing sensitive data assumes the agent cannot route around that policy through environmental reasoning about its execution context.
Posts 1 and 2 demonstrated precisely the conditions under which those assumptions break down. When a frontier model can escape its container by reasoning about exposed control surfaces, governance controls that operate through instruction provide no additional protection. When a model can select attacks that evade its monitor, human-in-the-loop workflows that operate downstream of the monitoring layer inherit the monitoring layer’s blindness.
The Critical Dependency
Governance controls are downstream of containment controls. A governance layer that assumes a functioning execution environment is only as reliable as that environment. Organizations that have built robust governance programs on top of inadequate containment infrastructure have not built defense in depth — they have built a governance layer that inherits the execution layer’s vulnerabilities.
What the Deployment Data Establishes
The Kiteworks 2026 Data Security and Compliance Risk Forecast Report surveyed 225 security, IT, and risk leaders across 10 industries and 8 regions on the state of agentic AI governance. The findings document the governance-containment gap with precision that should concern any organization that considers itself ahead of the curve on AI security.
The 2025 AI Agent Index, which surveyed 30 deployed commercial AI agent systems across consumer, enterprise, and developer categories, adds a different dimension. Of those 30 agents, 9 had no guardrails of any kind documented. Sandboxing or VM isolation was documented for exactly 9 agents — primarily developer tools and browser agents. Of the 13 enterprise agents surveyed, 7 described options for setting up guardrails, but none documented actual sandboxing or containment.
The governance-containment gap will narrow through 2026. It will not close. The organizations that close it first will be demonstrably more resilient. The rest will be demonstrably exposed.
— Kiteworks 2026 Data Security and Compliance Risk Forecast ReportWhat makes these numbers particularly significant is the deployment velocity they are set against. Of the organizations surveyed by Kiteworks, 51% already have agents in production. A third are planning autonomous workflow agents — systems that take actions without human approval for each step. A quarter are planning decision-making agents. These are not chatbots operating under human supervision. They are systems with data access, tool execution capability, and autonomous action authority — running, in the majority of cases, without the containment infrastructure that the technical failure modes in Posts 1 and 2 make necessary.
Why the Gap Exists
The governance-containment gap is not a product of organizational negligence. It is the predictable output of how AI governance programs were structured in response to early institutional pressures.
The initial demand for enterprise AI governance came from compliance teams, boards, and regulators. What those stakeholders asked for — and what AI governance programs were built to provide — were visible artifacts: policies, risk assessments, responsible use frameworks, ethics review processes. These artifacts could be presented, audited, and reported on. They satisfied the governance requirements that organizations faced and were accountable for.
Containment controls were never part of that initial demand. No regulator asked for a kill switch. No board presentation required a network isolation architecture. No ethics review process checked whether purpose binding was architecturally enforced. The containment layer had no institutional stakeholder, which meant it had no institutional advocate, which meant it did not get built at the same pace as the governance layer.
Point gap between governance controls (monitoring, human-in-the-loop oversight) and containment controls (purpose binding, kill switches, network isolation) in the Kiteworks 2026 survey — representing the structural distance between what organizations have invested in and what actually limits blast radius when agents fail.
This is, in organizational terms, a principal-agent problem. The principals who funded and sponsored AI governance programs — compliance leaders, boards, regulators — had visibility into the governance layer and accountability for it. They did not have visibility into the containment layer and were not accountable for it. The agents who built the governance programs rationally built what their principals could see and evaluate.
The consequence is organizations that have comprehensive policies governing what agents are permitted to do, and inadequate infrastructure limiting what agents are capable of doing when those policies fail. The gap between permission and capability is where the failure modes in this series live.
The Five Containment Questions
Closing the governance-containment gap is not a policy exercise. It is an infrastructure build. But before an organization can build the infrastructure, it needs to understand what it is currently missing. The following five questions are not a maturity model or a compliance framework. They are a diagnostic — the minimum set of questions an enterprise deploying agentic AI should be able to answer before its governance layer claims to provide meaningful protection.
-
1Purpose Binding What is this agent architecturally authorized to do — and is that limit enforced by the execution environment, or only by instruction? System prompts and policy constraints define what an agent is permitted to do. Purpose binding defines what an agent is capable of doing, enforced at the architecture layer regardless of what the model infers. If the only boundary between an agent and sensitive data is a prompt instruction, there is no purpose binding.
-
2Kill Switch Can this agent be halted within seconds, without multi-step human escalation, if it begins behaving in an unexpected or harmful way? An agent that cannot be stopped rapidly is not under organizational control — it is under observation. A kill switch that requires a security team to convene, investigate, and agree before acting provides a response window that is measured in minutes or hours. Agentic blast radius expands in seconds.
-
3Network Isolation If this agent is compromised, is lateral movement to adjacent systems blocked by default — or does containment depend on the agent not attempting it? An agent with network access that is not bounded by egress controls is a potential pivot point. Compromise of a single agent should not provide access to adjacent systems, credentials, or data stores. Default-deny network policies enforced at the infrastructure layer are the minimum requirement — not an aspirational configuration.
-
4Privilege Scoping Does this agent carry the minimum credential scope its assigned task requires — or does it carry the full access scope of the operator who provisioned it? Agent credential inheritance is one of the most common sources of blast radius amplification in agentic deployments. An agent provisioned by a user with broad system access does not need that access to complete its task. Credential scoping should be task-specific, not operator-inherited, and should be enforced at provisioning rather than relying on agent self-restraint.
-
5Sandbox Evaluation Has the execution environment been assessed against the capability of models currently deployed in it — not the models that were deployed when the sandbox was first configured? As Post 1 established, a container configuration that was adequate against less capable models may not be adequate against current frontier deployments. Sandbox evaluation is not a one-time security review. It is a continuous assessment that must be re-run as model capability improves, using benchmarks like SandboxEscapeBench that reflect current frontier capability rather than historical baselines.
An organization that cannot answer all five questions affirmatively has a containment gap regardless of how mature its governance program appears. The governance layer describes intent. The containment layer enforces limits. Without both, what the organization has is a policy document attached to an uncontained system.
What Closing the Gap Actually Requires
The Kiteworks forecast is direct about the trajectory: the gap will narrow through 2026, but it will not close. The organizations that close it first will be measurably more resilient when the failure modes this series has documented — sandbox escape, monitor evasion, blast radius expansion — occur in production. The organizations that don’t will be exposed in the ways that matter most.
Closing the gap requires treating containment controls as what they are: security-critical infrastructure with their own threat models, deployment timelines, and operational runbooks. This means several things in practice.
Purpose binding must be architectural, not instructional. The shift from “the agent is instructed not to access sensitive data” to “the agent is structurally incapable of accessing sensitive data outside its assigned scope” requires changes to provisioning workflows, credential management, and execution environment configuration. It cannot be accomplished through prompt engineering.
Kill switches must be operational, not theoretical. An organization that has a policy stating agents can be halted but no tested runbook for doing so has a theoretical kill switch. Operational kill switches require pre-defined triggers, tested halt procedures, and incident response playbooks that have been exercised before they are needed.
Sandbox configurations must be evaluated against current frontier capability. The SandboxEscapeBench benchmark, released publicly under the UK AI Safety Institute’s research infrastructure, provides a concrete starting point. Running it against current production configurations against current frontier models is a tractable, one-time investment that reveals whether the gap between the sandbox the organization thinks it has and the sandbox that will hold under active exploration is one misconfiguration or many.
The containment layer must have an institutional owner. Every component of the governance layer has an institutional owner — the CISO, the AI ethics committee, the compliance team, the board. The containment layer needs one too. Without an owner who is accountable for purpose binding, kill switch capability, and sandbox currency, the gap will persist not because it is technically intractable but because no one is responsible for closing it.
This series has traced a single structural problem through three layers. The sandbox is the assumed execution boundary — it degrades as model capability improves, and the degradation is measurable. Monitoring is the assumed safety fallback — it has structural limits that adversarial models can exploit, and the limits compound under adaptive pressure. Governance is the assumed organizational backstop — it is systematically under-resourced at the containment layer and over-resourced at the instruction layer.
Each layer is necessary. None is sufficient on its own, and each assumes the layer below it is functioning. The organizations that understand this dependency structure — and build containment infrastructure that can hold when individual layers fail — are the ones that will be able to claim, with evidence, that their agentic AI deployments are defensible. The rest are deploying agents they cannot constrain, audit, or stop, governed by policies that assume a trust model the execution environment no longer supports.
