This is the opening post of Fault Lines, our fourth series on agentic AI security. Series 1 mapped where agentic systems break. Series 2 argued that deterministic architectural enforcement — not probabilistic guardrails — is the structural answer. Series 3 examined indirect prompt injection in depth as the most operationally dangerous single attack vector. This series steps back to the systems level: why the six attack surface dimensions we have documented don’t merely coexist, but compound — and what that means for how organizations should think about damage scope before a compromise occurs, not after.
There is a security mental model that has served practitioners well for three decades. Enumerate your vulnerabilities. Assign each a severity score. Prioritize by exploitability and impact. Patch the highest-risk items first. Bound your exposure through network segmentation and least-privilege access controls. Know your blast radius — the scope of damage a successful compromise can produce — and keep it as small as architecture allows. This model produced a generation of security practices, frameworks, and tooling that genuinely improved enterprise security posture over time.
Agentic AI does not invalidate that model entirely. But it breaks several of its load-bearing assumptions in ways that are not yet widely understood — and the breakage is structural rather than incidental. The vulnerabilities are not longer lists of the same kind of problem. The relationships between them are different. The blast radius is not a static property you can read off a network diagram. And the perimeter — the concept that gives “inside” and “outside” their meaning — dissolves in an architecture where autonomous agents dynamically discover, compose, and act through an unbounded set of runtime connections.
This post establishes why. Posts 2 and 3 in this series will go deeper: Post 2 maps all six attack surface dimensions with the empirical rigor the research now supports, and Post 3 makes the case for treating blast radius as a first-class design constraint rather than a forensic afterthought.
What the CVE Era’s Model Assumed
The CVE era’s security model was built for deterministic systems. A web application has a defined set of inputs, a defined execution path, and a defined set of outputs. The attack surface is the collection of interfaces through which an adversary can provide malicious input. The blast radius of a compromise is bounded by what the compromised process has access to — its file permissions, its network reach, its database credentials — all of which are set at deployment time and remain largely stable during operation.
The perimeter model gave this structure its organizing logic. Inside the network boundary, systems trust each other. Outside it, systems are suspect. Firewall rules, VPN access, role-based access controls, and network segmentation all work to enforce that boundary and limit lateral movement when it is breached. The model is not perfect — lateral movement attacks, credential theft, and insider threats all exploit its assumptions — but it provides a coherent framework for bounded risk.
Critically, in this model, blast radius is essentially fixed at the time of provisioning. A compromised service account can reach what it was granted when it was created. A vulnerability in an API endpoint exposes whatever that endpoint has access to. Security teams can reason about worst-case exposure from a known starting point. This knowability is what makes the CVE model workable: you can enumerate, prioritize, and contain.
AI security breaches now linked to agentic systems, according to HiddenLayer’s 2026 AI Threat Landscape Report. Each agent interaction expands the operational blast radius while introducing new forms of runtime risk — yet most security controls stop at prompts, policies, or static permissions, leaving execution-time behavior largely unobserved and uncontrolled.
Six Dimensions, No Perimeter
Agentic AI systems operate across six distinct attack surface dimensions — each representing a class of vulnerability with no complete traditional equivalent, and each capable of enabling or amplifying every other. The six dimensions are not a longer version of the same vulnerability list. They are structurally different in ways that matter for how risk compounds:
The tool-call and MCP layer introduces a channel where malicious instructions can be embedded in tool descriptions, tool results, and inter-server metadata — processed by the model as authoritative input before any human review. A single compromised MCP server, in a deployment with five concurrent servers, achieves a 78.3% attack success rate and a 72.4% cascade rate to other servers (arXiv:2601.17549).
Multi-agent trust chains dissolve the perimeter by replacing a defined trust boundary with a mesh of agent-to-agent communications operating at machine speed. The Trust Paradox (arXiv:2510.18563) demonstrates that increasing inter-agent trust to improve coordination monotonically expands over-exposure risk — and in empirical testing, a single compromised agent poisoned 87% of downstream decision-making within four hours across connected agents.
Supply chain and skill compromise has moved from theoretical to operational: the first confirmed malicious MCP server appeared in September 2025; a self-propagating npm worm with an embedded MCP injection module was discovered in February 2026; and a scan of 3,984 skills from a major agent skill marketplace found that 36.8% contained at least one security flaw (Snyk ToxicSkills, February 2026).
Identity and OAuth delegation gaps arise because OAuth 2.0 was designed for human-delegated, bounded access — not for autonomous agents acting across sessions, across agent-to-agent hops, and across systems without a clear human consent moment at each step. Industry data indicates 97% of non-human identities carry excessive privileges, and unlike humans who exercise roughly 4% of their granted permissions, agents exercise all of them.
Memory and context persistence risks transform prompt injection from a transient single-session attack into a stateful, cross-session control channel. MINJA (arXiv:2503.03704) demonstrated greater than 95% memory injection success through query-only interaction — meaning any regular user can become an attacker. AgentPoison (NeurIPS 2024) showed backdoor success rates above 80% with a poison rate below 0.1%.
Sandbox and containment failure is no longer theoretical. SandboxEscapeBench (arXiv:2603.02277, Marchand et al., March 2026) — the first systematic benchmark of frontier model container escape capability — found that across 18 scenarios grounded in real CVE classes, frontier models achieve escape success rates following a log-linear relationship with compute budget, at approximately $1 per attempt.
Of enterprises report some degree of AI agent adoption, with MCP SDK downloads surpassing 97 million per month. The attack surface is not theoretical — it is present at scale, in production, across an ecosystem that is expanding faster than the security frameworks designed to govern it.
Why the Math Changes: Compounding Rather Than Adding
The critical insight — and the central argument of this series — is that these six dimensions do not produce a threat posture of 1+1+1+1+1+1. They produce something closer to 1×1×1×1×1×1. Each dimension amplifies the exploitability of every other, through a non-deterministic reasoning layer that no patch can fully remediate.
Consider the attack chain that our research synthesis identified as the most complete illustration of this compounding. A typosquatted MCP server — a supply chain compromise — injects a tool description containing hidden prompt injection. When that description is processed by the orchestrator agent, the malicious instruction propagates to sub-agents through the trust chain. Those sub-agents execute the instruction using over-permissioned OAuth tokens, persisting the payload in shared memory. From memory, the payload can direct code execution that probes container boundaries. Each hop amplifies the preceding one. The supply chain compromise alone would be contained. Combined with prompt injection, trust chain propagation, ambient identity, memory persistence, and weak sandbox boundaries, it becomes a systemic compromise.
Threats Are Sequentially Linked, Not Isolated
IBM X-Force’s 2025 Threat Intelligence Index makes the point directly: threats in agentic environments “should not be considered in isolation — they are not merely isolated parallel threats but often sequentially linked threats leading to a compromise.” The blast radius of a credential compromise “expands beyond the traditional data theft to potential control of the whole system” when that credential provides access to in-house agentic AI. A fourfold increase in supply chain or third-party breaches over the last five years is the observable result of this sequential linking accelerating.
The empirical evidence for the compounding dynamic is consistent across independent research groups. Unit 42’s 2026 Global Incident Response Report found that 87% of attacks involved multiple attack surfaces. The cascade rate in multi-server MCP deployments reaches 72.4%. The Trust Paradox research found over-exposure rates rising consistently with trust level across all tested model backends and orchestration frameworks — not as a model-specific behavior but as a structural outcome of the architecture itself.
The Compounding Dynamic
In traditional software security, vulnerabilities are largely independent: a SQL injection flaw in one service does not increase the exploitability of an XSS flaw in another. The attack surface is additive. In agentic AI, the six dimensions interact through a shared reasoning layer that is probabilistic, stateful, and manipulable. A successful tool-call injection can direct memory writes that influence future trust decisions that expand OAuth scope that reduce sandbox effectiveness. The dimensions are not independent — they are coupled through the agent’s own reasoning, which can be steered.
This is not a vulnerability. It is an architectural property of systems that reason autonomously over connected data and tool surfaces. It cannot be patched away. It must be designed against.
Blast Radius: From Static Metric to Dynamic Property
In the CVE era, blast radius was a property you could read off an architecture diagram at deployment time. A compromised service account can reach these systems, these databases, these APIs — and no further, because that is what it was granted. The boundary is structural, auditable, and stable.
Agentic AI makes blast radius dynamic, runtime-determined, and self-expanding. An agent does not just use the access it was granted at provisioning — it makes decisions about what to access next, based on its evolving understanding of the task. The blast radius at the moment of compromise is not the blast radius five minutes later, because the agent has continued reasoning, continuing to acquire context, credentials, and connections. The challenge, as one practitioner framing captures it, is that agentic systems are “non-deterministic systems with deterministic privilege grants” — a mismatch that produces unpredictable and expanding damage scope.
Blast Radius as a Static Boundary
Defined at provisioning time by explicit grants: network segments, file permissions, database credentials, API scopes. Knowable from the architecture diagram. Stable during operation unless deliberately reconfigured. Contains what the compromised identity was given and nothing more.
Security controls operate at the network and permission layer. The blast radius is a fact about the system’s configuration, not about the system’s behavior during operation.
Bounded · Knowable · FixedBlast Radius as a Dynamic Property
Expands at runtime as the agent reasons, retrieves, delegates, and acts. Not readable from a network diagram — it depends on what the agent decides to do next. Grows through trust chain propagation, memory accumulation, OAuth delegation hops, and tool discovery. Can expand at machine speed, far faster than human intervention.
Security controls must operate at the identity and behavior layer. Blast radius is a function of what the agent is doing, not just what it was given permission to do.
Dynamic · Runtime-Determined · Self-ExpandingThe operational consequence of this shift is significant. Repello AI’s AI Attack Surface Management framework proposes four factors that determine blast radius for any agentic asset: data access scope (what the agent can read), tool execution scope (what actions it can take), downstream consumption breadth (how many other agents or systems consume its outputs), and persistence mechanisms (whether it writes to memory or shared state). An agent where any one of these factors is high represents elevated risk regardless of how unlikely any specific attack against it may be. An agent where all four are high is the highest-priority risk in any deployment.
Greater blast radius observed in a Salesloft-Drift OAuth supply chain incident analyzed by Obsidian Security, compared to direct Salesforce infiltration. Attackers who exploited OAuth tokens granted access to hundreds of downstream environments — a multiplier effect produced not by a more sophisticated attack but by the agent-mediated trust chain amplifying the reach of a single credential compromise.
The practitioner argument that follows from this is not that blast radius cannot be controlled — it can, and Post 3 of this series will make that case in detail. The argument is that blast radius must be assessed and designed against before deployment, across all six dimensions simultaneously, using a framework that treats it as a dynamic runtime property rather than a static configuration fact. Organizations that rely on traditional perimeter defenses to contain agentic blast radius will consistently find the explosion has already happened by the time containment is attempted.
In agentic systems, autonomy increases capability. Identity defines containment. The organizations that will fare best are those treating blast radius as a first-class design constraint, not a post-incident measurement.
— Synthesis from LoginRadius Agentic Security Framework; HiddenLayer 2026 AI Threat Landscape Report; Repello AI Attack Surface ManagementWhat This Series Covers
This post has established the framing: why agentic attack surfaces compound rather than add, and why blast radius requires a new definition in this context. The two posts that follow provide the substance that framing requires.
Post 2 — “Mapping the Agentic Attack Surface: Six Dimensions, No Perimeter” — takes each of the six dimensions through the full analytical treatment: what the surface is, how it differs from its CVE-era analog, the empirical findings from 2024–2026 research, the framework coverage gaps across OWASP, NIST, CSA, and MITRE, and the current remediation boundaries. This is the reference post — dense, empirically grounded, designed to be bookmarked by practitioners who need the full picture in one place.
Post 3 — “Blast Radius Is Not a Post-Incident Metric” — makes the prescriptive case. Drawing on the four-factor assessment framework, circuit breaker design patterns, runtime least-privilege models, and identity-as-containment principles, it argues that blast radius must be assessed at design time and treated as a living property that requires continuous monitoring. It closes with an honest accounting of what remains genuinely unsolved — because the field is early, and intellectual honesty about the limits of current knowledge is part of what makes security research credible.
Agentic AI security is not a harder version of the same problem. The attack surfaces are different, the relationships between them are different, and the blast radius behaves differently. Approximately 70% of the threat landscape maps to familiar security patterns where existing controls provide partial protection. The remaining 30% — semantic-layer exploits, multiplicative cascade effects, runtime-composed supply chains, and dynamic blast radius expansion — represents genuinely novel risk that current security architecture was not designed to address. We are building production systems faster than we can secure them. The first step is understanding why the old model runs out.
