The Skill Is the Attack Surface

This series examines the attack surface that lies beneath the agent itself — the skills, tools, and data sources the agent treats as trusted infrastructure. The anchor paper for this post is Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems (arXiv:2604.03081, Qu, Liu, Geng, Deng, Li, Zhang, Zhang, and Ma, April 2026), which introduces Document-Driven Implicit Payload Execution — DDIPE — and establishes that implicit attacks defeat defenses that catch explicit attacks at a 0% rate while achieving 11–33% bypass against state-of-the-art protection. Four confirmed CVEs resulted from the responsible disclosure process.

The open-source package supply chain problem is not new. The security community has been documenting malicious packages in npm and PyPI since at least 2017. The pattern is well-understood: an attacker publishes a package with a name similar to a trusted one, or compromises a legitimate package maintainer account, or introduces a malicious dependency into an otherwise legitimate package. Developers install it. The payload executes with the privileges of the development or production environment.

The pattern maps directly to agent skill ecosystems — and then it gets worse. A malicious npm package executes with the privileges the Node.js runtime grants it, which are typically bounded by the user account running the process. A malicious agent skill executes inside an agentic system that has been provisioned with tool access, file system permissions, API credentials, and in enterprise deployments, access to sensitive organizational data. The blast radius is categorically different.

Qu et al.’s research documents something more troubling still. The most effective supply chain attacks against agent skill ecosystems do not arrive through the skill invocation at all. They arrive through the documentation the agent reads before it decides which tool to call — and they are structurally invisible to the defenses enterprises have deployed to catch supply chain attacks at the invocation layer.

How Agent Skill Ecosystems Work — and Where They Break

An agent skill is a capability unit — a function, a tool definition, a plugin — that an agent can invoke to accomplish a task. In production agentic deployments, agents do not come pre-loaded with a fixed set of skills. They discover and invoke skills dynamically, reading skill documentation to understand what each skill does, when to use it, and how to call it correctly.

The Skill Documentation Attack Surface

Agent skills expose two surfaces to the agent: the invocation interface (the function call, its parameters, its return value) and the documentation (natural language descriptions of what the skill does, when to use it, and example usage patterns). Security scanning has focused almost entirely on the invocation interface — checking for malicious code in the skill’s implementation, validating parameter types, monitoring the return values.

DDIPE targets the documentation. Malicious logic is embedded not in the skill’s code but in the code examples and configuration templates within the skill’s documentation. The agent reads the documentation during normal task planning, incorporates the example patterns into its approach, and executes the embedded payload as part of what it believes to be routine task completion. No anomalous function call is made. No suspicious parameter is passed. The agent is doing exactly what the documentation instructed it to do.

The elegance of this attack from an attacker’s perspective is that it does not require bypassing any security control at the invocation layer. The payload never arrives as a suspicious tool call. It arrives as a documentation example — a code snippet that the agent incorporates into its task plan during the planning phase, before any security monitor is looking at invocations. By the time the agent makes a tool call, the malicious action is embedded in what looks like ordinary task execution.

What DDIPE Establishes Empirically

Qu et al. generated 1,070 adversarial skills from 81 seed examples, spanning 15 MITRE ATT&CK categories. They tested these against state-of-the-art defense systems — the same systems that successfully block explicit instruction attacks — and measured bypass rates for both explicit and implicit attack variants.

Bypass rate for explicit instruction attacks against state-of-the-art defenses. Explicit attacks — skills that directly instruct the agent to perform a malicious action — are effectively blocked. This is what the current generation of enterprise agent security tools is designed to catch, and it catches them.

11–33%

Bypass rate for DDIPE implicit attacks against the same state-of-the-art defenses. Implicit attacks — malicious logic embedded in documentation examples and configuration templates — bypass the defenses that catch explicit attacks at 0%. The attacks span 15 MITRE ATT&CK categories. Four resulted in confirmed CVEs following responsible disclosure.

The gap between these two numbers is the structural finding. It is not that explicit defenses are weak — they are demonstrably effective at their designed scope. It is that the implicit attack surface is categorically different, and no defense currently deployed addresses it. An enterprise that has invested in agent invocation monitoring and skill scanning has closed the explicit attack surface. It has not closed the documentation attack surface. Those are not the same thing.

The responsible disclosure process produced four confirmed CVEs — a result that validates the research methodology as empirically grounded against real production systems, not synthetic benchmarks. Twelve reactive security tools appeared within 30 days of major disclosures, which is the pattern the package ecosystem community will recognize: vulnerability disclosure drives tool development, but the tools address the disclosed vector without necessarily closing the attack class.

The Package Ecosystem Parallel — and Where It Breaks Down

The parallel between agent skill ecosystems and package ecosystems is instructive and has a precise limit. Understanding both is necessary for calibrating the response correctly.

Package Ecosystem Supply Chain

The Known Problem

Malicious packages in npm and PyPI execute code with the privileges of the runtime environment — typically bounded by a user account or container process. The attack vectors are well-documented: typosquatting, dependency confusion, compromised maintainer accounts, malicious updates to legitimate packages.

Defenses are mature: software composition analysis, lockfile pinning, integrity hashing, private registry mirroring, automated vulnerability scanning. These defenses address the invocation surface — the package code itself.

The blast radius is constrained by the runtime environment’s privilege scope — significant, but bounded by standard operating system access controls.

Known Pattern · Mature Defenses

Agent Skill Ecosystem Supply Chain

The Amplified Problem

Malicious agent skills execute inside agentic systems provisioned with tool access, API credentials, file system permissions, and organizational data access. The same runtime privilege scope that makes agents useful makes a compromised skill dangerous beyond the bounds of standard OS access controls.

Defenses are immature: invocation monitoring catches explicit attacks at 0% bypass, but DDIPE’s documentation-layer implicit attacks achieve 11–33% bypass against the same defenses. The documentation attack surface has no equivalent in package ecosystems and no deployed defense addresses it.

The blast radius is bounded only by the agent’s provisioned access — which in enterprise deployments routinely includes credentials, sensitive data, and external API access.

Amplified Pattern · Immature Defenses

The limit of the parallel is this: package ecosystem attacks target the code. DDIPE targets the documentation that the agent reads to understand how to use the code. There is no equivalent attack surface in traditional software development because traditional software does not read its dependencies’ documentation and incorporate its patterns into task planning. The LLM reasoning layer that makes agents useful — the layer that reads skill documentation, understands examples, and incorporates them into task plans — is the layer the attack exploits. It is novel to agentic systems and has no precise historical analogue.

The 26.1% Baseline: Ecosystem-Scale Evidence

DDIPE is a specific attack technique. Its significance is amplified by ecosystem-scale evidence about the current state of the skill supply chain. A systematic analysis documented in the formal supply chain security literature (arXiv:2603.00195) scanned 98,380 skills across major agent skill platforms and found that 26.1% of 42,447 skills exhibited at least one security vulnerability across 14 vulnerability patterns, with 157 confirmed malicious entries. VirusTotal failed to detect the majority of agent-targeted malware.

26.1%

Proportion of scanned agent skills exhibiting at least one security vulnerability across 14 documented vulnerability patterns — across 42,447 skills on major skill platforms. 157 confirmed malicious entries were identified. VirusTotal, the industry-standard malware detection tool, failed to detect the majority. The skill supply chain is not a theoretical risk surface. It is an already-compromised one.

This number requires careful interpretation. It does not mean that 26.1% of skills actively exploit agents that install them. It means that 26.1% exhibit vulnerability patterns that are exploitable under the right conditions. In a traditional package ecosystem, a vulnerable package that has not been actively exploited is a latent risk. In an agent skill ecosystem, a vulnerable skill that an agent has already incorporated into its documentation-informed task planning may have already delivered its payload — and the delivery may not have been logged as a security event because it looked like normal task execution.

The Detection Gap in Production

Standard agent security monitoring looks for anomalous tool invocations — function calls that exceed expected parameters, invoke unexpected tools, or return suspicious data. DDIPE payloads do not trigger these monitors because the tool invocations they produce are individually normal. The malicious action is distributed across a sequence of legitimate-looking steps that collectively produce the attacker’s objective. Step-level safety evaluation, as documented in the ToolSafe research from Series 1, addresses this partially — but ToolSafe was designed for direct task injection, not documentation-layer implicit payload execution. The detection gap for DDIPE in production environments has not been closed.

What the Research Recommends

Qu et al. are explicit that the research represents a first characterization of the implicit payload attack surface, and that comprehensive defenses have not yet been developed. The responsible disclosure findings and the reactive tool development that followed suggest the field is aware of the problem and moving to address it. The current best-practice recommendations that follow from the research and the broader supply chain security literature are necessarily partial — they address the attack surface available for defense today while the documentation-layer defense gap remains open.

Treat skill documentation as untrusted input at the point of consumption. The documentation attack surface exists because agents incorporate skill documentation examples into their task plans without subjecting them to the same scrutiny applied to user input. Documentation examples that contain executable code patterns should be processed through the same content integrity checks applied to tool return values — treating them as data, not as trusted instructions. This does not close DDIPE, but it narrows the surface by applying existing controls to a previously unguarded layer.

Pin skill versions and verify integrity at installation. Version pinning and integrity hashing are the foundational controls from the package ecosystem security playbook. They do not address documentation-layer attacks — a pinned, integrity-verified skill can still carry a DDIPE payload — but they close the dependency confusion and malicious update vectors that remain active. An enterprise that has not applied package ecosystem supply chain hygiene to its agent skill ecosystem has not completed the basics.

Audit skills with elevated privilege scope first. Not all skills carry equal blast radius. Skills provisioned with credential access, file system write permissions, or external API invocation authority represent the highest-priority audit surface. A malicious payload delivered through a skill with read-only access to a narrow data source is categorically different from one delivered through a skill with write access to a CRM or the ability to invoke payment APIs.

The defense that catches the explicit attack at 0% bypass does not reach the documentation. The documentation is where the attack lives.

— Luminity Digital synthesis from arXiv:2604.03081, Qu et al., April 2026

The Central Insight

Agent skill ecosystems reproduce the package supply chain problem at higher privilege stakes and add a new attack surface — documentation-layer implicit payload execution — that has no equivalent in traditional software development and no deployed defense that closes it. The current generation of enterprise agent security monitoring addresses the explicit invocation surface effectively. It does not address the documentation layer. Post 2 of this series examines what the full MCP-scale ecosystem evidence adds to this picture and how network-level defense changes the detection calculus.

The Supply Chain Beneath the Stack · Three-Part Series

Post 1 · Now Reading The Skill Is the Attack Surface

Post 2 The Tool You Trusted Was Never Yours

Post 3 Provenance Is the Architecture

DDIPE — Document-Driven Implicit Payload Execution — embeds malicious logic in skill documentation examples rather than skill code. The agent reads the documentation during task planning and incorporates the payload as what it believes to be routine usage patterns. Explicit attacks bypass state-of-the-art defenses at 0%. DDIPE bypasses the same defenses at 11–33%. The gap is structural: the defenses address the invocation layer. The attack lives in the documentation layer, which has no deployed defense.

Series 1 Where Agentic AI Breaks 5 posts · The failure mode map
Series 2 Building Defensible Agents 3 posts · Deterministic architecture
Series 3 The Invisible Attack 3 posts · Indirect prompt injection
Series 4 Fault Lines 3 posts · Hidden structural risks
Series 5 The Policy Layer 4 posts · Governance architecture
Series 6 The Containment Problem 3 posts · Sandbox and AI control
Series 7 The Memory Problem 3 posts · Memory as attack surface

DDIPE Document-Driven Implicit Payload Execution — malicious logic embedded in skill documentation, executed when agents incorporate those examples into task plans. Passes all code-level inspection; targets the planning phase, not the invocation layer.
Documentation Layer The natural language descriptions, usage examples, and configuration templates within skill documentation — the attack surface DDIPE exploits and current defenses do not reach.
Privilege Amplification The property by which agent skill attacks carry higher blast radius than equivalent package ecosystem attacks — skills execute with agent-provisioned credentials, not only OS-level process permissions.
Invocation-Layer Defense Security controls that inspect skill behavior at the point of function call — sandboxing, I/O validation, system call monitoring. Effective against explicit attacks; blind to DDIPE’s documentation-layer execution.

How Agent Skill Ecosystems Work — and Where They Break

The Skill Documentation Attack Surface

What DDIPE Establishes Empirically

The Package Ecosystem Parallel — and Where It Breaks Down

The Known Problem

The Amplified Problem

The 26.1% Baseline: Ecosystem-Scale Evidence

The Detection Gap in Production

What the Research Recommends

Up Next: Post 2 — The Tool You Trusted Was Never Yours

Like this:

Related

The Skill Is the Attack Surface

How Agent Skill Ecosystems Work — and Where They Break

The Skill Documentation Attack Surface

What DDIPE Establishes Empirically

The Package Ecosystem Parallel — and Where It Breaks Down

The Known Problem

The Amplified Problem

The 26.1% Baseline: Ecosystem-Scale Evidence

The Detection Gap in Production

What the Research Recommends

Up Next: Post 2 — The Tool You Trusted Was Never Yours

Share this:

Like this:

Related