The Last Mile of AI Is Not a Model Problem

Anthropic’s head of Americas, Kate Jensen, said something this week that every enterprise architect building AI systems in production will recognize immediately: “2025 was meant to be the year agents transformed the enterprise, but the hype turned out to be mostly premature. It wasn’t a failure of effort. It was a failure of approach.”

That framing is deliberate — and it’s worth sitting with. Anthropic isn’t blaming enterprise customers for moving too slowly. They’re not blaming the models for being insufficiently capable. They’re blaming the architecture of deployment itself — the way AI has been inserted into enterprises like a sophisticated autocomplete rather than a redesigned operating model.

Having spent months advising enterprises on AI infrastructure, the failure mode Jensen describes is intimately familiar. The proof of concept impresses the steering committee. The pilot produces compelling demos. And then, somewhere between pilot and production, the effort stalls — not because the model can’t perform the task, but because the surrounding system wasn’t built to support it at scale, under governance, with the auditability that regulated industries require.

Anthropic’s recent moves — Agent Skills, the Cowork platform, deep partnerships with Accenture, PwC, and Infosys — aren’t just product announcements. They’re a diagnosis and a proposed treatment. And the diagnosis is more specific than most coverage has captured.

<10%

of AI pilots reach production deployment. Anthropic’s own Economic Index data confirms that the gap between experimentation and enterprise-scale operation is structural — not a prompting problem, not a model capability gap, but a failure of the harness built around the model.

Three Last-Mile Problems Anthropic Is Naming

Anthropic’s own research data is instructive. Their Economic Index found that API customers using Claude for complex tasks tend to provide Claude with lengthy inputs — and that this represents a structural barrier to broader enterprise deployment. The reason: most enterprise data isn’t centralized, digitized, or organized in a way that AI agents can consume it. Correcting that bottleneck, Anthropic notes, may require firms to restructure their organizations, invest in new data infrastructure, and fundamentally change how information flows through the business.

That is a much harder problem than selecting the right foundation model. It points to three distinct barriers in the last mile of enterprise AI deployment — each requiring a different kind of fix.

Last-Mile Problem

The Data Infrastructure Gap

Enterprise knowledge is dispersed, undigitized, and not organized for AI consumption. Most agents fail not because they can’t reason — but because the context they need doesn’t exist in a consumable form. This is an organizational redesign problem, not a prompting problem.

What’s Required

Structural Information Architecture

Centralizing, digitizing, and restructuring enterprise knowledge so agents can actually consume it. This precedes any model or agent framework decision — and it is where most pilots quietly die before they ever reach a production review.

Anthropic Economic Index — September 2025 Report

Last-Mile Problem

The Skills and Execution Gap

Moving from AI conversation to AI action requires domain-specific execution primitives — connections to systems of record, workflow triggers, and business logic that exist outside any foundation model. Without these bridges, agents remain advisors rather than operators.

What’s Required

Agent Skills as Connective Tissue

Pre-built, domain-specific skills that connect Claude to the systems enterprises already operate — Atlassian, Figma, Stripe, Notion, Zapier — via an open standard designed for progressive context loading rather than monolithic injection.

Anthropic Agent Skills Launch — VentureBeat, December 2025

Last-Mile Problem

The Governance and Auditability Gap

Regulated industries — financial services, healthcare, legal — need AI systems that are transparent, auditable, and controllable before they will trust them with consequential decisions. Governance retrofitted after deployment is not governance at all.

What’s Required

Governance Designed In from Day One

Architectural decisions about data flow, audit logging, human-in-the-loop triggers, and access control must be made at design time — not at deployment review. This is what Anthropic’s PwC and Accenture partnerships are operationalizing at enterprise scale.

PwC & Anthropic Enterprise AI Partnership — February 2026

The Architecture Anthropic Is Building Around the Model

What makes Anthropic’s current enterprise push interesting isn’t the partnerships themselves — it’s the underlying architectural logic. They are not building more capable models and waiting for enterprises to figure out how to deploy them. They are building the reference architecture for the harness that surrounds those models, and recruiting an ecosystem to execute the industry-specific last-mile work that Anthropic itself cannot scale alone.

Agent Skills is the clearest expression of this. The architectural decision is deliberate: each skill loads into Claude’s context window in a compressed summary form, with full detail retrievable only when the task requires it — what Anthropic calls “progressive disclosure.” This means organizations can deploy extensive skill libraries without overwhelming the model’s working memory. It is a harness-level solution to a harness-level problem. It does not require a better model. It requires a better way of organizing what the model sees, and when.

There’s a big gap between an AI model that works in a demo and one that works in a regulated industry.

— Dario Amodei, Co-Founder & CEO, Anthropic — on the Infosys partnership, February 2026

The Cowork platform extends this logic from individual skill execution to full workflow automation — agents that operate across enterprise software, handle multi-step tasks, and surface for human review at the appropriate decision points. Pre-built industry plugins for finance, legal, and HR operationalize this vision without requiring enterprises to build from scratch. The partner ecosystem is the scaling mechanism: Accenture’s Business Group brings roughly 30,000 trained Claude practitioners to bear, PwC brings governance frameworks purpose-built for regulated industries, and Infosys brings implementation capacity across banking, telecoms, and manufacturing.

The Model vs. The Harness: A Structural Choice

Anthropic’s decision to release the Agent Skills specification as an open standard — rather than locking it proprietary — signals something important about where they believe durable competitive advantage actually lives. By making skills portable across AI platforms, they are betting that ecosystem growth benefits the company more than proprietary lock-in would. OpenAI has already adopted structurally identical architecture in ChatGPT and Codex CLI. The standard is becoming the standard.

The Prevailing Assumption

The Model Is the Moat

Select the most capable foundation model, wrap it in prompts and guardrails, and iterate toward production. Competitive advantage comes from model capability — the smartest model wins.

Limitation: models are converging in capability. Differentiation built on model selection compounds only until the next release cycle erases it.

Eroding Advantage

The Emerging Reality

The Harness Is the Moat

Competitive advantage accrues to organizations that build superior context management, skill orchestration, workflow integration, and governance infrastructure — not those that simply select the most capable model.

The surrounding system is where differentiation compounds. The model is a powerful but increasingly interchangeable component within it.

Compounding Advantage

What the Governance Imperative Actually Means

The PwC and Infosys partnerships are not coincidental. Both signal Anthropic’s awareness that the last mile in regulated industries is fundamentally a governance problem before it is a capability problem. Scott White, Anthropic’s head of enterprise product, put it directly: enterprises need governance, auditability, and risk controls from day one — not as a retrofitted layer added after deployment has already begun.

This is the “failure of approach” diagnosis made concrete. When governance is treated as a compliance review that happens after the architecture is set, you have already built something that cannot reach production in a regulated environment. The architectural decisions that enable auditability — data flow design, audit logging, human-in-the-loop trigger points, access control — belong in the same design session as the AI capability itself.

Practitioner Note: The Governance Design Window

If you are waiting until the pilot succeeds to figure out governance, you have already delayed production by six to twelve months. Governance cannot be owned solely by a compliance function reviewing outputs after the fact. It requires architectural decisions that belong at the design phase — decisions about what gets logged, what triggers human review, who has access to what context, and how the agent’s reasoning can be traced and challenged. Build this in from day one or rebuild from scratch later.

What This Means for Enterprises in Production

Anthropic’s data tells a clear story: AI adoption among US firms has more than doubled in two years — from 3.7% to 9.7% — but the vast majority of organizations remain in early experimentation. Usage is concentrated in information-sector tasks and only rarely deployed for the complex, multi-step workflows that constitute genuine operational transformation. Large enterprise accounts at Anthropic grew nearly sevenfold in the past year. The production threshold is moving — but it is moving for organizations that have internalized what the last mile actually requires.

What Closing the Last-Mile Gap Requires in Practice

First, treat data infrastructure as a prerequisite to agent deployment — not a parallel workstream. Second, adopt an open skill standard rather than building proprietary connectors that lock you to a single model. Third, design governance into the architecture at the same time you design the AI capability — not after the pilot succeeds. The organizations closing the gap between pilot and production in 2026 are not the ones with access to the most capable models. They are the ones treating the harness as the primary engineering challenge.

Practitioner Takeaway

Anthropic’s “failure of approach” diagnosis is a gift to enterprise architects willing to hear it clearly. The last mile of AI is a data infrastructure problem, a skills-and-connectors problem, and a governance problem — in that order. The model sits downstream of all three. Organizations that continue to treat model selection as the primary decision will continue to see pilots stall at the production threshold. The architectural investment that closes the gap is in the harness, the skill library, the context orchestration layer, and the governance framework. That is where the work of 2026 actually lives.

The Last Mile of AI Is Not a Model Problem

Three Last-Mile Problems Anthropic Is Naming

The Data Infrastructure Gap

Structural Information Architecture

The Skills and Execution Gap

Agent Skills as Connective Tissue

The Governance and Auditability Gap

Governance Designed In from Day One

The Architecture Anthropic Is Building Around the Model

The Model vs. The Harness: A Structural Choice

The Model Is the Moat

The Harness Is the Moat

What the Governance Imperative Actually Means

Practitioner Note: The Governance Design Window

What This Means for Enterprises in Production

What Closing the Last-Mile Gap Requires in Practice

Anthropic’s Enterprise Agent Launch — February 2026

Share this: