The Great Compression: Model Providers Are Swallowing the Agent Harness Layer

OpenAI’s March 2026 acquisition of Promptfoo — the AI security testing platform used to evaluate prompt injection resistance and LLM compliance — marks the eighth harness-layer company the company has absorbed since June 2024. It follows Rockset (real-time retrieval), Context.ai (evaluation talent), Statsig ($1.1 billion, product analytics), and Neptune (ML training observability). Each acquisition represents a middleware function that will no longer be available as an independent vendor to enterprises building multi-provider architectures. The pattern is not incidental. It is the endgame of a vertical integration strategy that all six major model providers are executing simultaneously.

The agent harness layer has never been a stable category. It emerged as a practical necessity in 2023 and 2024, when enterprises building on foundation models needed orchestration, memory management, tool access, evaluation, and observability infrastructure that the model providers had not yet built. The middleware companies that stepped into that gap — LangChain, LlamaIndex, Arize AI, Galileo, Pinecone, Composio — attracted meaningful venture capital precisely because the gap was real. What has changed, with gathering speed since mid-2024, is that the model providers have decided the gap represents strategic surface area they intend to own. The question is no longer whether enterprises need harness-layer infrastructure. It is who controls it.

The answer, at the current trajectory, is the model providers themselves. The mechanism differs by company — OpenAI acquires, Google open-sources at scale, Microsoft merges its own frameworks, AWS wraps infrastructure into managed runtimes, Anthropic establishes protocol standards, Meta buys capability at speed — but the direction is uniform. Every middleware function that existed as a standalone commercial offering in early 2024 now has a provider-native equivalent, and in most cases that equivalent is free, deeply integrated, and carries the distribution advantage of the underlying model relationship.

$200B+

Estimated capital deployed by the six major model providers across acquisitions, infrastructure investments, and strategic stakes in the agent harness layer since mid-2024. OpenAI alone has completed eight harness-layer acquisitions. Meta acquired Manus for $2B+ and Scale AI for $14.3B. AWS holds $8B in Anthropic and a $50B stake in OpenAI. The consolidation is structural, not cyclical.

Six Providers, Six Vertical Integration Strategies

The compression is not a monolithic event. Each of the six major providers is executing a distinct integration thesis, and understanding those theses is essential to assessing which middleware positions remain defensible and which have already been absorbed.

Provider Strategy

OpenAI — Full-Stack Vertical Integrator

OpenAI’s acquisition strategy has been the most aggressive and most transparent in its direction. Rockset delivers real-time retrieval. Context.ai was an evaluation talent acqui-hire. Statsig ($1.1B, September 2025) provides product experimentation and analytics infrastructure. Neptune ($400M, December 2025) covers ML training observability. Promptfoo (March 2026) handles AI security testing. The Responses API plus Agents SDK (March 2025) and AgentKit (October 2025) complete the orchestration layer. Eight harness-layer functions absorbed; one acquisition — Windsurf at $3B — collapsed over Microsoft IP concerns, the exception that proves the rule.

Harness Layer Impact

Eight Functions Absorbed, One Attempt Blocked

Every acquisition maps to a harness-layer function previously served by independent vendors. Retrieval, evaluation, analytics, observability, security testing, orchestration, developer tooling — all now provider-native. Enterprises building OpenAI-centric architectures will find each function available free as part of the platform relationship, with integration advantages independent vendors cannot replicate. The strategic implication: multi-provider architectures require deliberate design, not default behavior.

OpenAI Acquisitions — 2024–2026, multiple sources

Provider Strategy

Anthropic — Protocol Standard-Setter

Anthropic’s harness-layer strategy is architecturally distinct from acquisition: establishing open protocols that become the substrate for the entire ecosystem. Model Context Protocol, launched November 2024, has accumulated 97 million monthly SDK downloads and 10,000+ public servers, adopted by ChatGPT, Gemini, VS Code, and Chrome. In December 2025, Anthropic donated MCP to the Linux Foundation Agentic AI Foundation, co-founded with Block and OpenAI. Agent Skills (October 2025, open-sourced December 2025) introduced SKILL.md modular capability packages. One acquisition: the Bun JavaScript runtime, controlling the Claude Code execution layer.

Harness Layer Impact

Protocol Ownership as Platform Control

By establishing MCP as the cross-provider standard for tool connectivity — adopted by competitors — Anthropic has positioned itself as the infrastructure layer beneath the middleware layer. Programmatic Tool Calling (November 2025) demonstrated an 85% token reduction in tool-heavy workflows, with Opus 4.5 improving from 79.5% to 88.1% on MCP evals. Composio, Nango, Toolhouse, and other integration middleware companies face the most direct existential pressure from a standard with 10,000+ community-built servers.

Anthropic MCP — Linux Foundation donation, December 2025

Provider Strategy

Google — Open-Source Platform Maximalist

Google’s strategy maximizes developer surface area through open-source releases at enterprise scale. Agent Development Kit (ADK), open-sourced in early 2025, has accumulated 7 million downloads and supports 200+ models. The Agent-to-Agent (A2A) protocol addresses inter-agent communication. Vertex AI Agent Engine provides a managed runtime with Memory Bank and OpenTelemetry observability. The Windsurf outcome was decisive: when OpenAI’s $3B attempt collapsed, Google moved — a $2.4B licensing deal plus acqui-hire of CEO Varun Mohan and approximately 40 engineers to DeepMind for $250M.

Harness Layer Impact

Developer Gravity at Ecosystem Scale

ADK’s 7 million downloads represent a different kind of moat than acquisition — developer adoption that creates migration friction through familiarity and community, not contract. Vertex AI Agent Engine with native OpenTelemetry observability directly competes with Arize AI and monitoring middleware. Memory Bank addresses episodic memory infrastructure that companies like Zep AI have been selling as standalone products. Google VP Darren Mowry’s February 2025 assessment — 80% of wrapper startups will disappear by end of 2026 — reads less like prediction than competitive positioning.

Google ADK — 7M downloads, A2A protocol, Vertex AI Agent Engine, 2025

97M+

Monthly SDK downloads for Anthropic’s Model Context Protocol as of early 2026, with 10,000+ public servers across the ecosystem. MCP has been adopted by OpenAI, Google, Microsoft, VS Code, GitHub, Chrome, and Cursor — transforming what began as Anthropic’s internal tool connectivity standard into the de facto cross-provider protocol for agentic tool access. Integration middleware companies built on proprietary connector libraries face the most direct commoditization pressure.

The Acquisition Pattern: Three Capability Categories Being Targeted

Across the six providers, the M&A activity reveals three middleware categories that have attracted the most aggressive acquisition interest. These are not random targets. They represent the functions that, if left as independent commercial offerings, would give enterprises the infrastructure to build genuinely provider-agnostic architectures — and in doing so, reduce provider leverage over the enterprise relationship.

1. AI Coding Assistants and Developer Tooling

The Windsurf saga is the defining case study of competitive intensity at the developer tooling layer. OpenAI attempted a $3 billion acquisition that collapsed in July 2025 over Microsoft IP concerns. Google moved immediately — $2.4 billion licensing plus acqui-hire of Windsurf’s CEO and 40 engineers. Cognition AI acquired the remainder for $250 million. The total capital deployed in this single competitive sequence exceeded $2.65 billion. Developer tooling is where model preference is formed at the individual engineer level, and provider competition for that surface area has been correspondingly intense.

2. Evaluation, Security, and Observability

OpenAI’s acquisitions of Context.ai, Neptune, Statsig, and Promptfoo are the clearest expression of this pattern. Each represents a function — evaluation talent, training observability, product analytics, security testing — that enterprises were previously sourcing from independent vendors. AWS added 13 built-in evaluators to Bedrock AgentCore in December 2025. Azure Foundry IQ provides built-in evaluation within Microsoft’s managed runtime. The eval and observability middleware companies that remain independent — Braintrust ($800M valuation, February 2026), Arize AI ($131M raised), Galileo ($68M raised) — are building moats on vendor neutrality and proprietary evaluation IP that providers cannot simply replicate through acquisition of a different target.

3. General-Purpose Agents and Memory

Meta’s acquisition of Manus — the general-purpose agent platform with $100M+ ARR — for more than $2 billion, with Manus continuing to operate independently post-close, represents one of the largest bets on a standalone agent platform in this acquisition cycle. Scale AI ($14.3 billion for 49%, June 2025) provided Meta with data labeling and evaluation infrastructure at a scale no independent vendor could replicate. On the memory side, both Zep AI and Composio reportedly received acquisition offers in April 2025 — within six months of their Series A closes — signaling that strategic buyers recognize memory and integration layers as critical infrastructure worth taking off the board before independent platforms mature.

80% of wrapper startups will disappear by end of 2026. The ones that survive will have built genuine differentiation — not just thin abstractions over model APIs, but defensible intellectual property in evaluation, data, or domain-specific orchestration that providers cannot replicate through acquisition alone.

— Darren Mowry, VP of Global Startups, Google Cloud — February 2025

What Orchestration Frameworks Tell Us About Provider Intent

The most revealing signal in the entire acquisition pattern is what the providers have not acquired: orchestration frameworks. LangChain has raised $260 million at a $1.25 billion valuation. CrewAI has raised $18 million. LlamaIndex has raised $27.5 million. None has been acquired by a major provider. This is not an oversight. It is a deliberate strategic choice by every major provider to build orchestration internally rather than acquire it externally — signaling that orchestration is viewed as core platform capability that providers intend to own and control, not a function they are willing to purchase from an independent vendor and allow to remain neutral.

Microsoft’s merger of AutoGen and Semantic Kernel into a unified Microsoft Agent Framework in October 2025 is the definitive expression of this logic. Two independently developed orchestration approaches were collapsed into a single Azure-centric platform. AWS built Bedrock AgentCore’s Runtime from scratch. OpenAI’s Agents SDK and AgentKit are first-party offerings. Google’s ADK is an open-source project controlled by Google. The orchestration frameworks that remain independent are not independent because providers overlooked them — they are independent because providers decided building was strategically preferable to buying.

What This Means for LangChain’s $1.25B Valuation

LangChain’s pivot is instructive: the company has repositioned its core framework as a RAG utility and directed enterprise customers toward LangGraph for production agents. LangSmith — its evaluation and observability product — generates an estimated $12–16 million in ARR. The $1.25 billion valuation implies a multiple on LangSmith revenue, not on LangChain core, which faces direct substitution from provider-native orchestration. CrewAI reportedly received an acquisition offer in April 2025, six months post-Series A. LlamaIndex has pivoted to LlamaParse and LlamaCloud for document processing — a more defensible niche than general-purpose orchestration. These pivots are rational responses to structural compression.

The Harness Layer Segmented: What Survives and What Does Not

Not all middleware is equally exposed. The compression affects different harness-layer functions at different rates and with different levels of residual defensibility. Enterprise architects evaluating build-versus-buy decisions and vendor relationships need a clear segmentation of which functions are being commoditized by provider-native offerings and which retain durable independent value.

Commodity Harness — Eroding Moat

Functions Being Absorbed by Provider Platforms

Basic orchestration and workflow chaining — now available free through OpenAI Agents SDK, Google ADK, Microsoft Agent Framework, and AWS Bedrock AgentCore Runtime. Tool and API connectivity — being commoditized by MCP’s 10,000+ community servers under Linux Foundation governance. Vector storage and RAG — native to all three hyperscaler databases and provider-managed retrieval services. Basic agent memory — available through OpenAI Conversations API, Anthropic memory tool, Google Memory Bank, AWS episodic memory. General evaluation and logging — built into provider platforms as first-party features without incremental cost.

Result: Startups whose core offering is a thin layer over these functions face substitution by platform-native capabilities that are free, deeply integrated, and carry the distribution advantage of the model relationship itself.

Platform Substitution Risk

Differentiated Intelligence — Compound Moat

Functions Retaining Independent Value

Vendor-neutral evaluation with proprietary models — Galileo’s Luna evaluation models achieve 97% lower cost than GPT-based evaluation through specialized fine-tuning providers cannot replicate at acquisition. OpenTelemetry-native observability with cross-provider instrumentation — Arize AI’s vendor neutrality is the moat; enterprises running multi-provider architectures need instrumentation that reports to no single provider. Specialized document intelligence — LlamaIndex’s LlamaParse pivot addresses a domain where provider-native RAG remains inadequate. Enterprise governance and compliance infrastructure — audit trails, policy enforcement, and regulatory alignment that require independence from the model provider by definition.

Result: Middleware companies that survive the Great Compression will have built intellectual property — in evaluation models, cross-provider instrumentation, domain-specific pipelines, or governance infrastructure — that providers cannot replicate through acquisition of a different target.

Defensible Differentiation

The Enterprise Strategic Implication

The structural reality that enterprise architects must internalize is that the 79% of OpenAI customers who also pay Anthropic — a figure cited repeatedly in market analyses — are not accidents. They represent a deliberate multi-provider strategy by organizations that have already concluded that single-provider dependence is a governance risk. The Great Compression makes that multi-provider instinct correct but harder to execute: as each provider absorbs more of the harness layer into its platform, the infrastructure required to maintain genuine provider portability becomes more, not less, demanding.

The NIST AI Risk Management Framework’s emphasis on continuous operational risk management — across the full AI lifecycle, not at point-in-time configuration — applies directly to vendor architecture decisions. An enterprise that builds its agentic infrastructure on a single provider’s orchestration framework, evaluation tooling, memory system, and observability platform has not simplified its architecture. It has concentrated its risk. The Great Compression is creating exactly the conditions under which that concentration becomes strategically dangerous: when a provider controls the evaluation layer, it also controls what counts as success for agents running on its infrastructure.

The Vendor Lock-In Risk That Evaluation Consolidation Creates

When a model provider controls the evaluation layer — as OpenAI now does with Context.ai talent, Neptune observability, Statsig analytics, and Promptfoo security testing absorbed into its platform — it also sets the measurement standard for agent quality on its own infrastructure. Enterprises that rely on provider-native evaluation for production agents have no independent verification that the measurement framework is provider-neutral. This is precisely the dynamic that makes vendor-neutral evaluation platforms — Braintrust, Arize AI, Galileo — strategically essential for enterprises that require auditable, provider-independent quality measurement.

The Open Protocol Question: MCP as Infrastructure or as Anthropic Advantage

MCP’s donation to the Linux Foundation Agentic AI Foundation in December 2025 is Anthropic’s most sophisticated strategic move in the compression dynamic. By establishing MCP as a Linux Foundation-governed standard co-founded with Block and OpenAI, Anthropic created a standard that competitors have strong incentives to adopt — while retaining the advantages of having originated, implemented, and operationalized the standard first. The 97 million monthly SDK downloads and adoption by OpenAI and Google as co-founders mean that enterprises building on MCP are building on infrastructure no single provider controls but that Anthropic’s tooling and reference implementations are best positioned to serve.

The Four-Layer Harness Claim and What It Actually Means

Anthropic’s Agent Skills framework — launched October 2025, open-sourced December 2025, adopted by VS Code, GitHub, Cursor, and OpenAI — proposed that a complete agent harness consists of four layers: SKILL.md files for modular capability definition, the MCP tool connectivity layer, the Claude Code execution runtime, and the evaluation and observability layer. This framing is technically accurate and strategically revealing: Anthropic is the only provider claiming to own all four layers through a combination of open protocol (MCP), open standard (SKILL.md), proprietary runtime (Claude Code), and evaluation infrastructure via MCP evals. Whether this four-layer claim constitutes genuine platform completeness or sophisticated positioning will be determined by enterprise adoption patterns through 2026 and 2027.

What the Compression Means for Enterprise Decision-Makers

The strategic implications of the Great Compression translate into concrete architecture decisions that enterprise AI leaders — CDAOs, CDOs, CAIOs, CTOs — are making now, with consequences that will compound over the next three to five years as provider platforms mature and middleware vendor consolidation continues.

The first implication is that multi-provider architecture must be an explicit design requirement, not a default assumption. In 2023 and early 2024, multi-provider portability was relatively easy to achieve because the middleware layer was genuinely provider-neutral. As those functions are absorbed into provider platforms, maintaining genuine portability requires deliberate architectural choices — OpenTelemetry instrumentation rather than provider-native logging, vendor-neutral evaluation frameworks rather than platform-built eval, open protocol connectivity (MCP) rather than proprietary tool integrations. Portability is not free. It must be engineered.

The second implication is that open standards favor enterprises over providers in the long run, even when providers sponsor those standards. MCP’s Linux Foundation governance means the 10,000+ servers built by the community cannot be unilaterally deprecated or monetized by Anthropic. A2A’s open-source governance means Google cannot wall off inter-agent communication behind Vertex AI subscriptions. Enterprises should aggressively favor standards-governed infrastructure over proprietary middleware — not because the standards are technically superior in every case, but because governance structure determines who controls the upgrade path, the pricing, and the deprecation schedule.

Practitioner Takeaway

The Great Compression does not eliminate the need for harness-layer infrastructure. It changes who provides it and on what terms. Enterprise organizations that navigate this transition well are building on open standards (MCP, A2A, OpenTelemetry), maintaining vendor-neutral evaluation through independent platforms, and distinguishing clearly between commodity orchestration — which they should take free from provider platforms — and differentiated intelligence infrastructure, which they should source from independent vendors whose business model is not structurally conflicted with provider-neutral measurement. The compression is structural. The response must be architectural.

01 OpenAI — Full-stack vertical integrator. 8 acquisitions: Rockset, Context.ai, Statsig, Neptune, Promptfoo, and more
02 Anthropic — Protocol standard-setter. MCP at 97M+ downloads, Linux Foundation governance, Agent Skills open-sourced
03 Google — Open-source maximalist. ADK at 7M downloads, A2A protocol, Vertex AI Agent Engine, Windsurf acqui-hire
04 Microsoft — Enterprise distributor. AutoGen + Semantic Kernel merged into Microsoft Agent Framework; Foundry IQ, Copilot Studio
05 AWS — Framework-agnostic infrastructure. Bedrock AgentCore GA with 13 built-in evaluators, Cedar policy enforcement, Osmos acquisition
06 Meta — Acquisition-led assembler. Manus ($2B+), Scale AI ($14.3B for 49%), PlayAI, WaveForms AI, Limitless AI

Braintrust — $80M Series B at $800M valuation (February 2026). Customers include Notion, Stripe, Vercel, Airtable, Instacart. Vendor-neutral eval is the moat.

Arize AI — $131M raised ($70M Series C, February 2025). OpenTelemetry-native instrumentation across all major providers. Cross-provider observability.

Galileo AI — $68M raised. Proprietary Luna evaluation models at 97% lower cost than GPT-based eval. Genuine IP moat in evaluation methodology.

LangChain — $260M at $1.25B valuation. Pivoted to LangGraph for agents; LangSmith (~$12–16M ARR) is the real business. Framework repositioned as RAG utility.

Pinecone — $138M raised, $26.6M revenue, 4,000+ customers. Pivoting to knowledge platform with Pinecone Assistant to escape vector DB commoditization.

CrewAI — $18M raised, ~$3.2M revenue. Received M&A offer April 2025, six months post-Series A. Orchestration framework facing provider-native substitution.

Zep AI — ~$1M revenue, 5 employees. Received M&A offer April 2025. Memory infrastructure being replicated natively by all major providers.

Composio — $29M raised ($25M Series A, July 2025). 3,000+ cloud app connections. MCP’s 10,000+ community servers directly commoditize the core offering.

Nango / Toolhouse — Very high substitution risk as MCP becomes Linux Foundation-governed open standard with broad ecosystem adoption.

The competition for Windsurf (formerly Codeium) illustrates the intensity of developer tooling acquisition pressure. OpenAI attempted a $3B acquisition that collapsed in July 2025 over Microsoft IP concerns. Google immediately moved with a $2.4B licensing deal plus acqui-hire of CEO Varun Mohan and approximately 40 engineers to DeepMind for $250M. Cognition AI acquired the remainder. Total capital deployed in the Windsurf sequence: over $2.65B for a single developer tooling asset.

The Great Compression: Model Providers Are Swallowing the Agent Harness Layer

Six Providers, Six Vertical Integration Strategies

OpenAI — Full-Stack Vertical Integrator

Eight Functions Absorbed, One Attempt Blocked

Anthropic — Protocol Standard-Setter

Protocol Ownership as Platform Control

Google — Open-Source Platform Maximalist

Developer Gravity at Ecosystem Scale

The Acquisition Pattern: Three Capability Categories Being Targeted

1. AI Coding Assistants and Developer Tooling

2. Evaluation, Security, and Observability

3. General-Purpose Agents and Memory

What Orchestration Frameworks Tell Us About Provider Intent

What This Means for LangChain’s $1.25B Valuation

The Harness Layer Segmented: What Survives and What Does Not

Functions Being Absorbed by Provider Platforms

Functions Retaining Independent Value

The Enterprise Strategic Implication

The Vendor Lock-In Risk That Evaluation Consolidation Creates

The Open Protocol Question: MCP as Infrastructure or as Anthropic Advantage

The Four-Layer Harness Claim and What It Actually Means

What the Compression Means for Enterprise Decision-Makers

The Great Compression — March 2026

Share this: