The Great Compression thesis has a particular shape. It says the agent harness — the runtime layer that sits between the model and the enterprise — is the structural bottleneck of agentic AI, and that whoever holds the harness holds the production system. Twelve months ago that was an analytical claim. Today it is a documentary one. The receipts have arrived in three forms: substrate-side acquisitions, model-provider acquisitions, and provider-native displacement. The third form pays nothing and is the most consequential.
This dispatch reads the three vectors structurally. It then turns to the centerpiece — the published diagnosis of the framework category from Anthropic and OpenAI, which is not a marketing posture but an architectural instruction — and shows how that diagnosis interacts with the independent layer below. The closing accounting names what the receipts confirm and what they do not yet answer.
Three Vectors. The Third Pays Nothing.
Vector one: substrate-side absorption. The substrate platforms have been pulling observability and evaluation tooling inward. CoreWeave’s acquisition of Weights & Biases closed in May 2025 for a reported $1.7 billion. Cisco’s acquisition of Galileo is consolidating into the Splunk observability stack. Both transactions are substrate plays — the GPU layer and the data-plane layer respectively claiming the model-evaluation surface as part of their integrated offering.
Vector two: model-provider acquisition. The model providers have been pulling harness-adjacent tooling directly into their stacks. Anthropic’s acqui-hire of Humanloop in August 2025 brought in a prompt-management and evaluation team. OpenAI’s acquisition of Promptfoo in March 2026 absorbed a prompt-evaluation framework with documented penetration into roughly a quarter of the Fortune 500. These are not financial moves; they are stack moves. The capability the customer was buying as a third-party tool is now native to the model surface.
Vector three: provider-native displacement. No transaction. No headline. This is the silent vector, and it is the most consequential of the three. The model providers have been quietly publishing guidance, shipping minimal-abstraction SDKs, and documenting native observability stacks that displace the framework layer architecturally — not by acquiring it, but by making it unnecessary. The independents in this layer raised hundreds of millions of dollars to build framework infrastructure. The providers are now telling enterprises, in published documentation, that the framework layer creates problems they should avoid.
The Published Diagnosis
This is the centerpiece of the receipts. It rests on three pillars, all visible in published material from Anthropic and OpenAI over the past twelve months.
Pillar one: framework-skeptical guidance, in writing
Anthropic’s engineering team, in Building Effective Agents by Erik Schluntz and Barry Zhang, states the position directly. Frameworks “create extra layers of abstraction that can obscure the underlying prompts and responses, making them harder to debug.” The same piece notes that “incorrect assumptions about what’s under the hood are a common source of customer error.” That language is not a tonal preference. It is an architectural instruction: write to the model API directly, and reach for a framework only when the abstraction earns its complexity.
OpenAI’s A Practical Guide to Building Agents publishes the parallel position. The Agents SDK is described as having “very few abstractions” by design. The two parties with the most production authority over how enterprises build agentic systems have published the same diagnosis of the framework category in the same window. That convergence is the receipt.
Pillar two: minimal-abstraction native SDKs
The published diagnosis is operationalized in product. OpenAI’s Agents SDK ships with built-in tracing via a BackendSpanExporter and OpenTelemetry pluggability through add_trace_processor(). The OpenTelemetry community has published official guidance and an instrumentation library — opentelemetry-instrumentation-openai-agents-v2 — that wires the SDK directly into the GenAI Semantic Conventions. The framework layer is not being replaced by another framework. It is being replaced by an SDK that does less, on purpose, and exposes the seams that frameworks were obscuring.
The practitioner signal is consistent. CrewAI’s documented pivot from a LangChain-mounted architecture to native Python in 2024 is the one-line witness: when the framework category is structurally diagnosed, even projects built on top of it move off.
Pillar three: native observability, three layers deep
The displacement is not just at the SDK layer. Anthropic ships three layers of native observability that displace the independent observability category for Anthropic-customer workloads. First, a native OpenTelemetry trace pipeline from Cowork, Claude Code, the API, and the Agent SDK directly into enterprise observability stacks — Splunk, Cribl, Datadog, Honeycomb, Elastic. Second, native admin dashboards inside Cowork and a Claude Code analytics dashboard in public beta with usage metrics, contribution metrics, leaderboards, GitHub integration, and CSV export. Third, an Admin Usage and Cost API with native integrations into Datadog, Grafana Cloud, and Honeycomb.
OpenAI ships a structurally similar stack with a slightly different default posture — traces flow into the OpenAI dashboard first, with OpenTelemetry available as an escape hatch. The architectural conclusion is the same: the model provider is the observability vendor for its own workload. The independent observability category is being absorbed without being bought.
Published guidance against frameworks. Minimal-abstraction SDKs. Three layers of native observability. The framework category and the observability category are being structurally displaced by the parties whose workloads they were built to serve — without an acquisition, without a transaction, and in public.
The Independents and the Foreclosed Escape Route
Read against the published diagnosis, the independent layer’s structural position becomes visible.
LangChain. The framework category is the one Anthropic and OpenAI have publicly diagnosed. LangSmith, LangChain’s observability product, is marketed as framework-agnostic but is structurally coupled to LangChain primitives in product design — practitioner analyses from ZenML and competitive surveys document the dependency. LangGraph, the multi-agent state-coordination layer, addresses a deployment pattern that — according to McKinsey’s “Building the Foundations for Agentic AI at Scale” — is not the dominant production pattern; single-agent architectures account for the majority of enterprise deployments that reach production. The foundation has been diagnosed; the downstream stack inherits the diagnosis.
Arize. The strategic move was to absorb the standards layer through OpenInference, the open-source telemetry specification it stewards. That move is foreclosed. MLflow has won the open-source observability and evaluation incumbency at this point — sixty million-plus monthly downloads on the latest GitHub README, Linux Foundation governance, OpenTelemetry-native by design, and GenAI Semantic Conventions support shipped natively. A startup attempting to fork the open-source standard with foundation governance is attempting to fork an incumbent that has already won. Standards do not fork well in that direction.
Langfuse. The structural counterfactual. MIT-licensed, OpenTelemetry-native, with a small team and modest seed capital, Langfuse has built credible practitioner mindshare in the observability layer at a fraction of the capital intensity of the better-funded independents. The point is not that Langfuse will win; it is that the consolidation visible in the other two cases is capital-structure-driven, not capability-driven. The category is consolidating because the capital required to operate at scale collides with the published diagnosis from above.
The Honest Accounting
Three observations close this dispatch.
The foundation has been diagnosed in public. The framework category — the one the largest independent in this layer is built on — has been structurally diagnosed by both Anthropic and OpenAI in published guidance. The downstream stack inherits that diagnosis. Below it, the open-source observability and evaluation incumbent — MLflow, governed by the Linux Foundation, OTel-native — sets the substrate that any independent observability or evaluation play has to either build with or build around. Both directions of structural pressure are now documented.
Google is the patient observer. While the consolidation runs through the OpenAI-Anthropic axis, Google has positioned itself across both flanks of the bet. On the open-source flank, Vertex AI integrates with LangChain natively, and Vertex AI Agent Engine has named OpenInference as a partner specification. On the integrated-provider flank, Gemini Enterprise is the integrated platform play. This is the same Android-versus-Pixel posture Google has run for twenty-five years: do not commit to a single vector when you can absorb whichever vector wins. The compression has not yet reached Google’s hand.
The McKinsey anchor. McKinsey’s State of AI in 2025: Agents, innovation, and transformation (November 2025) documents the production reality this consolidation is operating against: AI adoption has reached eighty-eight percent of organizations, yet only seven percent report AI fully scaled across the enterprise, just thirty-nine percent report enterprise-level EBIT impact, and roughly eighty percent of enterprises cite data limitations as the primary roadblock to scaling. The follow-on State of AI Trust in 2026 sharpens the governance picture: only about thirty percent of organizations reach maturity level three or higher in strategy, governance, and agentic AI controls — and security and risk concerns are the top barrier to scaling agentic AI. The harness layer is consolidating above a substrate problem that the consolidation itself does not address.
The receipts confirm what the forecast said. The harness layer is being absorbed — through transactions in two of the three vectors, and through structural displacement in the third and most consequential one. The foundation under it has been publicly diagnosed by the parties with the most production authority, and the open-source incumbent below it has consolidated. McKinsey’s late-April 2026 Reimagining Tech Infrastructure for Agentic AI names the next register in their own voice: less than ten percent of agentic programs reach meaningful scale, and non-labor IT infrastructure costs are projected to increase two- to threefold by 2030 against flat budgets. What the receipts do not yet answer is the substrate question. The data substrate beneath the harness — what the agent actually retrieves, with what authority, with what provenance, with what binding to action — has not been examined in this dispatch. The next one picks it up, against the Substrate Fitness Criteria — DISC, PERM, CTX, ACT, PROV — that name what an agentic substrate actually has to satisfy.
