OpenTelemetry Native vs Supported

The distinction between OpenTelemetry “Native” and “Supported” implementations represents a fundamental architectural choice with significant implications for vendor lock-in, data fidelity, operational flexibility, and long-term total cost of ownership.

Choosing a proprietary instrumentation approach can result in 18-24 months of engineering effort to migrate between vendors, while OpenTelemetry-native solutions enable platform switches in days to weeks. For C-suite leaders evaluating multi-year observability investments, this difference represents millions in avoided switching costs and preserved strategic flexibility.

$760K

The 5-year TCO difference between OTEL-native ($340K) and proprietary SDK ($1.1M) approaches — driven almost entirely by migration costs when switching platforms.

Defining the Spectrum

OpenTelemetry Native

Platform built from the ground up on OpenTelemetry standards. Internal data model directly mirrors OTLP (OpenTelemetry Protocol). GenAI Semantic Conventions stored without translation. Query interfaces understand OTLP attribute semantics natively.

OpenTelemetry Supported

Platform accepts OTLP data via ingestion endpoint. Internal data model differs from OTLP (translation layer required). GenAI attributes mapped to proprietary schema. Query interfaces may expose OTLP concepts but with limitations.

Proprietary SDK

Platform requires vendor-specific instrumentation library. No OTLP ingestion capability. Complete vendor lock-in for instrumented code. Switching vendors requires re-instrumenting entire codebase.

Architecture Patterns

OpenTelemetry Native Architecture

End-to-End OTLP Flow

Application

OpenTelemetry SDK

GenAI Semantic Conventions

↓

OTLP Collector

Standard processing

No translation needed

↓

Native Storage

OTLP schema preserved

Semantic conventions preserved

↓

Query/UI Layer

Native attribute semantics

Context-aware filtering

Key Advantages

Zero Translation Loss: All GenAI semantic convention attributes preserved exactly as specified
Attribute Semantics: UI understands attribute types and provides contextual actions (e.g., “gen_ai.usage.input_tokens” renders as numeric cost calculation, not just text)
Resource-Centric Navigation: Natural filtering by service.name, deployment.environment using OpenTelemetry resource concepts
Future-Proof: Automatic support for new semantic convention versions without vendor updates

OpenTelemetry Supported Architecture

OTLP with Translation Layer

Application

OpenTelemetry SDK

GenAI Semantic Conventions

↓

OTLP Endpoint

Receives OTLP

Translation begins

↓

Internal Format

Vendor-specific schema

Attribute mapping

↓

Proprietary Storage

Internal data model

Semantic loss possible

Translation Layer Implications

Attribute Mapping: GenAI semantic conventions mapped to vendor’s internal attribute names (potential inconsistencies)
Type Coercion: Structured attributes may be flattened to strings (e.g., gen_ai.request.temperature from float to text)
Semantic Loss: Vendor UI shows key-value pairs without understanding attribute meaning
Version Lag: Support for new semantic convention versions depends on vendor update cycles

Proprietary SDK Architecture

Vendor-Specific Instrumentation

Application

Vendor SDK

Proprietary decorators/tracers

↓

Vendor Endpoint

Custom protocol

Vendor-specific format

↓

Backend

Optimized for vendor format

No OTLP compatibility

Vendor Lock-in Realities

Re-instrumentation Required: Switching vendors requires replacing all instrumentation decorators/SDK calls
Framework Coverage: Limited to languages/frameworks the vendor supports
Migration Cost: 18-24 months for large codebases with thousands of instrumentation points
Strategic Risk: Vendor pricing changes or feature deprecation creates operational crisis

Technical Feature Comparison

Dimension

OTEL Native

OTEL Supported

Proprietary SDK

Data Fidelity

100% preservation

70-95% depending on translation

N/A (proprietary format)

Attribute Semantics

UI understands types

Text key-value pairs

Vendor-defined only

Multi-Language

Immediate (any OTEL SDK)

Via OTLP with overhead

Limited to vendor SDKs

Migration Effort

Hours to days

Days to weeks

18-24 months

Convention Updates

Automatic

Requires vendor updates

Requires SDK releases

Vendor Lock-in Risk

Minimal

Moderate

Severe

GenAI Semantic Conventions

The OpenTelemetry GenAI Semantic Conventions (v1.38.0+) define standardized attributes for LLM and agent operations. Native platforms leverage these directly; supported platforms must translate them.

Model Operations

Identify what LLM operation occurred and which provider/model was used

gen_ai.operation.name
gen_ai.request.model
gen_ai.response.model
gen_ai.system

Token Metrics

Cost tracking, usage analytics, and budget monitoring

gen_ai.usage.input_tokens
gen_ai.usage.output_tokens
gen_ai.usage.total_tokens
gen_ai.token.type

Agent Operations

Multi-step agent tracing and tool-calling visibility

gen_ai.agent.id
gen_ai.agent.name
gen_ai.tool.call.id
gen_ai.tool.call.name

Response Metadata

Debugging, reproducibility, and quality analysis

gen_ai.response.id
gen_ai.response.finish_reasons
gen_ai.request.temperature
gen_ai.request.max_tokens

Platform Implementation Analysis

OpenTelemetry Native Platforms

Arize AI / Phoenix

Native

Architecture: Built on OpenInference semantic conventions (OpenTelemetry extension). Entire stack designed around OTLP from day one.

Data Flow: Application → OTLP Collector → Phoenix (no translation) → ClickHouse with OTLP schema

OpenInference span attributes map directly to UI elements
Resource explorer shows Kubernetes/service hierarchies using OTEL resource attributes
Embedding analysis for RAG uses OTLP embedding spans
Phoenix OSS runs locally or self-hosted with identical schema

Migration Path: Change OTLP endpoint URL. No code changes required.

Phoenix Docs OpenInference GitHub Convention Translation

Langfuse

Native

Architecture: OTEL-native SDK v3 released June 2025. Complies with GenAI semantic conventions with langfuse.* namespace for extensions.

Data Flow: Application → OTLP SDK → /api/public/otel endpoint → Direct storage with semantic preservation

Langfuse.* namespace attributes for Langfuse-specific features
OTLP context propagation enables automatic integration with HTTP frameworks, databases
Property mapping maintains GenAI convention compliance
MIT-licensed, full self-hosting support

Language Support: Immediate support for all OpenTelemetry SDK languages via OpenLLMetry/OpenLIT

OTEL Integration Python SDK v3 GitHub (6.3k★)

Datadog LLM Observability

Native

Architecture: Native GenAI Semantic Conventions support (v1.37+) with Datadog Agent acting as OTLP collector.

Data Flow: Application → OTLP → Datadog Agent → Datadog backend (preserves OTLP structure)

Full-stack integration: correlates LLM traces with APM, RUM, logs
Out-of-box evaluators use GenAI convention attributes directly
Cost tracking leverages gen_ai.usage.* attributes without translation
Agent-based collection enables unified observability strategy

Enterprise Value: Organizations already on Datadog gain LLM observability without new vendor relationship

LLM Observability Docs Product Overview

OpenTelemetry Supported Platforms

LangSmith

Supported

Architecture: Primary SDK is LangChain-native; OTLP endpoint added for interoperability. Translation layer converts OTLP to LangSmith’s internal trace format.

OTLP exporter available for non-LangChain applications
Best experience with native LangChain/LangGraph integration
Some semantic conventions may lose fidelity in translation

Best Use Case: Teams heavily invested in LangChain ecosystem who occasionally need OTLP interop

LangSmith Docs LangChain Blog

W&B Weave

Supported

Architecture: Accepts OTLP via standard endpoints but stores in Weave’s internal format optimized for ML experiment tracking lineage.

One-line MCP agent auto-logging uses OTLP under the hood
Translation maintains most GenAI conventions
Best integration with broader W&B MLOps ecosystem

Best Use Case: Organizations using W&B for ML tracking seeking to add LLM observability

Weave Docs Product Website

Proprietary SDK Platforms

Galileo AI

Proprietary

Architecture: Purpose-built evaluation platform with proprietary SDK and data format.

Strategic Rationale:

Luna-2 SLMs optimize for proprietary trace format (not OTLP)
Agent-specific optimizations difficult to standardize
Insights Engine requires vendor-specific schema

Lock-in Mitigation: Enterprise export APIs for extracting evaluation data

Value Proposition: 97% cost reduction vs GPT-4-as-judge justifies proprietary approach for cost-sensitive deployments

Galileo Docs Website

Braintrust

Proprietary

Architecture: SDK-based with Brainstore database optimized for AI application logs.

Loop AI agent requires tight coupling to evaluation loop
Dataset version control and diffing built into proprietary format
Fast evaluation execution via custom parallel processing

Migration Consideration: CI/CD GitHub Actions tightly integrated; switching requires rebuilding evaluation pipelines

Braintrust Docs Website

Decision Framework

Choose OpenTelemetry Native When:

1. Multi-Vendor Strategy Required

Enterprise architecture committee mandates ability to switch observability vendors with <90 days notice. Regulatory or compliance requirements prevent vendor lock-in.

2. Polyglot Architecture

Application stack spans Python, Java, Go, Rust, Node.js. No single vendor provides native SDKs for all languages. Need consistent observability across diverse technology stack.

3. Existing OpenTelemetry Investment

Already instrumented infrastructure and backend services with OpenTelemetry. Want to extend to LLM/agent applications with same standard.

4. Cost Optimization Priority

Need collector-level control over sampling, filtering, aggregation. Want to optimize ingestion costs independently of vendor pricing.

5. Future-Proofing Critical

Building multi-year strategic platform. Need confidence that instrumentation will work with future vendors/tools. Want automatic support for emerging semantic conventions.

Choose OpenTelemetry Supported When:

1. Existing Vendor Relationship

Already using vendor for traditional observability. Want to add LLM observability without introducing new vendor. Unified billing and support valuable.

2. Framework-Specific Optimization

Heavily invested in specific framework (LangChain → LangSmith). Native integration provides richer features than generic OTLP.

3. Gradual Migration Path

Starting with proprietary SDK for speed, planning OTLP migration later. Need bridge between legacy instrumentation and modern standards.

Accept Proprietary SDK When:

1. Unique Differentiation Required

Vendor provides capabilities impossible with standard OTLP (e.g., Galileo Luna-2 cost savings). Feature set justifies lock-in risk.

2. Single-Language, Single-Framework Shop

Monolithic Python application using single framework. No plans for polyglot expansion. Vendor SDK coverage matches technology stack completely.

3. Proof of Concept / MVP

Short-term project where speed to value exceeds long-term concerns. Explicit plan to migrate before production scale.

Total Cost of Ownership Analysis

The $760K difference ($1.1M vs $340K) is driven almost entirely by migration cost. OTEL-native platforms require only collector endpoint configuration changes, while proprietary SDKs require rewriting all instrumentation code.

TCO Methodology & Assumptions

Scope: This analysis models a mid-to-large enterprise with significant instrumentation footprint (1,000+ instrumentation points) over a 5-year planning horizon.

Cost Basis

Engineering labor: $75-100/hour fully-loaded cost
Implementation team: 2-3 senior engineers for initial instrumentation
Operations allocation: 0.25-0.5 FTE for ongoing monitoring, upgrades, configuration

OTEL-Native Calculations

Implementation ($50K-$100K): 3 weeks avg × 2.5 engineers × 40 hrs × $85/hr = ~$85K mid-point
Annual Operations ($20K-$40K): 0.25 FTE × $120K/yr fully-loaded = $30K mid-point
Migration ($10K-$20K): Configuration change + validation testing; no code rewrites
5-Year Total: $340K (assumes 2 platform switches)

Proprietary SDK Calculations

Implementation ($30K-$50K): Faster initial setup with vendor “quick start” SDKs
Annual Operations ($30K-$60K): Higher due to vendor-specific SDK version management
Migration ($500K-$1M): Rewriting instrumentation across thousands of call sites over 18-24 months
5-Year Total: $1.1M (assumes 1 migration event)

What’s NOT Included

Platform subscription/licensing fees (varies widely by vendor)
Infrastructure costs (collectors, storage, compute)
Productivity gains/losses from platform-specific features
Training costs beyond initial implementation

Strategic Recommendations

Primary Recommendation: Default to OpenTelemetry Native

Unless you have compelling reasons to choose otherwise, OpenTelemetry Native platforms should be your default. The benefits of vendor-agnostic instrumentation, reduced switching costs, and future-proofing outweigh the slightly higher initial complexity.

Strategic Flexibility

Preserve M&A optionality (switches in weeks not years)
Maintain strong vendor negotiation position
Enable multi-vendor strategy for different use cases

Operational Excellence

Unified observability across all services
Consistent instrumentation approach
Reduced training and onboarding complexity

Innovation Velocity

Support new languages immediately
Experiment with emerging frameworks
Adopt new semantic conventions automatically

Cost Management

Collector-level cost optimization
Avoid vendor-specific pricing tiers
Minimize switching costs

Acceptable Exceptions for Proprietary SDKs

Demonstrable ROI: Galileo’s 97% cost reduction vs GPT-4-as-judge provides clear financial justification
Narrow Use Case: Evaluation-only scenarios where production monitoring not required
Framework Optimization: LangSmith for LangChain if native integration provides critical features unavailable via OTLP
Temporary Solution: Proof of concept with explicit plan to migrate to OTLP before production scale

Key Insight

The $760K difference represents the “vendor lock-in premium” that compounds over multi-year enterprise relationships. OTEL-native platforms require only collector endpoint configuration changes ($10K-$20K), while proprietary SDKs require rewriting all instrumentation code ($500K-$1M).

Defining the Spectrum

OpenTelemetry Native

OpenTelemetry Supported

Proprietary SDK

Architecture Patterns

OpenTelemetry Native Architecture

Application

OTLP Collector

Native Storage

Query/UI Layer

Key Advantages

OpenTelemetry Supported Architecture

Application

OTLP Endpoint

Internal Format

Proprietary Storage

Translation Layer Implications

Proprietary SDK Architecture

Application

Vendor Endpoint

Backend

Vendor Lock-in Realities

Technical Feature Comparison

GenAI Semantic Conventions

Model Operations

Token Metrics

Agent Operations

Response Metadata

Platform Implementation Analysis

OpenTelemetry Native Platforms

Arize AI / Phoenix

Langfuse

Datadog LLM Observability

OpenTelemetry Supported Platforms

LangSmith

W&B Weave

Proprietary SDK Platforms

Galileo AI

Braintrust

Decision Framework

Choose OpenTelemetry Native When:

1. Multi-Vendor Strategy Required

2. Polyglot Architecture

3. Existing OpenTelemetry Investment

4. Cost Optimization Priority

5. Future-Proofing Critical

Choose OpenTelemetry Supported When:

1. Existing Vendor Relationship

2. Framework-Specific Optimization

3. Gradual Migration Path

Accept Proprietary SDK When:

1. Unique Differentiation Required

2. Single-Language, Single-Framework Shop

3. Proof of Concept / MVP

Total Cost of Ownership Analysis

TCO Methodology & Assumptions

Cost Basis

OTEL-Native Calculations

Proprietary SDK Calculations

What’s NOT Included

Strategic Recommendations

Primary Recommendation: Default to OpenTelemetry Native

Strategic Flexibility

Operational Excellence

Innovation Velocity

Cost Management

Acceptable Exceptions for Proprietary SDKs

Related Resources

Share this: