Skip to content

[FEATURE] Map RAG_EMBEDDING spans to GenAI / Langfuse embedding observations #17914

@sgarfinkel

Description

@sgarfinkel

Problem Statement

Mastra emits RAG_EMBEDDING child spans with RagEmbeddingAttributes (model, provider, usage, dimensions, mode, inputCount), but @mastra/otel-exporter's SpanConverter does not map those attributes to OpenTelemetry GenAI semantic conventions. Only MODEL_GENERATION spans get gen_ai.request.model, gen_ai.provider.name, and gen_ai.usage.*.

As a result, embedding spans export with:

  • gen_ai.operation.name = "rag_embedding" (not the OTel predefined value "embeddings")
  • I/O under mastra.rag_embedding.input / mastra.rag_embedding.output
  • No model or token usage on the OTel span
  • SpanKind.INTERNAL instead of CLIENT

When exported to Langfuse via @mastra/langfuse@langfuse/otel, these spans appear as generic observations without model name, token usage, or computed cost — even though the Mastra span attributes are populated correctly at the source.

Proposed Solution

Extend @mastra/otel-exporter (observability/otel-exporter/src/gen-ai-semantics.ts and span-converter.ts) with explicit RAG_EMBEDDING handling:

  1. getOperationName — return "embeddings" per OTel GenAI semconv.
  2. getSpanIdentifier — use RagEmbeddingAttributes.model for span naming (e.g. embeddings text-embedding-3-small).
  3. getAttributes — for RAG_EMBEDDING, emit:
    • gen_ai.request.model from attributes.model
    • gen_ai.provider.name from attributes.provider (via existing normalizeProvider)
    • gen_ai.usage.* from attributes.usage (via existing formatUsageMetrics)
    • Optionally preserve embedding-specific metadata (mode, dimensions, inputCount) on mastra.rag_embedding.* or dedicated attributes.
  4. getSpanKind — return SpanKind.CLIENT for RAG_EMBEDDING.

Optionally, in @mastra/langfuse (mapMastraToLangfuseAttributes), when the source Mastra span type is RAG_EMBEDDING, set Langfuse-native attributes so observations are typed as embeddings with cost:

  • langfuse.observation.type = "embedding"
  • langfuse.observation.model.name
  • langfuse.observation.usage_details (from usage)

This avoids consumers having to re-type spans as MODEL_GENERATION in customSpanFormatter workarounds.

Component

  • Observability

Alternatives Considered

  • Promoting RAG_EMBEDDINGMODEL_GENERATION in a customSpanFormatter before export. Works for Langfuse cost tracking but mislabels embeddings as chat generations (gen_ai.operation.name = "chat") and drops embedding-specific attributes (mode, dimensions, inputCount).
  • Relying on Langfuse server-side inference from gen_ai.* alone without exporter changes. Insufficient today because the exporter never emits the required attributes for non-MODEL_GENERATION spans.

Example Use Case

As a developer exporting traces to Langfuse, I want RAG_EMBEDDING spans from RAG ingestion and query paths to appear as embedding observations with the correct model name and input token usage, so I can track embedding cost alongside LLM generation cost without custom span-type rewriting.

Additional Context

  • RagEmbeddingAttributes is documented in @mastra/core as using the same UsageStats shape as MODEL_GENERATION specifically so cost pipelines work uniformly.
  • Langfuse SDK v5 defines a first-class EMBEDDING observation type with the same model / usageDetails / costDetails fields as GENERATION.
  • Suggested tests: one gen-ai-semantics.test.ts case asserting gen_ai.operation.name, gen_ai.request.model, and gen_ai.usage.input_tokens for a RAG_EMBEDDING span; one span-converter.test.ts case for SpanKind.CLIENT.

Environment Information

N/A

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions