Skip to content

Add span events to track streaming request milestones for detailed streaming performance analysis #476

@psschwei

Description

@psschwei

Description:
Add span events to track streaming request milestones for detailed streaming performance analysis.

Detailed Requirements:

  1. Add span events during streaming:
    • first_token - Time to first token received
    • chunk_received - Periodic chunk delivery (configurable sampling)
    • stream_complete - Stream completion with totals
  2. Event attributes:
    • ttfb_ms - Milliseconds to first token
    • chunk_size - Size of chunk in tokens/bytes
    • chunks_received - Total chunks at completion
    • total_tokens - Total tokens at completion
  3. Integrate with streaming backends:
    • OpenAI streaming responses
    • Other streaming-capable backends
  4. Make chunk event sampling configurable (avoid event spam)

Files to Modify:

  • mellea/telemetry/__init__.py - Add span event helpers
  • mellea/backends/openai.py - Add streaming events
  • Other streaming backends as applicable

Environment Variables:

Variable Default Description
MELLEA_TRACE_STREAM_CHUNKS false Emit events for each chunk
MELLEA_TRACE_STREAM_SAMPLE_RATE 10 Sample every Nth chunk

Event Schema:

span.add_event("first_token", {"ttfb_ms": 245})
span.add_event("chunk_received", {"chunk_index": 5, "tokens": 12})
span.add_event("stream_complete", {"total_chunks": 15, "total_tokens": 150})

Acceptance Criteria:

  • First token event with TTFB timing
  • Optional chunk events with sampling
  • Completion event with totals
  • Works with OpenAI streaming

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions