Skip to content

feat: Propagate OTel tracing context from client to server through websocket#173

Closed
lhchavez wants to merge 1 commit intomainfrom
feat/otel-server-trace-propagation
Closed

feat: Propagate OTel tracing context from client to server through websocket#173
lhchavez wants to merge 1 commit intomainfrom
feat/otel-server-trace-propagation

Conversation

@lhchavez
Copy link
Collaborator

Why

The River Python library already injects OpenTelemetry trace context (traceparent/tracestate) into outgoing TransportMessages on the client side, but the server never extracts this context or creates corresponding server-side spans. This means distributed traces are broken across the websocket boundary — the server has no visibility into which client trace initiated a given RPC.

What changed

src/replit_river/rpc.py

  • Added TransportMessageTracingGetter — the extraction counterpart to the existing TransportMessageTracingSetter. Implements the OTel Getter protocol to read traceparent and tracestate from a TransportMessage.tracing field.

src/replit_river/server_session.py

  • In _open_stream_and_call_handler(), the server now extracts the incoming trace context from the first message of each stream using TraceContextTextMapPropagator.extract().
  • Creates a SERVER span (named river.server.<method_type>.<service>.<procedure>) that is a child of the client's CLIENT span, with attributes for service name, procedure name, method type, stream ID, and client ID.
  • Runs the RPC handler within the extracted trace context via _run_handler_with_tracing(), so any downstream OTel instrumentation within the handler automatically inherits the correct parent trace.
  • The server span is ended and status set (OK/ERROR) after the handler completes.

tests/v1/test_opentelemetry.py

  • Updated existing span count assertions to account for both client and server spans.
  • Added 6 new test cases:
    • test_rpc_trace_propagation — verifies same trace ID, parent-child relationship, span kinds
    • test_subscription_trace_propagation — verifies propagation for subscriptions
    • test_upload_trace_propagation — verifies propagation for uploads
    • test_stream_trace_propagation — verifies propagation for bidirectional streams
    • test_server_span_has_attributes — verifies span attributes on the server span
    • test_multiple_rpcs_have_independent_traces — verifies independent RPCs get independent traces

Test plan

All 67 existing tests pass, including the 6 new trace propagation tests:

$ uv run pytest tests/ -v
67 passed in 8.76s

Key verifications in the new tests:

  • Client and server spans share the same trace_id
  • Server span's parent.span_id matches client span's span_id
  • Client spans have SpanKind.CLIENT, server spans have SpanKind.SERVER
  • Server spans carry correct river.* attributes
  • Independent RPCs produce independent traces (different trace IDs)

Revertibility

This change is safe to revert — it only adds new behavior (server-side span creation) without modifying the wire protocol or any existing client behavior.

~ written by Zerg 👾

…bsocket

Add server-side trace context extraction and span creation so that
distributed traces flow end-to-end through River websocket connections.

Changes:
- Add TransportMessageTracingGetter to extract traceparent/tracestate
  from incoming TransportMessages (counterpart to existing Setter)
- Extract trace context in ServerSession._open_stream_and_call_handler
  and create a SERVER span that is a child of the client's CLIENT span
- Run handler within the extracted context so downstream code inherits
  the trace
- Update and expand tests to verify server spans, trace propagation,
  span attributes, and independent traces for concurrent RPCs
@lhchavez
Copy link
Collaborator Author

Superseded by a simpler approach that propagates OTel context (traceparent, tracestate, baggage) via standard HTTP headers on the WebSocket upgrade request, rather than per-RPC tracing.

~ written by Zerg 👾

@lhchavez lhchavez closed this Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant