Skip to content

Add detailed OTel inference sub-spans and X-Trace-Id response header#2148

Open
hansent wants to merge 1 commit intomainfrom
feat/otel-inference-instrumentation
Open

Add detailed OTel inference sub-spans and X-Trace-Id response header#2148
hansent wants to merge 1 commit intomainfrom
feat/otel-inference-instrumentation

Conversation

@hansent
Copy link
Collaborator

@hansent hansent commented Mar 24, 2026

Summary

  • Adds model.preprocess, model.predict, model.postprocess child spans inside model.infer to break down where inference time is spent (image decoding vs GPU execution vs NMS/formatting)
  • Adds model.input_shape attribute on the preprocess span and model.infer.caller attribute to distinguish async vs sync FastAPI code paths
  • Returns X-Trace-Id header in HTTP responses so forced traces can be easily located in the collector without searching

Test plan

  • Deploy to staging with X-Force-Trace: true and verify X-Trace-Id is returned in response headers
  • Verify new child spans (model.preprocess, model.predict, model.postprocess) appear in trace waterfall under model.infer
  • Verify model.input_shape and model.infer.caller attributes are populated
  • Confirm no regression in inference latency (span overhead should be microseconds)

…response header

- Add model.preprocess, model.predict, model.postprocess sub-spans to
  BaseInference.infer() to break down where inference time is spent
- Add model.input_shape attribute to preprocess span
- Add model.infer.caller attribute to distinguish async vs sync code paths
- Return X-Trace-Id header in HTTP responses for easy trace lookup
- Expose X-Trace-Id in CORS headers

Generated with AI

Co-Authored-By: AI <ai@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant