Skip to content

Fix v1/traces endpoint to return protobuf instead of JSON#18929

Merged
harupy merged 4 commits intomasterfrom
copilot/fix-v1-traces-response-format
Nov 20, 2025
Merged

Fix v1/traces endpoint to return protobuf instead of JSON#18929
harupy merged 4 commits intomasterfrom
copilot/fix-v1-traces-response-format

Conversation

Copy link
Contributor

Copilot AI commented Nov 20, 2025

🛠 DevTools 🛠

Open in GitHub Codespaces

Install mlflow from this PR

# mlflow
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/18929/merge
# mlflow-skinny
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/18929/merge#subdirectory=libs/skinny

For Databricks, use the following command:

%sh curl -LsSf https://raw.githubusercontent.com/mlflow/mlflow/HEAD/dev/install-skinny.sh | sh -s pull/18929/merge

Related Issues/PRs

Resolve #[issue_number]

What changes are proposed in this pull request?

The /v1/traces OTLP endpoint was returning JSON (via Pydantic model) while declaring Content-Type: application/x-protobuf, causing protobuf parsers to fail with "cannot parse invalid wire-format data".

Changes:

  • Replace Pydantic response model with ExportTraceServiceResponse protobuf serialization
  • Remove unused OTelExportTraceServiceResponse class and response parameter
  • Return serialized protobuf bytes in FastAPI Response with correct media type

Before:

# Returned JSON despite protobuf Content-Type header
return OTelExportTraceServiceResponse()  # Pydantic model

After:

# Returns actual protobuf bytes
response_message = ExportTraceServiceResponse()
return Response(
    content=response_message.SerializeToString(),
    media_type="application/x-protobuf",
)

How is this PR tested?

  • Existing unit/integration tests (15/15 pass)
  • New unit/integration tests
  • Manual tests

Added test_response_is_protobuf_format() that verifies:

  • Response Content-Type is application/x-protobuf
  • Response body deserializes as valid ExportTraceServiceResponse

Does this PR require documentation update?

  • No. You can skip the rest of this section.
  • Yes. I've updated:
    • Examples
    • API references
    • Instructions

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

Fixed /v1/traces endpoint to return valid protobuf responses per OTLP specification, resolving compatibility issues with Go and other strict protobuf clients.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/prompts: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages

How should the PR be classified in the release notes? Choose one:

  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Should this PR be included in the next patch release?

  • Yes (this PR will be cherry-picked and included in the next patch release)
  • No (this PR will be included in the next minor release)
Original prompt

This section details on the original issue you should resolve

<issue_title>[BUG] v1/traces response is in json format instead of protobuf</issue_title>
<issue_description>### Issues Policy acknowledgement

  • I have read and agree to submit bug reports in accordance with the issues policy

Where did you encounter this bug?

Local machine

MLflow version

3.6.0

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS 15.7.2 (24G325)

Describe the problem

I am running a Go app that communicates with the containerized mlflow 3.6.0 version. When using go.opentelemetry to send traces to mlflow, the following error is logged:

proto: cannot parse invalid wire-format data

I guess this is happening because the response headers for the endpoint v1/traces are being set to application/x-protobuf here:

response.headers["Content-Type"] = "application/x-protobuf"

while the response is a json (

@otel_router.post("", response_model=OTelExportTraceServiceResponse, status_code=200)
)

Tracking information

Code to reproduce issue

You may confirm the response is a json with this simple curl:

echo 'Cu4BCjoKHAoMc2VydmljZS5uYW1lEgwKCm15LXNlcnZpY2UKGgoPc2VydmljZS52ZXJzaW9uEgcKBTEuMC4wEq8BChQKDW1sZmxvdy10cmFjZXISAzEuMBKWAQoQVmuTkZmAV/PKyL8LFzZCExIIbrKu35F4Mr0qDG15X29wZXJhdGlvbjABOQB/5UDdenkYQQBJgHzdenkYShUKC2h0dHAubWV0aG9kEgYKBFBPU1RKFwoQaHR0cC5zdGF0dXNfY29kZRIDGMgBSiIKEGN1c3RvbS5hdHRyaWJ1dGUSDgoMY3VzdG9tX3ZhbHVlegIYAQ==' | base64 -d > /tmp/otlp_span.bin
curl -X POST "http://localhost:5000/v1/traces" \
  -H "Content-Type: application/x-protobuf" \
  -H "x-mlflow-experiment-id: 0" \
  --data-binary @/tmp/otlp_span.bin

Stack trace

REPLACE_ME

Other info / logs

REPLACE_ME

What component(s) does this bug affect?

  • area/tracking: Tracking Service, tracking client APIs, autologging
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/scoring: MLflow model serving, deployment tools, Spark UDFs
  • area/evaluation: MLflow model evaluation features, evaluation metrics, and evaluation workflows
  • area/prompt: MLflow prompt engineering features, prompt templates, and prompt management
  • area/tracing: MLflow Tracing features, tracing APIs, and LLM tracing functionality
  • area/gateway: MLflow AI Gateway client APIs, server, and third-party integrations
  • area/projects: MLproject format, project running backends
  • area/uiux: Front-end, user experience, plotting
  • area/docs: MLflow documentation pages</issue_description>

<agent_instructions>It appears we need to use ExportTraceServiceResponse. Read https://github.com/Arize-ai/phoenix/blob/9bff1596653920836fe9abdc19afbc616213d865/src/phoenix/server/api/routers/v1/traces.py#L117 </agent_instructions>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
@harupy harupy requested a review from dbczumar November 20, 2025 01:02
@harupy harupy added the team-review Trigger a team review request label Nov 20, 2025
Copilot AI changed the title [WIP] Fix v1/traces response format to use protobuf Fix v1/traces endpoint to return protobuf instead of JSON Nov 20, 2025
@harupy harupy marked this pull request as ready for review November 20, 2025 01:10
@github-actions
Copy link
Contributor

github-actions bot commented Nov 20, 2025

Documentation preview for ce5be5d is available at:

More info
  • Ignore this comment if this PR does not change the documentation.
  • The preview is updated when a new commit is pushed to this PR.
  • This comment was created by this workflow run.
  • The documentation was built by this workflow run.

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
@harupy harupy enabled auto-merge November 20, 2025 03:44
@harupy harupy disabled auto-merge November 20, 2025 03:44
@harupy harupy added the rn/bug-fix Mention under Bug Fixes in Changelogs. label Nov 20, 2025
@harupy harupy requested a review from B-Step62 November 20, 2025 03:47
@harupy harupy added this pull request to the merge queue Nov 20, 2025
@harupy harupy removed this pull request from the merge queue due to a manual request Nov 20, 2025
Copy link
Collaborator

@B-Step62 B-Step62 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@harupy harupy added this pull request to the merge queue Nov 20, 2025
Merged via the queue into master with commit 6887d3a Nov 20, 2025
46 checks passed
@harupy harupy deleted the copilot/fix-v1-traces-response-format branch November 20, 2025 10:49
Tian-Sky-Lan pushed a commit to Tian-Sky-Lan/mlflow that referenced this pull request Nov 24, 2025
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Harutaka Kawamura <hkawamura0130@gmail.com>
Signed-off-by: Tian Lan <sky.blue266000@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rn/bug-fix Mention under Bug Fixes in Changelogs. team-review Trigger a team review request v3.6.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] v1/traces response is in json format instead of protobuf

4 participants