Skip to content

Skip test_metrics_agent_with_open_telemetry on mac#53917

Merged
can-anyscale merged 1 commit intomasterfrom
revert-53751-can-telwins
Jun 18, 2025
Merged

Skip test_metrics_agent_with_open_telemetry on mac#53917
can-anyscale merged 1 commit intomasterfrom
revert-53751-can-telwins

Conversation

@can-anyscale
Copy link
Copy Markdown
Contributor

@can-anyscale can-anyscale commented Jun 18, 2025

This test is failing on mac (#53828). Remove it on mac so it doesn't pollute go/flaky. This feature is behind a flag so it doesn't affect production.

Closes #53828

Test:

  • CI

@can-anyscale can-anyscale force-pushed the revert-53751-can-telwins branch from ddbfaf5 to e267244 Compare June 18, 2025 14:17
@can-anyscale can-anyscale marked this pull request as ready for review June 18, 2025 14:18
@can-anyscale can-anyscale enabled auto-merge (squash) June 18, 2025 14:18
@github-actions github-actions bot added the go add ONLY when ready to merge, run all tests label Jun 18, 2025
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR reverts the previous move of the OpenTelemetry tests into a pytest module and instead isolates them to Linux-only runs to avoid macOS failures.

  • Adds a new py_test_module_list entry for OpenTelemetry tests with the required feature flags
  • Introduces a dedicated Buildkite step to run the OpenTelemetry tests
  • Ensures the feature remains behind a flag so production remains unaffected

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
python/ray/tests/BUILD Added a pytest module list for OpenTelemetry tests with env flags, tags, and dependencies
.buildkite/core.rayci.yml Added a new Buildkite step to run OpenTelemetry tests in Docker with feature flags
Comments suppressed due to low confidence (2)

.buildkite/core.rayci.yml:116

  • This step should be restricted to Linux agents only to avoid running on macOS. Consider adding an os: linux tag or agent queue filter under this step.
  - label: ":ray: core: open telemetry tests"

.buildkite/core.rayci.yml:122

  • The Bazel invocation references test_metrics_agent and includes a stray core argument, but the new test target is named test_metrics_agent_open_telemetry. Update the target to //python/ray/tests:test_metrics_agent_open_telemetry and remove the extra argument.
      - bazel run //ci/ray_ci:test_in_docker -- //python/ray/tests:test_metrics_agent core

@can-anyscale can-anyscale force-pushed the revert-53751-can-telwins branch from e267244 to 1de393b Compare June 18, 2025 14:21
@github-actions github-actions bot disabled auto-merge June 18, 2025 14:22
@can-anyscale can-anyscale force-pushed the revert-53751-can-telwins branch from 1de393b to d77ba7c Compare June 18, 2025 14:22
@dayshah
Copy link
Copy Markdown
Contributor

dayshah commented Jun 18, 2025

do we know why it's failing on mac and if it works locally on your mac / apple silicon macs?

would also just prefer doing pytest.skip, based on os which makes it more clear that it's not running on a certain os

@can-anyscale
Copy link
Copy Markdown
Contributor Author

can-anyscale commented Jun 18, 2025

@dayshah ah nice skipping at pytest level is way better

I know why it fails on mac yes; in cases where a metric is exported with a missing tag (e.g., the metric is defined with tags A and B but only exported with tag A), OpenTelemetry on macOS CI can misalign tag key-value pairs. You can see an example here: https://buildkite.com/ray-project/postmerge-macos/builds/6168/steps/canvas?jid=01976d5d-0669-485b-8a2d-e8288d8dedcf#01976d5d-0669-485b-8a2d-e8288d8dedcf/6-5824. This might be a bug in the opentelemetry-prometheus-exporter package. We're upgrading that package in another PR, and if the issue persists afterward, I'll look into a different fix. I haven’t tested it on my local Mac yet.

Signed-off-by: can <can@anyscale.com>
@can-anyscale can-anyscale force-pushed the revert-53751-can-telwins branch from d77ba7c to 50d429a Compare June 18, 2025 15:52
@can-anyscale
Copy link
Copy Markdown
Contributor Author

@dayshah's comments

@can-anyscale can-anyscale changed the title Revert "[core][telemetry] move the open telemetry tests into a pytest module" Skip test_metrics_agent_with_open_telemetry on mac Jun 18, 2025
@can-anyscale can-anyscale merged commit b4727db into master Jun 18, 2025
5 checks passed
@can-anyscale can-anyscale deleted the revert-53751-can-telwins branch June 18, 2025 17:34
minerharry pushed a commit to minerharry/ray that referenced this pull request Jun 27, 2025
This test is failing on mac
(ray-project#53828). Remove it on mac so
it doesn't pollute go/flaky. This feature is behind a flag so it doesn't
affect production.

Closes ray-project#53828

Test:
- CI

Signed-off-by: can <can@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Jul 2, 2025
This test is failing on mac
(#53828). Remove it on mac so
it doesn't pollute go/flaky. This feature is behind a flag so it doesn't
affect production.

Closes #53828

Test:
- CI

Signed-off-by: can <can@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI test darwin://python/ray/tests:test_metrics_agent_open_telemetry is consistently_failing

3 participants