[Serve] Rename some prometheus metrics#61090
[Serve] Rename some prometheus metrics#61090petern48 wants to merge 17 commits intoray-project:masterfrom
Conversation
…rve_replica_num_ongoing_requests' Signed-off-by: Peter Nguyen <petern0408@gmail.com>
Signed-off-by: Peter Nguyen <petern0408@gmail.com>
Signed-off-by: Peter Nguyen <petern0408@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request correctly renames the ray_serve_replica_processing_queries metric to ray_serve_replica_num_ongoing_requests and maintains backward compatibility by deprecating the old metric. The changes are applied across Python, Java, and documentation files.
I've identified a critical bug in the Java implementation where the new metric is not updated due to an overwritten variable. I've also found a bug in a dashboard panel definition where the new metric name is missing a prefix. Additionally, I've suggested an improvement to the monitoring documentation to explicitly mention the deprecated metric for clarity.
Overall, the changes are in the right direction, but the identified issues should be addressed to ensure correctness and consistency.
python/ray/dashboard/modules/metrics/dashboards/serve_deployment_dashboard_panels.py
Outdated
Show resolved
Hide resolved
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Peter Nguyen <petern0408@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Peter Nguyen <petern0408@gmail.com>
…erve_router_num_queued_requests' Signed-off-by: Peter Nguyen <petern0408@gmail.com>
…same as monitoring.md (old description wasn't great) Signed-off-by: Peter Nguyen <petern0408@gmail.com>
…ongoing_requests_at_replicas' Signed-off-by: Peter Nguyen <petern0408@gmail.com>
…e one Signed-off-by: Peter Nguyen <petern0408@gmail.com>
java/serve/src/main/java/io/ray/serve/metrics/RayServeMetrics.java
Outdated
Show resolved
Hide resolved
java/serve/src/main/java/io/ray/serve/metrics/RayServeMetrics.java
Outdated
Show resolved
Hide resolved
java/serve/src/main/java/io/ray/serve/replica/RayServeReplicaImpl.java
Outdated
Show resolved
Hide resolved
…java Co-authored-by: harshit-anyscale <harshit@anyscale.com> Signed-off-by: Peter Nguyen <petern0408@gmail.com>
Signed-off-by: Peter Nguyen <petern0408@gmail.com>
Signed-off-by: Peter Nguyen <petern0408@gmail.com>
Signed-off-by: Peter Nguyen <petern0408@gmail.com>
|
I think the most important detail we want to find out is the impact on runtime performance from this change. @harshit-anyscale once the PR is ready, could you help run a round of microbenchmarks on this PR to find out. |
Signed-off-by: Peter Nguyen <petern0408@gmail.com>
|
This pull request has been automatically marked as stale because it has not had You can always ask for help on our discussion forum or Ray's public slack channel. If you'd like to keep this open, just leave any comment, and the stale label will be removed. |
Signed-off-by: Peter Nguyen <petern0408@gmail.com>
Signed-off-by: Peter Nguyen <petern0408@gmail.com>

Description
This is a follow-up for #59220. In the PR review, a few metric renames were suggested. This PR implements those metric renames to follow better consistency.
Old metric names have not been removed for backwards compatibility. Instead, they have been deprecated, so they can be removed in the next major release. Updates have also been applied for Java as well.
Format: old_name -> new_name
ray_serve_replica_processing_queries->ray_serve_replica_num_ongoing_requestsray_serve_deployment_queued_queries->ray_serve_router_num_queued_requestsray_serve_num_ongoing_requests_at_replicas->ray_serve_router_num_ongoing_requests_at_replicasRelated issues
Fixes #59376
Additional information
Note, I came up with a new name for the third metric myself because the comment in the original PR did not propose a specific new name. It only suggested that it should "explicitly say handle or router", so the name of the last metric is still up for discussion.