Add perf metrics for 2.47.0 by khluu · Pull Request #53668 · ray-project/ray

khluu · 2025-06-09T19:33:37Z

REGRESSION 12.82%: tasks_per_second (THROUGHPUT) regresses from 221.2222291023174 to 192.87246715163326 in benchmarks/many_nodes.json
REGRESSION 12.73%: actors_per_second (THROUGHPUT) regresses from 634.2824761754516 to 553.5098466276525 in benchmarks/many_actors.json
REGRESSION 12.26%: client__get_calls (THROUGHPUT) regresses from 1160.5254002780266 to 1018.2939193917422 in microbenchmark.json
REGRESSION 5.15%: multi_client_put_gigabytes (THROUGHPUT) regresses from 39.896743394372585 to 37.84234603653026 in microbenchmark.json
REGRESSION 4.04%: client__tasks_and_get_batch (THROUGHPUT) regresses from 0.9480091293556955 to 0.909684480871914 in microbenchmark.json
REGRESSION 3.72%: 1_n_actor_calls_async (THROUGHPUT) regresses from 8318.094433102775 to 8008.806358661164 in microbenchmark.json
REGRESSION 3.01%: 1_1_actor_calls_sync (THROUGHPUT) regresses from 2020.4236901532247 to 1959.5608579309087 in microbenchmark.json
REGRESSION 2.80%: n_n_async_actor_calls_async (THROUGHPUT) regresses from 23716.451989299432 to 23052.03512506016 in microbenchmark.json
REGRESSION 2.71%: single_client_put_gigabytes (THROUGHPUT) regresses from 20.105537951105227 to 19.561225172916046 in microbenchmark.json
REGRESSION 2.69%: pgs_per_second (THROUGHPUT) regresses from 13.650631601393242 to 13.282795863244178 in benchmarks/many_pgs.json
REGRESSION 1.35%: single_client_tasks_async (THROUGHPUT) regresses from 8081.168521067462 to 7971.849053459262 in microbenchmark.json
REGRESSION 1.31%: n_n_actor_calls_async (THROUGHPUT) regresses from 27465.39608393524 to 27105.63998087682 in microbenchmark.json
REGRESSION 1.09%: client__tasks_and_put_batch (THROUGHPUT) regresses from 14569.862277318796 to 14411.155262801181 in microbenchmark.json
REGRESSION 1.05%: 1_1_async_actor_calls_sync (THROUGHPUT) regresses from 1483.660979687764 to 1468.0999827232097 in microbenchmark.json
REGRESSION 0.92%: single_client_get_object_containing_10k_refs (THROUGHPUT) regresses from 12.796724102063072 to 12.67868528378648 in microbenchmark.json
REGRESSION 0.88%: placement_group_create/removal (THROUGHPUT) regresses from 768.9082534403586 to 762.110356621388 in microbenchmark.json
REGRESSION 0.87%: single_client_tasks_sync (THROUGHPUT) regresses from 969.5757440611114 to 961.1131766783709 in microbenchmark.json
REGRESSION 0.35%: client__1_1_actor_calls_async (THROUGHPUT) regresses from 1069.1602586173547 to 1065.4228066614364 in microbenchmark.json
REGRESSION 0.23%: client__put_gigabytes (THROUGHPUT) regresses from 0.1529268174148042 to 0.1525808986433169 in microbenchmark.json
REGRESSION 0.05%: single_client_put_calls_Plasma_Store (THROUGHPUT) regresses from 5113.112753017668 to 5110.344528620948 in microbenchmark.json
REGRESSION 49.81%: dashboard_p99_latency_ms (LATENCY) regresses from 275.082 to 412.087 in benchmarks/many_pgs.json
REGRESSION 37.19%: dashboard_p95_latency_ms (LATENCY) regresses from 6.696 to 9.186 in benchmarks/many_pgs.json
REGRESSION 36.35%: dashboard_p95_latency_ms (LATENCY) regresses from 2283.949 to 3114.217 in benchmarks/many_actors.json
REGRESSION 13.04%: dashboard_p99_latency_ms (LATENCY) regresses from 675.061 to 763.093 in benchmarks/many_tasks.json
REGRESSION 11.46%: dashboard_p50_latency_ms (LATENCY) regresses from 3.856 to 4.298 in benchmarks/many_pgs.json
REGRESSION 11.23%: dashboard_p95_latency_ms (LATENCY) regresses from 437.195 to 486.283 in benchmarks/many_tasks.json
REGRESSION 8.97%: 107374182400_large_object_time (LATENCY) regresses from 29.323037406000026 to 31.951921509999977 in scalability/single_node.json
REGRESSION 6.24%: avg_iteration_time (LATENCY) regresses from 1.1950538015365602 to 1.2696449542045594 in stress_tests/stress_test_dead_actors.json
REGRESSION 5.86%: dashboard_p50_latency_ms (LATENCY) regresses from 8.293 to 8.779 in benchmarks/many_actors.json
REGRESSION 2.91%: time_to_broadcast_1073741824_bytes_to_50_nodes (LATENCY) regresses from 12.241764013000008 to 12.597426240999994 in scalability/object_store.json
REGRESSION 1.02%: avg_pg_remove_time_ms (LATENCY) regresses from 1.2291068678679091 to 1.2416502777781075 in stress_tests/stress_test_placement_group.json
REGRESSION 0.57%: dashboard_p50_latency_ms (LATENCY) regresses from 5.658 to 5.69 in benchmarks/many_nodes.json
REGRESSION 0.34%: 10000_args_time (LATENCY) regresses from 18.764070391999994 to 18.828636121000002 in scalability/single_node.json

Signed-off-by: Lonnie Liu <lonnie@anyscale.com>

Copilot

Pull Request Overview

This PR updates performance metric values for release 2.47.0, reflecting new benchmark values and regression changes across various stress tests, scalability tests, and microbenchmarks.

Updated benchmark and metric values in stress tests, scalability tests, microbenchmarks, and dashboards.
Revised the release version in the metadata file.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
release/perf_metrics/stress_tests/stress_test_placement_group.json	Updated average placement group creation and removal times.
release/perf_metrics/stress_tests/stress_test_many_tasks.json	Adjusted latency metrics across several stages along with aggregate timings.
release/perf_metrics/stress_tests/stress_test_dead_actors.json	Updated iteration times and total time metrics.
release/perf_metrics/scalability/single_node.json	Revised argument and get times along with large object timing and related metrics.
release/perf_metrics/scalability/object_store.json	Updated broadcast time for object_store scalability test.
release/perf_metrics/microbenchmark.json	Multiple throughput and latency values adjusted to reflect updated benchmark results.
release/perf_metrics/metadata.json	Release version updated to 2.47.0.
release/perf_metrics/benchmarks/many_tasks.json	Adjusted task performance and memory metrics in the benchmark.
release/perf_metrics/benchmarks/many_pgs.json	Revised throughput and latency metrics for placement groups benchmark.
release/perf_metrics/benchmarks/many_nodes.json	Updated throughput and latency metrics for many nodes benchmark.
release/perf_metrics/benchmarks/many_actors.json	Updated throughput and dashboard latency metrics for many actors benchmark.

jjyao · 2025-06-09T20:07:09Z

REGRESSION 12.82%: tasks_per_second (THROUGHPUT) regresses from 221.2222291023174 to 192.87246715163326 in benchmarks/many_nodes.json

In the normal range of historical release test performance.

REGRESSION 12.73%: actors_per_second (THROUGHPUT) regresses from 634.2824761754516 to 553.5098466276525 in benchmarks/many_actors.json

In the normal range of historical release test performance.

REGRESSION 12.26%: client__get_calls (THROUGHPUT) regresses from 1160.5254002780266 to 1018.2939193917422 in microbenchmark.json

In the normal range of historical release test performance.

``` REGRESSION 12.82%: tasks_per_second (THROUGHPUT) regresses from 221.2222291023174 to 192.87246715163326 in benchmarks/many_nodes.json REGRESSION 12.73%: actors_per_second (THROUGHPUT) regresses from 634.2824761754516 to 553.5098466276525 in benchmarks/many_actors.json REGRESSION 12.26%: client__get_calls (THROUGHPUT) regresses from 1160.5254002780266 to 1018.2939193917422 in microbenchmark.json REGRESSION 5.15%: multi_client_put_gigabytes (THROUGHPUT) regresses from 39.896743394372585 to 37.84234603653026 in microbenchmark.json REGRESSION 4.04%: client__tasks_and_get_batch (THROUGHPUT) regresses from 0.9480091293556955 to 0.909684480871914 in microbenchmark.json REGRESSION 3.72%: 1_n_actor_calls_async (THROUGHPUT) regresses from 8318.094433102775 to 8008.806358661164 in microbenchmark.json REGRESSION 3.01%: 1_1_actor_calls_sync (THROUGHPUT) regresses from 2020.4236901532247 to 1959.5608579309087 in microbenchmark.json REGRESSION 2.80%: n_n_async_actor_calls_async (THROUGHPUT) regresses from 23716.451989299432 to 23052.03512506016 in microbenchmark.json REGRESSION 2.71%: single_client_put_gigabytes (THROUGHPUT) regresses from 20.105537951105227 to 19.561225172916046 in microbenchmark.json REGRESSION 2.69%: pgs_per_second (THROUGHPUT) regresses from 13.650631601393242 to 13.282795863244178 in benchmarks/many_pgs.json REGRESSION 1.35%: single_client_tasks_async (THROUGHPUT) regresses from 8081.168521067462 to 7971.849053459262 in microbenchmark.json REGRESSION 1.31%: n_n_actor_calls_async (THROUGHPUT) regresses from 27465.39608393524 to 27105.63998087682 in microbenchmark.json REGRESSION 1.09%: client__tasks_and_put_batch (THROUGHPUT) regresses from 14569.862277318796 to 14411.155262801181 in microbenchmark.json REGRESSION 1.05%: 1_1_async_actor_calls_sync (THROUGHPUT) regresses from 1483.660979687764 to 1468.0999827232097 in microbenchmark.json REGRESSION 0.92%: single_client_get_object_containing_10k_refs (THROUGHPUT) regresses from 12.796724102063072 to 12.67868528378648 in microbenchmark.json REGRESSION 0.88%: placement_group_create/removal (THROUGHPUT) regresses from 768.9082534403586 to 762.110356621388 in microbenchmark.json REGRESSION 0.87%: single_client_tasks_sync (THROUGHPUT) regresses from 969.5757440611114 to 961.1131766783709 in microbenchmark.json REGRESSION 0.35%: client__1_1_actor_calls_async (THROUGHPUT) regresses from 1069.1602586173547 to 1065.4228066614364 in microbenchmark.json REGRESSION 0.23%: client__put_gigabytes (THROUGHPUT) regresses from 0.1529268174148042 to 0.1525808986433169 in microbenchmark.json REGRESSION 0.05%: single_client_put_calls_Plasma_Store (THROUGHPUT) regresses from 5113.112753017668 to 5110.344528620948 in microbenchmark.json REGRESSION 49.81%: dashboard_p99_latency_ms (LATENCY) regresses from 275.082 to 412.087 in benchmarks/many_pgs.json REGRESSION 37.19%: dashboard_p95_latency_ms (LATENCY) regresses from 6.696 to 9.186 in benchmarks/many_pgs.json REGRESSION 36.35%: dashboard_p95_latency_ms (LATENCY) regresses from 2283.949 to 3114.217 in benchmarks/many_actors.json REGRESSION 13.04%: dashboard_p99_latency_ms (LATENCY) regresses from 675.061 to 763.093 in benchmarks/many_tasks.json REGRESSION 11.46%: dashboard_p50_latency_ms (LATENCY) regresses from 3.856 to 4.298 in benchmarks/many_pgs.json REGRESSION 11.23%: dashboard_p95_latency_ms (LATENCY) regresses from 437.195 to 486.283 in benchmarks/many_tasks.json REGRESSION 8.97%: 107374182400_large_object_time (LATENCY) regresses from 29.323037406000026 to 31.951921509999977 in scalability/single_node.json REGRESSION 6.24%: avg_iteration_time (LATENCY) regresses from 1.1950538015365602 to 1.2696449542045594 in stress_tests/stress_test_dead_actors.json REGRESSION 5.86%: dashboard_p50_latency_ms (LATENCY) regresses from 8.293 to 8.779 in benchmarks/many_actors.json REGRESSION 2.91%: time_to_broadcast_1073741824_bytes_to_50_nodes (LATENCY) regresses from 12.241764013000008 to 12.597426240999994 in scalability/object_store.json REGRESSION 1.02%: avg_pg_remove_time_ms (LATENCY) regresses from 1.2291068678679091 to 1.2416502777781075 in stress_tests/stress_test_placement_group.json REGRESSION 0.57%: dashboard_p50_latency_ms (LATENCY) regresses from 5.658 to 5.69 in benchmarks/many_nodes.json REGRESSION 0.34%: 10000_args_time (LATENCY) regresses from 18.764070391999994 to 18.828636121000002 in scalability/single_node.json ``` Signed-off-by: Lonnie Liu <lonnie@anyscale.com> Co-authored-by: Lonnie Liu <lonnie@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>

Update performance metrics for 2.47.0

291d2b1

Signed-off-by: Lonnie Liu <lonnie@anyscale.com>

Copilot AI review requested due to automatic review settings June 9, 2025 19:33

Copilot AI reviewed Jun 9, 2025

View reviewed changes

aslonnie requested review from aslonnie and jjyao June 9, 2025 19:35

jjyao approved these changes Jun 9, 2025

View reviewed changes

aslonnie added the go add ONLY when ready to merge, run all tests label Jun 10, 2025

aslonnie merged commit 1d15cf1 into master Jun 11, 2025
5 checks passed

aslonnie deleted the 2.47.0_perf_metrics branch June 11, 2025 01:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add perf metrics for 2.47.0#53668

Add perf metrics for 2.47.0#53668
aslonnie merged 1 commit intomasterfrom
2.47.0_perf_metrics

khluu commented Jun 9, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

jjyao commented Jun 9, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

khluu commented Jun 9, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

jjyao commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jjyao commented Jun 9, 2025 •

edited

Loading