Skip to content

Add perf metrics for 2.47.0#53668

Merged
aslonnie merged 1 commit intomasterfrom
2.47.0_perf_metrics
Jun 11, 2025
Merged

Add perf metrics for 2.47.0#53668
aslonnie merged 1 commit intomasterfrom
2.47.0_perf_metrics

Conversation

@khluu
Copy link
Copy Markdown
Contributor

@khluu khluu commented Jun 9, 2025

REGRESSION 12.82%: tasks_per_second (THROUGHPUT) regresses from 221.2222291023174 to 192.87246715163326 in benchmarks/many_nodes.json
REGRESSION 12.73%: actors_per_second (THROUGHPUT) regresses from 634.2824761754516 to 553.5098466276525 in benchmarks/many_actors.json
REGRESSION 12.26%: client__get_calls (THROUGHPUT) regresses from 1160.5254002780266 to 1018.2939193917422 in microbenchmark.json
REGRESSION 5.15%: multi_client_put_gigabytes (THROUGHPUT) regresses from 39.896743394372585 to 37.84234603653026 in microbenchmark.json
REGRESSION 4.04%: client__tasks_and_get_batch (THROUGHPUT) regresses from 0.9480091293556955 to 0.909684480871914 in microbenchmark.json
REGRESSION 3.72%: 1_n_actor_calls_async (THROUGHPUT) regresses from 8318.094433102775 to 8008.806358661164 in microbenchmark.json
REGRESSION 3.01%: 1_1_actor_calls_sync (THROUGHPUT) regresses from 2020.4236901532247 to 1959.5608579309087 in microbenchmark.json
REGRESSION 2.80%: n_n_async_actor_calls_async (THROUGHPUT) regresses from 23716.451989299432 to 23052.03512506016 in microbenchmark.json
REGRESSION 2.71%: single_client_put_gigabytes (THROUGHPUT) regresses from 20.105537951105227 to 19.561225172916046 in microbenchmark.json
REGRESSION 2.69%: pgs_per_second (THROUGHPUT) regresses from 13.650631601393242 to 13.282795863244178 in benchmarks/many_pgs.json
REGRESSION 1.35%: single_client_tasks_async (THROUGHPUT) regresses from 8081.168521067462 to 7971.849053459262 in microbenchmark.json
REGRESSION 1.31%: n_n_actor_calls_async (THROUGHPUT) regresses from 27465.39608393524 to 27105.63998087682 in microbenchmark.json
REGRESSION 1.09%: client__tasks_and_put_batch (THROUGHPUT) regresses from 14569.862277318796 to 14411.155262801181 in microbenchmark.json
REGRESSION 1.05%: 1_1_async_actor_calls_sync (THROUGHPUT) regresses from 1483.660979687764 to 1468.0999827232097 in microbenchmark.json
REGRESSION 0.92%: single_client_get_object_containing_10k_refs (THROUGHPUT) regresses from 12.796724102063072 to 12.67868528378648 in microbenchmark.json
REGRESSION 0.88%: placement_group_create/removal (THROUGHPUT) regresses from 768.9082534403586 to 762.110356621388 in microbenchmark.json
REGRESSION 0.87%: single_client_tasks_sync (THROUGHPUT) regresses from 969.5757440611114 to 961.1131766783709 in microbenchmark.json
REGRESSION 0.35%: client__1_1_actor_calls_async (THROUGHPUT) regresses from 1069.1602586173547 to 1065.4228066614364 in microbenchmark.json
REGRESSION 0.23%: client__put_gigabytes (THROUGHPUT) regresses from 0.1529268174148042 to 0.1525808986433169 in microbenchmark.json
REGRESSION 0.05%: single_client_put_calls_Plasma_Store (THROUGHPUT) regresses from 5113.112753017668 to 5110.344528620948 in microbenchmark.json
REGRESSION 49.81%: dashboard_p99_latency_ms (LATENCY) regresses from 275.082 to 412.087 in benchmarks/many_pgs.json
REGRESSION 37.19%: dashboard_p95_latency_ms (LATENCY) regresses from 6.696 to 9.186 in benchmarks/many_pgs.json
REGRESSION 36.35%: dashboard_p95_latency_ms (LATENCY) regresses from 2283.949 to 3114.217 in benchmarks/many_actors.json
REGRESSION 13.04%: dashboard_p99_latency_ms (LATENCY) regresses from 675.061 to 763.093 in benchmarks/many_tasks.json
REGRESSION 11.46%: dashboard_p50_latency_ms (LATENCY) regresses from 3.856 to 4.298 in benchmarks/many_pgs.json
REGRESSION 11.23%: dashboard_p95_latency_ms (LATENCY) regresses from 437.195 to 486.283 in benchmarks/many_tasks.json
REGRESSION 8.97%: 107374182400_large_object_time (LATENCY) regresses from 29.323037406000026 to 31.951921509999977 in scalability/single_node.json
REGRESSION 6.24%: avg_iteration_time (LATENCY) regresses from 1.1950538015365602 to 1.2696449542045594 in stress_tests/stress_test_dead_actors.json
REGRESSION 5.86%: dashboard_p50_latency_ms (LATENCY) regresses from 8.293 to 8.779 in benchmarks/many_actors.json
REGRESSION 2.91%: time_to_broadcast_1073741824_bytes_to_50_nodes (LATENCY) regresses from 12.241764013000008 to 12.597426240999994 in scalability/object_store.json
REGRESSION 1.02%: avg_pg_remove_time_ms (LATENCY) regresses from 1.2291068678679091 to 1.2416502777781075 in stress_tests/stress_test_placement_group.json
REGRESSION 0.57%: dashboard_p50_latency_ms (LATENCY) regresses from 5.658 to 5.69 in benchmarks/many_nodes.json
REGRESSION 0.34%: 10000_args_time (LATENCY) regresses from 18.764070391999994 to 18.828636121000002 in scalability/single_node.json

Signed-off-by: Lonnie Liu <lonnie@anyscale.com>
Copilot AI review requested due to automatic review settings June 9, 2025 19:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates performance metric values for release 2.47.0, reflecting new benchmark values and regression changes across various stress tests, scalability tests, and microbenchmarks.

  • Updated benchmark and metric values in stress tests, scalability tests, microbenchmarks, and dashboards.
  • Revised the release version in the metadata file.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
release/perf_metrics/stress_tests/stress_test_placement_group.json Updated average placement group creation and removal times.
release/perf_metrics/stress_tests/stress_test_many_tasks.json Adjusted latency metrics across several stages along with aggregate timings.
release/perf_metrics/stress_tests/stress_test_dead_actors.json Updated iteration times and total time metrics.
release/perf_metrics/scalability/single_node.json Revised argument and get times along with large object timing and related metrics.
release/perf_metrics/scalability/object_store.json Updated broadcast time for object_store scalability test.
release/perf_metrics/microbenchmark.json Multiple throughput and latency values adjusted to reflect updated benchmark results.
release/perf_metrics/metadata.json Release version updated to 2.47.0.
release/perf_metrics/benchmarks/many_tasks.json Adjusted task performance and memory metrics in the benchmark.
release/perf_metrics/benchmarks/many_pgs.json Revised throughput and latency metrics for placement groups benchmark.
release/perf_metrics/benchmarks/many_nodes.json Updated throughput and latency metrics for many nodes benchmark.
release/perf_metrics/benchmarks/many_actors.json Updated throughput and dashboard latency metrics for many actors benchmark.

@aslonnie aslonnie requested review from aslonnie and jjyao June 9, 2025 19:35
@jjyao
Copy link
Copy Markdown
Contributor

jjyao commented Jun 9, 2025

REGRESSION 12.82%: tasks_per_second (THROUGHPUT) regresses from 221.2222291023174 to 192.87246715163326 in benchmarks/many_nodes.json

In the normal range of historical release test performance.

REGRESSION 12.73%: actors_per_second (THROUGHPUT) regresses from 634.2824761754516 to 553.5098466276525 in benchmarks/many_actors.json

In the normal range of historical release test performance.

REGRESSION 12.26%: client__get_calls (THROUGHPUT) regresses from 1160.5254002780266 to 1018.2939193917422 in microbenchmark.json

In the normal range of historical release test performance.

@aslonnie aslonnie added the go add ONLY when ready to merge, run all tests label Jun 10, 2025
@aslonnie aslonnie merged commit 1d15cf1 into master Jun 11, 2025
5 checks passed
@aslonnie aslonnie deleted the 2.47.0_perf_metrics branch June 11, 2025 01:43
elliot-barn pushed a commit that referenced this pull request Jun 18, 2025
```
REGRESSION 12.82%: tasks_per_second (THROUGHPUT) regresses from 221.2222291023174 to 192.87246715163326 in benchmarks/many_nodes.json
REGRESSION 12.73%: actors_per_second (THROUGHPUT) regresses from 634.2824761754516 to 553.5098466276525 in benchmarks/many_actors.json
REGRESSION 12.26%: client__get_calls (THROUGHPUT) regresses from 1160.5254002780266 to 1018.2939193917422 in microbenchmark.json
REGRESSION 5.15%: multi_client_put_gigabytes (THROUGHPUT) regresses from 39.896743394372585 to 37.84234603653026 in microbenchmark.json
REGRESSION 4.04%: client__tasks_and_get_batch (THROUGHPUT) regresses from 0.9480091293556955 to 0.909684480871914 in microbenchmark.json
REGRESSION 3.72%: 1_n_actor_calls_async (THROUGHPUT) regresses from 8318.094433102775 to 8008.806358661164 in microbenchmark.json
REGRESSION 3.01%: 1_1_actor_calls_sync (THROUGHPUT) regresses from 2020.4236901532247 to 1959.5608579309087 in microbenchmark.json
REGRESSION 2.80%: n_n_async_actor_calls_async (THROUGHPUT) regresses from 23716.451989299432 to 23052.03512506016 in microbenchmark.json
REGRESSION 2.71%: single_client_put_gigabytes (THROUGHPUT) regresses from 20.105537951105227 to 19.561225172916046 in microbenchmark.json
REGRESSION 2.69%: pgs_per_second (THROUGHPUT) regresses from 13.650631601393242 to 13.282795863244178 in benchmarks/many_pgs.json
REGRESSION 1.35%: single_client_tasks_async (THROUGHPUT) regresses from 8081.168521067462 to 7971.849053459262 in microbenchmark.json
REGRESSION 1.31%: n_n_actor_calls_async (THROUGHPUT) regresses from 27465.39608393524 to 27105.63998087682 in microbenchmark.json
REGRESSION 1.09%: client__tasks_and_put_batch (THROUGHPUT) regresses from 14569.862277318796 to 14411.155262801181 in microbenchmark.json
REGRESSION 1.05%: 1_1_async_actor_calls_sync (THROUGHPUT) regresses from 1483.660979687764 to 1468.0999827232097 in microbenchmark.json
REGRESSION 0.92%: single_client_get_object_containing_10k_refs (THROUGHPUT) regresses from 12.796724102063072 to 12.67868528378648 in microbenchmark.json
REGRESSION 0.88%: placement_group_create/removal (THROUGHPUT) regresses from 768.9082534403586 to 762.110356621388 in microbenchmark.json
REGRESSION 0.87%: single_client_tasks_sync (THROUGHPUT) regresses from 969.5757440611114 to 961.1131766783709 in microbenchmark.json
REGRESSION 0.35%: client__1_1_actor_calls_async (THROUGHPUT) regresses from 1069.1602586173547 to 1065.4228066614364 in microbenchmark.json
REGRESSION 0.23%: client__put_gigabytes (THROUGHPUT) regresses from 0.1529268174148042 to 0.1525808986433169 in microbenchmark.json
REGRESSION 0.05%: single_client_put_calls_Plasma_Store (THROUGHPUT) regresses from 5113.112753017668 to 5110.344528620948 in microbenchmark.json
REGRESSION 49.81%: dashboard_p99_latency_ms (LATENCY) regresses from 275.082 to 412.087 in benchmarks/many_pgs.json
REGRESSION 37.19%: dashboard_p95_latency_ms (LATENCY) regresses from 6.696 to 9.186 in benchmarks/many_pgs.json
REGRESSION 36.35%: dashboard_p95_latency_ms (LATENCY) regresses from 2283.949 to 3114.217 in benchmarks/many_actors.json
REGRESSION 13.04%: dashboard_p99_latency_ms (LATENCY) regresses from 675.061 to 763.093 in benchmarks/many_tasks.json
REGRESSION 11.46%: dashboard_p50_latency_ms (LATENCY) regresses from 3.856 to 4.298 in benchmarks/many_pgs.json
REGRESSION 11.23%: dashboard_p95_latency_ms (LATENCY) regresses from 437.195 to 486.283 in benchmarks/many_tasks.json
REGRESSION 8.97%: 107374182400_large_object_time (LATENCY) regresses from 29.323037406000026 to 31.951921509999977 in scalability/single_node.json
REGRESSION 6.24%: avg_iteration_time (LATENCY) regresses from 1.1950538015365602 to 1.2696449542045594 in stress_tests/stress_test_dead_actors.json
REGRESSION 5.86%: dashboard_p50_latency_ms (LATENCY) regresses from 8.293 to 8.779 in benchmarks/many_actors.json
REGRESSION 2.91%: time_to_broadcast_1073741824_bytes_to_50_nodes (LATENCY) regresses from 12.241764013000008 to 12.597426240999994 in scalability/object_store.json
REGRESSION 1.02%: avg_pg_remove_time_ms (LATENCY) regresses from 1.2291068678679091 to 1.2416502777781075 in stress_tests/stress_test_placement_group.json
REGRESSION 0.57%: dashboard_p50_latency_ms (LATENCY) regresses from 5.658 to 5.69 in benchmarks/many_nodes.json
REGRESSION 0.34%: 10000_args_time (LATENCY) regresses from 18.764070391999994 to 18.828636121000002 in scalability/single_node.json
```

Signed-off-by: Lonnie Liu <lonnie@anyscale.com>
Co-authored-by: Lonnie Liu <lonnie@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Jul 2, 2025
```
REGRESSION 12.82%: tasks_per_second (THROUGHPUT) regresses from 221.2222291023174 to 192.87246715163326 in benchmarks/many_nodes.json
REGRESSION 12.73%: actors_per_second (THROUGHPUT) regresses from 634.2824761754516 to 553.5098466276525 in benchmarks/many_actors.json
REGRESSION 12.26%: client__get_calls (THROUGHPUT) regresses from 1160.5254002780266 to 1018.2939193917422 in microbenchmark.json
REGRESSION 5.15%: multi_client_put_gigabytes (THROUGHPUT) regresses from 39.896743394372585 to 37.84234603653026 in microbenchmark.json
REGRESSION 4.04%: client__tasks_and_get_batch (THROUGHPUT) regresses from 0.9480091293556955 to 0.909684480871914 in microbenchmark.json
REGRESSION 3.72%: 1_n_actor_calls_async (THROUGHPUT) regresses from 8318.094433102775 to 8008.806358661164 in microbenchmark.json
REGRESSION 3.01%: 1_1_actor_calls_sync (THROUGHPUT) regresses from 2020.4236901532247 to 1959.5608579309087 in microbenchmark.json
REGRESSION 2.80%: n_n_async_actor_calls_async (THROUGHPUT) regresses from 23716.451989299432 to 23052.03512506016 in microbenchmark.json
REGRESSION 2.71%: single_client_put_gigabytes (THROUGHPUT) regresses from 20.105537951105227 to 19.561225172916046 in microbenchmark.json
REGRESSION 2.69%: pgs_per_second (THROUGHPUT) regresses from 13.650631601393242 to 13.282795863244178 in benchmarks/many_pgs.json
REGRESSION 1.35%: single_client_tasks_async (THROUGHPUT) regresses from 8081.168521067462 to 7971.849053459262 in microbenchmark.json
REGRESSION 1.31%: n_n_actor_calls_async (THROUGHPUT) regresses from 27465.39608393524 to 27105.63998087682 in microbenchmark.json
REGRESSION 1.09%: client__tasks_and_put_batch (THROUGHPUT) regresses from 14569.862277318796 to 14411.155262801181 in microbenchmark.json
REGRESSION 1.05%: 1_1_async_actor_calls_sync (THROUGHPUT) regresses from 1483.660979687764 to 1468.0999827232097 in microbenchmark.json
REGRESSION 0.92%: single_client_get_object_containing_10k_refs (THROUGHPUT) regresses from 12.796724102063072 to 12.67868528378648 in microbenchmark.json
REGRESSION 0.88%: placement_group_create/removal (THROUGHPUT) regresses from 768.9082534403586 to 762.110356621388 in microbenchmark.json
REGRESSION 0.87%: single_client_tasks_sync (THROUGHPUT) regresses from 969.5757440611114 to 961.1131766783709 in microbenchmark.json
REGRESSION 0.35%: client__1_1_actor_calls_async (THROUGHPUT) regresses from 1069.1602586173547 to 1065.4228066614364 in microbenchmark.json
REGRESSION 0.23%: client__put_gigabytes (THROUGHPUT) regresses from 0.1529268174148042 to 0.1525808986433169 in microbenchmark.json
REGRESSION 0.05%: single_client_put_calls_Plasma_Store (THROUGHPUT) regresses from 5113.112753017668 to 5110.344528620948 in microbenchmark.json
REGRESSION 49.81%: dashboard_p99_latency_ms (LATENCY) regresses from 275.082 to 412.087 in benchmarks/many_pgs.json
REGRESSION 37.19%: dashboard_p95_latency_ms (LATENCY) regresses from 6.696 to 9.186 in benchmarks/many_pgs.json
REGRESSION 36.35%: dashboard_p95_latency_ms (LATENCY) regresses from 2283.949 to 3114.217 in benchmarks/many_actors.json
REGRESSION 13.04%: dashboard_p99_latency_ms (LATENCY) regresses from 675.061 to 763.093 in benchmarks/many_tasks.json
REGRESSION 11.46%: dashboard_p50_latency_ms (LATENCY) regresses from 3.856 to 4.298 in benchmarks/many_pgs.json
REGRESSION 11.23%: dashboard_p95_latency_ms (LATENCY) regresses from 437.195 to 486.283 in benchmarks/many_tasks.json
REGRESSION 8.97%: 107374182400_large_object_time (LATENCY) regresses from 29.323037406000026 to 31.951921509999977 in scalability/single_node.json
REGRESSION 6.24%: avg_iteration_time (LATENCY) regresses from 1.1950538015365602 to 1.2696449542045594 in stress_tests/stress_test_dead_actors.json
REGRESSION 5.86%: dashboard_p50_latency_ms (LATENCY) regresses from 8.293 to 8.779 in benchmarks/many_actors.json
REGRESSION 2.91%: time_to_broadcast_1073741824_bytes_to_50_nodes (LATENCY) regresses from 12.241764013000008 to 12.597426240999994 in scalability/object_store.json
REGRESSION 1.02%: avg_pg_remove_time_ms (LATENCY) regresses from 1.2291068678679091 to 1.2416502777781075 in stress_tests/stress_test_placement_group.json
REGRESSION 0.57%: dashboard_p50_latency_ms (LATENCY) regresses from 5.658 to 5.69 in benchmarks/many_nodes.json
REGRESSION 0.34%: 10000_args_time (LATENCY) regresses from 18.764070391999994 to 18.828636121000002 in scalability/single_node.json
```

Signed-off-by: Lonnie Liu <lonnie@anyscale.com>
Co-authored-by: Lonnie Liu <lonnie@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants