Skip to content

bechmark: scale up RPS to test data plane CPU performance#7810

Merged
zhaohuabing merged 12 commits intoenvoyproxy:mainfrom
zhaohuabing:bechmark-scale-rps
Dec 30, 2025
Merged

bechmark: scale up RPS to test data plane CPU performance#7810
zhaohuabing merged 12 commits intoenvoyproxy:mainfrom
zhaohuabing:bechmark-scale-rps

Conversation

@zhaohuabing
Copy link
Copy Markdown
Member

@zhaohuabing zhaohuabing commented Dec 24, 2025

What type of PR is this?
This PR adjusts the Envoy Gateway benchmark to measure data plane CPU performance using RPS instead of route count.

While route count is useful for evaluating the CPU performance patterns of the control plane(EG) , data plane(Envoy) CPU usage is primarily driven by request processing. Scaling load by RPS provides a more stable signal, and makes the results easier to interpret for data plane performance analysis.

This PR also:

  • Previously, failed Prometheus queries resulted in sample values being recorded as 0 and used to generate the report. These failed samples are now excluded from the min/max/mean calculations in the report.
  • Data-plane CPU sampling previously started before traffic was sent and continued after traffic had finished. It is now restricted to the actual traffic window, capturing CPU usage only while traffic is being sent.
  • Prometheus scraping was set to 10s, and BENCHMARK_DURATION was increased to 90s to ensure we collect enough samples for generating the reports.

The envoy proxy cpu metrics in the current benchmark report:
Screenshot 2025-12-24 at 14 44 23

The envoy proxy cpu metrics in this PR:
image

Test Name Envoy Gateway Container Memory (MiB)
min/max/means
Envoy Gateway Process Memory (MiB)
min/max/means
Envoy Gateway CPU (%)
min/max/means
Averaged Envoy Proxy Memory (MiB)
min/max/means
Averaged Envoy Proxy CPU (%)
min/max/means
scaling up httproutes to 10 with 2 routes per hostname at 100 rps 122.74 / 134.74 / 132.60 30.68 / 43.92 / 37.77 0.45 / 1.20 / 0.60 4.16 / 22.77 / 22.14 5.79 / 9.93 / 9.10
scaling up httproutes to 50 with 10 routes per hostname at 300 rps 134.10 / 143.62 / 141.19 35.01 / 52.05 / 42.50 0.55 / 2.10 / 0.84 26.81 / 28.89 / 28.53 18.98 / 28.93 / 27.36
scaling up httproutes to 100 with 20 routes per hostname at 500 rps 140.60 / 148.92 / 145.47 37.59 / 49.14 / 42.69 0.65 / 2.45 / 1.03 32.87 / 34.96 / 34.72 45.10 / 46.27 / 45.50
scaling up httproutes to 300 with 60 routes per hostname at 800 rps 144.14 / 153.39 / 150.26 42.30 / 60.46 / 52.16 0.75 / 4.85 / 1.57 55.12 / 68.81 / 63.44 43.81 / 68.25 / 62.95
scaling up httproutes to 500 with 100 routes per hostname at 1000 rps 158.59 / 166.66 / 164.15 52.29 / 68.52 / 58.96 1.00 / 7.40 / 2.06 84.20 / 100.09 / 90.03 53.02 / 80.69 / 75.48
scaling up httproutes to 1000 with 200 routes per hostname at 2000 rps 173.98 / 188.18 / 186.47 65.00 / 87.07 / 77.12 1.05 / 13.50 / 3.09 140.79 / 141.78 / 141.43 63.11 / 99.49 / 91.75
scaling down httproutes to 500 with 100 routes per hostname at 1000 rps 168.58 / 181.48 / 173.22 54.95 / 78.56 / 67.08 1.00 / 1.35 / 1.16 140.78 / 141.30 / 141.13 54.45 / 81.40 / 76.96
scaling down httproutes to 300 with 60 routes per hostname at 800 rps 154.50 / 168.78 / 160.81 47.39 / 62.73 / 54.50 1.00 / 1.35 / 1.16 136.22 / 140.91 / 137.19 40.49 / 68.74 / 64.18
scaling down httproutes to 100 with 20 routes per hostname at 500 rps 142.30 / 163.45 / 151.32 38.12 / 65.46 / 47.57 1.00 / 1.40 / 1.16 134.57 / 136.95 / 134.93 29.56 / 48.22 / 43.26
scaling down httproutes to 50 with 10 routes per hostname at 300 rps 145.06 / 151.28 / 147.09 35.56 / 47.53 / 40.31 1.10 / 1.25 / 1.15 134.60 / 136.53 / 135.37 16.46 / 29.25 / 25.20
scaling down httproutes to 10 with 2 routes per hostname at 100 rps 144.98 / 151.25 / 146.45 34.93 / 49.85 / 42.12 1.00 / 1.30 / 1.14 136.50 / 136.53 / 136.51 5.30 / 9.77 / 9.04

cc @missBerg

@zhaohuabing zhaohuabing requested a review from a team as a code owner December 24, 2025 05:19
@zhaohuabing zhaohuabing marked this pull request as draft December 24, 2025 05:19
@zhaohuabing zhaohuabing changed the title Scale up RPS to test data plane CPU performance bechmark: scale up RPS to test data plane CPU performance Dec 24, 2025
@codecov
Copy link
Copy Markdown

codecov bot commented Dec 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.73%. Comparing base (fe57c70) to head (f70ebad).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7810      +/-   ##
==========================================
- Coverage   72.77%   72.73%   -0.04%     
==========================================
  Files         235      235              
  Lines       35092    35092              
==========================================
- Hits        25538    25525      -13     
- Misses       7738     7751      +13     
  Partials     1816     1816              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@zhaohuabing zhaohuabing force-pushed the bechmark-scale-rps branch 16 times, most recently from fee920f to 1a6be88 Compare December 26, 2025 09:24
@zhaohuabing zhaohuabing marked this pull request as ready for review December 28, 2025 00:44
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
@jukie
Copy link
Copy Markdown
Contributor

jukie commented Dec 28, 2025

/retest

jukie
jukie previously approved these changes Dec 28, 2025
@zhaohuabing zhaohuabing requested a review from a team December 28, 2025 23:54
Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com>
@zhaohuabing zhaohuabing requested review from a team, jukie and zirain December 29, 2025 09:47
@zhaohuabing zhaohuabing merged commit 47a1dfa into envoyproxy:main Dec 30, 2025
95 of 101 checks passed
@zhaohuabing zhaohuabing deleted the bechmark-scale-rps branch December 30, 2025 03:08
millermatt pushed a commit to millermatt/envoy-gateway that referenced this pull request Jan 4, 2026
…#7810)

* Scale up RPS to test data plane CPU performance

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* set duration to 120s

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* discard invalid samples

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* change scrape interval to 10s

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* remove invalid cpu sampling data

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* reduce duration to 60

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* fix benchmark end time

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* fix data plane benchmark start time

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* increase test time to get more samples

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* adjust rps for each scale

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* address comments

Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com>

---------

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Matt Miller <millermatt@outlook.com>
rudrakhp pushed a commit to rudrakhp/gateway that referenced this pull request Jan 8, 2026
…#7810)

* Scale up RPS to test data plane CPU performance

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* set duration to 120s

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* discard invalid samples

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* change scrape interval to 10s

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* remove invalid cpu sampling data

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* reduce duration to 60

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* fix benchmark end time

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* fix data plane benchmark start time

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* increase test time to get more samples

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* adjust rps for each scale

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* address comments

Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com>

---------

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
rudrakhp added a commit that referenced this pull request Jan 9, 2026
* fix: set observedGeneration in envoy patch policy (#7715)

* fix: set observedGeneration in envoy patch policy

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

* add release note

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>

---------

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* fix: add validation for request buffer limit (#7687)

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* fix: nil pointer error when applying BackendTrafficPolicy to HTTPRoute with no backendRefs (#7765)

* fix: checking route section name in backend traffic policy

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* fix: setting externalTrafficPolicy for NodePort service type (#7823)

Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* fix: add indexing and processing for CRL references in ClientTrafficPolicies (#7829)

Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* feat: change the benchmark report to json format (#6818)

* benchmark json output

Signed-off-by: zirain <zirain2009@gmail.com>

* fix

Signed-off-by: zirain <zirain2009@gmail.com>

* fix

Signed-off-by: zirain <zirain2009@gmail.com>

* fix lint

Signed-off-by: zirain <zirain2009@gmail.com>

* fix

Signed-off-by: zirain <zirain2009@gmail.com>

* revert

Signed-off-by: zirain <zirain2009@gmail.com>

* fix seconds

Signed-off-by: zirain <zirain2009@gmail.com>

---------

Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* bechmark: scale up RPS to test data plane CPU performance (#7810)

* Scale up RPS to test data plane CPU performance

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* set duration to 120s

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* discard invalid samples

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* change scrape interval to 10s

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* remove invalid cpu sampling data

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* reduce duration to 60

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* fix benchmark end time

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* fix data plane benchmark start time

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* increase test time to get more samples

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* adjust rps for each scale

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>

* address comments

Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com>

---------

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* fix: make port-forward worked for OTel collector on port 19001 (#7860)

Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* chore: fix goroutine leak (#7880)

fix goroutine leak

Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

* fix gen-check

Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>

---------

Signed-off-by: kkk777-7 <kota.kimura0725@gmail.com>
Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
Signed-off-by: zirain <zirain2009@gmail.com>
Signed-off-by: Huabing Zhao <zhaohuabing@gmail.com>
Signed-off-by: Huabing(Robin) Zhao <zhaohuabing@gmail.com>
Co-authored-by: Kota Kimura <86363983+kkk777-7@users.noreply.github.com>
Co-authored-by: zirain <zirain2009@gmail.com>
Co-authored-by: Huabing (Robin) Zhao <zhaohuabing@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants