all: Use gomaxprocs library to set GOMAXPROCS#8175
Merged
marclop merged 2 commits intoelastic:mainfrom May 24, 2022
Merged
Conversation
Adds a new explicit dependency on the Uber gomaxprocs library to check if any CFS quotas have been set, adjusting the Go runtime gomaxprocs setting accordingly. This prevents the APM Server from "starving" the CFS allowed bandwidth and get into a situation where the majority of threads are sleeping (throttled by the CFS scheduler), reducing the amount of work that can be done. The initial macro-benchmarks I ran seemed to indicate that the change has a positive effect on the APM Server's throughput: ``` $ benchstat automaxprocs.txt noautoprocs.txt name old time/op new time/op delta 1000Transactions-64 99.5ms ±25% 119.2ms ±13% +19.78% (p=0.032 n=5+5) AgentGo-64 657ms ±21% 1602ms ± 5% +143.86% (p=0.008 n=5+5) AgentNodeJS-64 336ms ± 5% 875ms ±13% +160.50% (p=0.008 n=5+5) AgentPython-64 1.14s ±21% 3.37s ± 4% +196.62% (p=0.008 n=5+5) AgentRuby-64 540ms ±15% 1437ms ± 5% +166.17% (p=0.008 n=5+5) OTLPTraces-64 82.9µs ± 3% 98.1µs ± 8% +18.34% (p=0.016 n=4+5) name old error_responses/sec new error_responses/sec delta 1000Transactions-64 0.00 0.00 ~ (all equal) AgentGo-64 0.00 0.00 ~ (all equal) AgentNodeJS-64 0.00 0.00 ~ (all equal) AgentPython-64 0.00 0.00 ~ (all equal) AgentRuby-64 0.00 0.00 ~ (all equal) OTLPTraces-64 0.00 0.00 ~ (all equal) name old errors/sec new errors/sec delta 1000Transactions-64 0.00 0.00 ~ (all equal) AgentGo-64 162 ±18% 66 ± 5% -59.44% (p=0.008 n=5+5) AgentNodeJS-64 146 ± 6% 56 ±12% -61.51% (p=0.008 n=5+5) AgentPython-64 64.1 ±18% 21.4 ± 4% -66.61% (p=0.008 n=5+5) AgentRuby-64 250 ±16% 93 ± 5% -62.66% (p=0.008 n=5+5) OTLPTraces-64 0.00 0.00 ~ (all equal) name old events/sec new events/sec delta 1000Transactions-64 4.80k ±128% 4.85k ±76% ~ (p=0.841 n=5+5) AgentGo-64 8.06k ±18% 3.27k ± 5% -59.46% (p=0.008 n=5+5) AgentNodeJS-64 6.00k ± 6% 2.31k ±12% -61.49% (p=0.008 n=5+5) AgentPython-64 6.26k ±18% 2.09k ± 4% -66.65% (p=0.008 n=5+5) AgentRuby-64 7.00k ±16% 2.62k ± 4% -62.64% (p=0.008 n=5+5) OTLPTraces-64 12.1k ± 3% 10.2k ± 7% -15.31% (p=0.016 n=4+5) name old metrics/sec new metrics/sec delta 1000Transactions-64 0.69 ±301% 0.08 ±138% ~ (p=0.714 n=5+4) AgentGo-64 373 ±17% 150 ± 5% -59.85% (p=0.008 n=5+5) AgentNodeJS-64 360 ± 4% 139 ±13% -61.44% (p=0.008 n=5+5) AgentPython-64 1.89k ±18% 0.63k ± 4% -66.73% (p=0.008 n=5+5) AgentRuby-64 516 ±16% 194 ± 2% -62.48% (p=0.008 n=5+5) OTLPTraces-64 0.00 0.03 ±300% ~ (p=0.889 n=5+4) name old spans/sec new spans/sec delta 1000Transactions-64 0.00 0.00 ~ (all equal) AgentGo-64 5.32k ±18% 2.16k ± 5% -59.44% (p=0.008 n=5+5) AgentNodeJS-64 3.16k ± 6% 1.22k ±12% -61.51% (p=0.008 n=5+5) AgentPython-64 3.63k ±18% 1.21k ± 4% -66.62% (p=0.008 n=5+5) AgentRuby-64 4.12k ±16% 1.54k ± 5% -62.66% (p=0.008 n=5+5) OTLPTraces-64 0.00 0.00 ~ (all equal) name old txs/sec new txs/sec delta 1000Transactions-64 4.80k ±128% 4.85k ±76% ~ (p=0.841 n=5+5) AgentNodeJS-64 2.33k ± 6% 0.90k ±12% -61.50% (p=0.008 n=5+5) AgentRuby-64 2.12k ±16% 0.80k ± 2% -62.22% (p=0.016 n=5+4) OTLPTraces-64 12.1k ± 3% 10.2k ± 7% -15.33% (p=0.016 n=4+5) name old alloc/op new alloc/op delta 1000Transactions-64 3.23MB ±36% 3.11MB ±24% ~ (p=1.000 n=5+5) AgentNodeJS-64 3.49MB ± 5% 6.01MB ± 8% +72.34% (p=0.008 n=5+5) AgentRuby-64 4.69MB ±15% 9.59MB ± 0% +104.68% (p=0.016 n=5+4) OTLPTraces-64 1.41kB ± 0% 1.38kB ± 1% -2.08% (p=0.016 n=4+5) name old allocs/op new allocs/op delta 1000Transactions-64 6.58k ± 8% 6.40k ± 4% ~ (p=0.548 n=5+5) AgentNodeJS-64 6.69k ± 4% 10.45k ± 8% +56.12% (p=0.008 n=5+5) AgentRuby-64 11.9k ±18% 23.8k ± 0% +99.41% (p=0.016 n=5+4) OTLPTraces-64 9.00 ± 0% 8.00 ± 0% ~ (p=0.079 n=4+5) ``` Signed-off-by: Marc Lopez Rubio <marc5.12@outlook.com>
Contributor
|
This pull request does not have a backport label. Could you fix it @marclop? 🙏
NOTE: |
🌐 Coverage report
|
Contributor
|
Given the issues that @marclop encountered with packaging local apm-servers, I suggest to mark this as ready for review and get it merged into |
Member
|
@simitt sounds good. |
Member
|
Removing test-plan, will be covered by #8278 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation/summary
Adds a new explicit dependency on the Uber gomaxprocs library to check
if any CFS quotas have been set, adjusting the Go runtime gomaxprocs
setting accordingly. This prevents the APM Server from "starving" the
CFS allowed bandwidth and get into a situation where the majority of
threads are sleeping (throttled by the CFS scheduler), reducing the
amount of work that can be done.
The initial macro-benchmarks I ran seemed to indicate that the change
has a positive effect on the APM Server's throughput:
Gist with all the results: https://gist.github.com/marclop/2de5da81c5be7cd6dbab5080623c0ba1
Checklist
apmpackagehave been made)How to test these changes
docker-compose up -dcd systemtest/cmd/runapm && go run main.go, copy the image that is used.FLEET_ENROLLMENT_TOKENfrom the runapm container (you can stop the runapm container afterwards).docker-compose.override.ymlin the root of the APM Server with these contents:docker-compose up -d. Verify that the apm-server container is reachable.cd systemtest/cmd/apmbench && go build ../apmbench -benchtime=15s -count=5 -warmup-events=0 -agents=64Related issues
Closes #7967