Skip to content

server: Implement 'gossip' endpoint in multitenant setup#141212

Merged
craig[bot] merged 3 commits intocockroachdb:masterfrom
shaikzakiriitm:issue_110022
Feb 21, 2025
Merged

server: Implement 'gossip' endpoint in multitenant setup#141212
craig[bot] merged 3 commits intocockroachdb:masterfrom
shaikzakiriitm:issue_110022

Conversation

@shaikzakiriitm
Copy link
Copy Markdown
Contributor

@shaikzakiriitm shaikzakiriitm commented Feb 12, 2025

What was there before: previously, gossip endpoint wasn't accessible from secondary tenant why it needed to change: gossip endpoint provides crucial information for debugging and this should be accessible from secondary tenant for troubleshooting issues within gossip protocol of the cluster
What you did about it: To address this, we created gossip endpoint in tenant status server. The implementation uses connector to redirect the call to the system tenant on the requested kv node. Access to this endpoint is guarded by can_debug_process capability as this is node's debugging information albeit gossip related one. Updated the gossip status unit test for both system tenant and application tenant.

Epic: CRDB-38968
Fixes: #110022

Release note: None

@shaikzakiriitm shaikzakiriitm requested a review from a team as a code owner February 12, 2025 05:53
@shaikzakiriitm shaikzakiriitm requested review from a team February 12, 2025 05:53
@shaikzakiriitm shaikzakiriitm requested review from a team as code owners February 12, 2025 05:53
@shaikzakiriitm shaikzakiriitm requested review from kyle-a-wong and rharding6373 and removed request for a team February 12, 2025 05:53
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@shaikzakiriitm shaikzakiriitm changed the title server/storage_api: Implement 'gossip' endpoint in tenant status serv… server: Implement 'gossip' endpoint in tenant status serv… Feb 18, 2025
@shaikzakiriitm shaikzakiriitm changed the title server: Implement 'gossip' endpoint in tenant status serv… server: Implement 'gossip' endpoint in tenant status server to serve gossip information for debugging purposes Feb 18, 2025
@shaikzakiriitm shaikzakiriitm changed the title server: Implement 'gossip' endpoint in tenant status server to serve gossip information for debugging purposes server: Implement 'gossip' endpoint in multitenant setup Feb 18, 2025
Copy link
Copy Markdown
Contributor

@cthumuluru-crdb cthumuluru-crdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments, please take a look. Also, use the commit guidelines from here: https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/73072807/Git+Commit+Messages. I think you also need to title on the commit, not just the PR.

what was there before: previously, gossip endpoint wasn't accessible from secondary tenant
why it needed to change: gossip endpoint provides crucial information for debugging and
this should be accessible from secondary tenant for troubleshooting issues within gossip
protocol of the cluster
what you did about it: To address this, we created gossip endpoint in tenant status server.
The implementation uses connector to redirect the call to the system tenant on the requested
kv node. Access to this endpoint is guarded by `can_debug_process` capability as this is
node's debugging information albeit gossip related one. Updated the gossip status unit test
for both system tenant and application tenant.

Epic: CRDB-38968
Fixes: cockroachdb#110022

Release note: None
@shaikzakiriitm
Copy link
Copy Markdown
Contributor Author

I left some comments, please take a look. Also, use the commit guidelines from here: https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/73072807/Git+Commit+Messages. I think you also need to title on the commit, not just the PR.

Fixed the commit title.

Copy link
Copy Markdown
Contributor

@cthumuluru-crdb cthumuluru-crdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides one comment on the capability we have to use, other changes look good to me. Please wait for the response from David or Steven before committing the changes.

Copy link
Copy Markdown
Contributor

@cthumuluru-crdb cthumuluru-crdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides one comment on the capability we have to use, other changes look good to me. Please wait for the response from David or Steven before committing the changes.

Copy link
Copy Markdown
Collaborator

@dhartunian dhartunian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cthumuluru-crdb, @kyle-a-wong, @rharding6373, @shaikzakiriitm, @shubhamdhama, and @stevendanna)

@shaikzakiriitm shaikzakiriitm requested a review from a team as a code owner February 21, 2025 05:01
Copy link
Copy Markdown
Contributor

@cthumuluru-crdb cthumuluru-crdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Thanks!

Copy link
Copy Markdown
Contributor

@shubhamdhama shubhamdhama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!
LGTM!

small suggestions and don't let it block this PR: you can use this helper now: #141490

@shaikzakiriitm
Copy link
Copy Markdown
Contributor Author

Great work! LGTM!

small suggestions and don't let it block this PR: you can use this helper now: #141490

👍 got it. Will raise a followup pr to clean the usage. Thanks!

@github-actions
Copy link
Copy Markdown
Contributor

⚪ Sysbench [SQL, 3node, oltp_read_write]
Metric Old Commit New Commit Delta Note Threshold
sec/op 10.93m ±1% 10.89m ±1% ~ p=0.315 n=10 3.0%
errs/op 0.000 ±0% 0.000 ±0% ~ p=1.000 n=10 0.0%
allocs/op 10.17k ±0% 10.19k ±0% ~ p=0.256 n=10 2.0%
B/op 2.204Mi ±0% 2.207Mi ±0% ~ p=0.063 n=10 2.0%
Reproduce

benchdiff binaries:

mkdir -p benchdiff/3e93d0c/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/3e93d0cffa0faf754a99d991dee62ad4a56b8233/bin/pkg_sql_tests benchdiff/3e93d0c/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/3e93d0c/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/66bfa47/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/66bfa4733e3c016ea54f13f1104c395dd36bd25e/bin/pkg_sql_tests benchdiff/66bfa47/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/66bfa47/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

benchdiff --run=^BenchmarkSysbench/SQL/3node/oltp_read_write$ --old=66bfa47 --new=3e93d0c ./pkg/sql/tests
⚪ Sysbench [KV, 1node, local, oltp_read_only]
Metric Old Commit New Commit Delta Note Threshold
sec/op 667.3µ ±1% 666.7µ ±1% ~ p=0.247 n=10 2.0%
errs/op 0.000 ±0% 0.000 ±0% ~ p=1.000 n=10 0.0%
allocs/op 439.0 ±0% 439.0 ±0% ~ p=1.000 n=10 1.5%
B/op 254.2Ki ±0% 254.2Ki ±0% ~ p=0.725 n=10 1.5%
Reproduce

benchdiff binaries:

mkdir -p benchdiff/3e93d0c/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/3e93d0cffa0faf754a99d991dee62ad4a56b8233/bin/pkg_sql_tests benchdiff/3e93d0c/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/3e93d0c/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/66bfa47/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/66bfa4733e3c016ea54f13f1104c395dd36bd25e/bin/pkg_sql_tests benchdiff/66bfa47/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/66bfa47/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

benchdiff --run=^BenchmarkSysbench/KV/1node_local/oltp_read_only$ --old=66bfa47 --new=3e93d0c ./pkg/sql/tests
⚪ Sysbench [KV, 1node, local, oltp_write_only]
Metric Old Commit New Commit Delta Note Threshold
sec/op 1.364m ±1% 1.373m ±1% ~ p=0.190 n=10 2.5%
errs/op 0.000 ±0% 0.000 ±0% ~ p=1.000 n=10 0.0%
allocs/op 1.396k ±0% 1.397k ±0% ~ p=0.416 n=10 1.8%
B/op 290.4Ki ±0% 290.2Ki ±1% ~ p=0.123 n=10 1.8%
Reproduce

benchdiff binaries:

mkdir -p benchdiff/3e93d0c/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/3e93d0cffa0faf754a99d991dee62ad4a56b8233/bin/pkg_sql_tests benchdiff/3e93d0c/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/3e93d0c/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
mkdir -p benchdiff/66bfa47/bin/1058449141
gcloud storage cp gs://cockroach-microbench-ci/builds/66bfa4733e3c016ea54f13f1104c395dd36bd25e/bin/pkg_sql_tests benchdiff/66bfa47/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests
chmod +x benchdiff/66bfa47/bin/1058449141/cockroachdb_cockroach_pkg_sql_tests

benchdiff command:

benchdiff --run=^BenchmarkSysbench/KV/1node_local/oltp_write_only$ --old=66bfa47 --new=3e93d0c ./pkg/sql/tests
Artifacts

download:

mkdir -p new
gcloud storage cp gs://cockroach-microbench-ci/artifacts/3e93d0cffa0faf754a99d991dee62ad4a56b8233/13456898448-1/\* new/
mkdir -p old
gcloud storage cp gs://cockroach-microbench-ci/artifacts/66bfa4733e3c016ea54f13f1104c395dd36bd25e/13456898448-1/\* old/
Legend
  • Neutral: No significant performance change.
  • 🟡 Warning: Slight degradation, likely due to variance, but still within thresholds.
  • 🔴 Regression: Likely performance regression, requiring investigation.
  • 🟢 Improvement: Possible performance gain.

No regressions detected!

built with commit: 3e93d0cffa0faf754a99d991dee62ad4a56b8233

@shaikzakiriitm
Copy link
Copy Markdown
Contributor Author

bors r=shubhamdhama,cthumuluru-crdb

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 21, 2025

@craig craig bot merged commit fe860f6 into cockroachdb:master Feb 21, 2025
24 checks passed
@dhartunian
Copy link
Copy Markdown
Collaborator

@shaikzakiriitm please squash fixups before merging in the future

@shaikzakiriitm
Copy link
Copy Markdown
Contributor Author

@shaikzakiriitm please squash fixups before merging in the future

@dhartunian My bad, squashing the fixup commits really slipped my mind. Thank you for pointing it out, will keep this in my mind from now on. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

server: the "gossip" status API endpoint should work with secondary tenants with sufficient capability

5 participants