-
Notifications
You must be signed in to change notification settings - Fork 4.1k
*: fine-grained cpu attribution #82625
Copy link
Copy link
Closed
Labels
A-kv-distributionRelating to rebalancing and leasing.Relating to rebalancing and leasing.A-kv-observabilityC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-kvKV TeamKV Teamsync-me-7sync-me-8
Description
This is the tracking issue for #82356. We expect the work here to break down into the following deliverables.
Must have:
- Write a program to build Go archives for various OS/arch combinations CRDB officially supports/makes use of internally (linux arm64+amd64, darwin arm64+amd64, freebsd amd64, windows amd64). A prototype using hand-rolled scripts (but without CGO support) is here.
- The official steps, in code form, are are here.
- We probably want a docker image with the right cross-compilers to run this program under, these archives need to be built with CGO support.
- Being able to generate these archives in CI as downloadable artifacts would be a nice bonus. Ditto if we're able to run the Go tests alongside.
- When bumping Go versions (major or minor), updating our steps to use a manually generated archive from above which also applies the patchset proposed in rfcs: fine-grained cpu attribution #82356 (kept in a cockroachdb/go repo, or checked into cockroachdb/cockroach as a patch file).
- Introduce a library in cockroachdb/cockroach to access per-goroutine running time. This needs to be gated behind a build tag to error/zero out when not built using the patched runtime.
Nice to have:
- Support a per-store CPU usage breakdown by (i) ranges and (ii) tenants, powered by per-goroutine running time. We'd make this accessible through vtables and include results in debug zips, serving as a CPU-only version of today's hot-ranges report. A prototype of this is available here.
-
Surfacing per-statement cluster-wide CPU usage as part of EXPLAIN ANALYZE. This would likely require integrating the library above into the Stopper and recording as tracing events total running CPU time for a given request (propagated all the way up to the gateway).Tracked in sql: record CPU seconds per statement #87213.
Jira issue: CRDB-16580
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
A-kv-distributionRelating to rebalancing and leasing.Relating to rebalancing and leasing.A-kv-observabilityC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-kvKV TeamKV Teamsync-me-7sync-me-8