Skip to content

kvserver/mmaprototype: add BenchmarkRebalanceStores#165284

Merged
trunk-io[bot] merged 6 commits intocockroachdb:masterfrom
tbg:tbg/mma-rebalance-bench
Mar 16, 2026
Merged

kvserver/mmaprototype: add BenchmarkRebalanceStores#165284
trunk-io[bot] merged 6 commits intocockroachdb:masterfrom
tbg:tbg/mma-rebalance-bench

Conversation

@tbg
Copy link
Copy Markdown
Member

@tbg tbg commented Mar 10, 2026

Summary

Reduce per-iteration allocations in the MMA rebalance loop from ~2000 to ~160
allocs/op, primarily by guarding expensive logging and pooling hot slices/maps.

name                old time/op    new time/op    delta
RebalanceStores-10     246µs ± 3%     104µs ± 2%  -57.87%  (p=0.000 n=10+10)

name                old alloc/op   new alloc/op   delta
RebalanceStores-10     101kB ± 0%      19kB ± 0%  -81.17%  (p=0.000 n=9+8)

name                old allocs/op  new allocs/op  delta
RebalanceStores-10     1.99k ± 0%     0.16k ± 0%  -91.75%  (p=0.000 n=10+10)

Commit arc

  1. Add BenchmarkRebalanceStores — microbenchmark for the rebalance loop
    (4 stores, 1000 ranges, 100 hot). Uses undoPendingChange to restore state
    between iterations. Baseline: 2013 allocs/op.

  2. Annotate top allocation sites — TODO comments at each site found via
    memory profiling, guiding the subsequent commits.

  3. Guard logging in sortTargetCandidateSetAndPick — extract
    formatCandidatesLog to a package-level function with an internal
    ExpensiveLogEnabled check, removing redundant guards at all 6 call sites.
    2013 → 1689 allocs/op.

  4. Guard top-K debug logging — wrap the redact.StringBuilder +
    LoadVector formatting block with ExpensiveLogEnabled.
    1689 → 778 allocs/op.

  5. Guard lease-transfer candidate logging — prevent boxing candsPL into
    interface{} when verbose logging is off. 778 → 678 allocs/op.

  6. Pool slice and map allocations in rebalance loop — introduce generic
    slicePool[T] and mapPool[K, V] wrappers around sync.Pool, replacing
    manual pool boilerplate and the rebalanceEnv.scratch struct (which had
    aliasing hazards). Five pools cover slices (storeAndLeasePreference,
    StoreID, candidateInfo) and maps (NodeID→*NodeLoad,
    StoreID→struct{}). candidatesToMoveLease accepts a reusable buffer.
    678 → 160 allocs/op.

Epic: CRDB-55052

@trunk-io
Copy link
Copy Markdown
Contributor

trunk-io bot commented Mar 10, 2026

😎 Merged successfully - details.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@tbg tbg force-pushed the tbg/mma-rebalance-bench branch 4 times, most recently from f7acd25 to 6e2a60c Compare March 10, 2026 09:42
@tbg tbg force-pushed the tbg/mma-rebalance-bench branch from 82751c0 to d04dfb1 Compare March 10, 2026 10:44
@tbg tbg force-pushed the tbg/mma-rebalance-bench branch 2 times, most recently from e0be4b8 to a95dd61 Compare March 10, 2026 11:40
@tbg tbg marked this pull request as ready for review March 10, 2026 14:23
@tbg tbg requested review from a team as code owners March 10, 2026 14:23
@tbg tbg requested a review from angeladietz March 10, 2026 14:23
Copy link
Copy Markdown
Contributor

@angeladietz angeladietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits but lgtm overall, nothing blocking!

I'm also curious how you attributed the number of allocs to each line (commit 2). was it just with pprof (& your agent)?

@tbg tbg force-pushed the tbg/mma-rebalance-bench branch from c61112c to 6f8b7ec Compare March 16, 2026 09:51
Copy link
Copy Markdown
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I told the agent to run the benchmark with memprofile and to top50 the output by allocation count, then iterated.

@tbg partially reviewed 10 files and all commit messages, made 3 comments, resolved 5 discussions, and dismissed @angeladietz from a discussion.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on angeladietz).

@tbg tbg force-pushed the tbg/mma-rebalance-bench branch from 6f8b7ec to 11ee657 Compare March 16, 2026 09:54
tbg and others added 5 commits March 16, 2026 10:54
Add a microbenchmark for the MMA rebalance loop (rebalanceStores).
The benchmark uses the existing undoPendingChange production code to
reverse mutations after each iteration, plus targeted resets of leaked
fields (lastFailedChange, overload times, meansMemo cache).

Setup: 4 stores, 1000 ranges (100 hot, 900 cold), store 1 overloaded.

    BenchmarkRebalanceStores  100  323k ns/op  102kB/op  2013 allocs/op

Epic: none
Release note: None

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Add TODO(tbg) comments at each allocation site found via
BenchmarkRebalanceStores memory profiling (4 stores, 1000 ranges,
100 hot). The allocs/op counts are from a 1000-iteration run.

Epic: none
Release note: None

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
The `formatCandidatesLog` closure and `redact.StringBuilder` were
allocated on every call to `sortTargetCandidateSetAndPick`, even when
verbose logging at level 2 was disabled. Additionally, the standalone
VEventf for discarded candidates boxed multiple arguments into
interfaces unconditionally.

Extract `formatCandidatesLog` to a package-level function and guard all
call sites (plus the standalone VEventf) with `log.ExpensiveLogEnabled`.

    BenchmarkRebalanceStores  1999 -> 1689 allocs/op (-310)

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
…bled

The top-K debug logging block in `processSheddingStore` builds a
`redact.StringBuilder` and formats `LoadVector` values for every range in
the top-K set on every call, even when verbose logging at level 2 is
disabled. Wrap the entire block with `log.ExpensiveLogEnabled`.

    BenchmarkRebalanceStores  1689 -> 778 allocs/op (-911)

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
…Enabled

The VEventf call logging lease-transfer candidates boxes `candsPL`
(a `storeSet`) to `interface{}`, triggering formatting even when verbose
logging at level 2 is disabled. Guard with `log.ExpensiveLogEnabled`.

    BenchmarkRebalanceStores  778 -> 678 allocs/op (-100)

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@tbg tbg force-pushed the tbg/mma-rebalance-bench branch from 11ee657 to 11e77a4 Compare March 16, 2026 09:58
Reduce per-iteration allocations by pooling hot slices and maps via generic
`slicePool[T]` and `mapPool[K, V]` wrappers around `sync.Pool`. These handle
the `New` func, type assertion on `get`, and `clear`+reset on `put`.

Five pools are introduced:
- `storeAndLeasePreferencePool`, `storeIDPool`, `candidateInfoPool` for slices
- `nodeLoadMapPool`, `storeIDStructMapPool` for maps

The `rebalanceEnv.scratch` struct (which had aliasing hazards) is removed in
favor of these pools. `candidatesToMoveLease` now accepts a reusable buffer.

678 → 160 allocs/op.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@tbg tbg force-pushed the tbg/mma-rebalance-bench branch from 50bbd59 to f9fc017 Compare March 16, 2026 10:05
@tbg
Copy link
Copy Markdown
Member Author

tbg commented Mar 16, 2026

/trunk merge

@trunk-io trunk-io bot merged commit 559db16 into cockroachdb:master Mar 16, 2026
26 checks passed
@tbg tbg deleted the tbg/mma-rebalance-bench branch March 16, 2026 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants