mmaprototype: plumb node CPU in MakeStoreLoadMsg by wenyihu6 · Pull Request #164760 · cockroachdb/cockroach

wenyihu6 · 2026-03-03T20:05:03Z

The capped multiplier model (computeCPUCapacityWithCap) subtracts
background CPU when deriving per-store capacity. This intentionally
breaks the invariant maintained by the naive model:

sum(store loads) / sum(store capacities) = NodeCPURateUsage / NodeCPURateCapacity

When background CPU exceeds the cap, node-level overload detection can
no longer recover physical utilization by summing store-level values.

This change replaces the derived NodeLoad fields (ReportedCPU,
CapacityCPU — incremental sums of per-store values) which relied
on this invariance with explicit physical node-level CPU values
(NodeCPULoad, NodeCPUCapacity) carried through StoreLoadMsg
from NodeCapacity fields.

Note that each store generates its descriptor independently, so
NodeCPULoad values may differ slightly across stores on the same
node due to sampling time differences. This is acceptable since
they should be roughly the same.

Epic: CRDB-55052
Release note: none

cockroach-teamcity · 2026-03-03T20:05:17Z

This change is

trunk-io · 2026-03-03T20:08:58Z

😎 Merged successfully - details.

tbg

I'm a little sad about the extra noise in the datadriven files but great otherwise. Why is there no regression test? I figured since the old code was wrong, we should've been able to show an issue that then would've resolved with the fix? Did I miss it?

@tbg reviewed 25 files and all commit messages, and made 3 comments.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on sumeerbhola and wenyihu6).

pkg/kv/kvserver/allocator/mmaprototype/cluster_state.go line 1395 at r1 (raw file):

	// so we incrementally adjust for the baseline shift and strip only this
	// store's pending delta.
	storePendingDelta := ss.adjusted.load[CPURate] - ss.reportedLoad[CPURate]

ok just re-expressing to make sure I'm getting it

the previous adjustedCPU is basically ns.NodeCPULoad + all_store_pending_deltas and the new one should be storeMsg.NodeCPULoad + all_but_our_store_pending_deltas. And we have

newAdjustedCPU :=
= oldAdjustedCPU +  storeMsg.NodeCPULoad - oldNodeCPULoad - storePendingDelta
= (oldAdjustedCPU - oldNodeCPULoad) - storePendingDelta + storeMsg.NodeCPULoad
= allPendingDelta - storePendingDelta + storeMsg.NodeCPULoad
= storeMsg.NodeCPULoad + all_but_our_store_pending_delta

and we add our pending delta back in below. Checks out.

pkg/kv/kvserver/allocator/mmaprototype/testdata/cluster_state/rebalance_stores_cpu_replica_refusing_target.txt line 28 at r1 (raw file):

store-load-msg
  store-id=1 node-id=1 load=[1000,0,0] capacity=[1000,1000,1000] secondary-load=0 load-time=0s node-cpu-load=1000 node-cpu-capacity=1000

These additions seem noisy and redundant except when we're actively introducing background load. Can we automatically compute them by default so that they need to be specified only when targeted directly?

The capped multiplier model (`computeCPUCapacityWithCap`) subtracts background CPU when deriving per-store capacity. This intentionally breaks the invariant maintained by the naive model: sum(store loads) / sum(store capacities) = NodeCPURateUsage / NodeCPURateCapacity When background CPU exceeds the cap, node-level overload detection can no longer recover physical utilization by summing store-level values. This change replaces the derived `NodeLoad` fields (`ReportedCPU`, `CapacityCPU` — incremental sums of per-store values) which relied on this invariance with explicit physical node-level CPU values (`NodeCPULoad`, `NodeCPUCapacity`) carried through `StoreLoadMsg` from `NodeCapacity` fields. Note that each store generates its descriptor independently, so `NodeCPULoad` values may differ slightly across stores on the same node due to sampling time differences. This is acceptable since they should be roughly the same.

wenyihu6

@wenyihu6 made 2 comments.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on sumeerbhola and tbg).

pkg/kv/kvserver/allocator/mmaprototype/cluster_state.go line 1395 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

ok just re-expressing to make sure I'm getting it

the previous adjustedCPU is basically ns.NodeCPULoad + all_store_pending_deltas and the new one should be storeMsg.NodeCPULoad + all_but_our_store_pending_deltas. And we have
newAdjustedCPU :=
= oldAdjustedCPU +  storeMsg.NodeCPULoad - oldNodeCPULoad - storePendingDelta
= (oldAdjustedCPU - oldNodeCPULoad) - storePendingDelta + storeMsg.NodeCPULoad
= allPendingDelta - storePendingDelta + storeMsg.NodeCPULoad
= storeMsg.NodeCPULoad + all_but_our_store_pending_delta
and we add our pending delta back in below. Checks out.

Yes.

pkg/kv/kvserver/allocator/mmaprototype/testdata/cluster_state/rebalance_stores_cpu_replica_refusing_target.txt line 28 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

These additions seem noisy and redundant except when we're actively introducing background load. Can we automatically compute them by default so that they need to be specified only when targeted directly?

We didn't have regression tests for this invariance. Put up #164882 as a regression test.

wenyihu6 · 2026-03-05T01:01:06Z

/trunk merge

wenyihu6 requested review from a team as code owners March 3, 2026 20:05

wenyihu6 requested review from sumeerbhola and tbg March 3, 2026 20:05

wenyihu6 mentioned this pull request Mar 4, 2026

mmaintegration: introduce physical capacity model #164792

Closed

tbg approved these changes Mar 4, 2026

View reviewed changes

wenyihu6 force-pushed the fixnodelevel branch from f3e67dd to aeb1b56 Compare March 4, 2026 17:02

wenyihu6 commented Mar 5, 2026

View reviewed changes

trunk-io bot merged commit 06a4624 into cockroachdb:master Mar 5, 2026
37 of 39 checks passed

celeste-cockroachdb bot added the target-release-26.2.0 label Mar 5, 2026

5hubh4m mentioned this pull request Mar 11, 2026

release-26.1: batch backport of even more MMA related changes #165445

Merged

celeste-cockroachdb bot added v26.2.0-prerelease and removed target-release-26.2.0 labels Mar 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mmaprototype: plumb node CPU in MakeStoreLoadMsg#164760

mmaprototype: plumb node CPU in MakeStoreLoadMsg#164760
trunk-io[bot] merged 1 commit intocockroachdb:masterfrom
wenyihu6:fixnodelevel

wenyihu6 commented Mar 3, 2026 •

edited

Loading

Uh oh!

cockroach-teamcity commented Mar 3, 2026

Uh oh!

trunk-io bot commented Mar 3, 2026 •

edited

Loading

Uh oh!

tbg left a comment

Uh oh!

wenyihu6 left a comment

Uh oh!

wenyihu6 commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wenyihu6 commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Mar 3, 2026

Uh oh!

trunk-io bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tbg left a comment

Choose a reason for hiding this comment

Uh oh!

wenyihu6 left a comment

Choose a reason for hiding this comment

Uh oh!

wenyihu6 commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wenyihu6 commented Mar 3, 2026 •

edited

Loading

trunk-io bot commented Mar 3, 2026 •

edited

Loading