mmaprototype: plumb node CPU in MakeStoreLoadMsg#164760
mmaprototype: plumb node CPU in MakeStoreLoadMsg#164760trunk-io[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
😎 Merged successfully - details. |
tbg
left a comment
There was a problem hiding this comment.
I'm a little sad about the extra noise in the datadriven files but great otherwise. Why is there no regression test? I figured since the old code was wrong, we should've been able to show an issue that then would've resolved with the fix? Did I miss it?
@tbg reviewed 25 files and all commit messages, and made 3 comments.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on sumeerbhola and wenyihu6).
pkg/kv/kvserver/allocator/mmaprototype/cluster_state.go line 1395 at r1 (raw file):
// so we incrementally adjust for the baseline shift and strip only this // store's pending delta. storePendingDelta := ss.adjusted.load[CPURate] - ss.reportedLoad[CPURate]
ok just re-expressing to make sure I'm getting it
the previous adjustedCPU is basically ns.NodeCPULoad + all_store_pending_deltas and the new one should be storeMsg.NodeCPULoad + all_but_our_store_pending_deltas. And we have
newAdjustedCPU :=
= oldAdjustedCPU + storeMsg.NodeCPULoad - oldNodeCPULoad - storePendingDelta
= (oldAdjustedCPU - oldNodeCPULoad) - storePendingDelta + storeMsg.NodeCPULoad
= allPendingDelta - storePendingDelta + storeMsg.NodeCPULoad
= storeMsg.NodeCPULoad + all_but_our_store_pending_delta
and we add our pending delta back in below. Checks out.
pkg/kv/kvserver/allocator/mmaprototype/testdata/cluster_state/rebalance_stores_cpu_replica_refusing_target.txt line 28 at r1 (raw file):
store-load-msg store-id=1 node-id=1 load=[1000,0,0] capacity=[1000,1000,1000] secondary-load=0 load-time=0s node-cpu-load=1000 node-cpu-capacity=1000
These additions seem noisy and redundant except when we're actively introducing background load. Can we automatically compute them by default so that they need to be specified only when targeted directly?
The capped multiplier model (`computeCPUCapacityWithCap`) subtracts background CPU when deriving per-store capacity. This intentionally breaks the invariant maintained by the naive model: sum(store loads) / sum(store capacities) = NodeCPURateUsage / NodeCPURateCapacity When background CPU exceeds the cap, node-level overload detection can no longer recover physical utilization by summing store-level values. This change replaces the derived `NodeLoad` fields (`ReportedCPU`, `CapacityCPU` — incremental sums of per-store values) which relied on this invariance with explicit physical node-level CPU values (`NodeCPULoad`, `NodeCPUCapacity`) carried through `StoreLoadMsg` from `NodeCapacity` fields. Note that each store generates its descriptor independently, so `NodeCPULoad` values may differ slightly across stores on the same node due to sampling time differences. This is acceptable since they should be roughly the same.
wenyihu6
left a comment
There was a problem hiding this comment.
@wenyihu6 made 2 comments.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on sumeerbhola and tbg).
pkg/kv/kvserver/allocator/mmaprototype/cluster_state.go line 1395 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
ok just re-expressing to make sure I'm getting it
the previous adjustedCPU is basically
ns.NodeCPULoad + all_store_pending_deltasand the new one should bestoreMsg.NodeCPULoad + all_but_our_store_pending_deltas. And we havenewAdjustedCPU := = oldAdjustedCPU + storeMsg.NodeCPULoad - oldNodeCPULoad - storePendingDelta = (oldAdjustedCPU - oldNodeCPULoad) - storePendingDelta + storeMsg.NodeCPULoad = allPendingDelta - storePendingDelta + storeMsg.NodeCPULoad = storeMsg.NodeCPULoad + all_but_our_store_pending_deltaand we add our pending delta back in below. Checks out.
Yes.
pkg/kv/kvserver/allocator/mmaprototype/testdata/cluster_state/rebalance_stores_cpu_replica_refusing_target.txt line 28 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
These additions seem noisy and redundant except when we're actively introducing background load. Can we automatically compute them by default so that they need to be specified only when targeted directly?
We didn't have regression tests for this invariance. Put up #164882 as a regression test.
|
/trunk merge |
The capped multiplier model (
computeCPUCapacityWithCap) subtractsbackground CPU when deriving per-store capacity. This intentionally
breaks the invariant maintained by the naive model:
When background CPU exceeds the cap, node-level overload detection can
no longer recover physical utilization by summing store-level values.
This change replaces the derived
NodeLoadfields (ReportedCPU,CapacityCPU— incremental sums of per-store values) which reliedon this invariance with explicit physical node-level CPU values
(
NodeCPULoad,NodeCPUCapacity) carried throughStoreLoadMsgfrom
NodeCapacityfields.Note that each store generates its descriptor independently, so
NodeCPULoadvalues may differ slightly across stores on the samenode due to sampling time differences. This is acceptable since
they should be roughly the same.
Epic: CRDB-55052
Release note: none