kvflowcontrol: annotate/fix perf regressions#109833
kvflowcontrol: annotate/fix perf regressions#109833craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
fe6c9b3 to
a59d5ed
Compare
- Replace the flow controller level mutex-backed kvflowcontrol.Stream => token bucket map with sync.Map. On kv0/enc=false/nodes=3/cpu=96 accessing this map contributed to a high amount of mutex contention. We observe that this bucket is effectively read-only - entries for keys are written once (on creation) and read frequently after. We don't currently GC these buckets, but even if we did, the same access pattern would hold. We'll note that using a sync.Map is slightly more expensive CPU-wise. - Replace various map accesses with individual variables. We were needly using maps to access one of two variables, keyed by work class, for example when maintaining metrics per work class, or tracking token adjustments. The map accesses appeared prominently in CPU profiles and was unnecessary overhead. - Avoid using log.ExpensiveLogEnabled in hot code paths; it shows up in CPU profiles. - Slightly reduce the surface area of kvflowhandle.Handle.mu when returning flow tokens. - We also annotate various other points in the code where peep-hole optimizations exist, as surfaced by kv0/enc=false/nodes=3/cpu=96. Part of cockroachdb#104154. Release note: None
a59d5ed to
e571ffe
Compare
aadityasondhi
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained
|
I'll bors but if there are any follow ups/need to revert, I'll do it then. bors r+ |
|
Build succeeded: |
sumeerbhola
left a comment
There was a problem hiding this comment.
Reviewed 4 of 9 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained
pkg/kv/kvserver/kvflowcontrol/kvflowcontroller/kvflowcontroller.go line 72 at r1 (raw file):
// or when completely inactive (no tokens deducted/returned over 30+ // minutes), clear these out. buckets sync.Map // kvflowcontrol.Stream => *bucket
could use a code comment that says writes require the mutex.
pkg/kv/kvserver/kvflowcontrol/kvflowcontroller/kvflowcontroller.go line 113 at r1 (raw file):
// NB: We're holding the controller mutex here, which guards against // new buckets being added, synchronization we don't get out of // sync.Map.Range() directly.
is this the only reason we need updates to the map to hold the mutex?
pkg/kv/kvserver/kvflowcontrol/kvflowcontroller/kvflowcontroller.go line 376 at r1 (raw file):
func (b *bucket) tokensLocked(wc admissionpb.WorkClass) kvflowcontrol.Tokens { if wc == regular { return b.mu.tokensPerWorkClass.regular
this would be simpler if tokensPerWorkClass was an array [NumWorkClasses]kvflowcontrol.Tokens.
pkg/kv/kvserver/kvflowcontrol/kvflowcontroller/kvflowcontroller.go line 401 at r1 (raw file):
defer b.mu.Unlock() // TODO(irfansharif,aaditya): On kv0/enc=false/nodes=3/cpu=96 this mutex is // responsible for ~1.8% of the mutex contention. Maybe address it as part
We could use some more context before deciding whether to try to optimize this.
For example https://cockroachlabs.slack.com/archives/C01SRKWGHG8/p1667521022234139?thread_ts=1667515074.752629&cid=C01SRKWGHG8
pkg/kv/kvserver/kvflowcontrol/kvflowcontroller/kvflowcontroller_metrics.go line 133 at r1 (raw file):
type metrics struct { ElasticFlowTokensDeducted *metric.Counter
why this change of not using an array?
Is array indexing really a performance hit?
Part of #104154.
Release note: None