kvserver: scale Raft entry cache size with system memory#107424
kvserver: scale Raft entry cache size with system memory#107424craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
|
I could be convinced to increase the scaling here to e.g. 1/128 with 64 MB minimum, but I decided to err on the side of caution here, and I haven't seen compelling evidence that this scaling is insufficient for the majority of workloads. It might be more relevant in multi-region clusters, where replication latencies are higher. @nvanbenschoten Do you have any further workloads or experiments you'd like to calibrate this on? I'm currently rerunning #98666 (comment) and will post results in #98666, but after 1 hour I'm seeing ~100% hit rate with the new setting while the old default is at 98.9%. |
674c34f to
f1dac6a
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @erikgrinaker and @pavelkalinnikov)
pkg/kv/kvserver/store.go line 1294 at r1 (raw file):
} if sc.RaftEntryCacheSize == 0 { sc.RaftEntryCacheSize = uint64(defaultRaftEntryCacheSize)
nit: should this just be sc.RaftEntryCacheSize = uint64(defaultRaftEntryCacheSize) / uint64(numStores)?
31a30dc to
3ef9509
Compare
erikgrinaker
left a comment
There was a problem hiding this comment.
TFTR!
bors r+
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @pavelkalinnikov)
pkg/kv/kvserver/store.go line 1294 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: should this just be
sc.RaftEntryCacheSize = uint64(defaultRaftEntryCacheSize) / uint64(numStores)?
Figured I'd guard against division by zero. This is called unconditionally on the server startup path, and I don't know if there are cases where we can end up running without a store (e.g. SQL pods).
The Raft entry cache size defaulted to 16 MB, which is rather small. This has been seen to cause tail latency and throughput degradation with high write volume on large nodes, correlating with a reduction in the entry cache hit rate. This patch linearly scales the Raft entry cache size as 1/256 of total system/cgroup memory, shared evenly between all stores, with a minimum 32 MB. For example, a 32 GB 8-vCPU node will have a 128 MB entry cache. This is a conservative default, since this memory is not accounted for in existing memory budgets nor by the `--cache` flag. We rarely see cache misses in production clusters anyway, and have seen significantly improved hit rates with this scaling (e.g. a 64 KB kv0 workload on 8-vCPU nodes increased from 87% to 99% hit rate). Epic: none Release note (performance improvement): The default Raft entry cache size has been increased from 16 MB to 1/256 of system memory with a minimum of 32 MB, divided evenly between all stores. This can be configured via `COCKROACH_RAFT_ENTRY_CACHE_SIZE`.
3ef9509 to
c435472
Compare
|
Canceled. |
|
bors r+ |
|
Build succeeded: |
The Raft entry cache size defaulted to 16 MB, which is rather small. This has been seen to cause tail latency and throughput degradation with high write volume on large nodes, correlating with a reduction in the entry cache hit rate.
This patch linearly scales the Raft entry cache size as 1/256 of total system/cgroup memory, shared evenly between all stores, with a minimum 32 MB. For example, a 32 GB 8-vCPU node will have a 128 MB entry cache.
This is a conservative default, since this memory is not accounted for in existing memory budgets nor by the
--cacheflag. We rarely see cache misses in production clusters anyway, and have seen significantly improved hit rates with this scaling (e.g. a 64 KB kv0 workload on 8-vCPU nodes increased from 87% to 99% hit rate).Resolves #98666.
Epic: none
Release note (performance improvement): The default Raft entry cache size has been increased from 16 MB to 1/256 of system memory with a minimum of 32 MB, divided evenly between all stores. This can be configured via
COCKROACH_RAFT_ENTRY_CACHE_SIZE.