kvserver: benchmark different cpu load based split thresholds

#96128 Adds support for splitting a range once its leaseholder replica uses more CPU than `kv.range_split.load_cpu_threshold`. The default value of `kv.range_split.load_cpu_threshold` is `250ms` of CPU use per second, or 1/4 of a CPU core. 

This issue is to benchmark performance with different `kv.range_split.load_cpu_threshold` values set. The results should then inform a default value.


More specifically, benchmark ycsb, kv0, kv95 on three nodes and bisect a value that achieves the highest throughput.

The current value was selected by observing the performance of the cluster from a rebalancing perspective. The specific criteria was to constrain the occurrences of a store being overfull relative to the mean but not having any actions available to resolve being overfull. When running TPCE (50k), CPU splitting with a 250ms threshold performed 1 load based split whilst QPS splitting (2500) performed 12.5. 

When running the allocbench/*/kv roachtest suite, CPU splitting (250ms) tended to make between 33-100% more load based splits than QPS splitting (2500) on workloads involving reads (usually large scans), whilst on the write heavy workloads the number of load based splits was identically low.

Here's a comparison of splits running TPCE between master(qps splits)/this branch with 250ms:

![image.png](https://files.reviewableusercontent.io/images/reviewable/39606633/wlzmujONIQ7vJrIABfnOJtV_/image.png)

The same for allocbench (5 runs of each type, order is r=0/access=skew, r=0/ops=skew, r=50/ops=skew, r=95/access=skew, r=95/ops=skew.
![image copy 1.png](https://files.reviewableusercontent.io/images/reviewable/39606633/uRvnzE5sH9YrXZY8U09PK3rk/image.png)




Jira issue: CRDB-24382

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: benchmark different cpu load based split thresholds #96869

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kvserver: benchmark different cpu load based split thresholds #96869

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions