-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: pre-split hash sharded indexes on shard boundaries before DELETE_AND_WRITE_ONLY #74558
Description
Is your feature request related to a problem? Please describe.
We're creating a hash sharded index because we know that the write volume will be sequential. As we've see in #62672 (comment) and elsewhere, throughput can crater when we make these new indexes writable because all of the load lands on a single range. Ideally we'd pre-split the new index before turning on traffic.
It's a somewhat tricky problem to figure out, in the general case, where to pre-split these new indexes. Fortunately, for hash sharded indexes, we get to make an assumption that underneath each shard, the data is uniformly distributed. That means we can safely split at each shard boundary and know that it's a smart place to put a split.
Describe the solution you'd like
Before making a hash sharded index DELETE_AND_WRITE_ONLY, split it with an expiring split point at each shard boundary.
Describe alternatives you've considered
There was tons of discussion of the broader problem in this internal thread.
Additional context
We should also split other secondary indexes at the same time, but the heuristic on how many split and where to put them is less obvious. We'll track that separately.
This change should be backportable.
Epic: CRDB-7363
gz#10941