storage: splits don't seem to take into account range size properly

I noticed this while playing with the `tpcc` 1000 warehouse dataset, but it's easy to reproduce with just 10 warehouses.

Under the `tpcc` data generation load, ranges get queued for splitting due to their size despite not actually being over the range size limit.

To observe this, simply run a 1-node cockroach cluster locally and run `./tpcc -load -warehouses=10`. After a short while, you should notice that the cluster is performing a ton of splits on many different tables.

For example, after loading 10 warehouses, the `stock` table has 232 ranges. The first one is pretty large, containing more than a single 100,000 row warehouse. The second one is of similar size. The third is fairly small, containing just below 18,000 rows, and the rest are very small, containing only a few thousand rows each. Here's a snippet of the ranges:

```
root@:26257/tpcc> show testing_ranges from table stock;
+-----------+-----------+----------+----------+--------------+
| Start Key |  End Key  | Range ID | Replicas | Lease Holder |
+-----------+-----------+----------+----------+--------------+
| NULL      | /1/699    |       59 | {1}      |            1 |
| /1/699    | /2/1384   |       61 | {1}      |            1 |
| /2/1384   | /2/18000  |       62 | {1}      |            1 |
| /2/18000  | /2/20000  |       63 | {1}      |            1 |
| /2/20000  | /2/23000  |       64 | {1}      |            1 |
| /2/23000  | /2/27000  |       65 | {1}      |            1 |
| /2/27000  | /2/30000  |       66 | {1}      |            1 |
| /2/30000  | /2/34000  |       67 | {1}      |            1 |
| /2/34000  | /2/36000  |       68 | {1}      |            1 |
| /2/36000  | /2/41000  |       69 | {1}      |            1 |
| /2/41000  | /2/43000  |       70 | {1}      |            1 |
<snip>
```

As a total split queue novice, I poked around and added some debug output:

```
diff --git a/pkg/storage/split_queue.go b/pkg/storage/split_queue.go
index 8d9e5f9be..e26eea045 100644
--- a/pkg/storage/split_queue.go
+++ b/pkg/storage/split_queue.go
@@ -20,6 +20,8 @@ import (

        "github.com/pkg/errors"

+       "fmt"
+
        "github.com/cockroachdb/cockroach/pkg/config"
        "github.com/cockroachdb/cockroach/pkg/gossip"
        "github.com/cockroachdb/cockroach/pkg/internal/client"
@@ -83,6 +85,7 @@ func (sq *splitQueue) shouldQueue(
        // Add priority based on the size of range compared to the max
        // size for the zone it's in.
        if ratio := float64(repl.GetMVCCStats().Total()) / float64(repl.GetMaxBytes()); ratio > 1 {
+               fmt.Println("Ratio: ", ratio, desc.RangeID, desc.StartKey, desc.EndKey, repl.GetMVCCStats().Total(), repl.GetMaxBytes())
                priority += ratio
                shouldQ = true
        }
```

This log line gets fired many times for a particular range when a split happens, claiming that the result of Total() is in fact greater than 64 megabytes. This is empirically false - the rows in these small ranges are no larger bytes-wise than those in the large ranges.

So, my hypothesis is that something's overcounting range size. Since this behavior takes a while to kick in (a couple warehouses), I would guess that there's an issue during range splits themselves that causes the size of the new range to be overcounted.

cc @petermattis @tschottdorf as likely candidates for people who know about MVCC stats.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: splits don't seem to take into account range size properly #21689

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

storage: splits don't seem to take into account range size properly #21689

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions