-
Notifications
You must be signed in to change notification settings - Fork 556
db: change FileMetadata (Version) consistency check for L0 files to allow flushed sstables that have overlapping seqnums #587
Description
The Pebble prototype to improve import performance using L0 sub-levels for compaction (and other compaction heuristics, in https://github.com/sumeerbhola/pebble/tree/sublevel and https://github.com/sumeerbhola/cockroach/tree/sublevel) showed very promising results (about 3.5x faster for a large TPCC import, compared to RocksDB -- import finished in ~65 min).
One of the changes this prototype relies on is splitting sstables being flushed based on split points provided by the L0SubLevels data-structure to avoid generating wide L0 files that block a sub-level from accepting more files. Without this change, the wide files overwhelm the ability to do narrow (in key space, and consequently bytes) compactions out of L0 into Lbase, and to do narrow intra-L0 compactions.
This change means flushed files have overlapping seqnums, which runs afoul of the current consistency checking (for the prototype I temporarily disabled it https://github.com/sumeerbhola/pebble/blob/sublevel/internal/manifest/version.go#L607-L631). Since we plan to support such flush splitting in the very near future, it is worthwhile to make the first production version of Pebble be compatible if it encounters such files. Also, we should look into how RocksDB, which historically had weaker consistency checking, handles such files -- if it permits them, our compatibility story is better.
I've attached a MANIFEST from one of the import runs that we can use, in addition to unit testing, to check that the consistency checking is compatible with the future Pebble.
MANIFEST1.gz