-
Notifications
You must be signed in to change notification settings - Fork 555
internal/manifest: efficient range-key fileMetadata seeks #1742
Description
In designing range keys, we decided to store range keys in the same sstables as point keys to preserve the LSM invariant relationship between the two. However this intermingling poses a performance obstacle.
The range key and point key iterator stacks are distinct. A seek on a combined iterator needs to seek both iterators, which twice need to seek through a level's metadata to find the correct file. Because the two iterator stacks are separate, these two iterators care about separate but overlapping sets of sstables (eg, only those that hold point keys and only those that hold range keys)
For range keys which are expected to be rare, wading through many point key-only sstables adds unnecessary key comparisons. Additionally, some levels may not contain any range keys at all, and it would be preferable to exclude those levels from the range key iterator stack altogether.
A couple of potential solutions:
- Use a b-tree annotation to annotate subtrees of a level that contain range keys. When annotations are up to date, this allows O(1) testing whether a level contains range keys during iterator construction. With extensions to
manifest.LevelIteratorand itsFiltermethod, these annotations could also be used to skip subtrees that contain no range keys. The details of this approach might be tricky. - Maintain a parallel , range-key
[NumLevels]LevelMetadataonmanifest.Version. This parallel LSM would be a projection of the main LSM, holding only sstables that contain range keys. It would be incrementally updated alongside the combined LSM.