Skip to content

internal/manifest: Create new LevelMetadata for range keys#1771

Merged
itsbilal merged 1 commit intocockroachdb:masterfrom
itsbilal:rangekey-levelmeta
Jun 16, 2022
Merged

internal/manifest: Create new LevelMetadata for range keys#1771
itsbilal merged 1 commit intocockroachdb:masterfrom
itsbilal:rangekey-levelmeta

Conversation

@itsbilal
Copy link
Copy Markdown
Contributor

This change creates a new LevelMetadata btree just for
sstables containing range keys. This btree is incrementally
updated with sstable additions/deletions that contain range
keys. This should have a relatively light overhead, as
range keys and files containing range keys are expected to be
rare. This also has the benefit of reducing the time it takes
to determine whether files contain range keys.

Fixes #1742.

@itsbilal itsbilal requested review from a team and jbowens June 16, 2022 17:01
@itsbilal itsbilal self-assigned this Jun 16, 2022
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@itsbilal
Copy link
Copy Markdown
Contributor Author

@jbowens I thought about doing this for L0Sublevels as well, where I thought of two possible solutions:

  1. Re-running NewL0Sublevels (or AddL0Files in the incremental case) and creating a separate L0Sublevels struct for range key files only.
  2. Not doing 1, but just setting a new field, L0SublevelRangeFiles []LevelSlice to be a subset of L0SublevelFiles containing range key files only.

1 seems to be a lot of work (especially in the non-incremental case), and 2 seems to be... not that different from just letting the LevelIter do the filtering thing itself like it already does. Thoughts? I left the L0 sublevel case as-is for that reason.

@itsbilal
Copy link
Copy Markdown
Contributor Author

Oh, a third thing we could do: Look at the existing RangeKeyLevels[0] b-tree and shove each of those iterators directly into the MergingIter, skipping the keyspan.LevelIter entirely. This bypasses all sublevel logic and level-based optimizations but might actually be desirable in cases where L0 is very sparse on range keys anyway, as it avoids all sublevel generation work for range keys only.

@itsbilal itsbilal force-pushed the rangekey-levelmeta branch from 836a2c9 to a04137d Compare June 16, 2022 19:46
@itsbilal
Copy link
Copy Markdown
Contributor Author

Ended up implementing the last option mentioned above.

Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that last option too 👍 This is great.

:lgtm:

Reviewed 2 of 3 files at r1, all commit messages.
Reviewable status: 2 of 3 files reviewed, 3 unresolved discussions (waiting on @itsbilal)


range_keys.go line 55 at r1 (raw file):

			// No files with range keys.
			return
		}

We can drop this Filter and .First() call now, right?


internal/manifest/version.go line 657 at r1 (raw file):

	Levels [NumLevels]LevelMetadata

	// RangeKeyLevels holds a subset of the same files as Levels that only

nit: s/only// ? Reading this sounds like it's saying that these files contain no point keys, only range keys.


internal/manifest/version_edit.go line 734 at r1 (raw file):

			}
			if f.HasRangeKeys {
				err = lmRange.tree.insert(f)

something that I didn't think about before, this will add extra ref counts for range key files. that's totally fine, and since we also delete these files from this tree when they're removed, it all evens out.

This change creates a new LevelMetadata btree just for
sstables containing range keys. This btree is incrementally
updated with sstable additions/deletions that contain range
keys. This should have a relatively light overhead, as
range keys and files containing range keys are expected to be
rare. This also has the benefit of reducing the time it takes
to determine whether files contain range keys.

Fixes cockroachdb#1742.
@itsbilal itsbilal force-pushed the rangekey-levelmeta branch from a04137d to 767d066 Compare June 16, 2022 20:42
Copy link
Copy Markdown
Contributor Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: 2 of 3 files reviewed, 2 unresolved discussions (waiting on @itsbilal and @jbowens)


range_keys.go line 55 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

We can drop this Filter and .First() call now, right?

Done. Good catch. I also inlined all of addLevelIterForFiles as it only has one call site now.


internal/manifest/version.go line 657 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

nit: s/only// ? Reading this sounds like it's saying that these files contain no point keys, only range keys.

Done.


internal/manifest/version_edit.go line 734 at r1 (raw file):

Previously, jbowens (Jackson Owens) wrote…

something that I didn't think about before, this will add extra ref counts for range key files. that's totally fine, and since we also delete these files from this tree when they're removed, it all evens out.

Yep, that's right. Shouldn't be a problem as long as the refs are being properly cleaned up.

@itsbilal itsbilal merged commit 2798eb7 into cockroachdb:master Jun 16, 2022
nicktrav added a commit to nicktrav/pebble that referenced this pull request Oct 20, 2022
A manifest version uses reference counting to ensure that obsolete
SSTables are only removed once it is safe to do so (i.e. no iterators or
snapshots are open referencing tables in a older manifest version).

In cockroachdb#1771, a new slice of `LevelMetadata` was added to support range
keys. A slice already existed for point keys (plus range dels).
Currently, when a version's ref count reaches zero, only the files
present in the point key `LevelMetadata` slice have their ref counts
decremented. If a file also contains range keys, the reference count
never becomes zero and the file becomes "zombied", and can never be
deleted from disk.

Decrement the ref counts on files present in the range keys
`LevelMetadata` slice, allowing the ref counts to be correctly zeroed.

Add regression test to exercise the bug observed in
cockroachdb/cockroach#90149.

Fix cockroachdb/cockroach#90149.
nicktrav added a commit to nicktrav/pebble that referenced this pull request Oct 20, 2022
A manifest version uses reference counting to ensure that obsolete
SSTables are only removed once it is safe to do so (i.e. no iterators or
snapshots are open referencing tables in a older manifest version).

In cockroachdb#1771, a new slice of `LevelMetadata` was added to support range
keys. A slice already existed for point keys (plus range dels).
Currently, when a version's ref count reaches zero, only the files
present in the point key `LevelMetadata` slice have their ref counts
decremented. If a file also contains range keys, the reference count
never becomes zero and the file becomes "zombied", and can never be
deleted from disk.

Decrement the ref counts on files present in the range keys
`LevelMetadata` slice, allowing the ref counts to be correctly zeroed.

Add regression test to exercise the bug observed in
cockroachdb/cockroach#90149.

Fix cockroachdb/cockroach#90149.
nicktrav added a commit that referenced this pull request Oct 20, 2022
A manifest version uses reference counting to ensure that obsolete
SSTables are only removed once it is safe to do so (i.e. no iterators or
snapshots are open referencing tables in a older manifest version).

In #1771, a new slice of `LevelMetadata` was added to support range
keys. A slice already existed for point keys (plus range dels).
Currently, when a version's ref count reaches zero, only the files
present in the point key `LevelMetadata` slice have their ref counts
decremented. If a file also contains range keys, the reference count
never becomes zero and the file becomes "zombied", and can never be
deleted from disk.

Decrement the ref counts on files present in the range keys
`LevelMetadata` slice, allowing the ref counts to be correctly zeroed.

Add regression test to exercise the bug observed in
cockroachdb/cockroach#90149.

Fix cockroachdb/cockroach#90149.
nicktrav added a commit to nicktrav/pebble that referenced this pull request Oct 20, 2022
A manifest version uses reference counting to ensure that obsolete
SSTables are only removed once it is safe to do so (i.e. no iterators or
snapshots are open referencing tables in a older manifest version).

In cockroachdb#1771, a new slice of `LevelMetadata` was added to support range
keys. A slice already existed for point keys (plus range dels).
Currently, when a version's ref count reaches zero, only the files
present in the point key `LevelMetadata` slice have their ref counts
decremented. If a file also contains range keys, the reference count
never becomes zero and the file becomes "zombied", and can never be
deleted from disk.

Decrement the ref counts on files present in the range keys
`LevelMetadata` slice, allowing the ref counts to be correctly zeroed.

Add regression test to exercise the bug observed in
cockroachdb/cockroach#90149.

Fix cockroachdb/cockroach#90149.
nicktrav added a commit that referenced this pull request Oct 20, 2022
A manifest version uses reference counting to ensure that obsolete
SSTables are only removed once it is safe to do so (i.e. no iterators or
snapshots are open referencing tables in a older manifest version).

In #1771, a new slice of `LevelMetadata` was added to support range
keys. A slice already existed for point keys (plus range dels).
Currently, when a version's ref count reaches zero, only the files
present in the point key `LevelMetadata` slice have their ref counts
decremented. If a file also contains range keys, the reference count
never becomes zero and the file becomes "zombied", and can never be
deleted from disk.

Decrement the ref counts on files present in the range keys
`LevelMetadata` slice, allowing the ref counts to be correctly zeroed.

Add regression test to exercise the bug observed in
cockroachdb/cockroach#90149.

Fix cockroachdb/cockroach#90149.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

internal/manifest: efficient range-key fileMetadata seeks

3 participants