Fix point lookup on range tombstone sentinel endpoint#4829
Fix point lookup on range tombstone sentinel endpoint#4829ajkr wants to merge 4 commits intofacebook:masterfrom
Conversation
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
This is a corruption issue and it's in 5.18.fb. It is not in 5.17.fb or earlier. |
ajkr
left a comment
There was a problem hiding this comment.
We might be able to solve this in a cleaner way by modifying FindFileInRange to increment by one the limit passed to lower_bound. That lower_bound uses internal comparator so it should be able to pass over a file ending in range tombstone sentinel key when the lookup key has the same user key.
|
@ajkr has updated the pull request. Re-import the pull request |
092caf7 to
61b8571
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
|
OK, it is improved now that it reuses the existing |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@ajkr has updated the pull request. Re-import the pull request |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
Yeah I'm happy with this approach. It should only prevent cascading in cases where the lookup key definitely does not exist in the file range chosen while making comparisons at the previous level. For example, if the lookup key is outside the level's range, or the lookup key is within a file's range from user-key perspective but not from internal key perspective. Arguably it's better to not let |
|
@ajkr has updated the pull request. Re-import the pull request |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: Previously for point lookup we decided which file to look into based on user key overlap only. We also did not truncate range tombstones in the point lookup code path. These two ideas did not interact well in cases like this: - L1 has range tombstone [a, c)#1 and point key b#2. The data is split between file1 with range [a#1,1, b#72057594037927935,15], and file2 with range [b#2, c#1]. - L1's file2 gets compacted to L2. - User issues `Get()` for b#3. - L1's file1 is opened and the range tombstone [a, c)#1 is found for b, while no point-key for b is found in L1. - `Get()` assumes that the range tombstone must cover all data in that range in lower levels, so short circuits and returns `NotFound`. The solution to this problem is to not look into files that only overlap with the point lookup at a range tombstone sentinel endpoint. In the above example, this would mean not opening L1's file1 or its tombstones during the `Get()`. Pull Request resolved: #4829 Differential Revision: D13561355 Pulled By: ajkr fbshipit-source-id: a13c21c816870a2f5d32a48af6dbd719a7d9d19f
Previously for point lookup we decided which file to look into based on user key overlap only. We also did not truncate range tombstones in the point lookup code path. These two ideas did not interact well in cases like this:
Get()for b#3.Get()assumes that the range tombstone must cover all data in that range in lower levels, so short circuits and returnsNotFound.The solution to this problem is to not look into files that only overlap with the point lookup at a range tombstone sentinel endpoint. In the above example, this would mean not opening L1's file1 or its tombstones during the
Get().Test Plan: