save a key comparison in block seeks#6646
Conversation
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
edit: this problem is fixed. huh, after ~30 minutes stress test failed. not sure yet if related. |
d1534c0 to
a3881b2
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
a3881b2 to
430f3bc
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
430f3bc to
fcd0a44
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
|
alright, known problems are fixed so ready for review. |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
fcd0a44 to
262a7e2
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
262a7e2 to
92af65f
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
92af65f to
4e67eb7
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
4e67eb7 to
397469b
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
397469b to
efbc981
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
efbc981 to
1db34e7
Compare
3302eeb to
50f890b
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
table/block_based/block.cc
Outdated
There was a problem hiding this comment.
Done. I also redid the BinarySeek() to avoid changing the binary search. The number of key comparisons saved lessened as a result, but I am fine with leaving those for a future PR..
ajkr
left a comment
There was a problem hiding this comment.
couple explanatory comments
50f890b to
c55640a
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
pdillinger
left a comment
There was a problem hiding this comment.
I like this. Some suggestions for readability, but good.
Wouldn't hurt to have another set of eyes on this for correctness.
|
MacOS compilation failure |
ajkr
left a comment
There was a problem hiding this comment.
MacOS compilation failure
Fixed (I think).
Thanks for the review!
c55640a to
7f6c018
Compare
|
@ajkr has updated the pull request. Re-import the pull request |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
anand1976
left a comment
There was a problem hiding this comment.
LGTM! Just a few nits and suggestions.
table/block_based/block.cc
Outdated
There was a problem hiding this comment.
A comment here might help. It took me a little while to figure out that this Next() actually doesn't advance to the next key, but just ends up decoding the key at the restart point we seek'd to.
table/block_based/block.cc
Outdated
There was a problem hiding this comment.
We land at 0 if restart key at index 1 > "target", right?
table/block_based/block.cc
Outdated
There was a problem hiding this comment.
Something like skip_linear_scan might be clearer.
There was a problem hiding this comment.
Done. Also renamed "ScanAfterBinarySeek" -> "FindKeyAfterBinarySeek" to make it fit in better.
table/block_based/block.cc
Outdated
There was a problem hiding this comment.
Can we just use ParseNextDataKey instead of Next and Valid?
There was a problem hiding this comment.
It's making use of polymorphism here -- IndexBlockIter::Next() calls ParseNextIndexKey() while DataBlockIter::Next() calls ParseNextDataKey().
This saves up to two key comparisons in block seeks. The first key comparison saved is a redundant key comparison against the restart key where the linear scan starts. This comparison is saved in all cases except when the found key is in the first restart interval. The second key comparison saved is a redundant key comparison against the restart key where the linear scan ends. This is only saved in cases where all keys in the restart interval are less than the target (probability roughly `1/restart_interval`). Test Plan: ran readrandom with mostly default settings and counted key comparisons using `PerfContext`. before: `user_key_comparison_count = 30370310` after: `user_key_comparison_count = 28881965` setup command: ``` $ TEST_TMPDIR=/dev/shm/dbbench ./db_bench -benchmarks=fillrandom,compact -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -level_compaction_dynamic_level_bytes=true -num=10000000 ``` benchmark command: ``` $ TEST_TMPDIR=/dev/shm/dbbench/ ./db_bench -use_existing_db=true -benchmarks=readrandom -disable_auto_compactions=true -num=10000000 -compression_type=none -reads=1000000 -perf_level=3 ```
7f6c018 to
463f274
Compare
ajkr
left a comment
There was a problem hiding this comment.
Thanks for the review! Addressed the comments.
table/block_based/block.cc
Outdated
There was a problem hiding this comment.
Done. Also renamed "ScanAfterBinarySeek" -> "FindKeyAfterBinarySeek" to make it fit in better.
table/block_based/block.cc
Outdated
table/block_based/block.cc
Outdated
There was a problem hiding this comment.
It's making use of polymorphism here -- IndexBlockIter::Next() calls ParseNextIndexKey() while DataBlockIter::Next() calls ParseNextDataKey().
table/block_based/block.cc
Outdated
|
@ajkr has updated the pull request. Re-import the pull request |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: This is a followup to #6646. In that PR, for simplicity I just appended a comparison against the 0th restart key in case `BinarySeek()`'s binary search landed at index 0. As a result there were `2/(N+1) + log_2(N)` key comparisons. This PR does it differently. Now we expand the binary search range by one so it also covers the case where target is at or before the restart key at index 0. As a result, it involves `log_2(N+1)` key comparisons. Pull Request resolved: #7068 Test Plan: ran readrandom with mostly default settings and counted key comparisons using `PerfContext`. before: `user_key_comparison_count = 28881965` after: `user_key_comparison_count = 27823245` setup command: ``` $ TEST_TMPDIR=/dev/shm/dbbench ./db_bench -benchmarks=fillrandom,compact -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -level_compaction_dynamic_level_bytes=true -num=10000000 ``` benchmark command: ``` $ TEST_TMPDIR=/dev/shm/dbbench/ ./db_bench -use_existing_db=true -benchmarks=readrandom -disable_auto_compactions=true -num=10000000 -compression_type=none -reads=1000000 -perf_level=3 ``` Reviewed By: anand1976 Differential Revision: D22357032 Pulled By: ajkr fbshipit-source-id: 8b01e9c1c2a4e9d02fc9dfe16c1cc0327f8bdf24
Summary: This is a followup to facebook#6646. In that PR, for simplicity I just appended a comparison against the 0th restart key in case `BinarySeek()`'s binary search landed at index 0. As a result there were `2/(N+1) + log_2(N)` key comparisons. This PR does it differently. Now we expand the binary search range by one so it also covers the case where target is at or before the restart key at index 0. As a result, it involves `log_2(N+1)` key comparisons. Pull Request resolved: facebook#7068 Test Plan: ran readrandom with mostly default settings and counted key comparisons using `PerfContext`. before: `user_key_comparison_count = 28881965` after: `user_key_comparison_count = 27823245` setup command: ``` $ TEST_TMPDIR=/dev/shm/dbbench ./db_bench -benchmarks=fillrandom,compact -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -level_compaction_dynamic_level_bytes=true -num=10000000 ``` benchmark command: ``` $ TEST_TMPDIR=/dev/shm/dbbench/ ./db_bench -use_existing_db=true -benchmarks=readrandom -disable_auto_compactions=true -num=10000000 -compression_type=none -reads=1000000 -perf_level=3 ``` Reviewed By: anand1976 Differential Revision: D22357032 Pulled By: ajkr fbshipit-source-id: 8b01e9c1c2a4e9d02fc9dfe16c1cc0327f8bdf24
This saves up to two key comparisons in block seeks. The first key
comparison saved is a redundant key comparison against the restart key
where the linear scan starts. This comparison is saved in all cases
except when the found key is in the first restart interval. The
second key comparison saved is a redundant key comparison against the
restart key where the linear scan ends. This is only saved in cases
where all keys in the restart interval are less than the target
(probability roughly
1/restart_interval).Test Plan:
ran readrandom with mostly default settings and counted key comparisons
using
PerfContext.before:
user_key_comparison_count = 30370310after:
user_key_comparison_count = 28881965setup command:
benchmark command: