Skip to content

Reduce comparator objects init cost in BlockIter#9611

Closed
XinyuZeng wants to merge 3 commits intofacebook:mainfrom
XinyuZeng:reduce_cmp_init
Closed

Reduce comparator objects init cost in BlockIter#9611
XinyuZeng wants to merge 3 commits intofacebook:mainfrom
XinyuZeng:reduce_cmp_init

Conversation

@XinyuZeng
Copy link
Copy Markdown
Contributor

@XinyuZeng XinyuZeng commented Feb 21, 2022

This PR solves the problem discussed in #7149. By storing the pointer of InternalKeyComparator as icmp_ in BlockIter, the object size remains the same. And for each call to CompareCurrentKey, there is no need to create Comparator objects. One can use icmp_ directly or use the "user_comparator" from the icmp_.

Test Plan: with #9903,

$ TEST_TMPDIR=/dev/shm python3.6 ../benchmark/tools/compare.py benchmarks ./db_basic_bench ../rocksdb-pr9611/db_basic_bench --benchmark_filter=DBGet/comp_style:0/max_data:134217728/per_key_size:256/enable_statistics:0/negative_query:0/enable_filter:0/mmap:1/iterations:262144/threads:1 --benchmark_repetitions=50
...
Comparing ./db_basic_bench to ../rocksdb-pr9611/db_basic_bench                                                                                                                                                                                            
Benchmark                                                                                                                                                               Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
DBGet/comp_style:0/max_data:134217728/per_key_size:256/enable_statistics:0/negative_query:0/enable_filter:0/mmap:1/iterations:262144/threads:1_pvalue                 0.0001          0.0001      U Test, Repetitions: 50 vs 50
DBGet/comp_style:0/max_data:134217728/per_key_size:256/enable_statistics:0/negative_query:0/enable_filter:0/mmap:1/iterations:262144/threads:1_mean                  -0.0483         -0.0483          3924          3734          3924          3734
DBGet/comp_style:0/max_data:134217728/per_key_size:256/enable_statistics:0/negative_query:0/enable_filter:0/mmap:1/iterations:262144/threads:1_median                -0.0713         -0.0713          3971          3687          3970          3687
DBGet/comp_style:0/max_data:134217728/per_key_size:256/enable_statistics:0/negative_query:0/enable_filter:0/mmap:1/iterations:262144/threads:1_stddev                -0.0342         -0.0344           225           217           225           217
DBGet/comp_style:0/max_data:134217728/per_key_size:256/enable_statistics:0/negative_query:0/enable_filter:0/mmap:1/iterations:262144/threads:1_cv                    +0.0148         +0.0146             0             0             0             0
OVERALL_GEOMEAN                                                                                                                                                      -0.0483         -0.0483             0             0             0             0

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ajkr has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@XinyuZeng has updated the pull request. You must reimport the pull request before landing.

@XinyuZeng XinyuZeng closed this Apr 24, 2022
@ajkr ajkr reopened this Apr 25, 2022
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ajkr has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Copy Markdown
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Updated the "Test Plan" to show results of microbenchmark

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ajkr has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.


UserComparatorWrapper ucmp() { return UserComparatorWrapper(raw_ucmp_); }
icmp_ =
std::make_unique<InternalKeyComparator>(raw_ucmp, false /* named */);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can InternalKeyComparator wrap a UserComparatorWrapper?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like InternalKeyComparator already has a UserComparatorWrapper as a private member. Calling icmp_->user_comparator() refers to it.

facebook-github-bot pushed a commit that referenced this pull request Apr 26, 2022
Summary:
I tried evaluating #9611 using DBGet microbenchmarks but mostly found the change is well within the noise even for hundreds of repetitions; meanwhile, the InternalKeyComparator CPU it saves is 1-2% according to perf so it should be measurable. In this PR I tried adding a mmap mode that will bypass compression/checksum/block cache/file read to focus more on the block lookup paths, and also increased the Get() count.

Pull Request resolved: #9903

Reviewed By: jay-zhuang, riversand963

Differential Revision: D35907375

Pulled By: ajkr

fbshipit-source-id: 69490d5040ef0863e1ce296724104d0aa7667215
@XinyuZeng
Copy link
Copy Markdown
Contributor Author

hey @ajkr, do you mind share the compare.py script you used in the Test Plan? Thanks!

@ajkr
Copy link
Copy Markdown
Contributor

ajkr commented May 1, 2022

hey @ajkr, do you mind share the compare.py script you used in the Test Plan? Thanks!

Sure, it's a script provided in google benchmark -- https://github.com/google/benchmark/blob/main/tools/compare.py

@riversand963
Copy link
Copy Markdown
Contributor

@XinyuZeng can you take a look at #10340 ?

@XinyuZeng
Copy link
Copy Markdown
Contributor Author

@riversand963 Through a quick look, I am not sure why #9611 introduces more allocations using glibc because it should reduce the number of allocations compared to the commit before it. Before the commit, each call to CompareCurrentKey will have an allocation. But I do agree with #10342:

As a side effect, internal key comparator was made configurable too. This introduces overhead to this simple wrapper. For example, every InternalKeyComparator will have an std::vector attached to it, which consumes memory and possible allocation overhead too.

This is also what I profiled and found when I did #9611

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants