box: speed up tuple_new() for large sparse tuples by 3.5x#9793
Merged
sergepetrenko merged 4 commits intotarantool:masterfrom Mar 18, 2024
Gumix:iverbin/gh-711-optimize-validation-of-sparse-tuples-in-memcs
Merged
box: speed up tuple_new() for large sparse tuples by 3.5x#9793sergepetrenko merged 4 commits intotarantool:masterfrom Gumix:iverbin/gh-711-optimize-validation-of-sparse-tuples-in-memcs
tuple_new() for large sparse tuples by 3.5x#9793sergepetrenko merged 4 commits intotarantool:masterfrom
Gumix:iverbin/gh-711-optimize-validation-of-sparse-tuples-in-memcs
Conversation
locker
requested changes
Mar 12, 2024
Currently `class MpData` generates msgpack data with a predefined format, let's call it `FORMAT_BASIC`. This patch allows to extend it with other formats. No functional changes. Needed for tarantool/tarantool-ee#711 NO_DOC=perf test NO_TEST=perf test NO_CHANGELOG=perf test
tuple_new() for large sparse tuples by 2~3xtuple_new() for large sparse tuples by 3.5x
locker
approved these changes
Mar 13, 2024
Implement `class MpData<FORMAT_SPARSE>`, which generates 1000 fields, 10 of them contain unsigned integers, while the remaining are null. Needed for tarantool/tarantool-ee#711 NO_DOC=perf test NO_TEST=perf test NO_CHANGELOG=perf test
It is possible to skip MP_NIL by mp_decode_nil(), which is faster than mp_next(). This patch improves bench_tuple_new<FORMAT_SPARSE> by 2.2x. NO_WRAP $ taskset 0x2 ~/benchmark/tools/compare.py benchmarks \ ./tuple.perftest.old ./tuple.perftest.new \ --benchmark_min_warmup_time=10 \ --benchmark_repetitions=30 \ --benchmark_report_aggregates_only=true \ --benchmark_filter=tuple_new\<FORMAT_SPARSE\> [...] Comparing ./tuple.perftest.old to ./tuple.perftest.new Benchmark Time CPU Time Old Time New CPU Old CPU New ------------------------------------------------------------------------------------------------------------------------------------ bench_tuple_new<FORMAT_SPARSE>_mean -0.5525 -0.5525 6985 3126 6985 3126 bench_tuple_new<FORMAT_SPARSE>_median -0.5445 -0.5444 6838 3115 6838 3115 bench_tuple_new<FORMAT_SPARSE>_stddev -0.8368 -0.8367 541 88 541 88 bench_tuple_new<FORMAT_SPARSE>_cv -0.6354 -0.6352 0 0 0 0 NO_WRAP Needed for tarantool/tarantool-ee#711 NO_DOC=perf improvement NO_TEST=perf improvement NO_CHANGELOG=next commit
If the number of tuple fields is less than `format->min_field_count`, then some required field is missed, i.e., there is no need to update the `required_fields` bitmap during msgpack decoding. This optimization is valid only if tuple format doesn't contain fields accessed by JSON paths. This patch improves bench_tuple_new by 15-50%, depending on field count. NO_WRAP $ taskset 0x2 ~/benchmark/tools/compare.py benchmarks \ ./tuple.perftest.old ./tuple.perftest.new \ --benchmark_min_warmup_time=10 \ --benchmark_repetitions=30 \ --benchmark_report_aggregates_only=true \ --benchmark_filter=tuple_new [...] Comparing ./tuple.perftest.old to ./tuple.perftest.new Benchmark Time CPU Time Old Time New CPU Old CPU New ------------------------------------------------------------------------------------------------------------------------------------ bench_tuple_new<FORMAT_BASIC>_mean -0.1469 -0.1470 126 107 126 107 bench_tuple_new<FORMAT_BASIC>_median -0.1428 -0.1429 124 106 124 106 bench_tuple_new<FORMAT_BASIC>_stddev +0.0589 +0.0600 4 5 4 5 bench_tuple_new<FORMAT_BASIC>_cv +0.2412 +0.2427 0 0 0 0 bench_tuple_new<FORMAT_SPARSE>_mean -0.3754 -0.3753 3104 1939 3104 1939 bench_tuple_new<FORMAT_SPARSE>_median -0.3749 -0.3747 3071 1920 3071 1920 bench_tuple_new<FORMAT_SPARSE>_stddev -0.3482 -0.3482 85 55 85 55 bench_tuple_new<FORMAT_SPARSE>_cv +0.0434 +0.0434 0 0 0 0 NO_WRAP Needed for tarantool/tarantool-ee#711 NO_DOC=perf improvement
sergepetrenko
approved these changes
Mar 13, 2024
Collaborator
sergepetrenko
left a comment
There was a problem hiding this comment.
Thanks for the patch!
p7nov
approved these changes
Mar 14, 2024
ochaplashkin
approved these changes
Mar 14, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The first 2 patches add the
bench_tuple_new<FORMAT_SPARSE>benchmark, that creates sparse tuples (1K fields each, 990 of which are nils).The next 2 patches speed up the benchmark from 7 μs to 2 μs per iteration.
Needed for tarantool/tarantool-ee#711