Skip to content

read view closing performance improvement#12019

Merged
sergepetrenko merged 1 commit intotarantool:masterfrom
Astronomax:ghe-1043-read-view-close-perf
Dec 9, 2025
Merged

read view closing performance improvement#12019
sergepetrenko merged 1 commit intotarantool:masterfrom
Astronomax:ghe-1043-read-view-close-perf

Conversation

@Astronomax
Copy link
Contributor

@Astronomax Astronomax commented Nov 6, 2025

Inside tuple_dictionary_delete_hash, there was a loop that, on each iteration, searched for the first non-deleted element using mh_first, which has a complexity of O(n), and deleted it using mh_strnu32_del. After the loop finished executing, tuple_dictionary_delete_hash immediately called mh_strnu32_delete.
All elements are stored in the hashmap inplace, so del doesn't invoke any memory deallocation functions. del also doesn't call destructors for the elements; it merely sets a specific bit in the bitmap to 0. Immediately after the loop, mh_strnu32_delete(hash) will delete the hashmap hash along with the array hash->p containing all elements inplace.

This PR simply removes this useless loop, which was causing significant performance issues when closing a read view.

Closes tarantool/tarantool-ee#1043, #12115

NO_DOC=performance improvement
NO_TEST=performance improvement

@coveralls
Copy link

coveralls commented Nov 6, 2025

Coverage Status

coverage: 87.639%. remained the same
when pulling 8ab02e6 on Astronomax:ghe-1043-read-view-close-perf
into 49eea0e
on tarantool:master
.

@Astronomax Astronomax force-pushed the ghe-1043-read-view-close-perf branch 2 times, most recently from 23c54bb to b2964b5 Compare November 6, 2025 15:50
@Astronomax Astronomax changed the title read view closing performance issue read view closing performance improvement Nov 6, 2025
@Astronomax Astronomax force-pushed the ghe-1043-read-view-close-perf branch 3 times, most recently from bb6008f to 8b9accc Compare November 7, 2025 07:18
@Astronomax Astronomax requested a review from a team as a code owner November 7, 2025 07:18
@sergepetrenko sergepetrenko self-assigned this Nov 7, 2025
@Astronomax Astronomax marked this pull request as draft November 7, 2025 12:18
Copy link
Collaborator

@sergepetrenko sergepetrenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks valid. It's strange such an obvious error wasn't noticed earlier.

@Astronomax Astronomax force-pushed the ghe-1043-read-view-close-perf branch from 8b9accc to 1ebedd6 Compare November 10, 2025 12:45
@Astronomax
Copy link
Contributor Author

Astronomax commented Nov 10, 2025

Let's consider the following benchmark:

benchmark
local log = require('log')
local clock = require('clock')

local field_count = tonumber(arg[1])
local test_iters = tonumber(arg[2])

os.execute('rm -rf *.snap *.xlog')
box.cfg{}

local format = {}
for i = 1, field_count do
    table.insert(format, {name = 'field'..i, type = 'uint64'})
end

box.schema.space.create('test', {
    engine = 'memcs',
    format = format,
    field_count = field_count,
})

local rv_func = function()
    local a = box.read_view.open()
    a:close()
end

local tm = clock.bench(function() for i = 1, test_iters do rv_func() end end)[1]
log.info(string.format("field count: %d, sec: %f", field_count, tm / test_iters))
os.exit()

Let's examine the open + close time of a single read view against field_count. The relationship appears piecewise linear. The most interesting features here are the three sharp performance jumps (slightly to the right of 2000, 4000, and 8000; in fact, this pattern appears near every power of 2, at least up to and including 256):

the open + close time of a single read view against `field_count`

tarantool_bench_10000

Please disregard that this graph and all subsequent ones say "Tarantool read_view.open() benchmark"; this is a typo. In reality, this is an open + close benchmark (exactly the script provided at the beginning).

As noted correctly by Gumix, most of the time (approximately 80-90% depending on field_count) is spent in tuple_dictionary_delete_hash.

This function is called approximately the same number of times regardless of field_count. The issue appears to be specifically in the relationship between the execution time of a single call and field_count.

A careful examination of tuple_dictionary_delete_hash quickly raises the question: why are we manually removing each element from the hashmap at all, if we immediately call mh_strnu32_delete afterward? All elements are stored in the hashmap inplace, so del doesn't invoke any memory deallocation functions. del also doesn't call destructors for the elements; it merely sets a specific bit in the bitmap to 0. After that, mh_strnu32_delete will delete the hashmap along with the array p containing all elements inplace.

It seems we could completely remove this loop and just keep a single call to mh_strnu32_delete? The second commit implements exactly that.

the open + close time of a single read view against `field_count` after this patch

tarantool_bench_after_the_second_commit

After the second commit, the contribution of lbox_read_view_close becomes so insignificant that it doesn't even appear on the flamegraph.

@Astronomax Astronomax marked this pull request as ready for review November 10, 2025 13:13
@Astronomax Astronomax requested a review from Gumix November 10, 2025 13:13
@Astronomax Astronomax assigned Gumix and unassigned Astronomax Nov 10, 2025
Copy link
Contributor

@Gumix Gumix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add the benchmark to perf/lua/?
Even if it doesn't test anything in CE, it would be useful for EE perf testing. See perf/lua/column_scan_module.c for reference.

@Gumix Gumix assigned Astronomax and unassigned Gumix Nov 10, 2025
@Astronomax Astronomax requested a review from nshy November 18, 2025 12:17
@nshy
Copy link
Contributor

nshy commented Nov 18, 2025

Nit. Let's mention that mh_first() has O(n) complexity to clarify why this loop take so much time.

@Astronomax Astronomax force-pushed the ghe-1043-read-view-close-perf branch 5 times, most recently from 00ced93 to b3ee80f Compare November 24, 2025 20:37
@nshy nshy removed their assignment Nov 25, 2025
@Gumix Gumix removed their assignment Nov 27, 2025
Inside `tuple_dictionary_delete_hash`, there was a loop that, on each
iteration, searched for the first non-deleted element using `mh_first`,
which has a complexity of `O(n)`, and deleted it using `mh_strnu32_del`.
After the loop finished executing, `tuple_dictionary_delete_hash`
immediately called `mh_strnu32_delete`.
All elements are stored in the hashmap inplace, so `del` doesn't invoke
any memory deallocation functions. `del` also doesn't call destructors for
the elements; it merely sets a specific bit in the bitmap to 0.
Immediately after the loop, `mh_strnu32_delete(hash)` will delete the
hashmap `hash` along with the array `hash->p` containing all elements
inplace.

This commit simply removes this useless loop, which was causing
significant performance issues when closing a read view.

Closes tarantool/tarantool-ee#1043, tarantool#12115

NO_DOC=performance improvement
NO_TEST=performance improvement
@Astronomax Astronomax force-pushed the ghe-1043-read-view-close-perf branch from b3ee80f to 8ab02e6 Compare December 8, 2025 09:24
@Astronomax Astronomax requested a review from nshy December 8, 2025 09:31
@Astronomax Astronomax assigned nshy and Gumix and unassigned Astronomax Dec 8, 2025
@nshy nshy removed their assignment Dec 8, 2025
@Gumix Gumix assigned Astronomax and unassigned Gumix Dec 8, 2025
@sergepetrenko sergepetrenko added backport/3.2 Automatically create a 3.2 backport PR backport/3.3 Automatically create a 3.3 backport PR backport/3.5 Automatically create a 3.5 backport PR full-ci Enables all tests for a pull request and removed full-ci Enables all tests for a pull request labels Dec 8, 2025
@sergepetrenko sergepetrenko merged commit 4b72007 into tarantool:master Dec 9, 2025
93 checks passed
@TarantoolBot
Copy link
Collaborator

Successfully created backport PR for release/3.2:

@TarantoolBot
Copy link
Collaborator

Successfully created backport PR for release/3.3:

@TarantoolBot
Copy link
Collaborator

Successfully created backport PR for release/3.5:

@TarantoolBot
Copy link
Collaborator

Backport summary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/3.2 Automatically create a 3.2 backport PR backport/3.3 Automatically create a 3.3 backport PR backport/3.5 Automatically create a 3.5 backport PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Slowdown in tuple_dictionary_delete_hash when updating space format or dropping space

7 participants