read view closing performance improvement#12019
read view closing performance improvement#12019sergepetrenko merged 1 commit intotarantool:masterfrom
Conversation
23c54bb to
b2964b5
Compare
bb6008f to
8b9accc
Compare
sergepetrenko
left a comment
There was a problem hiding this comment.
Looks valid. It's strange such an obvious error wasn't noticed earlier.
8b9accc to
1ebedd6
Compare
|
Let's consider the following benchmark: benchmarklocal log = require('log')
local clock = require('clock')
local field_count = tonumber(arg[1])
local test_iters = tonumber(arg[2])
os.execute('rm -rf *.snap *.xlog')
box.cfg{}
local format = {}
for i = 1, field_count do
table.insert(format, {name = 'field'..i, type = 'uint64'})
end
box.schema.space.create('test', {
engine = 'memcs',
format = format,
field_count = field_count,
})
local rv_func = function()
local a = box.read_view.open()
a:close()
end
local tm = clock.bench(function() for i = 1, test_iters do rv_func() end end)[1]
log.info(string.format("field count: %d, sec: %f", field_count, tm / test_iters))
os.exit()Let's examine the open + close time of a single read view against Please disregard that this graph and all subsequent ones say "Tarantool read_view.open() benchmark"; this is a typo. In reality, this is an open + close benchmark (exactly the script provided at the beginning). As noted correctly by Gumix, most of the time (approximately 80-90% depending on field_count) is spent in tuple_dictionary_delete_hash. This function is called approximately the same number of times regardless of A careful examination of It seems we could completely remove this loop and just keep a single call to After the second commit, the contribution of |
Gumix
left a comment
There was a problem hiding this comment.
Could you please add the benchmark to perf/lua/?
Even if it doesn't test anything in CE, it would be useful for EE perf testing. See perf/lua/column_scan_module.c for reference.
|
Nit. Let's mention that |
00ced93 to
b3ee80f
Compare
Inside `tuple_dictionary_delete_hash`, there was a loop that, on each iteration, searched for the first non-deleted element using `mh_first`, which has a complexity of `O(n)`, and deleted it using `mh_strnu32_del`. After the loop finished executing, `tuple_dictionary_delete_hash` immediately called `mh_strnu32_delete`. All elements are stored in the hashmap inplace, so `del` doesn't invoke any memory deallocation functions. `del` also doesn't call destructors for the elements; it merely sets a specific bit in the bitmap to 0. Immediately after the loop, `mh_strnu32_delete(hash)` will delete the hashmap `hash` along with the array `hash->p` containing all elements inplace. This commit simply removes this useless loop, which was causing significant performance issues when closing a read view. Closes tarantool/tarantool-ee#1043, tarantool#12115 NO_DOC=performance improvement NO_TEST=performance improvement
b3ee80f to
8ab02e6
Compare
|
Successfully created backport PR for |
|
Successfully created backport PR for |
|
Successfully created backport PR for |
Backport summary
|


Inside
tuple_dictionary_delete_hash, there was a loop that, on each iteration, searched for the first non-deleted element usingmh_first, which has a complexity ofO(n), and deleted it usingmh_strnu32_del. After the loop finished executing,tuple_dictionary_delete_hashimmediately calledmh_strnu32_delete.All elements are stored in the hashmap inplace, so
deldoesn't invoke any memory deallocation functions.delalso doesn't call destructors for the elements; it merely sets a specific bit in the bitmap to 0. Immediately after the loop,mh_strnu32_delete(hash)will delete the hashmaphashalong with the arrayhash->pcontaining all elements inplace.This PR simply removes this useless loop, which was causing significant performance issues when closing a read view.
Closes tarantool/tarantool-ee#1043, #12115
NO_DOC=performance improvement
NO_TEST=performance improvement