-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Umbrella task for memory allocators #34157
Description
We can easily add and test various malloc implementations with ClickHouse.
Caveats
If some allocator shows better or worse results on ClickHouse performance tests, it does not mean that it is better or worse in general. It will only show that it appeared to be better or worse on some slightly representative subset of some specific workload.
Sometimes performance depends mostly on thresholds or some knobs inside the allocator. It can be sad to reject one allocator if the only thing to make it good is to change some threshold a little.
Automated performance tests in ClickHouse don't test for long-term memory fragmentation and performance on long run. We cannot test it without deploying to production.
Performance of memory allocator depends on specific CPU and specific CPU model (example: internal mapping used by CPU caches may affect false sharing). But automated ClickHouse performance tests are using only single type of machine in AWS.
ClickHouse does not depend a lot on the performance of memory allocator on frequent small allocations, because we don't do frequent small allocations.
What allocators to test
jemalloc - the current choice.
Note: one friendly company is using decade old version of jemalloc in their monorepository and their developers has maintained wrong impression that jemalloc is bad. JeMalloc has long history and it is well maintained and constantly improved. We use it with some patches.
tcmalloc - the new tcmalloc should not be confused with the old tcmalloc.
Tested here: #11590 without success due to specific memory allocation pattern in ClickHouse.
mimalloc
Tested here: #5775 but it was just after its release when it was immature.
The results should not be relevant anymore.
lfalloc
Allocator from Yandex, can be found here: https://github.com/catboost/catboost/tree/master/library/cpp/lfalloc
It is not used due to large amount of page faults for medium-sized allocations.
Many teams made some modifications under the names "NUMA-Aware lfalloc for KikiMR", "lfalloc for YT", etc.
hualloc
New allocator based on experimental design, tested here: #31376
Intel's oneTBB
rpmalloc
Add more variants here.