Optimize aggregation performance of nullable String key when use AggregationMethodSerialized #51399
Conversation
|
Same as #46812. The current implementation can improve TPCH as a whole by 4%. |
71a1298 to
b9280d5
Compare
|
This is an automated comment for commit 65aeb05 with description of existing statuses. It's updated for the latest CI running
|
d35d8b8 to
abad35b
Compare
src/Columns/ColumnNullable.cpp
Outdated
There was a problem hiding this comment.
am I right that the whole idea of optimisation is to devirtualize serializeValueIntoArena? if so I think it would be better to just static_cast to ColumnString and call serializeValueIntoArena as usual.
There was a problem hiding this comment.
I'm not quite sure that the key to this optimization is devirtualize serializeValueIntoArena, and I've tested that this optimization has performance degradation on the number type. So need to determine whether the nested column is a ColumnString
There was a problem hiding this comment.
I'm not quite sure that the key to this optimization is devirtualize serializeValueIntoArena, and I've tested that this optimization has performance degradation on the number type. So need to determine whether the nested column is a ColumnString
There are some new discoveries that I will try again to apply in the fixed size column
There was a problem hiding this comment.
I would assume that calculating this condition for every element may hurt perf (especially for numeric types, because it is the fastest case). better would be to move this condition out of the hot loop
There was a problem hiding this comment.
@nickitat After my test, the performance improvement should be due to function inline and reduced allocContinue calls. If want to modify this part of the code to each column class, the amount of change will be very large
There was a problem hiding this comment.
I added two variables to ColumnNullable to represent the type of nested column. I thought about moving these two variables to HashMethodSerialized, but this optimization can only be used in aggregation, and other places where serializeIntoArea is used cannot obtain optimization effects. So I think it's more appropriate to place these two variables in ColumnNullable
|
If use getNestedColumn().valuesHaveFixedSize() to make a judgment, a lock instruction prefix is generated, which reduces performance |
|
@nickitat any suggestion? |
|
@alexey-milovidov @nickitat Is this change acceptable or are there any other suggestions |
my view is the following. ideally we want to have batch version of ClickHouse/src/Common/HashTable/HashTableKeyHolder.h Lines 128 to 134 in dea5cbc and for serialised aggregation saving memory is really important. |
|
@nickitat IIUC, we should change right? |
ecf8c82 to
94124d3
Compare
|
@nickitat I re-modified a version and added a new parameter There are still some problems with this change, some code can be reused, and generally it can guarantee that there is only one serialization algorithm for a type. If this solution is acceptable in general, I will further improve the code. If the method of modifying the interface is unacceptable, another way is to add additional functions for specific types (String, Number), and then call the function in nullable. If this method is acceptable, I can also implement it. |
|
current solution looks good to me, let's continue |
7e7ac09 to
0388dc3
Compare
af0ca7c to
65aeb05
Compare
|
Any comments? would we merge this PR? |
guys, I really apologies for delayed responding. busy with some other tasks. will take another look by the end of the week. |
|
I came across this PR when trying to traceback the mysterious "const UInt8 *" argument in serializeValueIntoArena. The optimization is likely not related to devirtualization, but to a reduced number of allocations. I've proposed a cleaner implementation in #55809. The complexity is constrained inside ColumnNullable instead of leaking around. @nickitat Could you help take a look? |






Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
optimize aggregation performance of Nullable String key when using AggregationMethodSerialized