Automatic GROUP/ORDER BY to disk based on the memory usage#71406
Automatic GROUP/ORDER BY to disk based on the memory usage#71406Michicosun merged 17 commits intoClickHouse:masterfrom
Conversation
|
This is an automated comment for commit dad435a with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
d392357 to
8b3ce12
Compare
|
b72e705 to
2c52da9
Compare
|
Fixed and also added separate setting for |
19cdf22 to
2df43d4
Compare
2df43d4 to
a429980
Compare
a429980 to
b33be86
Compare
edd5c4f to
a773d3d
Compare
|
Wow, green CI, haven't seen this for awhile |
There was a problem hiding this comment.
I think we can merge per query setting(max_bytes_ratio_before_external_group_by), because it is a sugar extension for currently used max_bytes_before_external_group_by.
Or use the method suggested in the issue and compute the memory limit for the query once before it starts as ratio_setting * remaining_memory_amount and use it similarly to max_bytes_before_external_group_by. In this case, to compute remaining_memory_amount, we need to find the memory tracker in the hierarchy with the most strict hard limit and check its occupancy.
How do you think I agree that it is kind of configuration sugar, but not completely. The difference is that this ratio is supported all the time, not only at the query start (like described in #69286, and that 10% of cases that is not covered by that proposal is that are covered by this patches), and I do find it useful, imagine a situation when the user spawned 100 queries at one point in time, that will eventually requires more RAM that the node has.
Again, I'm not really like this idea, because it does allow to cover the case with query peaks. And I actually looked through more advanced functionality that had been described in #41887, but I'm not sure that the complexity described there worth it, with taking into respect that this patches covers 99% cases. P.S. I also followed the same convention as some other settings, like:
|
a773d3d to
9534a35
Compare
I meant to merge the changes with this setting (
If you set But I agree that it is more convenient than the previous one.
Right now you implemented 2 new types of settings:
Obviously 1-st can't solve #69286 because it has old logic. Second on the other hand affects all queries at the same time - this is not per-query level setting. For example if limit is 0.7 and current memory consumption is 0.69 and you started new query it will be spilled to disk right after start, because total memory consumption will become > 0.7. In proposed solution it is important to recalculate limit from free memory related to the current user. So why your solution handles 99% of cases? |
|
Thanks, I guess now I understand your point and it does make sense, let's verify if I understood your correctly:
But it is still unclear what you are suggesting to merge
I haven't thought about this case, my point was that #41887 looks too complex to me (with an extra hook), and I think that it is better to try dumb implementation first (like this patch, but using per-user memory limits) |
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
…r_server under TSan
The problem is that TSan has some memory overhead which increases
process RSS to 10Gi in the middle of the query and it will fail.
Another option is to avoid syncing with RSS too frequently, but I doubt
that it is significant to run this test under TSan
v1: increase memory to 20Gi
Fixes: https://s3.amazonaws.com/clickhouse-test-reports/71406/b33be86dee3c7616e9193c339837b9e250810557/integration_tests__tsan__[6_6].html
v2: instead just disable under TSan
Fixes: https://s3.amazonaws.com/clickhouse-test-reports/71406/df68cf7362d825c3d83991fd74c391a49e73a8b1/integration_tests__tsan__[6_6].html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
…storage check Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
…::mergeOnBlock() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
…eded() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
- remove max_bytes_ratio_before_external_{order,group_by}_for_server
- change the way max_bytes_ratio_before_external_{order,group_by} works
Note, that it is not enough to transform ratio to bytes in
executeQuery(), since in this case it will not work for merges and
internal queries, plus, you have to reset them for Distributed engine
and update it for Merge/View/...
This patch also introduce some helpers (see MemoryTrackerUtils) and
adjust Aggregator::Params constructor to accept Settings object instead
of tons of arguments.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
…it is not configured Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
af822de to
dad435a
Compare
|
I've rebased on top of upstream (to fix conflicts in settings changes), but the changes had been done on top for easier incremental review |
|
|
* [GLUTEN-1632][CH]Daily Update Clickhouse Version (20241130) * Fix Build due to ClickHouse/ClickHouse#71406 * Fix build due to ClickHouse/ClickHouse#72460 --------- Co-authored-by: kyligence-git <gluten@kyligence.io> Co-authored-by: Chang Chen <baibaichen@gmail.com>
…ry consumption Make max_bytes_before_external_sort more user-friendly, previously it was number of bytes in the sorting block for one sorting thread, now it has the same meaning as max_bytes_before_external_group_by - it is total limit for the whole query memory for all threads. Also one more setting added to control on disk block size - min_external_sort_block_bytes But, after this change the files on disk can be very small (few kilobytes) and to avoid this, I've added separate setting to control on-disk block size - min_external_sort_block_bytes, to avoid too many files. Note, that max_bytes_ratio_before_external_sort already based on the memory consumption not the sorting block size (added in ClickHouse#71406) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
…ry consumption Make max_bytes_before_external_sort more user-friendly, previously it was number of bytes in the sorting block for one sorting thread, now it has the same meaning as max_bytes_before_external_group_by - it is total limit for the whole query memory for all threads. Also one more setting added to control on disk block size - min_external_sort_block_bytes But, after this change the files on disk can be very small (few kilobytes) and to avoid this, I've added separate setting to control on-disk block size - min_external_sort_block_bytes, to avoid too many files. Note, that max_bytes_ratio_before_external_sort already based on the memory consumption not the sorting block size (added in ClickHouse#71406) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Automatic
GROUP BY/ORDER BYto disk based on the server/user memory usage. Controlled withmax_bytes_ratio_before_external_group_by/max_bytes_ratio_before_external_sortquery settings.Fixes: #69286