Skip to content

Query cache: Enable compression of cache entries#45912

Merged
rschu1ze merged 6 commits intomasterfrom
qc-compression
Apr 3, 2023
Merged

Query cache: Enable compression of cache entries#45912
rschu1ze merged 6 commits intomasterfrom
qc-compression

Conversation

@rschu1ze
Copy link
Copy Markdown
Member

@rschu1ze rschu1ze commented Feb 1, 2023

This PR changes two things in the query cache:

  • ClickHouse reads table data in blocks of max_block_size rows but due to filtering, aggregation, etc., result blocks are typically much smaller than max_block_size (and sometimes bigger). The new setting query_cache_squash_partial_results (enabled by default) now controls if result blocks are squashed (if they are tiny) or split (if they are large) into blocks of max_block_size size before insertion into the query result cache. This reduces performance of writes into the query cache but improves compressability of cache entries and provides a "natural" block granularity when query results are later served from the query cache.

  • Entries in the query cache are now compressed by default. This reduces the overall memory consumption at the cost of slower writes into / reads from the query cache. To disable compression, use the new setting query_cache_compress_entries.

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Entries in the query cache are now squashed to max_block_size and compressed

@rschu1ze rschu1ze added the do not test disable testing on pull request label Feb 1, 2023
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-improvement Pull request with some product improvements label Feb 1, 2023
@rschu1ze rschu1ze changed the title (wip) Implement compression of query cache entries (wip) Query cache: Enable compression of cache entries Mar 14, 2023
@rschu1ze rschu1ze removed the do not test disable testing on pull request label Mar 14, 2023
@rschu1ze rschu1ze force-pushed the qc-compression branch 3 times, most recently from b7e30f2 to 2db2fba Compare March 19, 2023 14:25
@rschu1ze rschu1ze changed the title (wip) Query cache: Enable compression of cache entries Query cache: Enable compression of cache entries Mar 20, 2023
@rschu1ze rschu1ze marked this pull request as ready for review March 20, 2023 19:55
ClickHouse reads table data in blocks of 'max_block_size' rows. Due to
filtering, aggregation, etc., result blocks are typically much smaller
than 'max_block_size' but there are also cases where they are much
bigger. Setting 'query_cache_squash_partial_results' (enabled by
default) now controls if result blocks are squashed (if they are tiny)
or split (if they are large) into blocks of 'max_block_size' size before
insertion into the query result cache. This reduces performance of
writes into the query cache but improves compressability of cache
entries and provides more natural block granularity when query results
are later served from the query cache.

Entries in the query cache are now also compressed by default. This
reduces the overall memory consumption at the cost of slower writes into
/ reads from the query cache. To disable compression, use setting
'query_cache_compress_entries'.
@SmitaRKulkarni SmitaRKulkarni self-assigned this Mar 31, 2023
Copy link
Copy Markdown
Member

@SmitaRKulkarni SmitaRKulkarni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest all LGTM

@rschu1ze rschu1ze merged commit a50e741 into master Apr 3, 2023
@rschu1ze rschu1ze deleted the qc-compression branch April 3, 2023 11:33
@CurtizJ
Copy link
Copy Markdown
Member

CurtizJ commented Apr 3, 2023

See #48358 (comment).

rschu1ze added a commit that referenced this pull request Apr 3, 2023
Failing with high rate in master after #45912 was merged

-- Create test table with lot's of rows
CREATE TABLE t(c String) ENGINE=MergeTree ORDER BY c;
INSERT INTO t values ('abc') ('def') ('abc') ('jkl');
Copy link
Copy Markdown
Member

@devcrafter devcrafter Apr 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please

INSERT INTO t SELECT multiIf(n = 0, 'abc', n = 1, 'def', n = 2, 'abc', 'jkl')
FROM
(
    SELECT number % 4 AS n
    FROM numbers(4 * 400)
)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#48373, thanks.

rschu1ze added a commit that referenced this pull request Apr 4, 2023
- clear query cache at the end of the tests to minimize interaction with
  other query cache tests

- generate data more elegantly
rschu1ze added a commit that referenced this pull request Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants