Skip to content

Query result caching for sub-queries#76252

Open
nbarannik wants to merge 37 commits intoClickHouse:masterfrom
nbarannik:query_cache_for_subqueries
Open

Query result caching for sub-queries#76252
nbarannik wants to merge 37 commits intoClickHouse:masterfrom
nbarannik:query_cache_for_subqueries

Conversation

@nbarannik
Copy link
Copy Markdown

@nbarannik nbarannik commented Feb 17, 2025

Resolves #51656

Settings use_query_cache = true, query_cache_for_subqueries = true now enable caching of subquery results in the query cache.

Example:

SELECT some_expensive_calculation(column_1, column_2)
FROM table
SETTINGS use_query_cache = true, query_cache_for_subqueries = true;

-- returns cached results
SELECT column
FROM (
   SELECT some_expensive_calculation(column_1, column_2)
   FROM table
)
SETTINGS use_query_cache = true, query_cache_for_subqueries = true;

-- returns cached results
SELECT column
FROM table
WHERE column IN (
   SELECT some_expensive_calculation(column_1, column_2)
   FROM table
)
SETTINGS use_query_cache = true, query_cache_for_subqueries = true;

The field is_subquery was added to the system.query_cache table. It is set to true for all records related to sub-queries (including the main query that was cached as a sub-query for future use)

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Add subqueries caching in query cache

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Feb 17, 2025

CLA assistant check
All committers have signed the CLA.

@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label Feb 17, 2025
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Feb 17, 2025

Workflow [PR], commit [86d636e]

@clickhouse-gh clickhouse-gh bot added the pr-feature Pull request with new product feature label Feb 17, 2025
@rschu1ze rschu1ze self-assigned this Feb 17, 2025
@rschu1ze rschu1ze changed the title Query cache for subqueries Query result caching for sub-queries Feb 18, 2025
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Apr 15, 2025

Dear @rschu1ze, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

@fastio
Copy link
Copy Markdown
Contributor

fastio commented Apr 25, 2025

@rschu1ze Hi, are you still working on this PR? I really need this PR as it can improve query performance.

@rschu1ze
Copy link
Copy Markdown
Member

@fastio Not at the moment but it is still on my radar.

@alexey-milovidov
Copy link
Copy Markdown
Member

I also need this PR, it is an amazing feature!

@movy
Copy link
Copy Markdown

movy commented Jul 14, 2025

Any progress on this one? It looks very promising!

@alexey-milovidov
Copy link
Copy Markdown
Member

+1, very eager to see it merged! It is in the ClickHouse Roadmap 2025.

@balajisraghavan
Copy link
Copy Markdown

+1 a very useful feature. Looking forward to using this in production

@alexey-milovidov
Copy link
Copy Markdown
Member

alexey-milovidov commented Nov 9, 2025

It appears that this does not offer flexibility for selectively enabling caching for one or another subquery.

Let's do it this way:

  1. Make sure that the use_query_cache setting does not automatically propagate to subqueries by default. So if it is set outside, then in the subqueries, it is considered unset by default.
  2. Users can enable the setting use_query_cache explicitly inside subqueries to turn on/off the caching.
  3. The new setting query_cache_for_subqueries will change this default and enable the propagation of use_query_cache into all subqueries by default.

@kiran-4444
Copy link
Copy Markdown

Hey @rschu1ze can I continue this if you're busy? I'd like to complete this feature implementation.

@alexey-milovidov
Copy link
Copy Markdown
Member

@kiran-4444, yes, this will be amazing! You can create a new pull request that contains all the commits from this one and adds your commits. Also, take the comment above about the logic of this feature.

vvo added a commit to vvo/ClickHouse that referenced this pull request Mar 17, 2026
Implement CTO feedback on PR ClickHouse#76252: use_query_cache on the outer query
no longer auto-propagates to subqueries. Instead:

1. No auto-propagation: use_query_cache on outer query does NOT propagate
   to subqueries by default
2. Explicit per-subquery opt-in: SETTINGS use_query_cache = true on a
   specific subquery enables caching for that subquery only
3. Bulk propagation: query_cache_for_subqueries = true enables propagation
   of use_query_cache into ALL subqueries (existing behavior)

Changes:
- Planner.cpp: Add shouldUseQueryCacheForSubquery() helper implementing
  the three-rule decision logic for cache read/write
- ExecuteScalarSubqueriesVisitor.cpp: Add shouldCacheScalarSubquery()
  checking explicit subquery SETTINGS before falling back to propagation
- PreparedSets.cpp: Gate IN-subquery cache finalization on
  query_cache_for_subqueries setting
- New test: 03381_query_result_cache_subquery_settings_propagation
  covering all 4 scenarios (no propagation, explicit opt-in, bulk
  propagation, explicit opt-out override)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
vvo added a commit to vvo/ClickHouse that referenced this pull request Mar 17, 2026
Implement CTO feedback on PR ClickHouse#76252: `use_query_cache` on the outer
query no longer auto-propagates to subqueries. Instead:

1. No auto-propagation: `use_query_cache` on outer query does NOT
   propagate to subqueries by default
2. Explicit per-subquery opt-in: `SETTINGS use_query_cache = true` on
   a specific subquery enables caching for that subquery only
3. Bulk propagation: `query_cache_for_subqueries = true` enables
   propagation of `use_query_cache` into ALL subqueries

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vvo
Copy link
Copy Markdown

vvo commented Mar 17, 2026

@alexey-milovidov I too would love this feature (example: metadata table in a logs UI to fetch a list of projects: can be cached as a subquery like where projects IN ...).

I fired #99804 which is based on your comment in #76252 (comment), let me know what you think 🙏

vvo added a commit to vvo/ClickHouse that referenced this pull request Mar 17, 2026
Implement CTO feedback on PR ClickHouse#76252: `use_query_cache` on the outer
query no longer auto-propagates to subqueries. Instead:

1. No auto-propagation: `use_query_cache` on outer query does NOT
   propagate to subqueries by default
2. Explicit per-subquery opt-in: `SETTINGS use_query_cache = true` on
   a specific subquery enables caching for that subquery only
3. Bulk propagation: `query_cache_for_subqueries = true` enables
   propagation of `use_query_cache` into ALL subqueries

Key implementation details:
- Gate subquery cache on `select_query_options.is_subquery` to prevent
  outer queries from going through the subquery cache path
- Set context `canUseQueryResultCache` flag for explicit per-subquery
  opt-in so `checkCanWriteQueryResultCache` works
- Always call `finalizeWriteInQueryResultCache` to support subqueries
  that explicitly opt in independently of the outer query

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-feature Pull request with new product feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can the features of Query Cache support subqueries?

9 participants