-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Roadmap 2025 #74046
Copy link
Copy link
Closed
Labels
Description
This is ClickHouse roadmap 2025.
This roadmap does not cover the tasks related to infrastructure, orchestration, documentation, marketing, external integrations, drivers, etc.
See also:
Roadmap 2024: #58392
Roadmap 2023: #44767
Roadmap 2022: #32513
Roadmap 2021: #17623
Roadmap 2020: link
Data Lakes
- Automatic use of cluster functions Use cluster table functions automatically if parallel replicas are enabled #70659
- Parallel distributed INSERT SELECT by default Enable parallel distributed insert select by default #80425
- Enable Hive-style partitioning by default Enable
use_hive_partitioningby default #71636 - Uniform distribution of load across cluster s3Cluster is suboptimal when the number of files is comparable to the number of servers #70190
- Partition pruning for Iceberg Iceberg Partition Pruning for time-related partition transforms #72044
- Prewhere support for Parquet [Feature] Add support for prewhere in parquet native reader #65527
- Unify different Parquet readers into one A new parquet reader that supports filter push down, which improves the total time on clickbench by 50+% compared to arrow parquet reader #70611
- Writing bloom filters for Parquet Write Parquet bloom filters #71681
- Time travel for Iceberg Add setting to query Iceberg tables as of a specific timestamp #71072
- Support for deletions in Iceberg Cannot read Iceberg table: positional and equality deletes are not supported #66588 support Iceberg equality deletes #83627
- In-memory metadata cache A cache for data lake's metadata #71579
- Integration with Delta kernel Integrate with delta_kernel #75255
- Glue catalog support Unity catalog integration #76988
- Unity catalog support Add glue catalog integration #77257
- Support for Delta Lake on Azure Support deltalake for AzureBlobStorage #74541
- Materialized views on top of data lakes
- Support arbitrary nesting in database and table names Support arbitrary nesting in database and table names. #71171
- Writing into data lakes Support for Writing to Apache Iceberg Tables in ClickHouse #49973
- Background merges for data lakes ☁
- Subsets of a large file as a unit of work
Query Engine
- Non-constant CASE, non-constant IN Support non-constant second argument of IN #65398
- Using a partition key to optimize JOINs
- Optimization of anti-joins: LEFT JOIN ... WHERE ... IS NULL to NOT IN
- Deriving index condition from the right-hand side of INNER JOIN
- Automatic JOINs in external memory, if needed
- Optimization for certain cases of distributed JOINs
- JOIN reordering based on finer-grained statistics
- Correlated subqueries with decorrelation Analyzer: identifier resolution in parent scopes #66143 Support correlated subqueries in WHERE clause #76078
- PREWHERE to work with the FINAL clause Additional primary key scan for skip index FINAL queries #70210
- Secondary indices to work with the FINAL clause Secondary indexes should be used for queries with FINAL #70292
- Support for certain SQL functions without parentheses Allow certain functions without parentheses in SQL #52102
- Unification of table structures in Merge tables Improve usability of
Mergetables #73956 - Optimizations for a single dictionary in LowCardinality [RFC] Use global
LowCardinalitydictionary for optimizations if it is small enough #72717 - Materialized CTE Materialized CTE #61086
- Query conditions cache Implement Query Condition Cache #69236
- Userspace page cache Userspace page cache v2 #70509
- Lazy columns Optimize performance with lazy projection to avoid reading unused columns #55518
- Block-level hints WIP Add Sorting Optimization with General Data Hints #48800
- Query cache for partial results Use query cache when a query shares most of the query pipeline. #57490
- On-disk query cache Allow using
cachedisk for query result cache. #52141 - Streaming queries Streaming queries model with cursors #63312
Data Storage
- JSON data type from beta to production JSON/Dynamic/Variant are production-ready #77785 See https://jsonbench.com/
- Unique key constraint RFC: Unique key constraint #70589
- Lightweight updates with patch parts Lightweight Updates with patch parts #82004
- Automatic LowCardinality columns Automatic
LowCardinalityrepresentation of columns #69916 - Secondary indices by default Add minmax indices by default #72090
- ShardedMap data type Add shards to data type
Map#47045 - Time data type [WIP] Implementing
Time/Time64data types #71943 - Unify statistics with secondary indices
- Production-ready full-text index Inverted Text Index v2 #86485
- Functions and codecs for quantization
- Production-ready vector search
- Allow to instantly attach backups Backup disk #64222
- Transactions for Replicated tables
Interfaces
- Random early drop on overload Reject queries when the server is overloaded #63206
- Resource scheduler by CPU CPU scheduling for workloads #77595
- Support for PromQL Basic support for the PromQL dialect #75036
- Configuration of query handlers with SQL
- Remote and Cloud database engines
Remotedatabase engine #59304 - Drivers for UDFs Drivers for User-Defined Functions #71172
- API for query construction Add combining http parameters with queries, new handler for accessing tables as files #64336
- Even simpler data upload Even simpler data upload. #38775
- Passing input data to the url table function Support input data for reading from the
URLtable engine #45994 - Support for TLS in the PostgreSQL wire protocol Support TLS for Postgres wire protocol #73812
- Support for HTTP Event-Stream
- Persistent databases in clickhouse-local Persistent databases in clickhouse-local #71722
- Implicit tables in clickhouse-local
clickhouse-local: if there is input, use it as a default table in SELECT instead of system.one #65023 - SSH protocol for the server SSH interface with PTY #48902
- Predictive autocomplete in the CLI Autocomplete #69641
Cleanups
- Remove old analyzer
- Remove old predicate pushdown mechanics
- Remove Object data type in favor of JSON
- Remove Window View in favor of streaming queries
- Remove Live View in favor of streaming queries
- Remove the old cache for Hive Remove obsolete cache for Hive #77795
- Remove the "send metadata" feature Remove send_metadata logic related to zero-copy replication #82508
Reactions are currently unavailable