Cluster engines (such as `s3Cluster`) should be used automatically with parallel replicas

Introduce a new setting, `use_parallel_replicas_for_cluster_engines`, which we will enable by default.

If `use_parallel_replicas` and `use_parallel_replicas_for_cluster_engines` settings are enabled, and the `parallel_replicas_mode` is task-based, and the query contains one of `s3`, `url`, `hdfs`, `azure` (all file-like engines except the `file` engine), and all other tables are not distributed, they should be automatically transformed to the corresponding `-Cluster` engines.

The cluster for these engines should be controlled by the `cluster_for_parallel_replicas` setting, and the maximum number of servers should be controlled by the `max_parallel_replicas` setting.

**Additional context**

The setting `parallel_distributed_insert_select` should be enabled by default.

It should be extended to all data lake engines, such as Iceberg.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster engines (such as `s3Cluster`) should be used automatically with parallel replicas #65024

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cluster engines (such as s3Cluster) should be used automatically with parallel replicas #65024

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Cluster engines (such as `s3Cluster`) should be used automatically with parallel replicas #65024