Introduce a new setting, use_parallel_replicas_for_cluster_engines, which we will enable by default.
If use_parallel_replicas and use_parallel_replicas_for_cluster_engines settings are enabled, and the parallel_replicas_mode is task-based, and the query contains one of s3, url, hdfs, azure (all file-like engines except the file engine), and all other tables are not distributed, they should be automatically transformed to the corresponding -Cluster engines.
The cluster for these engines should be controlled by the cluster_for_parallel_replicas setting, and the maximum number of servers should be controlled by the max_parallel_replicas setting.
Additional context
The setting parallel_distributed_insert_select should be enabled by default.
It should be extended to all data lake engines, such as Iceberg.
Introduce a new setting,
use_parallel_replicas_for_cluster_engines, which we will enable by default.If
use_parallel_replicasanduse_parallel_replicas_for_cluster_enginessettings are enabled, and theparallel_replicas_modeis task-based, and the query contains one ofs3,url,hdfs,azure(all file-like engines except thefileengine), and all other tables are not distributed, they should be automatically transformed to the corresponding-Clusterengines.The cluster for these engines should be controlled by the
cluster_for_parallel_replicassetting, and the maximum number of servers should be controlled by themax_parallel_replicassetting.Additional context
The setting
parallel_distributed_insert_selectshould be enabled by default.It should be extended to all data lake engines, such as Iceberg.