Skip to content

S3 Table with local table joins don't work in one direction on distributed tables #52022

@EDsCODE

Description

@EDsCODE

Describe what's wrong
Attempting to join a remote table (S3Cluster) to a local table and getting an error

How to reproduce

  • Have a multinode clickhouse cluster
  • Create a table that's distributed
  • S3 bucket with a parquet file
  • Try to query the distributed table with a JOIN using s3Cluster wrapper for remote data source

Which ClickHouse server version to use

  • 23.6.1.1524

Queries to run that lead to unexpected result
Notes:

  • Parquet file
  • s3 bucket with a parquet file. any data model should cause the error
  • clickhouse cluster with > 1 node
  • Both with and without CTE breaks

** If you flip the tables and join the local table into the remote table it works. It only fails if the local table comes before the remote table

WITH some_remote_table AS (SELECT * FROM s3Cluster('<clickhouse cluster name>', '<s3 object url>', '<s3 object access key>', '<s3 object person key>', 'Parquet')) SELECT some_local_table.event FROM some_local_table JOIN some_remote_table ON some_local_table.event = some_remote_table.id

Expected behavior

Query should return

Error message and/or stacktrace

Distributed task iterator is not initialized

Metadata

Metadata

Assignees

No one assigned

    Labels

    potential bugTo be reviewed by developers and confirmed/rejected.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions