Skip to content

Allow to explicitly pass query string to table functions for federated queries. #46758

@alexey-milovidov

Description

@alexey-milovidov

Use case

We have a bunch of table functions to integrate with external databases:

  • mysql;
  • postgresql;
  • mongodb;
  • odbc;
  • jdbc;
  • hive;
  • sqlite;
  • meilisearch;
  • duckdb (under development);

We implement automatic query rewrite and condition pushdown to query these external storages.
Typically, it rewrites a query in the form of

SELECT columns... FROM table WHERE compatible expressions

Sometimes the user might want full control of the query that is being pushed to the external storage.

Describe the solution you'd like

For both table functions and the corresponding table engines, allow providing the query instead of database and table names in the following possible forms:

  1. mysql('address', (SELECT ...), ...) - when the external query can be transparently parsed by ClickHouse;
  2. mysql('address', 'SELECT ...', ...) - compatible with any query dialects, but less convenient;

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

It can be extended to support the S3 SELECT feature by AWS and Yandex Cloud: https://docs.aws.amazon.com/AmazonS3/latest/API/API_SelectObjectContent.html

For variant 1, it's unclear if we should keep the lexer's output and pass it directly to the storage or if we should obtain the AST and pass the reformatted AST. The latter can be better due to support for ClickHouse-style string literal escaping and numeric literals, but the former is more direct.

A similar syntax is already supported for Meilisearch.

It should look consistent with the executable table function and the upcoming feature of HTTP services as UDTFs: #45994

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions