Skip to content

massive S3 request can run suboptimal #92482

@filimonov

Description

@filimonov

Observed behavior

  • When ClickHouse scans Glue/Iceberg tables, it creates a new S3 client per table. For each client, region is auto-discovered and credentials are resolved afresh.
  • In massive scans (hundreds/thousands of tables) this leads to repeated region discovery and auth attempts for the same bucket.
  • Log excerpt (single thread, multiple requests to the same bucket):
    2025.12.12 02:46:10.000605 [770] <Information> S3Client: Resolving region for bucket dwh--prod--data
    2025.12.12 02:46:09.830721 [770] <Information> S3Client: Found region eu-central-1 for bucket dwh--prod--data
    2025.12.12 02:46:09.763560 [770] <Error> AWSClient: Response status: 400, Bad Request
    2025.12.12 02:46:09.742310 [770] <Information> S3Client: Resolving region for bucket dwh--prod--data
    2025.12.12 02:46:09.583405 [770] <Information> S3Client: Found region eu-central-1 for bucket dwh--prod--data
    2025.12.12 02:46:09.560895 [770] <Error> AWSClient: Response status: 400, Bad Request
    2025.12.12 02:46:09.539264 [770] <Information> S3Client: Resolving region for bucket dwh--prod--data
    2025.12.12 02:46:09.387190 [770] <Information> S3Client: Found region eu-central-1 for bucket dwh--prod--data
    2025.12.12 02:46:09.305615 [770] <Error> AWSClient: Response status: 400, Bad Request
    2025.12.12 02:46:09.236829 [770] <Information> S3Client: Resolving region for bucket dwh--prod--data
    2025.12.12 02:46:09.081699 [770] <Information> S3Client: Found region eu-central-1 for bucket dwh--prod--data
    2025.12.12 02:46:09.058927 [770] <Error> AWSClient: Response status: 400, Bad Request
    2025.12.12 02:46:08.992935 [770] <Information> S3Client: Resolving region for bucket dwh--prod--data
    2025.12.12 02:46:08.858249 [770] <Information> S3Client: Found region eu-central-1 for bucket dwh--prod--data
    2025.12.12 02:46:08.835508 [770] <Error> AWSClient: Response status: 400, Bad Request
    2025.12.12 02:46:08.812712 [770] <Information> S3Client: Resolving region for bucket dwh--prod--data

This pattern repeats per table; with bad/expired credentials it amplifies the log spam and load on IMDS/STS.

Impact

  • Unnecessary repeated region discovery and credential resolution for the same bucket.
  • Extra latency/log noise; under invalid creds, floods logs with 400/403 per table.
  • Scales poorly when enumerating many tables.

Suggestion

  • Reuse S3 clients/credential providers per catalog/endpoint instead of per table.
  • Cache region resolution per bucket (or allow configuring region explicitly for the catalog) to avoid repeated “Resolving region” calls.
  • Optionally short-circuit if credentials are empty/invalid before sending S3 requests to reduce 400s.

Affected version 25.8 (official)

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceunexpected behaviourResult is unexpected, but not entirely wrong at the same time.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions