Skip to content

Fix flaky test 01079_parallel_alter_detach_table_zookeeper#96888

Merged
alexey-milovidov merged 3 commits intomasterfrom
fix-flaky-01079-parallel-alter-detach
Feb 14, 2026
Merged

Fix flaky test 01079_parallel_alter_detach_table_zookeeper#96888
alexey-milovidov merged 3 commits intomasterfrom
fix-flaky-01079-parallel-alter-detach

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

Summary

  • Add a retry loop after SYSTEM SYNC REPLICA that waits for pending mutations and replication queue entries (ALTER_METADATA/MUTATE_PART) to drain before asserting they're empty
  • SYSTEM SYNC REPLICA may return while a MUTATE_PART entry is still postponed (e.g. waiting for an in-progress MERGE_PARTS on the same part to complete), causing a spurious diff
  • Add database filtering to the system.replication_queue query for proper test isolation

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=96874&sha=7d053394dc3bdd556f989a8546389fdab344aa41&name_0=PR&name_1=Stateless%20tests%20%28amd_debug%2C%20AsyncInsert%2C%20s3%20storage%2C%20sequential%29

Closes #91413

Changelog category (leave one):

  • CI Fix or Improvement (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

...

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

🤖 Generated with Claude Code

`SYSTEM SYNC REPLICA` may return while a `MUTATE_PART` entry is still
postponed in the replication queue (e.g. waiting for an in-progress
`MERGE_PARTS` on the same part to complete). The test then immediately
checks `system.replication_queue` and finds the not-yet-processed entry,
causing a spurious diff against the reference file.

Add a retry loop that waits for both pending mutations and replication
queue entries (`ALTER_METADATA`/`MUTATE_PART`) to drain before asserting.
Also add `database` filtering to the `system.replication_queue` query
for proper test isolation.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=96874&sha=7d053394dc3bdd556f989a8546389fdab344aa41&name_0=PR&name_1=Stateless%20tests%20%28amd_debug%2C%20AsyncInsert%2C%20s3%20storage%2C%20sequential%29

Closes #91413

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Feb 14, 2026

Workflow [PR], commit [b3c613b]

@clickhouse-gh clickhouse-gh bot added the pr-ci label Feb 14, 2026
$CLICKHOUSE_CLIENT --receive_timeout 120 --query "SYSTEM SYNC REPLICA concurrent_alter_detach_$i"
$CLICKHOUSE_CLIENT --query "SELECT SUM(toUInt64(value1)) > $INITIAL_SUM FROM concurrent_alter_detach_$i"

# Wait for all mutations and replication queue entries to finish.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the exact reasoning. At least I'd expect that when the entry is in the queue, a subsequent SYNC REPLICA has to wait for it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll edit the comment.

@alexey-milovidov alexey-milovidov self-assigned this Feb 14, 2026
@alexey-milovidov alexey-milovidov merged commit bcdf506 into master Feb 14, 2026
1 check failed
@alexey-milovidov alexey-milovidov deleted the fix-flaky-01079-parallel-alter-detach branch February 14, 2026 23:37
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ci pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test 01079_parallel_alter_detach_table_zookeeper is flaky

2 participants