Fix flaky test_disallow_concurrency#93612
Conversation
…on race When restoring a ReplicatedMergeTree table ON CLUSTER, multiple replicas restore the same parts concurrently. If one replica commits a part to ZooKeeper first, the other replica's part gets deduplicated. The code assumed deduplicated parts always came from ATTACH PART (located in detached/attaching_*) and threw LOGICAL_ERROR for other paths, causing the server to abort. During restore, parts are in tmp_restore_* directories instead. Now deduplicated parts from restore are simply removed since the data already exists on another replica and will be fetched via normal replication.
|
Workflow [PR], commit [d2b3915] Summary: ❌
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a LOGICAL_ERROR crash that occurred when restoring ReplicatedMergeTree tables ON CLUSTER due to a deduplication race condition between replicas.
Key Changes:
- Fixed handling of deduplicated parts during concurrent restore operations by removing them instead of throwing LOGICAL_ERROR
- Removed unused import from test file
- Increased flaky test repeat counts to better catch concurrency issues
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/Storages/MergeTree/ReplicatedMergeTreeSink.cpp | Fixed deduplication logic to handle parts from RESTORE operations by removing them instead of throwing LOGICAL_ERROR |
| tests/integration/test_backup_restore_on_cluster/test_disallow_concurrency.py | Removed unused typing.List import |
| ci/jobs/integration_test_job.py | Increased flaky check repeat counts for better test coverage |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| const auto & relative_path = part->getDataPartStorage().getRelativePath(); | ||
| const auto part_dir = fs::path(relative_path).parent_path().filename().string(); | ||
|
|
||
| if (relative_path.ends_with("detached/attaching_" + part->name + "/")) |
There was a problem hiding this comment.
The original logic checked that the path did NOT match the expected pattern and threw an error. The new logic inverts this to a positive check. However, this creates inconsistency: the first condition uses ends_with on relative_path, while the second condition uses starts_with on part_dir (which is just the parent directory name). Consider using a consistent approach for both checks, either both operating on relative_path or both on part_dir, to improve code clarity and reduce the chance of logic errors.
It shouldn't be like this. There is a condition (see) which checks that only one replica inserts restored parts into each replicated table. |
Cherry pick #93612 to 25.12: Fix flaky test_disallow_concurrency
Cherry pick #93612 to 25.3: Fix flaky test_disallow_concurrency
Cherry pick #93612 to 25.8: Fix flaky test_disallow_concurrency
Cherry pick #93612 to 25.10: Fix flaky test_disallow_concurrency
Cherry pick #93612 to 25.11: Fix flaky test_disallow_concurrency
Backport #93612 to 25.12: Fix flaky test_disallow_concurrency
Backport #93612 to 25.3: Fix flaky test_disallow_concurrency
Backport #93612 to 25.11: Fix flaky test_disallow_concurrency
Backport #93612 to 25.8: Fix flaky test_disallow_concurrency
Backport #93612 to 25.10: Fix flaky test_disallow_concurrency
Fix LOGICAL_ERROR when restoring ReplicatedMergeTree with deduplication race
When restoring a ReplicatedMergeTree table ON CLUSTER, multiple replicas
restore the same parts concurrently. If one replica commits a part to
ZooKeeper first, the other replica's part gets deduplicated. The code
assumed deduplicated parts always came from ATTACH PART (located in
detached/attaching_*) and threw LOGICAL_ERROR for other paths, causing
the server to abort.
During restore, parts are in tmp_restore_* directories instead. Those parts
are now properly deduplicated as well.
Log traces can be found in the logs of any of the failures:
In 89da254 I changed the flaky check to run 1000 times to ensure the tests are not flaky anymore. Considering from prior failures) a 2-8% failure rate, we should see at least some failures if the flakiness was still there. Fortunately, all tests succeeded after 5h of job running:
Closes #93413
Closes #93235
Closes #93177
Closes #68012
Closes #93549
Closes #93550
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fix LOGICAL_ERROR when restoring ReplicatedMergeTree with deduplication race
Documentation entry for user-facing changes