fix logical-error when undoing quorum insert transaction#61953
Merged
hanfei1991 merged 4 commits intoClickHouse:masterfrom Mar 28, 2024
Merged
fix logical-error when undoing quorum insert transaction#61953hanfei1991 merged 4 commits intoClickHouse:masterfrom
hanfei1991 merged 4 commits intoClickHouse:masterfrom
Conversation
Member
|
This is an automated comment for commit 15566f6 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
Member
Author
|
@CheSema PTAL |
CheSema
approved these changes
Mar 28, 2024
Member
CheSema
left a comment
There was a problem hiding this comment.
Actually cool work with fail points.
This was referenced May 23, 2024
robot-clickhouse
added a commit
that referenced
this pull request
May 23, 2024
This was referenced May 23, 2024
robot-clickhouse
added a commit
that referenced
this pull request
May 23, 2024
robot-ch-test-poll
added a commit
that referenced
this pull request
May 23, 2024
Backport #61953 to 24.2: fix logical-error when undoing quorum insert transaction
robot-ch-test-poll2
added a commit
that referenced
this pull request
May 23, 2024
Backport #61953 to 24.3: fix logical-error when undoing quorum insert transaction
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See https://github.com/ClickHouse/clickhouse-core-incidents/issues/68
This bug is:
server 2do quorum insert, gotconnection losswhen writing keeper, but keeper has the recordserver 1see the log, but does not see the part, becauseserver 2is read-only. thenserver 1set this part asquorum/failed_partserver 2wake up, therestarting threadsee oh my god there arequorum/failed_partlet's clean themrestarting thread, report LOGICAL_ERRORsolution
in retry logic, if we see
quorum/failed_part/ part_name exists, do nothing, becauserestarting threadwill clean it.Sorry that I have not tested it, the logic is complex :( will do it slowly
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
fix logical-error when undoing quorum insert transaction
Documentation entry for user-facing changes
Modify your CI run:
NOTE: If your merge the PR with modified CI you MUST KNOW what you are doing
NOTE: Set desired options before CI starts or re-push after updates
Run only:
CI options:
Only specified batches in multi-batch jobs: