Skip to content

Mutations having deleted table in subquery don't retry indefinitely#55946

Closed
myrrc wants to merge 6 commits intoClickHouse:masterfrom
myrrc:fix/mutation-for-deleted-subtable
Closed

Mutations having deleted table in subquery don't retry indefinitely#55946
myrrc wants to merge 6 commits intoClickHouse:masterfrom
myrrc:fix/mutation-for-deleted-subtable

Conversation

@myrrc
Copy link
Copy Markdown
Contributor

@myrrc myrrc commented Oct 23, 2023

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Mutations having deleted table in subquery don't retry indefinitely and fail fast.

Resolves #37435
Related: #36987

@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-bugfix Pull request with bugfix, not backported by default label Oct 23, 2023
@robot-ch-test-poll2
Copy link
Copy Markdown
Contributor

robot-ch-test-poll2 commented Oct 23, 2023

This is an automated comment for commit 37f2fb2 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Successful checks
Check nameDescriptionStatus
Mergeable CheckChecks if all other necessary checks are successful✅ success
Push to DockerhubThe check for building and pushing the CI related docker images to docker hub✅ success
Style CheckRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report✅ success
Check nameDescriptionStatus
CI runningA meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR⏳ pending
Fast testNormally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here❌ failure

@alexey-milovidov alexey-milovidov removed the pr-bugfix Pull request with bugfix, not backported by default label Oct 23, 2023
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-improvement Pull request with some product improvements label Oct 24, 2023
@myrrc myrrc marked this pull request as ready for review October 24, 2023 12:33
@myrrc
Copy link
Copy Markdown
Contributor Author

myrrc commented Oct 24, 2023

This PR has multiple flaws I'd like to address

  1. Tests require system.errors having a specific state at start. Re-running tests may produce incorrect results. I haven't found another reasonable way to query errors state (maybe using part_log?).
  2. Deleting a mutation when a table doesn't exist is a cure for a specific case. The general problem is that mutations' info doesn't include neither retry count, nor error_code (only message, which isn't a reliable filter). However, building a proper fix is quite a big task.
  3. This fix (currently) doesn't work for replicated tables. The only solution I see is to delete mutations on each replica individually
  4. It is possible that a mutation for table A referencing table B starts when table B is not loaded yet. In this case I believe my solution is incorrect

A good solution could manage an exponential backoff for failed mutations, but we have only start time (not finish time) in system.mutations, and time is not used while planning the mutation job to MergeTreeBackgroundExecutor

Mike Kot added 2 commits October 24, 2023 13:52
@myrrc
Copy link
Copy Markdown
Contributor Author

myrrc commented Dec 11, 2023

As Mikhail is working on a better approach for mutations, closing this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mutation hangs if right join is deleted

3 participants