sql, jobs: add schema change job resumer#45870
Merged
thoszhang merged 3 commits intocockroachdb:masterfrom Mar 9, 2020
Merged
Conversation
Member
dt
approved these changes
Mar 9, 2020
f43b118 to
74b39b9
Compare
This PR adds a `SessionBoundInternalExecutorFactory` to the job registry, and exposes the ability to create session bound internal executors in jobs via a method `MakeSessionBoundInternalExecutor()` on `Job`. This is intended for use by the schema change job resumer, which stores state on the internal executors it creates so that it can override the tables in the TableCollection for each internal validation query. Release note: None
Previously, the job adoption loop would pick a job to adopt by attempting to adopt jobs in decreasing order by created timestamp. On a single-node cluster, this could cause a deadlock where a schema change job is never adopted because it's stuck behind a newer schema change job with a higher mutation ID, which itself is stuck in a retry loop due to not being first in line in the mutations list. This PR randomizes the order in which jobs are up for attempted adoption when triggered by the timer. In the case where we attempt to adopt a job upon being notified of a queued job, we continue to try to adopt the newest job. Release note (general change): Non-running jobs are now considered for adoption in randomized order instead of in a determistic order by creation time, to avoid potential deadlocks when schema change jobs need to execute in a specific order. This is a preemptive change, not a bug fix, but it affects all jobs.
74b39b9 to
bc12f50
Compare
Author
|
The comment table migration ended up causing test failures since we're now creating jobs for the The most recent failure is because |
This commit adds a `SchemaChangeResumer` for the schema change job, which creates and runs a `SchemaChanger` to process outstanding schema changes for a given table ID and mutation ID (with the exception of dropping a database). A schema change job is now queued to run after a transaction that initiates a schema change before results are returned to the client. Schema changes are now managed like other jobs and can be paused, cancelled, etc. Specifically: - `SchemaChanger`s are now created solely within jobs. `SchemaChanger.exec()` has been split into a main schema change phase (for `Resume`) and a rollback phase (for `OnFailOrCancel`), and most of the code that updates job state (including making a new rollback job) has been removed from the schema changer. - The `SchemaChangeManager`, which used to handle async schema changes, and the `SchemaChanger`s queued in `extraTxnState` are all removed. `TableDescriptor.Lease` is no longer used. - We now queue a job for every schema change, even ones with no mutation, for every table affected (including, e.g., referenced tables for foreign keys), so we can wait for old leases to expire. The API for writing an updated table descriptor (used in, e.g., the execution of `ALTER TABLE`) has been refactored to accommodate this. Creating jobs for dropping tables (including views, etc.) has been unified with creating jobs for other schema changes. - The schema change tests have been refactored and (partially or entirely) skipped or removed in a lot of cases. Testing knobs only relevant to the old way of running schema changes have been removed. Tests are skipped for now either because this PR doesn't include any way to GC tables/indexes (see below) or because they were too complex and tightly coupled to the previous implementation and there seemed to be diminishing returns in getting them to work right now. There's some follow-up work left for PRs to follow shortly. The most significant pre-existing functionality not implemented in this PR is GC for tables and indexes (generally using `ClearRange`) after `gc.ttlseconds` have elapsed. There are some tests commented out or skipped, some of which are due to the lack of GC functionality and some of which are due to tests being complex and tightly coupled to the previous schema change implementation that need to be reconsidered. Release note (general change): Schema changes are now scheduled and run fully like other jobs and can take advantage of recent improvements in stability in the job framework, so they now can be canceled, paused, resumed after a crash or after being paused, and so on. Some other UI differences come with this implementation change; notably, all schema changes now have an associated job, failed schema changes are now rolled back within the Reverting phase of the same job, and GC for dropped indexes and tables is deferred to a later job.
bc12f50 to
e340ca9
Compare
Author
|
Now I'm getting |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is the same as #44391, which was originally reverted. This new PR has a few updates:
jobs: allow jobs to create internal executors
This PR adds a
SessionBoundInternalExecutorFactoryto the jobregistry, and exposes the ability to create session bound internal
executors in jobs via a method
MakeSessionBoundInternalExecutor()onJob.This is intended for use by the schema change job resumer, which stores
state on the internal executors it creates so that it can override the
tables in the TableCollection for each internal validation query.
Release note: None
jobs: randomize job order in adoption loop
Previously, the job adoption loop would pick a job to adopt by
attempting to adopt jobs in decreasing order by created timestamp. On a
single-node cluster, this could cause a deadlock where a schema change
job is never adopted because it's stuck behind a newer schema change job
with a higher mutation ID, which itself is stuck in a retry loop due to
not being first in line in the mutations list.
This PR randomizes the order in which jobs are up for attempted
adoption when triggered by the timer. In the case where we attempt to
adopt a job upon being notified of a queued job, we continue to try to
adopt the newest job.
Release note (general change): Non-running jobs are now considered for
adoption in randomized order instead of in a determistic order by
creation time, to avoid potential deadlocks when schema change jobs need
to execute in a specific order. This is a preemptive change, not a bug
fix, but it affects all jobs.
sql, jobs: add schema change job resumer
This commit adds a
SchemaChangeResumerfor the schema change job,which creates and runs a
SchemaChangerto process outstanding schemachanges for a given table ID and mutation ID (with the exception of
dropping a database). A schema change job is now queued to run after a
transaction that initiates a schema change before results are returned
to the client. Schema changes are now managed like other jobs and can be
paused, cancelled, etc.
Specifically:
SchemaChangers are now created solely within jobs.SchemaChanger.exec()has been split into a main schema change phase(for
Resume) and a rollback phase (forOnFailOrCancel), and mostof the code that updates job state (including making a new rollback
job) has been removed from the schema changer.
SchemaChangeManager, which used to handle async schema changes,and the
SchemaChangers queued inextraTxnStateare all removed.TableDescriptor.Leaseis no longer used.mutation, for every table affected (including, e.g., referenced tables
for foreign keys), so we can wait for old leases to expire. The API
for writing an updated table descriptor (used in, e.g., the execution
of
ALTER TABLE) has been refactored to accommodate this. Creatingjobs for dropping tables (including views, etc.) has been unified with
creating jobs for other schema changes.
entirely) skipped or removed in a lot of cases. Testing knobs only
relevant to the old way of running schema changes have been removed.
Tests are skipped for now either because this PR doesn't include any
way to GC tables/indexes (see below) or because they were too complex
and tightly coupled to the previous implementation and there seemed to
be diminishing returns in getting them to work right now.
There's some follow-up work left for PRs to follow shortly. The most
significant pre-existing functionality not implemented in this PR is GC
for tables and indexes (generally using
ClearRange) aftergc.ttlsecondshave elapsed. There are some tests commented out orskipped, some of which are due to the lack of GC functionality and some
of which are due to tests being complex and tightly coupled to the
previous schema change implementation that need to be reconsidered.
Release note (general change): Schema changes are now scheduled and run
fully like other jobs and can take advantage of recent improvements in
stability in the job framework, so they now can be canceled, paused,
resumed after a crash or after being paused, and so on. Some other UI
differences come with this implementation change; notably, all schema
changes now have an associated job, failed schema changes are now rolled
back within the Reverting phase of the same job, and GC for dropped
indexes and tables is deferred to a later job.