sql, jobs: add schema change job resumer by thoszhang · Pull Request #45870 · cockroachdb/cockroach

thoszhang · 2020-03-09T05:19:43Z

This PR is the same as #44391, which was originally reverted. This new PR has a few updates:

It's rebased on top of server: start jobs registry before running startup migrations #45828, so startup migrations involving schema changes now work and and all commented-out tests that were broken are now restored
Minor merge conflict in users.go is fixed
Commit message is fixed

jobs: allow jobs to create internal executors

This PR adds a SessionBoundInternalExecutorFactory to the job
registry, and exposes the ability to create session bound internal
executors in jobs via a method MakeSessionBoundInternalExecutor() on
Job.

This is intended for use by the schema change job resumer, which stores
state on the internal executors it creates so that it can override the
tables in the TableCollection for each internal validation query.

Release note: None

jobs: randomize job order in adoption loop

Previously, the job adoption loop would pick a job to adopt by
attempting to adopt jobs in decreasing order by created timestamp. On a
single-node cluster, this could cause a deadlock where a schema change
job is never adopted because it's stuck behind a newer schema change job
with a higher mutation ID, which itself is stuck in a retry loop due to
not being first in line in the mutations list.

This PR randomizes the order in which jobs are up for attempted
adoption when triggered by the timer. In the case where we attempt to
adopt a job upon being notified of a queued job, we continue to try to
adopt the newest job.

Release note (general change): Non-running jobs are now considered for
adoption in randomized order instead of in a determistic order by
creation time, to avoid potential deadlocks when schema change jobs need
to execute in a specific order. This is a preemptive change, not a bug
fix, but it affects all jobs.

sql, jobs: add schema change job resumer

This commit adds a SchemaChangeResumer for the schema change job,
which creates and runs a SchemaChanger to process outstanding schema
changes for a given table ID and mutation ID (with the exception of
dropping a database). A schema change job is now queued to run after a
transaction that initiates a schema change before results are returned
to the client. Schema changes are now managed like other jobs and can be
paused, cancelled, etc.

Specifically:

SchemaChangers are now created solely within jobs.
SchemaChanger.exec() has been split into a main schema change phase
(for Resume) and a rollback phase (for OnFailOrCancel), and most
of the code that updates job state (including making a new rollback
job) has been removed from the schema changer.
The SchemaChangeManager, which used to handle async schema changes,
and the SchemaChangers queued in extraTxnState are all removed.
TableDescriptor.Lease is no longer used.
We now queue a job for every schema change, even ones with no
mutation, for every table affected (including, e.g., referenced tables
for foreign keys), so we can wait for old leases to expire. The API
for writing an updated table descriptor (used in, e.g., the execution
of ALTER TABLE) has been refactored to accommodate this. Creating
jobs for dropping tables (including views, etc.) has been unified with
creating jobs for other schema changes.
The schema change tests have been refactored and (partially or
entirely) skipped or removed in a lot of cases. Testing knobs only
relevant to the old way of running schema changes have been removed.
Tests are skipped for now either because this PR doesn't include any
way to GC tables/indexes (see below) or because they were too complex
and tightly coupled to the previous implementation and there seemed to
be diminishing returns in getting them to work right now.

There's some follow-up work left for PRs to follow shortly. The most
significant pre-existing functionality not implemented in this PR is GC
for tables and indexes (generally using ClearRange) after
gc.ttlseconds have elapsed. There are some tests commented out or
skipped, some of which are due to the lack of GC functionality and some
of which are due to tests being complex and tightly coupled to the
previous schema change implementation that need to be reconsidered.

Release note (general change): Schema changes are now scheduled and run
fully like other jobs and can take advantage of recent improvements in
stability in the job framework, so they now can be canceled, paused,
resumed after a crash or after being paused, and so on. Some other UI
differences come with this implementation change; notably, all schema
changes now have an associated job, failed schema changes are now rolled
back within the Reverting phase of the same job, and GC for dropped
indexes and tables is deferred to a later job.

cockroach-teamcity · 2020-03-09T05:19:51Z

This change is

This PR adds a `SessionBoundInternalExecutorFactory` to the job registry, and exposes the ability to create session bound internal executors in jobs via a method `MakeSessionBoundInternalExecutor()` on `Job`. This is intended for use by the schema change job resumer, which stores state on the internal executors it creates so that it can override the tables in the TableCollection for each internal validation query. Release note: None

Previously, the job adoption loop would pick a job to adopt by attempting to adopt jobs in decreasing order by created timestamp. On a single-node cluster, this could cause a deadlock where a schema change job is never adopted because it's stuck behind a newer schema change job with a higher mutation ID, which itself is stuck in a retry loop due to not being first in line in the mutations list. This PR randomizes the order in which jobs are up for attempted adoption when triggered by the timer. In the case where we attempt to adopt a job upon being notified of a queued job, we continue to try to adopt the newest job. Release note (general change): Non-running jobs are now considered for adoption in randomized order instead of in a determistic order by creation time, to avoid potential deadlocks when schema change jobs need to execute in a specific order. This is a preemptive change, not a bug fix, but it affects all jobs.

thoszhang · 2020-03-09T20:01:07Z

The comment table migration ended up causing test failures since we're now creating jobs for the REVOKE statements, and sometimes tests assume a certain number of jobs in the job table. Those tests have gotten some quick fixes which should probably be cleaned up later. I also rebased on top of #45873.

The most recent failure is because TestLogic/fakedist-disk/aggregate/string_agg is flaky (see #45905). Trying again....

This commit adds a `SchemaChangeResumer` for the schema change job, which creates and runs a `SchemaChanger` to process outstanding schema changes for a given table ID and mutation ID (with the exception of dropping a database). A schema change job is now queued to run after a transaction that initiates a schema change before results are returned to the client. Schema changes are now managed like other jobs and can be paused, cancelled, etc. Specifically: - `SchemaChanger`s are now created solely within jobs. `SchemaChanger.exec()` has been split into a main schema change phase (for `Resume`) and a rollback phase (for `OnFailOrCancel`), and most of the code that updates job state (including making a new rollback job) has been removed from the schema changer. - The `SchemaChangeManager`, which used to handle async schema changes, and the `SchemaChanger`s queued in `extraTxnState` are all removed. `TableDescriptor.Lease` is no longer used. - We now queue a job for every schema change, even ones with no mutation, for every table affected (including, e.g., referenced tables for foreign keys), so we can wait for old leases to expire. The API for writing an updated table descriptor (used in, e.g., the execution of `ALTER TABLE`) has been refactored to accommodate this. Creating jobs for dropping tables (including views, etc.) has been unified with creating jobs for other schema changes. - The schema change tests have been refactored and (partially or entirely) skipped or removed in a lot of cases. Testing knobs only relevant to the old way of running schema changes have been removed. Tests are skipped for now either because this PR doesn't include any way to GC tables/indexes (see below) or because they were too complex and tightly coupled to the previous implementation and there seemed to be diminishing returns in getting them to work right now. There's some follow-up work left for PRs to follow shortly. The most significant pre-existing functionality not implemented in this PR is GC for tables and indexes (generally using `ClearRange`) after `gc.ttlseconds` have elapsed. There are some tests commented out or skipped, some of which are due to the lack of GC functionality and some of which are due to tests being complex and tightly coupled to the previous schema change implementation that need to be reconsidered. Release note (general change): Schema changes are now scheduled and run fully like other jobs and can take advantage of recent improvements in stability in the job framework, so they now can be canceled, paused, resumed after a crash or after being paused, and so on. Some other UI differences come with this implementation change; notably, all schema changes now have an associated job, failed schema changes are now rolled back within the Reverting phase of the same job, and GC for dropped indexes and tables is deferred to a later job.

thoszhang · 2020-03-09T20:48:56Z

Now I'm getting No space left on device :(

thoszhang requested a review from dt March 9, 2020 05:22

dt approved these changes Mar 9, 2020

View reviewed changes

thoszhang force-pushed the schema-changer-job branch 2 times, most recently from f43b118 to 74b39b9 Compare March 9, 2020 17:30

lucy-zhang added 2 commits March 9, 2020 14:15

thoszhang force-pushed the schema-changer-job branch from 74b39b9 to bc12f50 Compare March 9, 2020 18:32

thoszhang force-pushed the schema-changer-job branch from bc12f50 to e340ca9 Compare March 9, 2020 20:01

thoszhang merged commit d8b0caa into cockroachdb:master Mar 9, 2020

thoszhang deleted the schema-changer-job branch March 9, 2020 22:39

andreimatei mentioned this pull request Mar 10, 2020

sql: TestSchemaChangeRetry, TestSchemaChangeRetryOnVersionChange started taking a minute after jobs change #45970

Closed

jseldess mentioned this pull request Mar 19, 2020

sql, jobs: add schema change job resumer cockroachdb/docs#6894

Closed

thoszhang mentioned this pull request Apr 9, 2020

schema change: DROP DATABASE still waiting for GC TTL after 25h #42427

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql, jobs: add schema change job resumer#45870

sql, jobs: add schema change job resumer#45870
thoszhang merged 3 commits intocockroachdb:masterfrom
thoszhang:schema-changer-job

thoszhang commented Mar 9, 2020 •

edited

Loading

Uh oh!

cockroach-teamcity commented Mar 9, 2020

Uh oh!

thoszhang commented Mar 9, 2020

Uh oh!

thoszhang commented Mar 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

thoszhang commented Mar 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

jobs: allow jobs to create internal executors

jobs: randomize job order in adoption loop

sql, jobs: add schema change job resumer

Uh oh!

cockroach-teamcity commented Mar 9, 2020

Uh oh!

thoszhang commented Mar 9, 2020

Uh oh!

thoszhang commented Mar 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

thoszhang commented Mar 9, 2020 •

edited

Loading