Skip to content

release-20.1: schemachange: speed up slow schema changes#48621

Merged
ajwerner merged 1 commit intocockroachdb:release-20.1from
spaskob:backport20.1-48608
May 11, 2020
Merged

release-20.1: schemachange: speed up slow schema changes#48621
ajwerner merged 1 commit intocockroachdb:release-20.1from
spaskob:backport20.1-48608

Conversation

@spaskob
Copy link
Copy Markdown
Contributor

@spaskob spaskob commented May 9, 2020

Backport 1/1 commits from #48608.

/cc @cockroachdb/release


Touches #45150.
Fixes #47607.
Touches #47790.

Release note (performance improvement):
Before this a simple schema change could take 30s+.
The reason was that if the schema change is not first
in line in the table mutation queue it would return a
re-triable error and the jobs framework will re-adopt and
run it later. The problem is that the job adoption loop
is 30s.

To repro run this for some time:

cockroach sql --insecure --watch 1s -e 'drop table if exists users cascade; create table users (id uuid not null, name varchar(255) not null, email varchar(255) not null, password varchar(255) not null, remember_token varchar(100) null, created_at timestamp(0) without time zone null, updated_at timestamp(0) without time zone null, deleted_at timestamp(0) without time zone null); alter table users add primary key (id); alter table users add constraint users_email_unique unique (email);'

Instead of returning on re-triable errors we retry with exponential
backoff in the schema change code. This pattern of dealing with
re-triable errors in client job code is encouraged vs relying on the
registry because the latter leads to slowness and additionally to more
complicated test fixtures that rely on hacking with the internals of the
job registry,

Touches cockroachdb#47790.

Release note (performance improvement):
Before this a simple schema change could take 30s+.
The reason was that if the schema change is not first
in line in the table mutation queue it would return a
re-triable error and the jobs framework will re-adopt and
run it later. The problem is that the job adoption loop
is 30s.

To repro run this for some time:
```
cockroach sql --insecure --watch 1s -e 'drop table if exists users cascade; create table users (id uuid not null, name varchar(255) not null, email varchar(255) not null, password varchar(255) not null, remember_token varchar(100) null, created_at timestamp(0) without time zone null, updated_at timestamp(0) without time zone null, deleted_at timestamp(0) without time zone null); alter table users add primary key (id); alter table users add constraint users_email_unique unique (email);'
```

Instead of returning on retriable errors we retry with a exponential
backoff in the schema change code. This pattern of dealing with
retriable errors in client job code is encouraged vs relying on the
registry beacuse the latter leads to slowness and additionally to more
complicated test fixtures that rely in hacking with the internals of the
job registry,
@spaskob spaskob requested review from ajwerner and thoszhang May 9, 2020 01:36
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: too

I'm going to push the button

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)

@ajwerner ajwerner merged commit 864a8f3 into cockroachdb:release-20.1 May 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants