Skip to content

jobs: configure more frequent retries for polling in Registry.Run() #45677

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
thoszhang:queued-job-retry
Mar 4, 2020
Merged

jobs: configure more frequent retries for polling in Registry.Run() #45677
craig[bot] merged 1 commit intocockroachdb:masterfrom
thoszhang:queued-job-retry

Conversation

@thoszhang
Copy link
Copy Markdown

Previously, we were using SHOW JOBS WHEN COMPLETE to wait for queued
jobs to finish, which uses a constant retry interval (on the server
side) of 1s. This will be too long for short schema change jobs that
require much less than 1s to complete. This PR replaces the use of SHOW JOBS WHEN COMPLETE with a retry loop that starts with a 10ms backoff
and increases to a maximum of 1s in Registry.Run().

We currently have no jobs using the queuing interface, so there's no
user-facing change.

Release note: None

@thoszhang thoszhang requested review from ajwerner and spaskob March 3, 2020 22:43
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@thoszhang thoszhang changed the title jobs: configure more frequent job retries in Registry.Run() jobs: configure more frequent retries for polling in Registry.Run() Mar 3, 2020
Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: mod nits

Reviewed 1 of 1 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang and @spaskob)


pkg/jobs/registry.go, line 275 at r1 (raw file):

		// WHEN COMPLETE, if one of the jobs is missing from the jobs table for
		// whatever reason, we'll fail later when we try to load the job.
		rows, err := ex.QueryRowEx(

nit: it's a single row, call it row?


pkg/jobs/registry.go, line 280 at r1 (raw file):

			nil, /* txn */
			sqlbase.InternalExecutorSessionDataOverride{User: security.RootUser},
			fmt.Sprintf("SELECT count(*) FROM [SHOW JOBS VALUES %s] WHERE finished IS NULL", buf.String()),

nit: can you format the query above the loop?

Copy link
Copy Markdown
Contributor

@spaskob spaskob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang and @spaskob)

Copy link
Copy Markdown
Contributor

@spaskob spaskob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @lucy-zhang and @spaskob)

Previously, we were using `SHOW JOBS WHEN COMPLETE` to wait for queued
jobs to finish, which uses a constant retry interval (on the server
side) of 1s. This will be too long for short schema change jobs that
require much less than 1s to complete. This PR replaces the use of `SHOW
JOBS WHEN COMPLETE` with a retry loop that starts with a 10ms backoff
and increases to a maximum of 1s in `Registry.Run()`.

We currently have no jobs using the queuing interface, so there's no
user-facing change.

Release note: None
Copy link
Copy Markdown
Author

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajwerner)


pkg/jobs/registry.go, line 275 at r1 (raw file):

Previously, ajwerner wrote…

nit: it's a single row, call it row?

done


pkg/jobs/registry.go, line 280 at r1 (raw file):

Previously, ajwerner wrote…

nit: can you format the query above the loop?

done

@thoszhang
Copy link
Copy Markdown
Author

TFTRs

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 4, 2020

Build succeeded

@craig craig bot merged commit 223c08c into cockroachdb:master Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants