-
Notifications
You must be signed in to change notification settings - Fork 4.1k
jobs: add better index for job claim and cancellation queries #65080
Description
Is your feature request related to a problem? Please describe.
Each node in the cluster runs a query periodically to find jobs in a claimable state which are claimed by expired sessions and to remove that assignment.
cockroach/pkg/jobs/registry.go
Lines 671 to 675 in 88754f2
| UPDATE system.jobs | |
| SET claim_session_id = NULL | |
| WHERE claim_session_id <> $1 | |
| AND status IN `+claimableStatusTupleString+` | |
| AND NOT crdb_internal.sql_liveness_is_alive(claim_session_id)`, |
Unassigned jobs in a claimable state are then claimed in a different periodic loop.
Lines 51 to 58 in 88754f2
| ctx, "claim-jobs", txn, ` | |
| UPDATE system.jobs | |
| SET claim_session_id = $1, claim_instance_id = $2 | |
| WHERE claim_session_id IS NULL | |
| AND status IN `+claimableStatusTupleString+` | |
| ORDER BY created DESC | |
| LIMIT $3 | |
| RETURNING id;`, |
Both of these transactions will lay down locks during their execution. These locks have an arbitrary duration and can block users reading the table, leading to a bad user experience. Another big problem with these queries is that they require scanning and performing an index-join on all rows in the relevant states even though they are only looking for rows with a specific claim_session_id.
Describe the solution you'd like
Add a partial index to the jobs table which includes the state and the claim_session_id for the claimable states. This will allow for much more efficient update queries which scan narrower spans.
Epic: CRDB-7912