spanconfig: consult job lease interval when updating span configs #79171
spanconfig: consult job lease interval when updating span configs #79171craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
1e63686 to
5678cc8
Compare
ajwerner
left a comment
There was a problem hiding this comment.
Here's some comments. Didn't make it all the way through.
Reviewed 4 of 4 files at r1, 14 of 32 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @irfansharif, @nvanbenschoten, and @shermanCRL)
pkg/migration/migrations/seed_tenant_span_configs.go, line 89 at r2 (raw file):
} if err := scKVAccessor.UpdateSpanConfigRecords( ctx, nil, toUpsert, hlc.MinTimestamp, hlc.MaxTimestamp,
nit: add back comment or make a var that you set to nil
pkg/roachpb/span_config.proto, line 320 at r2 (raw file):
// the delete+upsert semantics described above. // // The delete/upsert updates are performed transactionally and the timestamp of
Is it important or relevant that the updates are performed transactionally? Would it break anything if, internally, more than one transaction were used, but all of the transactions carried the supplied deadline?
pkg/spanconfig/spanconfig.go, line 195 at r2 (raw file):
// [1]: It's possible for system.{zones,descriptor} to have been GC-ed away; // think suspended tenants. // TODO(arul): write words about the session being passed in.
nit: do this in this PR.
pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 38 at r2 (raw file):
// NB: don't change the string here; this will cause cross-version issues since // this singleton is used as a marker. var errMarkCanRetryInvalidLease = errors.New("should retry with valid lease start time")
would it be better to use a type and then register that type with the errors framework? That provides a path for migration. I don't know how much better it is. Does this definitely work across RPC boundaries, even in the face of wrapping? I'd love to see a test where the accessor implementation is accessed via a connector.
pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 290 at r2 (raw file):
txnCommitTimestamp = k.knobs.KVAccessorOverrideUpdateTransactionTimestamp } if leaseExpirationTime.Less(txnCommitTimestamp) {
this is not what you're looking for. You want:
Lines 782 to 788 in 1d4a1a3
pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 309 at r2 (raw file):
) } if txnCommitTimestamp.Less(leaseStartTime) {
this also isn't quite right. You want to just do the checking before you even create the transaction. I'd expect that you'd read the hlc, and if it's in the past of the timestamp, wait until it is not.
5678cc8 to
4bbc242
Compare
arulajmani
left a comment
There was a problem hiding this comment.
RFAL
Dismissed @ajwerner from a discussion.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @irfansharif, @nvanbenschoten, and @shermanCRL)
pkg/roachpb/span_config.proto, line 320 at r2 (raw file):
Previously, ajwerner wrote…
Is it important or relevant that the updates are performed transactionally? Would it break anything if, internally, more than one transaction were used, but all of the transactions carried the supplied deadline?
No, I don't think so -- "atomically" is probably a better word here, switched.
pkg/spanconfig/spanconfig.go, line 195 at r2 (raw file):
Previously, ajwerner wrote…
nit: do this in this PR.
oops, done.
pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 38 at r2 (raw file):
Previously, ajwerner wrote…
would it be better to use a type and then register that type with the errors framework? That provides a path for migration. I don't know how much better it is. Does this definitely work across RPC boundaries, even in the face of wrapping? I'd love to see a test where the accessor implementation is accessed via a connector.
Good call, this didn't work across RPC boundaries. Fixed.
pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 290 at r2 (raw file):
Previously, ajwerner wrote…
this is not what you're looking for. You want:
Lines 782 to 788 in 1d4a1a3
Done.
pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 309 at r2 (raw file):
Previously, ajwerner wrote…
this also isn't quite right. You want to just do the checking before you even create the transaction. I'd expect that you'd read the hlc, and if it's in the past of the timestamp, wait until it is not.
I wasn't sure if we wanted to wait things out or return errors if we weren't in the given interval initially; switched.
4bbc242 to
71070e7
Compare
| }) | ||
| timer := timeutil.NewTimer() | ||
| for { | ||
| timer.Reset(leaseStartTime.GoTime().Sub(k.clock.Now().GoTime())) |
There was a problem hiding this comment.
Can you refactor this logic to be like:
if err := waitForLeaseStart(ctx, k.clock, leaseStartTime); err != nil {
return err
}
return k.db.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
// Set the deadline of the transaction so that it commits before the
// lease of the job on behalf of which the KVAccessor is acting
// expires.
if leaseExpirationTime.Less(txn.ReadTimestamp()) {
return newRetryableLeaseExpiredError()
}
if err := txn.UpdateDeadline(ctx, leaseExpirationTime); err != nil {
return err
}
return k.updateSpanConfigRecordsWithTxn(ctx, toDelete, toUpsert, txn)
})| // TestKVAccessorUpdateLeastStartWaitRespondsToContextCancellation ensures that | ||
| // KVAccessor update requests which are waiting for their lease start times to | ||
| // be "valid" respond to context cancellations. | ||
| func TestKVAccessorUpdateLeaseStartWaitRespondsTo(t *testing.T) { |
pkg/jobs/jobs.go
Outdated
| // A Traceable job will dump a trace zip on failure. | ||
| dumpOnFail | ||
| // A Traceable job will dump a trace zip in any of paused, canceled, failed, | ||
| // A Traceable job will dump a trace zip in any o8 paused, canceled, failed, |
pkg/roachpb/span_config.proto
Outdated
| // the delete+upsert semantics described above. | ||
| // | ||
| // All delete/upsert updates are performed atomically at a timestamp within the | ||
| // [leaseStartTime, leaseExpirationTime) supplied. Typically, this corresponds |
There was a problem hiding this comment.
[nit] Use the capitalized "Lease{Start,Expiration}Time" in the comment, easier to grep for.
| message UpdateSpanConfigsResponse { | ||
| // This field stores any error that occurs on the server, allowing us to | ||
| // differentiate between those and RPC errors. | ||
| errorspb.EncodedError error = 1 [(gogoproto.nullable) = false]; |
There was a problem hiding this comment.
+cc @knz, out of curiosity, is this the pattern we should be using when propagating specific errors back up to the client?
| ctx context.Context, | ||
| toDelete []Target, | ||
| toUpsert []Record, | ||
| leaseStartTime hlc.Timestamp, |
There was a problem hiding this comment.
[nit] Drop "lease" phrasing here, it could instead be "txn{Start,End} hlc.Timestamp" or "commit{Min,Max}" etc. At the level of the interface, dropping references to specific impls (like the reconciliation job) makes things clearer -- you don't don't need to know at this level where these timestamps are coming from and that they are tied to lease durations. In our tests and other usages (SeedSpanConfigs) for e.g. they're not tied to leases at all.
nvb
left a comment
There was a problem hiding this comment.
Reviewed 3 of 4 files at r1, 33 of 33 files at r3, 39 of 39 files at r4, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @arulajmani, and @miretskiy)
pkg/roachpb/span_config.proto, line 320 at r2 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
No, I don't think so -- "atomically" is probably a better word here, switched.
Is it possible to guarantee atomicity with multiple txns? If half of the update commits, then time advances (or the spanconfig rows are read) such that the other half cannot commit within the lease interval, what can we do?
pkg/roachpb/span_config.proto, line 324 at r4 (raw file):
// [leaseStartTime, leaseExpirationTime) supplied. Typically, this corresponds // to the lease interval of the auto span config reconciliation job issuing the // request.
nit: mention what happens if committing the txn within this interval is not possible.
pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 180 at r4 (raw file):
timer := timeutil.NewTimer() for { timer.Reset(leaseStartTime.GoTime().Sub(k.clock.Now().GoTime()))
Can this use hlc.Clock.SleepUntil? That function does its best to avoid busy-looping and is also context aware.
pkg/spanconfig/spanconfigreconciler/reconciler.go, line 406 at r4 (raw file):
for retrier := retry.StartWithCtx(ctx, retryOpts); retrier.Next(); { err := kvAccessor.UpdateSpanConfigRecords( ctx, toDelete, toUpsert, session.StartTimestamp(), session.Expiration(),
Do we expect the Expiration to implicitly advance from call to call? That's needed for this retry loop to be effective. If so, could we add a comment somewhere?
Purely a refactor. We now store the sqlliveness.Session on jobs instead of sqlliveness.SessionID. Release note: None
Span config reconciliation is driven by the auto span config reconciliation job which builds over our existing jobs infrastructure. Jobs can be claimed by any node in the system. The jobs subsystem uses sqlliveness to claim jobs and a node has a disjoint lease over a claimed job until its session expires. This means transactions on behalf of this job must commit before the session expires if they wish to make strong mutual exclusion guarantess. The auto span config reconciliation job relies on such mutual exclusion guarantees. This patch changes the UpdateSpanConfigRequest to now include lease information in its proto. Typically, this is the lease interval of the span config job issuing these requests. Downstream, the KVAccessor ensures the transaction used to perform the update commits at a timestamp within this interval. Release note: None
71070e7 to
750b193
Compare
arulajmani
left a comment
There was a problem hiding this comment.
TFTRs!
bors r+
Dismissed @ajwerner and @irfansharif from 4 discussions.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @arulajmani, @irfansharif, @miretskiy, and @nvanbenschoten)
pkg/jobs/jobs.go, line 52 at r4 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Accidental typo?
Done.
pkg/roachpb/span_config.proto, line 320 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is it possible to guarantee atomicity with multiple txns? If half of the update commits, then time advances (or the spanconfig rows are read) such that the other half cannot commit within the lease interval, what can we do?
You're right, I don't think we'd be able to use more than one transactions here in the future given our atomicity guarantee.
pkg/roachpb/span_config.proto, line 322 at r4 (raw file):
Previously, irfansharif (irfan sharif) wrote…
[nit] Use the capitalized "Lease{Start,Expiration}Time" in the comment, easier to grep for.
Done.
pkg/spanconfig/spanconfig.go, line 58 at r4 (raw file):
Previously, irfansharif (irfan sharif) wrote…
[nit] Drop "lease" phrasing here, it could instead be "txn{Start,End} hlc.Timestamp" or "commit{Min,Max}" etc. At the level of the interface, dropping references to specific impls (like the reconciliation job) makes things clearer -- you don't don't need to know at this level where these timestamps are coming from and that they are tied to lease durations. In our tests and other usages (SeedSpanConfigs) for e.g. they're not tied to leases at all.
I don't think it's that bad, given the "typically" phrasing.
pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 180 at r4 (raw file):
Previously, ajwerner wrote…
Can you refactor this logic to be like:
if err := waitForLeaseStart(ctx, k.clock, leaseStartTime); err != nil { return err } return k.db.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error { // Set the deadline of the transaction so that it commits before the // lease of the job on behalf of which the KVAccessor is acting // expires. if leaseExpirationTime.Less(txn.ReadTimestamp()) { return newRetryableLeaseExpiredError() } if err := txn.UpdateDeadline(ctx, leaseExpirationTime); err != nil { return err } return k.updateSpanConfigRecordsWithTxn(ctx, toDelete, toUpsert, txn) })
Done.
pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 180 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Can this use
hlc.Clock.SleepUntil? That function does its best to avoid busy-looping and is also context aware.
Good call, done.
pkg/spanconfig/spanconfigreconciler/reconciler.go, line 406 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do we expect the
Expirationto implicitly advance from call to call? That's needed for this retry loop to be effective. If so, could we add a comment somewhere?
Yup, whenever the underlying sqlliveness session is extended. Added some words.
pkg/spanconfig/spanconfigkvaccessor/kvaccessor_test.go, line 302 at r4 (raw file):
Previously, ajwerner wrote…
name got trunctated
Done.
|
Build failed (retrying...): |
nvb
left a comment
There was a problem hiding this comment.
Reviewed 40 of 40 files at r6, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @arulajmani, @irfansharif, and @miretskiy)
|
Build succeeded: |
|
blathers backport release-22.1 |
Span config reconciliation is driven by the auto span config
reconciliation job which builds over our existing jobs infrastructure.
Jobs can be claimed by any node in the system. The jobs subsystem uses
sqlliveness to claim jobs and a node has a disjoint lease over a claimed
job until its session expires. This means transactions on behalf of this
job must commit before the session expires if they wish to make strong
mutual exclusion guarantess. The auto span config reconciliation job
relies on such mutual exclusion guarantees.
This patch changes the UpdateSpanConfigRequest to now include lease
information in its proto. Typically, this is the lease interval of the
span config job issuing these requests. Downstream, the KVAccessor
ensures the transaction used to perform the update commits at a
timestamp within this interval.
Release note: None