Skip to content

spanconfig: consult job lease interval when updating span configs #79171

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
arulajmani:scfg-job-mutual-exclusion-73789
Apr 18, 2022
Merged

spanconfig: consult job lease interval when updating span configs #79171
craig[bot] merged 2 commits intocockroachdb:masterfrom
arulajmani:scfg-job-mutual-exclusion-73789

Conversation

@arulajmani
Copy link
Copy Markdown
Collaborator

@arulajmani arulajmani commented Mar 31, 2022

Span config reconciliation is driven by the auto span config
reconciliation job which builds over our existing jobs infrastructure.
Jobs can be claimed by any node in the system. The jobs subsystem uses
sqlliveness to claim jobs and a node has a disjoint lease over a claimed
job until its session expires. This means transactions on behalf of this
job must commit before the session expires if they wish to make strong
mutual exclusion guarantess. The auto span config reconciliation job
relies on such mutual exclusion guarantees.

This patch changes the UpdateSpanConfigRequest to now include lease
information in its proto. Typically, this is the lease interval of the
span config job issuing these requests. Downstream, the KVAccessor
ensures the transaction used to perform the update commits at a
timestamp within this interval.

Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@arulajmani arulajmani force-pushed the scfg-job-mutual-exclusion-73789 branch 2 times, most recently from 1e63686 to 5678cc8 Compare April 6, 2022 23:02
@arulajmani arulajmani changed the title [WIP, DNM] spanconfig: consult job lease interval when updating spanconfigs spanconfig: consult job lease interval when updating span configs Apr 6, 2022
@arulajmani arulajmani marked this pull request as ready for review April 6, 2022 23:03
@arulajmani arulajmani requested a review from a team as a code owner April 6, 2022 23:03
@arulajmani arulajmani requested a review from a team April 6, 2022 23:03
@arulajmani arulajmani requested review from a team as code owners April 6, 2022 23:03
@arulajmani arulajmani requested review from ajwerner, irfansharif, nvb and shermanCRL and removed request for a team April 6, 2022 23:03
Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some comments. Didn't make it all the way through.

Reviewed 4 of 4 files at r1, 14 of 32 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @irfansharif, @nvanbenschoten, and @shermanCRL)


pkg/migration/migrations/seed_tenant_span_configs.go, line 89 at r2 (raw file):

			}
			if err := scKVAccessor.UpdateSpanConfigRecords(
				ctx, nil, toUpsert, hlc.MinTimestamp, hlc.MaxTimestamp,

nit: add back comment or make a var that you set to nil


pkg/roachpb/span_config.proto, line 320 at r2 (raw file):

// the delete+upsert semantics described above.
//
// The delete/upsert updates are performed transactionally and the timestamp of

Is it important or relevant that the updates are performed transactionally? Would it break anything if, internally, more than one transaction were used, but all of the transactions carried the supplied deadline?


pkg/spanconfig/spanconfig.go, line 195 at r2 (raw file):

	// [1]: It's possible for system.{zones,descriptor} to have been GC-ed away;
	//      think suspended tenants.
	// TODO(arul): write words about the session being passed in.

nit: do this in this PR.


pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 38 at r2 (raw file):

// NB: don't change the string here; this will cause cross-version issues since
// this singleton is used as a marker.
var errMarkCanRetryInvalidLease = errors.New("should retry with valid lease start time")

would it be better to use a type and then register that type with the errors framework? That provides a path for migration. I don't know how much better it is. Does this definitely work across RPC boundaries, even in the face of wrapping? I'd love to see a test where the accessor implementation is accessed via a connector.


pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 290 at r2 (raw file):

		txnCommitTimestamp = k.knobs.KVAccessorOverrideUpdateTransactionTimestamp
	}
	if leaseExpirationTime.Less(txnCommitTimestamp) {

this is not what you're looking for. You want:

cockroach/pkg/kv/txn.go

Lines 782 to 788 in 1d4a1a3

// UpdateDeadline sets the transactions deadline to the passed deadline.
// It may move the deadline to any timestamp above the current read timestamp.
// If the deadline is below the current provisional commit timestamp (write timestamp),
// then the transaction will fail with a deadline error during the commit.
// The deadline cannot be lower than txn.ReadTimestamp and we make the assumption
// the read timestamp will not change during execution, which is valid today.
func (txn *Txn) UpdateDeadline(ctx context.Context, deadline hlc.Timestamp) error {


pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 309 at r2 (raw file):

		)
	}
	if txnCommitTimestamp.Less(leaseStartTime) {

this also isn't quite right. You want to just do the checking before you even create the transaction. I'd expect that you'd read the hlc, and if it's in the past of the timestamp, wait until it is not.

@arulajmani arulajmani force-pushed the scfg-job-mutual-exclusion-73789 branch from 5678cc8 to 4bbc242 Compare April 13, 2022 01:15
Copy link
Copy Markdown
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFAL

Dismissed @ajwerner from a discussion.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @irfansharif, @nvanbenschoten, and @shermanCRL)


pkg/roachpb/span_config.proto, line 320 at r2 (raw file):

Previously, ajwerner wrote…

Is it important or relevant that the updates are performed transactionally? Would it break anything if, internally, more than one transaction were used, but all of the transactions carried the supplied deadline?

No, I don't think so -- "atomically" is probably a better word here, switched.


pkg/spanconfig/spanconfig.go, line 195 at r2 (raw file):

Previously, ajwerner wrote…

nit: do this in this PR.

oops, done.


pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 38 at r2 (raw file):

Previously, ajwerner wrote…

would it be better to use a type and then register that type with the errors framework? That provides a path for migration. I don't know how much better it is. Does this definitely work across RPC boundaries, even in the face of wrapping? I'd love to see a test where the accessor implementation is accessed via a connector.

Good call, this didn't work across RPC boundaries. Fixed.


pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 290 at r2 (raw file):

Previously, ajwerner wrote…

this is not what you're looking for. You want:

cockroach/pkg/kv/txn.go

Lines 782 to 788 in 1d4a1a3

// UpdateDeadline sets the transactions deadline to the passed deadline.
// It may move the deadline to any timestamp above the current read timestamp.
// If the deadline is below the current provisional commit timestamp (write timestamp),
// then the transaction will fail with a deadline error during the commit.
// The deadline cannot be lower than txn.ReadTimestamp and we make the assumption
// the read timestamp will not change during execution, which is valid today.
func (txn *Txn) UpdateDeadline(ctx context.Context, deadline hlc.Timestamp) error {

Done.


pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 309 at r2 (raw file):

Previously, ajwerner wrote…

this also isn't quite right. You want to just do the checking before you even create the transaction. I'd expect that you'd read the hlc, and if it's in the past of the timestamp, wait until it is not.

I wasn't sure if we wanted to wait things out or return errors if we weren't in the given interval initially; switched.

@arulajmani arulajmani requested a review from ajwerner April 13, 2022 01:16
@arulajmani arulajmani force-pushed the scfg-job-mutual-exclusion-73789 branch from 4bbc242 to 71070e7 Compare April 13, 2022 02:31
@shermanCRL shermanCRL requested review from miretskiy and removed request for shermanCRL April 13, 2022 02:34
Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

})
timer := timeutil.NewTimer()
for {
timer.Reset(leaseStartTime.GoTime().Sub(k.clock.Now().GoTime()))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you refactor this logic to be like:

if err := waitForLeaseStart(ctx, k.clock, leaseStartTime); err != nil {
    return err
}
return k.db.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
	// Set the deadline of the transaction so that it commits before the
	// lease of the job on behalf of which the KVAccessor is acting
	// expires.
	if leaseExpirationTime.Less(txn.ReadTimestamp()) {
		return newRetryableLeaseExpiredError()
	}
	if err := txn.UpdateDeadline(ctx, leaseExpirationTime); err != nil {
		return err
	}
	return k.updateSpanConfigRecordsWithTxn(ctx, toDelete, toUpsert, txn)
})

// TestKVAccessorUpdateLeastStartWaitRespondsToContextCancellation ensures that
// KVAccessor update requests which are waiting for their lease start times to
// be "valid" respond to context cancellations.
func TestKVAccessorUpdateLeaseStartWaitRespondsTo(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name got trunctated

pkg/jobs/jobs.go Outdated
// A Traceable job will dump a trace zip on failure.
dumpOnFail
// A Traceable job will dump a trace zip in any of paused, canceled, failed,
// A Traceable job will dump a trace zip in any o8 paused, canceled, failed,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental typo?

// the delete+upsert semantics described above.
//
// All delete/upsert updates are performed atomically at a timestamp within the
// [leaseStartTime, leaseExpirationTime) supplied. Typically, this corresponds
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Use the capitalized "Lease{Start,Expiration}Time" in the comment, easier to grep for.

message UpdateSpanConfigsResponse {
// This field stores any error that occurs on the server, allowing us to
// differentiate between those and RPC errors.
errorspb.EncodedError error = 1 [(gogoproto.nullable) = false];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+cc @knz, out of curiosity, is this the pattern we should be using when propagating specific errors back up to the client?

ctx context.Context,
toDelete []Target,
toUpsert []Record,
leaseStartTime hlc.Timestamp,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Drop "lease" phrasing here, it could instead be "txn{Start,End} hlc.Timestamp" or "commit{Min,Max}" etc. At the level of the interface, dropping references to specific impls (like the reconciliation job) makes things clearer -- you don't don't need to know at this level where these timestamps are coming from and that they are tied to lease durations. In our tests and other usages (SeedSpanConfigs) for e.g. they're not tied to leases at all.

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 4 files at r1, 33 of 33 files at r3, 39 of 39 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @arulajmani, and @miretskiy)


pkg/roachpb/span_config.proto, line 320 at r2 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

No, I don't think so -- "atomically" is probably a better word here, switched.

Is it possible to guarantee atomicity with multiple txns? If half of the update commits, then time advances (or the spanconfig rows are read) such that the other half cannot commit within the lease interval, what can we do?


pkg/roachpb/span_config.proto, line 324 at r4 (raw file):

// [leaseStartTime, leaseExpirationTime) supplied. Typically, this corresponds
// to the lease interval of the auto span config reconciliation job issuing the
// request.

nit: mention what happens if committing the txn within this interval is not possible.


pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 180 at r4 (raw file):

	timer := timeutil.NewTimer()
	for {
		timer.Reset(leaseStartTime.GoTime().Sub(k.clock.Now().GoTime()))

Can this use hlc.Clock.SleepUntil? That function does its best to avoid busy-looping and is also context aware.


pkg/spanconfig/spanconfigreconciler/reconciler.go, line 406 at r4 (raw file):

	for retrier := retry.StartWithCtx(ctx, retryOpts); retrier.Next(); {
		err := kvAccessor.UpdateSpanConfigRecords(
			ctx, toDelete, toUpsert, session.StartTimestamp(), session.Expiration(),

Do we expect the Expiration to implicitly advance from call to call? That's needed for this retry loop to be effective. If so, could we add a comment somewhere?

Purely a refactor. We now store the sqlliveness.Session on jobs instead
of sqlliveness.SessionID.

Release note: None
Span config reconciliation is driven by the auto span config
reconciliation job which builds over our existing jobs infrastructure.
Jobs can be claimed by any node in the system. The jobs subsystem uses
sqlliveness to claim jobs and a node has a disjoint lease over a claimed
job until its session expires. This means transactions on behalf of this
job must commit before the session expires if they wish to make strong
mutual exclusion guarantess. The auto span config reconciliation job
relies on such mutual exclusion guarantees.

This patch changes the UpdateSpanConfigRequest to now include lease
information in its proto. Typically, this is the lease interval of the
span config job issuing these requests. Downstream, the KVAccessor
ensures the transaction used to perform the update commits at a
timestamp within this interval.

Release note: None
@arulajmani arulajmani force-pushed the scfg-job-mutual-exclusion-73789 branch from 71070e7 to 750b193 Compare April 18, 2022 16:10
@blathers-crl blathers-crl bot requested a review from irfansharif April 18, 2022 16:10
Copy link
Copy Markdown
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Dismissed @ajwerner and @irfansharif from 4 discussions.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @arulajmani, @irfansharif, @miretskiy, and @nvanbenschoten)


pkg/jobs/jobs.go, line 52 at r4 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Accidental typo?

Done.


pkg/roachpb/span_config.proto, line 320 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is it possible to guarantee atomicity with multiple txns? If half of the update commits, then time advances (or the spanconfig rows are read) such that the other half cannot commit within the lease interval, what can we do?

You're right, I don't think we'd be able to use more than one transactions here in the future given our atomicity guarantee.


pkg/roachpb/span_config.proto, line 322 at r4 (raw file):

Previously, irfansharif (irfan sharif) wrote…

[nit] Use the capitalized "Lease{Start,Expiration}Time" in the comment, easier to grep for.

Done.


pkg/spanconfig/spanconfig.go, line 58 at r4 (raw file):

Previously, irfansharif (irfan sharif) wrote…

[nit] Drop "lease" phrasing here, it could instead be "txn{Start,End} hlc.Timestamp" or "commit{Min,Max}" etc. At the level of the interface, dropping references to specific impls (like the reconciliation job) makes things clearer -- you don't don't need to know at this level where these timestamps are coming from and that they are tied to lease durations. In our tests and other usages (SeedSpanConfigs) for e.g. they're not tied to leases at all.

I don't think it's that bad, given the "typically" phrasing.


pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 180 at r4 (raw file):

Previously, ajwerner wrote…

Can you refactor this logic to be like:

if err := waitForLeaseStart(ctx, k.clock, leaseStartTime); err != nil {
    return err
}
return k.db.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
	// Set the deadline of the transaction so that it commits before the
	// lease of the job on behalf of which the KVAccessor is acting
	// expires.
	if leaseExpirationTime.Less(txn.ReadTimestamp()) {
		return newRetryableLeaseExpiredError()
	}
	if err := txn.UpdateDeadline(ctx, leaseExpirationTime); err != nil {
		return err
	}
	return k.updateSpanConfigRecordsWithTxn(ctx, toDelete, toUpsert, txn)
})

Done.


pkg/spanconfig/spanconfigkvaccessor/kvaccessor.go, line 180 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Can this use hlc.Clock.SleepUntil? That function does its best to avoid busy-looping and is also context aware.

Good call, done.


pkg/spanconfig/spanconfigreconciler/reconciler.go, line 406 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we expect the Expiration to implicitly advance from call to call? That's needed for this retry loop to be effective. If so, could we add a comment somewhere?

Yup, whenever the underlying sqlliveness session is extended. Added some words.


pkg/spanconfig/spanconfigkvaccessor/kvaccessor_test.go, line 302 at r4 (raw file):

Previously, ajwerner wrote…

name got trunctated

Done.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 18, 2022

Build failed (retrying...):

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 40 of 40 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @arulajmani, @irfansharif, and @miretskiy)

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 18, 2022

Build succeeded:

@craig craig bot merged commit 630153d into cockroachdb:master Apr 18, 2022
@arulajmani
Copy link
Copy Markdown
Collaborator Author

blathers backport release-22.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

spanconfig: ensure safety without mutual exclusion in the reconciliation job

5 participants