Skip to content

kv: non-blocking write-read conflicts for weak isolation txns#103267

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
nvb:nvanbenschoten/pushSnapshotTxn2
May 19, 2023
Merged

kv: non-blocking write-read conflicts for weak isolation txns#103267
craig[bot] merged 1 commit intocockroachdb:masterfrom
nvb:nvanbenschoten/pushSnapshotTxn2

Conversation

@nvb
Copy link
Copy Markdown
Contributor

@nvb nvb commented May 14, 2023

Fixes #102014.

This commit updates write-read conflict rules to allow non-blocking behavior for weak isolation transactions. Specifically, PUSH_TIMESTAMP requests are now allowed to succeed when encountering PENDING pushees if any of the following conditions is true:

  • the pushee is a weak isolation transaction
  • OR the pusher is a weak isolation transaction with an equal priority as the pushee
  • OR the pusher has a greater priority than the pushee (previous behavior)

The rationale for this behavior is that weak isolation transactions can tolerate write skew without a refresh or retry. Because of this, they face no consequence from being pushed and also expect others to be pushable (non-blocking).

Longer-term, we would like all write-read conflicts to become non-blocking. For now, these rules ensure that all write-read conflicts between weak isolation transactions are non-blocking, and also that write-read conflicts between a weak isolation transaction and a strong isolation (serializable) transaction has reasonable, tunable behavior.

Release note: None

@nvb nvb requested a review from arulajmani May 14, 2023 22:15
@nvb nvb requested a review from a team as a code owner May 14, 2023 22:15
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@nvb nvb force-pushed the nvanbenschoten/pushSnapshotTxn2 branch from 9cabf3a to db455f7 Compare May 18, 2023 22:38
Copy link
Copy Markdown
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 10 of 10 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table_waiter.go line 1252 at r1 (raw file):

	}
	var pushType kvpb.PushTxnType
	if s.guardStrength == lock.None {

nit: consider using a switch statement on the guard's strength to assign pushType. That way, we can panic if any of the strengths are unexpected (for now, we only expect Intent and None, but soon we'll expand this to include Exclusive and Shared as well).


pkg/kv/kvserver/txnwait/queue.go line 107 at r1 (raw file):

	case kvpb.PUSH_TIMESTAMP:
		// If the pushee transaction tolerates write skew, the PUSH_TIMESTAMP is
		// harmless, so let it through. If the pushee transaction does not tolerate

nit: consider breaking this comment and placing the relevant bits closer to where the condition is being checked.

I saw this over at IsolatedAtLaterTimestamps and have since been a fan. It's got your blame on it, so maybe you like it too? Feel free to disregard if you no longer agree.


pkg/kv/kvclient/kvcoord/txn_test.go line 369 at r1 (raw file):

// TestTxnWriteReadConflict verifies that write-read conflicts are non-blocking
// to the reader, except when both the writer and reader are both serializable
// transactions. In that case, the reader will block until the writer completes.

Should we add a TODO here? :trollface:


pkg/kv/kvserver/txnwait/queue_test.go line 188 at r1 (raw file):

	}{
		// PUSH_ABORT
		{kvpb.PUSH_ABORT, SSI, SSI, min, min, false},

How do you feel about pulling the PUSH_ABORT and PUSH_TOUCH cases into a (or 2) different test, and running them with all isolation level combinations? That allows us to test they're agnostic to isolation levels, and we can test the more interesting PUSH_TIMESTAMP cases here.

Fixes cockroachdb#102014.

This commit updates write-read conflict rules to allow non-blocking
behavior for weak isolation transactions. Specifically, PUSH_TIMESTAMP
requests are now allowed to succeed when encountering PENDING pushees
if any of the following conditions is true:
- the pushee is a weak isolation transaction
- OR the pusher is a weak isolation transaction with an equal priority
  as the pushee
- OR the pusher has a greater priority than the pushee (previous behavior)

The rationale for this behavior is that weak isolation transactions can
tolerate write skew without a refresh or retry. Because of this, they
face no consequence from being pushed and also expect others to be
pushable (non-blocking).

Longer-term, we would like all write-read conflicts to become non-blocking. For
now, these rules ensure that all write-read conflicts between weak isolation
transactions are non-blocking, and also that write-read conflicts between a weak
isolation transaction and a strong isolation (serializable) transaction has
reasonable, tunable behavior.

Release note: None
@nvb nvb force-pushed the nvanbenschoten/pushSnapshotTxn2 branch from db455f7 to 5a53d25 Compare May 19, 2023 19:27
Copy link
Copy Markdown
Contributor Author

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)


pkg/kv/kvserver/concurrency/lock_table_waiter.go line 1252 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: consider using a switch statement on the guard's strength to assign pushType. That way, we can panic if any of the strengths are unexpected (for now, we only expect Intent and None, but soon we'll expand this to include Exclusive and Shared as well).

While I'm generally a strong proponent of switch statements, I think the if-else structure here is more suggestive of the meaning behind this logic. If the caller is not locking (lock.None), then it wants to use a PUSH_TIMESTAMP. Otherwise, if the caller is locking, regardless of what the specific locking strength is, it wants to use a PUSH_ABORT.

I'm actually going to switch the other instance of this over as well.


pkg/kv/kvserver/txnwait/queue.go line 107 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: consider breaking this comment and placing the relevant bits closer to where the condition is being checked.

I saw this over at IsolatedAtLaterTimestamps and have since been a fan. It's got your blame on it, so maybe you like it too? Feel free to disregard if you no longer agree.

Done.


pkg/kv/kvclient/kvcoord/txn_test.go line 369 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Should we add a TODO here? :trollface:

Heh, only if you want your name attached to it.


pkg/kv/kvserver/txnwait/queue_test.go line 188 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

How do you feel about pulling the PUSH_ABORT and PUSH_TOUCH cases into a (or 2) different test, and running them with all isolation level combinations? That allows us to test they're agnostic to isolation levels, and we can test the more interesting PUSH_TIMESTAMP cases here.

That's a great idea. Done.

Copy link
Copy Markdown
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewed 3 of 3 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table_waiter.go line 1252 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

While I'm generally a strong proponent of switch statements, I think the if-else structure here is more suggestive of the meaning behind this logic. If the caller is not locking (lock.None), then it wants to use a PUSH_TIMESTAMP. Otherwise, if the caller is locking, regardless of what the specific locking strength is, it wants to use a PUSH_ABORT.

I'm actually going to switch the other instance of this over as well.

That's a fair point, sgtm.

@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented May 19, 2023

TFTR!

bors r=arulajmani

@craig
Copy link
Copy Markdown
Contributor

craig bot commented May 19, 2023

Build failed:

@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented May 19, 2023

Bors' CI has not failed, so it's confused. I now canceled the CI runs. Let's try again.

bors r=arulajmani

@craig
Copy link
Copy Markdown
Contributor

craig bot commented May 19, 2023

Build succeeded:

@craig craig bot merged commit fd63cd6 into cockroachdb:master May 19, 2023
@nvb nvb deleted the nvanbenschoten/pushSnapshotTxn2 branch May 19, 2023 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kv: read committed transactions should not wait on PENDING intents

3 participants