Skip to content

kv: correctly handle shared lock replays in KVNemesis#111512

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
arulajmani:kvnemesis-fix-111426
Sep 29, 2023
Merged

kv: correctly handle shared lock replays in KVNemesis#111512
craig[bot] merged 1 commit intocockroachdb:masterfrom
arulajmani:kvnemesis-fix-111426

Conversation

@arulajmani
Copy link
Copy Markdown
Collaborator

@arulajmani arulajmani commented Sep 29, 2023

Previously, we'd return an AssertionFailed error if a SKIP LOCKED request discovered another request from its own transaction waiting in a lock's wait queue. In SQL's use of KV, this can only happen if the SKIP LOCKED request is being replayed -- so returning an error here is fine. However, this tripped KVNemesis up.

This patch marks such errors, for the benefit of KVNemesis, and doesn't call them assertion failed errors.

Fixes #111426
Fixes #111506
Fixes #111513

Release note: None

@arulajmani arulajmani requested a review from a team as a code owner September 29, 2023 16:33
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Sep 29, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: thanks for the fix! Feel free to merge this, though the need for this change it making me reconsider whether we should figure out a way to just handle the case properly. See the comment thread.

Reviewed 2 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani and @miraradeva)


pkg/kv/kvserver/concurrency/lock_table.go line 755 at r1 (raw file):

			// own transaction is if we're a replay. Instead of handling this case,
			// and defining sane semantics, we simply return an error. We mark the
			// error for the benefit of KVNemesis.

What would these sane semantics be? I don't recall the reason why we decided not to handle this case. I do wonder if it can actually be hit by SQL if the DistSender ever decides to replay a SKIP LOCKED Get/Scan.

Previously, we'd return an AssertionFailed error if a SKIP LOCKED
request discovered another request from its own transaction waiting in
a lock's wait queue. In SQL's use of KV, this can only happen if the
SKIP LOCKED request is being replayed -- so returning an error here
is fine. However, this tripped KVNemesis up.

This patch marks such errors, for the benefit of KVNemesis, and doesn't
call them assertion failed errors.

Fixes cockroachdb#111426

Release note: None
Copy link
Copy Markdown
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replied in-line and updated the commentary. I'll merge this for now, but we can come back to handling this case in the future. I'll also open an issue for it.

bors r=nvanbenschoten

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @miraradeva and @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table.go line 755 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

What would these sane semantics be? I don't recall the reason why we decided not to handle this case. I do wonder if it can actually be hit by SQL if the DistSender ever decides to replay a SKIP LOCKED Get/Scan.

Went back to the PR that introduced this, and this is what we had:

#108918 (review)

I think treating it as a non-conflict if you find another request from your own transaction makes sense. I'll update this comment so as to not mislead us in the future. The testing surface area point still stands -- I've captured some words in the comment here.

I'm not against handling this case, but maybe let's defer it for stability (or next release)?

@arulajmani
Copy link
Copy Markdown
Collaborator Author

bors r=nvanbenschoten

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Sep 29, 2023

Already running a review

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Sep 29, 2023

Build succeeded:

@craig craig bot merged commit 1787e21 into cockroachdb:master Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants