kv: correctly handle shared lock replays in KVNemesis#111512
kv: correctly handle shared lock replays in KVNemesis#111512craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
nvb
left a comment
There was a problem hiding this comment.
thanks for the fix! Feel free to merge this, though the need for this change it making me reconsider whether we should figure out a way to just handle the case properly. See the comment thread.
Reviewed 2 of 3 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @arulajmani and @miraradeva)
pkg/kv/kvserver/concurrency/lock_table.go line 755 at r1 (raw file):
// own transaction is if we're a replay. Instead of handling this case, // and defining sane semantics, we simply return an error. We mark the // error for the benefit of KVNemesis.
What would these sane semantics be? I don't recall the reason why we decided not to handle this case. I do wonder if it can actually be hit by SQL if the DistSender ever decides to replay a SKIP LOCKED Get/Scan.
Previously, we'd return an AssertionFailed error if a SKIP LOCKED request discovered another request from its own transaction waiting in a lock's wait queue. In SQL's use of KV, this can only happen if the SKIP LOCKED request is being replayed -- so returning an error here is fine. However, this tripped KVNemesis up. This patch marks such errors, for the benefit of KVNemesis, and doesn't call them assertion failed errors. Fixes cockroachdb#111426 Release note: None
ab215dc to
0859e83
Compare
arulajmani
left a comment
There was a problem hiding this comment.
Replied in-line and updated the commentary. I'll merge this for now, but we can come back to handling this case in the future. I'll also open an issue for it.
bors r=nvanbenschoten
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @miraradeva and @nvanbenschoten)
pkg/kv/kvserver/concurrency/lock_table.go line 755 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
What would these sane semantics be? I don't recall the reason why we decided not to handle this case. I do wonder if it can actually be hit by SQL if the DistSender ever decides to replay a SKIP LOCKED Get/Scan.
Went back to the PR that introduced this, and this is what we had:
I think treating it as a non-conflict if you find another request from your own transaction makes sense. I'll update this comment so as to not mislead us in the future. The testing surface area point still stands -- I've captured some words in the comment here.
I'm not against handling this case, but maybe let's defer it for stability (or next release)?
|
bors r=nvanbenschoten |
|
Already running a review |
|
Build succeeded: |
Previously, we'd return an AssertionFailed error if a SKIP LOCKED request discovered another request from its own transaction waiting in a lock's wait queue. In SQL's use of KV, this can only happen if the SKIP LOCKED request is being replayed -- so returning an error here is fine. However, this tripped KVNemesis up.
This patch marks such errors, for the benefit of KVNemesis, and doesn't call them assertion failed errors.
Fixes #111426
Fixes #111506
Fixes #111513
Release note: None