sql: fix data race in MemberOfWithAdminOption#98617
sql: fix data race in MemberOfWithAdminOption#98617craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
|
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
|
This PR is an implementation of the idea discussed in #98469, but it seems to be hitting an issue where the |
that seems likely. if the modification time is not set, then we should be able to safely not change the txn timestamp -- since this means that the |
|
Modification time will always be set -- we assert it deeply. The fact that that migration fails to update it and set the version is sad. Maybe one thing we can do is detect version 1 and use a hard-coded set of data, or just keep the transaction timestamp? |
Would you mind elaborating a bit more about what the advantages of the original suggestion to set a fixed timestamp were? Are those benefits not strictly required for correctness? |
|
If the version is 1, that means there have been no modifications to the table. So using a historical timestamp or the current timestamp would both return the same results. |
I think the problem right now is that the permanent upgrade below inserts rows into the cockroach/pkg/upgrade/upgrades/permanent_upgrades.go Lines 59 to 87 in 08f2dc4 |
|
Right, I'm saying that if the version is 1 then we know:
So we can work around this issue of the upgrade not bumping the version by just using the current timestamp for the txn timestamp. Or to resurrect this thread #98469 (comment)
Maybe this is a case where we can use the outer txn's ReadTimestamp? |
4ad8802 to
abf67c3
Compare
|
Ah I see, thanks for the clarification. I've updated the PR to do that! |
rafiss
left a comment
There was a problem hiding this comment.
great! this lgtm, but just wanted to discuss my comment from one of the other PRs:
if possible, could we add a more targeted test that catches this bug? it feels kind of lucky to me that TestBackendDialTLS is the one that caught this. i could imagine that test being refactored in the future, and no longer catching a future regression on this.
We could add a test to pkg/sql/user_test.go that surfaces the bug? It's possible that TestGetUserTimeout already would have revealed the bug, but I see that test is skipped under race
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
5ab7028 to
f55d945
Compare
andyyang890
left a comment
There was a problem hiding this comment.
Looks like TestGetUserTimeout does catch the bug so I've unskipped it under race! I also removed the backport-22.2.x tag since this issue only started happening after the internal executor refactor, which was merged in 23.1.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @rafiss)
This patch fixes a race condition in MemberOfWithAdminOption by using a fresh transaction within the singleflight call. Release note: None
This test is being unskipped under race to catch any regressions for a data race in MemberOfWithAdminOption that was discovered and subsequently patched. Release note: None
f55d945 to
37e1066
Compare
|
TFTR! bors r=rafiss |
|
Build failed (retrying...): |
|
Build failed (retrying...): |
|
Build failed: |
|
bors retry |
|
Build succeeded: |
This patch fixes a race condition in MemberOfWithAdminOption
by using a fresh transaction within the singleflight call.
Fixes #95642
Fixes #96539
Release note: None