kvcoord: propagate lock spans in LeafTxnFinalState#94388
kvcoord: propagate lock spans in LeafTxnFinalState#94388yuzefovich wants to merge 2 commits intocockroachdb:masterfrom
Conversation
| condensed := tp.lockFootprint.maybeCondense(ctx, tp.riGen, locksBudget) | ||
| if condensed && !alreadyCondensed { | ||
| if tp.condensedIntentsEveryN.ShouldLog() || log.ExpensiveLogEnabled(ctx, 2) { | ||
| var baDetails string |
There was a problem hiding this comment.
redact.StringBuilder, then use log.Warningf and include it with %s in the message.
| // Copy mutable state so access is safe for the caller. | ||
| lockSpans := tp.lockFootprint.asSlice() | ||
| tfs.LockSpans = make([]roachpb.Span, len(lockSpans)) | ||
| copy(tfs.LockSpans, lockSpans) |
There was a problem hiding this comment.
I don't think the LeafTxn gets reused/extended afterwards, so maybe you could avoid the copy?
There was a problem hiding this comment.
(If you know a copy is needed, maybe explain why in a comment)
9c8b83c to
5198f9e
Compare
yuzefovich
left a comment
There was a problem hiding this comment.
The newly-added test in select_for_update now fails on 3node-tenant config (before this change it wouldn't fail):
--- FAIL: TestLogic_tmp (2.04s)
test_log_scope.go:161: test logs captured to: /Users/yuzefovich/go/src/github.com/cockroachdb/cockroach/tmp/_tmp/967ce6772a6c5a2809661c88cfdfc502/logTestLogic_tmp362815130
test_log_scope.go:79: use -show-logs to present logs inline
logic.go:3198:
/private/var/tmp/_bazel_yahor/e311b1c2454b7c301a6b73bddff450e7/sandbox/darwin-sandbox/16694/execroot/com_github_cockroachdb_cockroach/bazel-out/darwin_arm64-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/_tmp:7:
expected success, but found
(XXUUU) RangeIterator failed to seek to /Min: rpc error: code = Unauthenticated desc = kvtenant.Proxy does not have access to FirstRange
range_iter.go:236: in Seek()
logic.go:3990:
/private/var/tmp/_bazel_yahor/e311b1c2454b7c301a6b73bddff450e7/sandbox/darwin-sandbox/16694/execroot/com_github_cockroachdb_cockroach/bazel-out/darwin_arm64-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/_tmp:7: error while processing
logic.go:3990: pq: RangeIterator failed to seek to /Min: rpc error: code = Unauthenticated desc = kvtenant.Proxy does not have access to FirstRange
panic.go:522: -- test log scope end --
FAIL
Do you have an idea of what could be the issue?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @knz and @nvanbenschoten)
pkg/kv/kvclient/kvcoord/txn_interceptor_pipeliner.go line 732 at r1 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
redact.StringBuilder, then use
log.Warningfand include it with %s in the message.
Done.
pkg/kv/kvclient/kvcoord/txn_interceptor_pipeliner.go line 820 at r1 (raw file):
Previously, knz (Raphael 'kena' Poss) wrote…
(If you know a copy is needed, maybe explain why in a comment)
I don't know whether it is needed - I simply copied what we have in txnSpanRefresher, and there we do make a copy. I was hoping that @nvanbenschoten would point out whether it is, in fact, needed here and the reason why. (The copying in the refresh spans was added in 51ade9f.)
Update: discussed on slack, will keep the copy out of caution.
There was a problem hiding this comment.
The newly-added test in select_for_update now fails on 3node-tenant config (before this change it wouldn't fail):
Do you have an idea of what could be the issue?
Yes I believe it is a recent regression related to the work that @ecwall and myself are doing on the master branch.
My recommendation would be to:
- verify that your fix + this test succeed properly on v22.2 (i.e. this failure is specific to
master) - skip this test on
master, with an issue filed for Evan to investigate further - do the backport and ensure the test is not skipped on the 22.2 branch.
If the test still fails on 22.2, I'm out of ideas and I fear we won't fix this today.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
knz
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
|
Summary of what we encountered:
The call to |
|
I debugged this a bit further, and I no longer think Not sure about other details yet. |
|
So the problem is that when committing the txn we issue I tried some hacky things (like intersecting that |
|
filed that last issue as #95003. |
|
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
This commit makes it so that we propagate the lock footprint of the leaf txns in `LeafTxnFinalState`. This is needed in order to disable the read-only fast path when committing the txn that acquired locks while using the leaf txn (which can happen, for example, if we use the streamer API). Without this propagation, previously the locks would only be released after the txn expiration duration (5s). Release note (bug fix): Previously, CockroachDB could delay the release of the locks acquired when evaluating SELECT FOR UPDATE statements in some cases. This delay (up to 5s) could then block the future readers. The bug was introduced in 22.2.0, and the temporary workaround without upgrading to a release with this fix is to set undocumented cluster setting `sql.distsql.use_streamer.enabled` to `false`.
…ocking" This reverts commit 9c2c4f2 while keeping the regression test. Release note: None
This commit makes it so that we propagate the lock footprint of the leaf txns in
LeafTxnFinalState. This is needed in order to disable the read-only fast path when committing the txn that acquired locks while using the leaf txn (which can happen, for example, if we use the streamer API). Without this propagation, previously the locks would only be released after the txn expiration duration (5s).Addresses: #94290.
Epic: None.
Release note (bug fix): Previously, CockroachDB could delay the release of the locks acquired when evaluating SELECT FOR UPDATE statements in some cases. This delay (up to 5s) could then block the future readers. The bug was introduced in 22.2.0, and the temporary workaround without upgrading to a release with this fix is to set undocumented cluster setting
sql.distsql.use_streamer.enabledtofalse.