logictest: add support for async statements that block due to lock contention#79010
Conversation
|
FYI: The reason for the backport label is that this change is in service of additional test coverage for #77876 (intended for 22.1). |
169adf3 to
aa72b3b
Compare
yuzefovich
left a comment
There was a problem hiding this comment.
The implementation LGTM, but I'm a bit on the fence whether this feature makes sense in the context of the logic tests.
The image of logic tests in my head has always been that a single statement is executed fully and then you can optionally check whether the results match the provided expectation. In the past I think we've implemented tests that want to assert the locking via unit tests. However, I could see an argument that this change provides an infrastructure for writing those blocking tests easier. Curious what Andrei/Nathan think. I'm also curious what @cockroachdb/sql-queries think.
Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)
pkg/sql/logictest/logic.go, line 992 at r1 (raw file):
// expected rows affected count. -1 to avoid testing this. expectCount int64 // if we expect this statement to hang, and become a pendingStatement
nit: missing periods here and below.
pkg/sql/logictest/logic.go, line 2478 at r1 (raw file):
name := fields[1] if pending, ok := t.pendingStatements[name]; ok {
nit: I think we tend to short-circuit on !ok case and then proceed with the main happy case. I.e. something like
pending, ok := t.pendingStatements[name]
if !ok {
return errors.New("")
}
// happy case
pkg/sql/logictest/logic.go, line 3129 at r1 (raw file):
if stmt.expectHang { if _, ok := t.pendingStatements[stmt.statementName]; ok { return false, errors.Newf("pending statement with name \"%s\" already exists", stmt.statementName)
nit: you can use %q instead of "%s".
pkg/sql/logictest/testdata/logic_test/cluster_locks, line 46 at r1 (raw file):
user testuser finishstatement readReq
nit: I think the txn is still open for testuser.
AlexTalks
left a comment
There was a problem hiding this comment.
Thanks for the feedback. I'm curious to hear some other opinions too. I had thought about trying to add generic async/await modifiers (inspired by JavaScript/C#/etc) but this seemed like a more straightforward method of getting something working quickly with the logictest infrastructure.
We have some logictests that incorporate locking, however I don't believe we have any that capture the concept of "waiting". The tests I've seen that have transactions hold locks make sure that other operations that attempt to obtain those locks have timeouts, so that they fail after some short wait. This means that we aren't able to test what happens when we have actively waiting operations in logictests unfortunately.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, @nvanbenschoten, and @yuzefovich)
pkg/sql/logictest/logic.go, line 992 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: missing periods here and below.
Working - thanks
pkg/sql/logictest/logic.go, line 2478 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: I think we tend to short-circuit on
!okcase and then proceed with the main happy case. I.e. something likepending, ok := t.pendingStatements[name] if !ok { return errors.New("") } // happy case
Working - sounds good
pkg/sql/logictest/logic.go, line 3129 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: you can use
%qinstead of"%s".
Working - thanks for the tip
pkg/sql/logictest/testdata/logic_test/cluster_locks, line 46 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: I think the txn is still open for
testuser.
This one is actually ok, though, correct? Any open transactions are rolled back at the end of a logictest I think?
Alex and I talked through this. I don't see what's here as much of a deviation from the existing logic test infrastructure. We already support switching between multiple sessions and we also have some non-determinism in certain cases, which is why we have the |
aa72b3b to
b4d7dff
Compare
AlexTalks
left a comment
There was a problem hiding this comment.
Thanks @nvanbenschoten. @yuzefovich, let me know if this seems ok to merge - I've modified the syntax a bit to use instead statement async and awaitstatement, and cleaned things up a bit.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @yuzefovich)
yuzefovich
left a comment
There was a problem hiding this comment.
Looks @michae2 was typing something, so it might be worth to wait a bit to give him a chance to share his feedback.
Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)
pkg/sql/logictest/testdata/logic_test/cluster_locks, line 46 at r1 (raw file):
Previously, AlexTalks (Alex Sarkesian) wrote…
This one is actually ok, though, correct? Any open transactions are rolled back at the end of a logictest I think?
Hm, I don't know - tbh I don't think I've ever written a logic test that begins a txn 😅
To me it seems cleaner to close things explicitly in case we'll be appending to this test file in the future, but I don't have strong feelings about it (that's why it's a "nit"), so feel free to ignore this comment :)
|
Don't forget to update the PR description based on the latest commit message. |
michae2
left a comment
There was a problem hiding this comment.
Cool! I really like this!
Something is bothering me, though. Do we check that the async statement is still blocking while executing the following statements? If I understand the implementation correctly, the async statement could finish at any time, including before the next statement is executed. That doesn't seem like much of a test if we want to verify that locks are actually blocking a SELECT FOR UPDATE until after the UPDATE transaction commits or aborts.
Maybe we could add something like:
select {
execRes := <-pending.resultChan:
// not blocking any more, failure
default:
// still blocking, yay, proceed
}
to all of the statement executions after the async statement?
(Or maybe not all of the statement executions... maybe only statements with another option like blocks namedStatement which would then cause this check to happen before execution.)
Then we could write tests like:
statement count 7
BEGIN; UPDATE t SET v = concat(v, '_updated') WHERE k >= 'b' and k < 'z'
user testuser
statement ok
BEGIN
statement async readReq ok
select * from t for update
user root
sleep 2 blocks readReq
statement ok blocks readReq
ROLLBACK
user testuser
awaitstatement readReq
which would give a little more assurance that the SELECT FOR UPDATE was actually blocking until the ROLLBACK.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)
AlexTalks
left a comment
There was a problem hiding this comment.
@michae2 I actually talked with @nvanbenschoten about this a bit (with a code example just like the one you gave!), but the problem is even with such a check we can't reason deterministically about where the async goroutine actually is in its execution... therefore we thought it might be better to leave it as is with a more naive approach, and validate that the locks are held by polling the crdb_internal.cluster_locks table (introduced in #77876) until we see that we have both held locks and waiters as expected. This will give us a more specific signal to test than just if the goroutine hasn't completed yet, as we can test for what we want to see - the waiting operation in the lock table.
This is also part of why I changed the terminology to async rather than hangs - the async statement may take some time, or it may complete quickly, the implementation doesn't care - but we can test for the other effects (like showing up in the lock table) that we are looking for anyway.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)
michae2
left a comment
There was a problem hiding this comment.
@AlexTalks that makes sense. Checking cluster_locks sounds even better. Thanks for the response!
Then I only have one small nit: could we use "block" as the terminology in comments, rather than "hang"? 🙂
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)
…ntention While previously each statement in a logictest was expected to complete (with success or error), this change adds support for a new type of statement in a logictest that is expected to block and be completed later. This change enables the ability to test what happens during lock contention, as in the case a statement is waiting on locks and does not have a lock timeout set, the statement will block until it can proceed (by obtaining the locks). This feature is supported by the syntax: ``` statement async namedStatement ok <some statement that will block as it cannot obtain locks> ... <operations that would free the locks namedStatement is waiting on> ... awaitstatement namedStatement <wait on namedStatement to complete and validate that results are ok> ``` Release note: None Release Justification: Testing change only.
b4d7dff to
12b1a01
Compare
|
bors r+ |
|
bors retry |
|
Already running a review |
|
Build succeeded: |
While async statements in logictests were introduced in cockroachdb#79010, in some cases tests with async statements would fail due to a race condition. This is fixed by updating the logic such that when an async statement is issued, the goroutine running that statement is ensured to have started before execution continues to other lines in the test. Fixes cockroachdb#79565. Release note: None
79628: logictest: deflake tests with async statements r=AlexTalks a=AlexTalks While async statements in logictests were introduced in #79010, in some cases tests with async statements would fail due to a race condition. This is fixed by updating the logic such that when an async statement is issued, the goroutine running that statement is ensured to have started before execution continues to other lines in the test. Fixes #79565. Release note: None Release Justification: Testing change only. Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
While async statements in logictests were introduced in #79010, in some cases tests with async statements would fail due to a race condition. This is fixed by updating the logic such that when an async statement is issued, the goroutine running that statement is ensured to have started before execution continues to other lines in the test. Fixes #79565. Release note: None
This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in cockroachdb#79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. This change also fixes a bug with asynchronous statements that could cause a race condition in testing. Release note: None
This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in cockroachdb#79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. Release note: None
This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in cockroachdb#79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. Release note: None
79871: logictest: support async queries in logictests r=AlexTalks a=AlexTalks This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in #79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. This change also fixes a bug with asynchronous statements that could cause a race condition in testing. Release note: None Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in cockroachdb#79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. Release note: None
79871: logictest: support async queries in logictests r=AlexTalks a=AlexTalks This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in #79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. This change also fixes a bug with asynchronous statements that could cause a race condition in testing. Release note: None Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in cockroachdb#79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. Release note: None
While previously each statement in a logictest was expected to complete
(with success or error), this change adds support for a new type of
statement in a logictest that is expected to block and be completed
later. This change enables the ability to test what happens during lock
contention, as in the case a statement is waiting on locks and does not
have a lock timeout set, the statement will block until it can proceed
(by obtaining the locks).
This feature is supported by the syntax:
Release note: None
Release Justification: Testing change only.