logictest: add support for async statements that block due to lock contention by AlexTalks · Pull Request #79010 · cockroachdb/cockroach

AlexTalks · 2022-03-30T00:40:53Z

While previously each statement in a logictest was expected to complete
(with success or error), this change adds support for a new type of
statement in a logictest that is expected to block and be completed
later. This change enables the ability to test what happens during lock
contention, as in the case a statement is waiting on locks and does not
have a lock timeout set, the statement will block until it can proceed
(by obtaining the locks).

This feature is supported by the syntax:

statement async namedStatement ok
<some statement that will block as it cannot obtain locks>
...
<operations that would free the locks namedStatement is waiting on>
...
awaitstatement namedStatement
<wait on namedStatement to complete and validate that results are ok>

Release note: None

Release Justification: Testing change only.

cockroach-teamcity · 2022-03-30T00:41:01Z

This change is

AlexTalks · 2022-03-30T01:31:44Z

FYI: The reason for the backport label is that this change is in service of additional test coverage for #77876 (intended for 22.1).

yuzefovich

The implementation LGTM, but I'm a bit on the fence whether this feature makes sense in the context of the logic tests.

The image of logic tests in my head has always been that a single statement is executed fully and then you can optionally check whether the results match the provided expectation. In the past I think we've implemented tests that want to assert the locking via unit tests. However, I could see an argument that this change provides an infrastructure for writing those blocking tests easier. Curious what Andrei/Nathan think. I'm also curious what @cockroachdb/sql-queries think.

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)

pkg/sql/logictest/logic.go, line 992 at r1 (raw file):

	// expected rows affected count. -1 to avoid testing this.
	expectCount int64
	// if we expect this statement to hang, and become a pendingStatement

nit: missing periods here and below.

pkg/sql/logictest/logic.go, line 2478 at r1 (raw file):

			name := fields[1]

			if pending, ok := t.pendingStatements[name]; ok {

nit: I think we tend to short-circuit on !ok case and then proceed with the main happy case. I.e. something like

pending, ok := t.pendingStatements[name]
if !ok {
  return errors.New("")
}
// happy case

pkg/sql/logictest/logic.go, line 3129 at r1 (raw file):

	if stmt.expectHang {
		if _, ok := t.pendingStatements[stmt.statementName]; ok {
			return false, errors.Newf("pending statement with name \"%s\" already exists", stmt.statementName)

nit: you can use %q instead of "%s".

pkg/sql/logictest/testdata/logic_test/cluster_locks, line 46 at r1 (raw file):

user testuser

finishstatement readReq

nit: I think the txn is still open for testuser.

AlexTalks

Thanks for the feedback. I'm curious to hear some other opinions too. I had thought about trying to add generic async/await modifiers (inspired by JavaScript/C#/etc) but this seemed like a more straightforward method of getting something working quickly with the logictest infrastructure.

We have some logictests that incorporate locking, however I don't believe we have any that capture the concept of "waiting". The tests I've seen that have transactions hold locks make sure that other operations that attempt to obtain those locks have timeouts, so that they fail after some short wait. This means that we aren't able to test what happens when we have actively waiting operations in logictests unfortunately.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, @nvanbenschoten, and @yuzefovich)

pkg/sql/logictest/logic.go, line 992 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: missing periods here and below.

Working - thanks

pkg/sql/logictest/logic.go, line 2478 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I think we tend to short-circuit on !ok case and then proceed with the main happy case. I.e. something like
pending, ok := t.pendingStatements[name]
if !ok {
  return errors.New("")
}
// happy case

Working - sounds good

pkg/sql/logictest/logic.go, line 3129 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: you can use %q instead of "%s".

Working - thanks for the tip

pkg/sql/logictest/testdata/logic_test/cluster_locks, line 46 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I think the txn is still open for testuser.

This one is actually ok, though, correct? Any open transactions are rolled back at the end of a logictest I think?

nvb · 2022-03-31T22:22:40Z

The image of logic tests in my head has always been that a single statement is executed fully and then you can optionally check whether the results match the provided expectation. In the past I think we've implemented tests that want to assert the locking via unit tests. However, I could see an argument that this change provides an infrastructure for writing those blocking tests easier. Curious what Andrei/Nathan think.

Alex and I talked through this. I don't see what's here as much of a deviation from the existing logic test infrastructure. We already support switching between multiple sessions and we also have some non-determinism in certain cases, which is why we have the retry directive. Allowing statements to block using an async/await model seems fine to me. It's actually how I would have preferred that we implemented locking-related testing in the past.

AlexTalks

Thanks @nvanbenschoten. @yuzefovich, let me know if this seems ok to merge - I've modified the syntax a bit to use instead statement async and awaitstatement, and cleaned things up a bit.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @yuzefovich)

yuzefovich

Sounds good

Looks @michae2 was typing something, so it might be worth to wait a bit to give him a chance to share his feedback.

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)

pkg/sql/logictest/testdata/logic_test/cluster_locks, line 46 at r1 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

This one is actually ok, though, correct? Any open transactions are rolled back at the end of a logictest I think?

Hm, I don't know - tbh I don't think I've ever written a logic test that begins a txn 😅

To me it seems cleaner to close things explicitly in case we'll be appending to this test file in the future, but I don't have strong feelings about it (that's why it's a "nit"), so feel free to ignore this comment :)

yuzefovich · 2022-03-31T23:32:39Z

Don't forget to update the PR description based on the latest commit message.

michae2

Cool! I really like this!

Something is bothering me, though. Do we check that the async statement is still blocking while executing the following statements? If I understand the implementation correctly, the async statement could finish at any time, including before the next statement is executed. That doesn't seem like much of a test if we want to verify that locks are actually blocking a SELECT FOR UPDATE until after the UPDATE transaction commits or aborts.

Maybe we could add something like:

select {
execRes := <-pending.resultChan:
  // not blocking any more, failure
default:
  // still blocking, yay, proceed
}

to all of the statement executions after the async statement?

(Or maybe not all of the statement executions... maybe only statements with another option like blocks namedStatement which would then cause this check to happen before execution.)

Then we could write tests like:

statement count 7
BEGIN; UPDATE t SET v = concat(v, '_updated') WHERE k >= 'b' and k < 'z'

user testuser

statement ok
BEGIN

statement async readReq ok
select * from t for update

user root

sleep 2 blocks readReq

statement ok blocks readReq
ROLLBACK

user testuser

awaitstatement readReq

which would give a little more assurance that the SELECT FOR UPDATE was actually blocking until the ROLLBACK.

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)

AlexTalks

@michae2 I actually talked with @nvanbenschoten about this a bit (with a code example just like the one you gave!), but the problem is even with such a check we can't reason deterministically about where the async goroutine actually is in its execution... therefore we thought it might be better to leave it as is with a more naive approach, and validate that the locks are held by polling the crdb_internal.cluster_locks table (introduced in #77876) until we see that we have both held locks and waiters as expected. This will give us a more specific signal to test than just if the goroutine hasn't completed yet, as we can test for what we want to see - the waiting operation in the lock table.

This is also part of why I changed the terminology to async rather than hangs - the async statement may take some time, or it may complete quickly, the implementation doesn't care - but we can test for the other effects (like showing up in the lock table) that we are looking for anyway.

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)

michae2

@AlexTalks that makes sense. Checking cluster_locks sounds even better. Thanks for the response!

Then I only have one small nit: could we use "block" as the terminology in comments, rather than "hang"? 🙂

Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)

…ntention While previously each statement in a logictest was expected to complete (with success or error), this change adds support for a new type of statement in a logictest that is expected to block and be completed later. This change enables the ability to test what happens during lock contention, as in the case a statement is waiting on locks and does not have a lock timeout set, the statement will block until it can proceed (by obtaining the locks). This feature is supported by the syntax: ``` statement async namedStatement ok <some statement that will block as it cannot obtain locks> ... <operations that would free the locks namedStatement is waiting on> ... awaitstatement namedStatement <wait on namedStatement to complete and validate that results are ok> ``` Release note: None Release Justification: Testing change only.

AlexTalks · 2022-04-06T21:00:53Z

bors r+

AlexTalks · 2022-04-06T21:54:55Z

bors retry

craig · 2022-04-06T21:54:56Z

Already running a review

craig · 2022-04-06T23:28:53Z

Build succeeded:

GitHub CI (Cockroach)

While async statements in logictests were introduced in cockroachdb#79010, in some cases tests with async statements would fail due to a race condition. This is fixed by updating the logic such that when an async statement is issued, the goroutine running that statement is ensured to have started before execution continues to other lines in the test. Fixes cockroachdb#79565. Release note: None

79628: logictest: deflake tests with async statements r=AlexTalks a=AlexTalks While async statements in logictests were introduced in #79010, in some cases tests with async statements would fail due to a race condition. This is fixed by updating the logic such that when an async statement is issued, the goroutine running that statement is ensured to have started before execution continues to other lines in the test. Fixes #79565. Release note: None Release Justification: Testing change only. Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>

While async statements in logictests were introduced in #79010, in some cases tests with async statements would fail due to a race condition. This is fixed by updating the logic such that when an async statement is issued, the goroutine running that statement is ensured to have started before execution continues to other lines in the test. Fixes #79565. Release note: None

This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in cockroachdb#79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. This change also fixes a bug with asynchronous statements that could cause a race condition in testing. Release note: None

This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in cockroachdb#79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. Release note: None

79871: logictest: support async queries in logictests r=AlexTalks a=AlexTalks This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in #79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. This change also fixes a bug with asynchronous statements that could cause a race condition in testing. Release note: None Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>

This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in cockroachdb#79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. Release note: None

79871: logictest: support async queries in logictests r=AlexTalks a=AlexTalks This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in #79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. This change also fixes a bug with asynchronous statements that could cause a race condition in testing. Release note: None Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>

This change adds support for asynchronous queries in logictests, which will execute the query in a separate goroutine, prior to awaiting and validating the results when unblocked by any contending operations. While previously logictests had supported asynchronous statements as introduced in cockroachdb#79010, the `query` execution, which returns rows to be validated, is now supported in asynchronous mode with this change. This feature is supported with the following syntax: ``` query <typestring> async,<options...> namedQuery <some query that may block during execution> ---- <expected query results> ... <operations that would free the locks namedQuery is waiting on> ... awaitquery namedQuery <waits on namedQuery to complete and validate the results> ``` While this feature is not supported in conjunction with some query options such as `retry`, or when placed in a `repeat` block, it does support being run with the `--rewrite` flag. Release note: None

AlexTalks added the backport-22.1.x label Mar 30, 2022

AlexTalks requested review from andreimatei, nvb and yuzefovich March 30, 2022 01:30

AlexTalks force-pushed the logic_test_hung_stmt branch from 169adf3 to aa72b3b Compare March 30, 2022 06:13

AlexTalks mentioned this pull request Mar 30, 2022

sql: introduce crdb_internal.cluster_locks virtual table #77876

Merged

yuzefovich reviewed Mar 30, 2022

View reviewed changes

AlexTalks commented Mar 30, 2022

View reviewed changes

AlexTalks force-pushed the logic_test_hung_stmt branch from aa72b3b to b4d7dff Compare March 31, 2022 23:14

AlexTalks commented Mar 31, 2022

View reviewed changes

yuzefovich approved these changes Mar 31, 2022

View reviewed changes

michae2 requested changes Apr 1, 2022

View reviewed changes

AlexTalks commented Apr 1, 2022

View reviewed changes

michae2 approved these changes Apr 1, 2022

View reviewed changes

AlexTalks force-pushed the logic_test_hung_stmt branch from b4d7dff to 12b1a01 Compare April 6, 2022 19:28

AlexTalks changed the title ~~logictest: add support for statements that hang due to lock contention~~ logictest: add support for async statements that block due to lock contention Apr 6, 2022

craig bot merged commit 301a424 into cockroachdb:master Apr 6, 2022

blathers-crl bot mentioned this pull request Apr 6, 2022

release-22.1: logictest: add support for async statements that block due to lock contention #79552

Merged

AlexTalks deleted the logic_test_hung_stmt branch April 7, 2022 00:55

AlexTalks mentioned this pull request Apr 8, 2022

logictest: deflake tests with async statements #79628

Merged

AlexTalks mentioned this pull request Apr 13, 2022

logictest: support async queries in logictests #79871

Merged

AlexTalks mentioned this pull request Apr 21, 2022

release-22.1: logictest: support async queries in logictests #80291

Merged

tbg mentioned this pull request Jun 22, 2022

roachperf: regression around 2022-04-06 #82136

Closed

Conversation

AlexTalks commented Mar 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Mar 30, 2022

Uh oh!

AlexTalks commented Mar 30, 2022

Uh oh!

yuzefovich left a comment

Choose a reason for hiding this comment

Uh oh!

AlexTalks left a comment

Choose a reason for hiding this comment

Uh oh!

nvb commented Mar 31, 2022

Uh oh!

AlexTalks left a comment

Choose a reason for hiding this comment

Uh oh!

yuzefovich left a comment

Choose a reason for hiding this comment

Uh oh!

yuzefovich commented Mar 31, 2022

Uh oh!

michae2 left a comment

Choose a reason for hiding this comment

Uh oh!

AlexTalks left a comment

Choose a reason for hiding this comment

Uh oh!

michae2 left a comment

Choose a reason for hiding this comment

Uh oh!

AlexTalks commented Apr 6, 2022

Uh oh!

AlexTalks commented Apr 6, 2022

Uh oh!

craig bot commented Apr 6, 2022

Uh oh!

craig bot commented Apr 6, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

AlexTalks commented Mar 30, 2022 •

edited

Loading