Skip to content

logictest: add support for async statements that block due to lock contention#79010

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
AlexTalks:logic_test_hung_stmt
Apr 6, 2022
Merged

logictest: add support for async statements that block due to lock contention#79010
craig[bot] merged 1 commit intocockroachdb:masterfrom
AlexTalks:logic_test_hung_stmt

Conversation

@AlexTalks
Copy link
Copy Markdown
Contributor

@AlexTalks AlexTalks commented Mar 30, 2022

While previously each statement in a logictest was expected to complete
(with success or error), this change adds support for a new type of
statement in a logictest that is expected to block and be completed
later. This change enables the ability to test what happens during lock
contention, as in the case a statement is waiting on locks and does not
have a lock timeout set, the statement will block until it can proceed
(by obtaining the locks).

This feature is supported by the syntax:

statement async namedStatement ok
<some statement that will block as it cannot obtain locks>
...
<operations that would free the locks namedStatement is waiting on>
...
awaitstatement namedStatement
<wait on namedStatement to complete and validate that results are ok>

Release note: None

Release Justification: Testing change only.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@AlexTalks
Copy link
Copy Markdown
Contributor Author

FYI: The reason for the backport label is that this change is in service of additional test coverage for #77876 (intended for 22.1).

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation LGTM, but I'm a bit on the fence whether this feature makes sense in the context of the logic tests.

The image of logic tests in my head has always been that a single statement is executed fully and then you can optionally check whether the results match the provided expectation. In the past I think we've implemented tests that want to assert the locking via unit tests. However, I could see an argument that this change provides an infrastructure for writing those blocking tests easier. Curious what Andrei/Nathan think. I'm also curious what @cockroachdb/sql-queries think.

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)


pkg/sql/logictest/logic.go, line 992 at r1 (raw file):

	// expected rows affected count. -1 to avoid testing this.
	expectCount int64
	// if we expect this statement to hang, and become a pendingStatement

nit: missing periods here and below.


pkg/sql/logictest/logic.go, line 2478 at r1 (raw file):

			name := fields[1]

			if pending, ok := t.pendingStatements[name]; ok {

nit: I think we tend to short-circuit on !ok case and then proceed with the main happy case. I.e. something like

pending, ok := t.pendingStatements[name]
if !ok {
  return errors.New("")
}
// happy case

pkg/sql/logictest/logic.go, line 3129 at r1 (raw file):

	if stmt.expectHang {
		if _, ok := t.pendingStatements[stmt.statementName]; ok {
			return false, errors.Newf("pending statement with name \"%s\" already exists", stmt.statementName)

nit: you can use %q instead of "%s".


pkg/sql/logictest/testdata/logic_test/cluster_locks, line 46 at r1 (raw file):

user testuser

finishstatement readReq

nit: I think the txn is still open for testuser.

Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I'm curious to hear some other opinions too. I had thought about trying to add generic async/await modifiers (inspired by JavaScript/C#/etc) but this seemed like a more straightforward method of getting something working quickly with the logictest infrastructure.

We have some logictests that incorporate locking, however I don't believe we have any that capture the concept of "waiting". The tests I've seen that have transactions hold locks make sure that other operations that attempt to obtain those locks have timeouts, so that they fail after some short wait. This means that we aren't able to test what happens when we have actively waiting operations in logictests unfortunately.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, @nvanbenschoten, and @yuzefovich)


pkg/sql/logictest/logic.go, line 992 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: missing periods here and below.

Working - thanks


pkg/sql/logictest/logic.go, line 2478 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I think we tend to short-circuit on !ok case and then proceed with the main happy case. I.e. something like

pending, ok := t.pendingStatements[name]
if !ok {
  return errors.New("")
}
// happy case

Working - sounds good


pkg/sql/logictest/logic.go, line 3129 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: you can use %q instead of "%s".

Working - thanks for the tip


pkg/sql/logictest/testdata/logic_test/cluster_locks, line 46 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I think the txn is still open for testuser.

This one is actually ok, though, correct? Any open transactions are rolled back at the end of a logictest I think?

@nvb
Copy link
Copy Markdown
Contributor

nvb commented Mar 31, 2022

The image of logic tests in my head has always been that a single statement is executed fully and then you can optionally check whether the results match the provided expectation. In the past I think we've implemented tests that want to assert the locking via unit tests. However, I could see an argument that this change provides an infrastructure for writing those blocking tests easier. Curious what Andrei/Nathan think.

Alex and I talked through this. I don't see what's here as much of a deviation from the existing logic test infrastructure. We already support switching between multiple sessions and we also have some non-determinism in certain cases, which is why we have the retry directive. Allowing statements to block using an async/await model seems fine to me. It's actually how I would have preferred that we implemented locking-related testing in the past.

@AlexTalks AlexTalks force-pushed the logic_test_hung_stmt branch from aa72b3b to b4d7dff Compare March 31, 2022 23:14
Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nvanbenschoten. @yuzefovich, let me know if this seems ok to merge - I've modified the syntax a bit to use instead statement async and awaitstatement, and cleaned things up a bit.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @yuzefovich)

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good :lgtm:

Looks @michae2 was typing something, so it might be worth to wait a bit to give him a chance to share his feedback.

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)


pkg/sql/logictest/testdata/logic_test/cluster_locks, line 46 at r1 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

This one is actually ok, though, correct? Any open transactions are rolled back at the end of a logictest I think?

Hm, I don't know - tbh I don't think I've ever written a logic test that begins a txn 😅

To me it seems cleaner to close things explicitly in case we'll be appending to this test file in the future, but I don't have strong feelings about it (that's why it's a "nit"), so feel free to ignore this comment :)

@yuzefovich
Copy link
Copy Markdown
Member

Don't forget to update the PR description based on the latest commit message.

Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I really like this!

Something is bothering me, though. Do we check that the async statement is still blocking while executing the following statements? If I understand the implementation correctly, the async statement could finish at any time, including before the next statement is executed. That doesn't seem like much of a test if we want to verify that locks are actually blocking a SELECT FOR UPDATE until after the UPDATE transaction commits or aborts.

Maybe we could add something like:

select {
execRes := <-pending.resultChan:
  // not blocking any more, failure
default:
  // still blocking, yay, proceed
}

to all of the statement executions after the async statement?

(Or maybe not all of the statement executions... maybe only statements with another option like blocks namedStatement which would then cause this check to happen before execution.)

Then we could write tests like:

statement count 7
BEGIN; UPDATE t SET v = concat(v, '_updated') WHERE k >= 'b' and k < 'z'

user testuser

statement ok
BEGIN

statement async readReq ok
select * from t for update

user root

sleep 2 blocks readReq

statement ok blocks readReq
ROLLBACK

user testuser

awaitstatement readReq

which would give a little more assurance that the SELECT FOR UPDATE was actually blocking until the ROLLBACK.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)

Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michae2 I actually talked with @nvanbenschoten about this a bit (with a code example just like the one you gave!), but the problem is even with such a check we can't reason deterministically about where the async goroutine actually is in its execution... therefore we thought it might be better to leave it as is with a more naive approach, and validate that the locks are held by polling the crdb_internal.cluster_locks table (introduced in #77876) until we see that we have both held locks and waiters as expected. This will give us a more specific signal to test than just if the goroutine hasn't completed yet, as we can test for what we want to see - the waiting operation in the lock table.

This is also part of why I changed the terminology to async rather than hangs - the async statement may take some time, or it may complete quickly, the implementation doesn't care - but we can test for the other effects (like showing up in the lock table) that we are looking for anyway.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)

Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexTalks that makes sense. Checking cluster_locks sounds even better. Thanks for the response!

Then I only have one small nit: could we use "block" as the terminology in comments, rather than "hang"? 🙂

:lgtm_strong:

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)

…ntention

While previously each statement in a logictest was expected to complete
(with success or error), this change adds support for a new type of
statement in a logictest that is expected to block and be completed
later. This change enables the ability to test what happens during lock
contention, as in the case a statement is waiting on locks and does not
have a lock timeout set, the statement will block until it can proceed
(by obtaining the locks).

This feature is supported by the syntax:
```
statement async namedStatement ok
<some statement that will block as it cannot obtain locks>
...
<operations that would free the locks namedStatement is waiting on>
...
awaitstatement namedStatement
<wait on namedStatement to complete and validate that results are ok>
```

Release note: None

Release Justification: Testing change only.
@AlexTalks AlexTalks force-pushed the logic_test_hung_stmt branch from b4d7dff to 12b1a01 Compare April 6, 2022 19:28
@AlexTalks
Copy link
Copy Markdown
Contributor Author

bors r+

@AlexTalks AlexTalks changed the title logictest: add support for statements that hang due to lock contention logictest: add support for async statements that block due to lock contention Apr 6, 2022
@AlexTalks
Copy link
Copy Markdown
Contributor Author

bors retry

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 6, 2022

Already running a review

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 6, 2022

Build succeeded:

@craig craig bot merged commit 301a424 into cockroachdb:master Apr 6, 2022
@AlexTalks AlexTalks deleted the logic_test_hung_stmt branch April 7, 2022 00:55
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Apr 8, 2022
While async statements in logictests were introduced in cockroachdb#79010, in some
cases tests with async statements would fail due to a race condition.
This is fixed by updating the logic such that when an async statement is
issued, the goroutine running that statement is ensured to have started
before execution continues to other lines in the test.

Fixes cockroachdb#79565.

Release note: None
craig bot pushed a commit that referenced this pull request Apr 8, 2022
79628: logictest: deflake tests with async statements r=AlexTalks a=AlexTalks

While async statements in logictests were introduced in #79010, in some
cases tests with async statements would fail due to a race condition.
This is fixed by updating the logic such that when an async statement is
issued, the goroutine running that statement is ensured to have started
before execution continues to other lines in the test.

Fixes #79565.

Release note: None

Release Justification: Testing change only.

Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
AlexTalks added a commit that referenced this pull request Apr 8, 2022
While async statements in logictests were introduced in #79010, in some
cases tests with async statements would fail due to a race condition.
This is fixed by updating the logic such that when an async statement is
issued, the goroutine running that statement is ensured to have started
before execution continues to other lines in the test.

Fixes #79565.

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Apr 13, 2022
This change adds support for asynchronous queries in logictests, which
will execute the query in a separate goroutine, prior to awaiting and
validating the results when unblocked by any contending operations.
While previously logictests had supported asynchronous statements as
introduced in cockroachdb#79010, the `query` execution, which returns rows to be
validated, is now supported in asynchronous mode with this change.

This feature is supported with the following syntax:
```
query <typestring> async,<options...> namedQuery
<some query that may block during execution>
----
<expected query results>
...
<operations that would free the locks namedQuery is waiting on>
...
awaitquery namedQuery
<waits on namedQuery to complete and validate the results>
```

While this feature is not supported in conjunction with some query
options such as `retry`, or when placed in a `repeat` block, it does
support being run with the `--rewrite` flag. This change also fixes a
bug with asynchronous statements that could cause a race condition in
testing.

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Apr 14, 2022
This change adds support for asynchronous queries in logictests, which
will execute the query in a separate goroutine, prior to awaiting and
validating the results when unblocked by any contending operations.
While previously logictests had supported asynchronous statements as
introduced in cockroachdb#79010, the `query` execution, which returns rows to be
validated, is now supported in asynchronous mode with this change.

This feature is supported with the following syntax:
```
query <typestring> async,<options...> namedQuery
<some query that may block during execution>
----
<expected query results>
...
<operations that would free the locks namedQuery is waiting on>
...
awaitquery namedQuery
<waits on namedQuery to complete and validate the results>
```

While this feature is not supported in conjunction with some query
options such as `retry`, or when placed in a `repeat` block, it does
support being run with the `--rewrite` flag.

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Apr 14, 2022
This change adds support for asynchronous queries in logictests, which
will execute the query in a separate goroutine, prior to awaiting and
validating the results when unblocked by any contending operations.
While previously logictests had supported asynchronous statements as
introduced in cockroachdb#79010, the `query` execution, which returns rows to be
validated, is now supported in asynchronous mode with this change.

This feature is supported with the following syntax:
```
query <typestring> async,<options...> namedQuery
<some query that may block during execution>
----
<expected query results>
...
<operations that would free the locks namedQuery is waiting on>
...
awaitquery namedQuery
<waits on namedQuery to complete and validate the results>
```

While this feature is not supported in conjunction with some query
options such as `retry`, or when placed in a `repeat` block, it does
support being run with the `--rewrite` flag.

Release note: None
craig bot pushed a commit that referenced this pull request Apr 14, 2022
79871: logictest: support async queries in logictests r=AlexTalks a=AlexTalks

This change adds support for asynchronous queries in logictests, which
will execute the query in a separate goroutine, prior to awaiting and
validating the results when unblocked by any contending operations.
While previously logictests had supported asynchronous statements as
introduced in #79010, the `query` execution, which returns rows to be
validated, is now supported in asynchronous mode with this change.

This feature is supported with the following syntax:
```
query <typestring> async,<options...> namedQuery
<some query that may block during execution>
----
<expected query results>
...
<operations that would free the locks namedQuery is waiting on>
...
awaitquery namedQuery
<waits on namedQuery to complete and validate the results>
```

While this feature is not supported in conjunction with some query
options such as `retry`, or when placed in a `repeat` block, it does
support being run with the `--rewrite` flag. This change also fixes a
bug with asynchronous statements that could cause a race condition in
testing.

Release note: None

Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Apr 15, 2022
This change adds support for asynchronous queries in logictests, which
will execute the query in a separate goroutine, prior to awaiting and
validating the results when unblocked by any contending operations.
While previously logictests had supported asynchronous statements as
introduced in cockroachdb#79010, the `query` execution, which returns rows to be
validated, is now supported in asynchronous mode with this change.

This feature is supported with the following syntax:
```
query <typestring> async,<options...> namedQuery
<some query that may block during execution>
----
<expected query results>
...
<operations that would free the locks namedQuery is waiting on>
...
awaitquery namedQuery
<waits on namedQuery to complete and validate the results>
```

While this feature is not supported in conjunction with some query
options such as `retry`, or when placed in a `repeat` block, it does
support being run with the `--rewrite` flag.

Release note: None
craig bot pushed a commit that referenced this pull request Apr 15, 2022
79871: logictest: support async queries in logictests r=AlexTalks a=AlexTalks

This change adds support for asynchronous queries in logictests, which
will execute the query in a separate goroutine, prior to awaiting and
validating the results when unblocked by any contending operations.
While previously logictests had supported asynchronous statements as
introduced in #79010, the `query` execution, which returns rows to be
validated, is now supported in asynchronous mode with this change.

This feature is supported with the following syntax:
```
query <typestring> async,<options...> namedQuery
<some query that may block during execution>
----
<expected query results>
...
<operations that would free the locks namedQuery is waiting on>
...
awaitquery namedQuery
<waits on namedQuery to complete and validate the results>
```

While this feature is not supported in conjunction with some query
options such as `retry`, or when placed in a `repeat` block, it does
support being run with the `--rewrite` flag. This change also fixes a
bug with asynchronous statements that could cause a race condition in
testing.

Release note: None

Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Apr 21, 2022
This change adds support for asynchronous queries in logictests, which
will execute the query in a separate goroutine, prior to awaiting and
validating the results when unblocked by any contending operations.
While previously logictests had supported asynchronous statements as
introduced in cockroachdb#79010, the `query` execution, which returns rows to be
validated, is now supported in asynchronous mode with this change.

This feature is supported with the following syntax:
```
query <typestring> async,<options...> namedQuery
<some query that may block during execution>
----
<expected query results>
...
<operations that would free the locks namedQuery is waiting on>
...
awaitquery namedQuery
<waits on namedQuery to complete and validate the results>
```

While this feature is not supported in conjunction with some query
options such as `retry`, or when placed in a `repeat` block, it does
support being run with the `--rewrite` flag.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants