roachtest: enable txn heartbeat loops for 1PC txns in kv/contention/nodes=4#45568
Conversation
andreimatei
left a comment
There was a problem hiding this comment.
LGTM
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)
pkg/cmd/roachtest/kv.go, line 225 at r1 (raw file):
// Drop the deadlock detection delay because the test creates a // large number transaction deadlocks. if _, err := conn.Exec(`
I think there's a way to check the cluster version instead of swallowing failures
nvb
left a comment
There was a problem hiding this comment.
TFTRs!
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
pkg/cmd/roachtest/kv.go, line 225 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I think there's a way to check the cluster version instead of swallowing failures
Yeah, that's another option. We already have this pattern all over the place though and I don't want to get into the game of guessing what the right cluster version is, so I'm going to stick with this.
Canceled (will resume) |
Build failed (retrying...) |
|
bors r+ |
…odes=4 I noticed when debugging issues in cockroachdb#45482 that unhandled deadlocks occasionally resolved themselves because txns would eventually time out. This was because we don't start the txn heartbeat loop for 1PC txns. In this kind of test, we want any unhandled deadlocks to be as loud as possible, so just like we set a very long txn expiration, we also enable the txn heartbeat loop for all txns, even those that we expect will be 1PC. This commit also drops the kv.lock_table.deadlock_detection_push_delay down for the test, since it's already touching this code.
9122438 to
898383f
Compare
|
bors r+ |
Build failed (retrying...) |
|
Bors crashed because of my PR, so: bors r+ |
|
Already running a review |
Build succeeded |
I noticed when debugging issues in #45482 that unhandled deadlocks occasionally
resolved themselves because txns would eventually time out. This was because
we don't start the txn heartbeat loop for 1PC txns. In this kind of test, we
want any unhandled deadlocks to be as loud as possible, so just like we set
a very long txn expiration, we also enable the txn heartbeat loop for all
txns, even those that we expect will be 1PC.
This commit also drops the kv.lock_table.deadlock_detection_push_delay down
for the test, since it's already touching this code.