@nvanbenschoten , the RETRY_ASYNC_WRITE_FAILURE error I was telling you about is very easy to repro on master, so I'm thinking maybe it's worth verifying that it indeed happens because of a lease transfer. Maybe there's something else too going on (based on nothing but the frequency).
I'm running something like bin/roachtest run interleavedpartitioned --count 100 reusing a cluster (so --cluster <name>, and more often then not the test fails within 20s or so. Looks like if it doesn't fail in the beginning, it won't fail later on either.
I've also changed the duration below to 30s to speed up the process:
|
duration := " --duration " + ifLocal("10s", "10m") |
What do you think? Feel free to close or pass back if you're not interested.
I've separately opened #28873 complaining that the test doesn't do retries.
@nvanbenschoten , the
RETRY_ASYNC_WRITE_FAILUREerror I was telling you about is very easy to repro on master, so I'm thinking maybe it's worth verifying that it indeed happens because of a lease transfer. Maybe there's something else too going on (based on nothing but the frequency).I'm running something like
bin/roachtest run interleavedpartitioned --count 100reusing a cluster (so--cluster <name>, and more often then not the test fails within 20s or so. Looks like if it doesn't fail in the beginning, it won't fail later on either.I've also changed the duration below to
30sto speed up the process:cockroach/pkg/cmd/roachtest/interleavedpartitioned.go
Line 62 in 2e905a0
What do you think? Feel free to close or pass back if you're not interested.
I've separately opened #28873 complaining that the test doesn't do retries.