storage: fix incorrect LookupRange error causing TestSplitSnapshotRace_SplitWins flakiness#8988
Conversation
storage/stores.go
Outdated
| repDesc, err = replica.GetReplicaDescriptor() | ||
| if err != nil { | ||
| if _, ok := err.(*roachpb.RangeNotFoundError); ok { | ||
| // Ignore this replica. |
|
Reviewed 1 of 1 files at r1, 2 of 2 files at r2. storage/client_test.go, line 398 [r2] (raw file):
this is test code - feel free to remove the V(2) storage/stores.go, line 234 [r2] (raw file):
can you move this up to where storage/stores.go, line 237 [r2] (raw file):
This looks good to me. Can you write a focused test for it? storage/stores_test.go, line 205 [r1] (raw file):
you need to log the expected error here - it is no longer evident from the code. also, every one of these calls to Comments from Reviewable |
a9b4f1b to
284396a
Compare
|
Review status: 0 of 3 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending. storage/stores.go, line 221 [r2] (raw file):
|
|
Reviewed 3 of 3 files at r3, 3 of 3 files at r4. storage/stores.go, line 223 [r4] (raw file):
"range could have gone away" isn't very descriptive. specifically, this would only happen if this replica was removed from the range via down-replication. Comments from Reviewable |
Rewrite part of TestStoresLookupReplica to use a table.
284396a to
6d06aaf
Compare
|
Reviewed 3 of 3 files at r5, 3 of 3 files at r6. Comments from Reviewable |
…e_SplitWins flakiness The test occasionally times out under `stressrace`. The culprit is a condition where the first replica doesn't find the range and incorrectly returns RangeKeyMismatchError. The request is not retried on other replicas, but is then retried endlessly by dist_sender. The fix is to return a `RangeNotFound` in this case. Rewriting the code in the function a bit to make it more readable. Fixes cockroachdb#8868.
6d06aaf to
b8b8173
Compare
|
Review status: 2 of 3 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending. storage/stores.go, line 223 [r4] (raw file):
|
|
Reviewed 2 of 3 files at r6, 1 of 1 files at r7. Comments from Reviewable |
The test occasionally times out under
stressrace. The culprit is a conditionwhere the first replica doesn't find the range and incorrectly returns
RangeKeyMismatchError. The request is not retried on other replicas, but is then
retried endlessly by dist_sender.
The fix is to return a
RangeNotFoundin this case. Rewriting the code in thefunction a bit to make it more readable.
Fixes #8868.
This change is