roachtest: deflake+improve replicagc-changed-peers test#51394
roachtest: deflake+improve replicagc-changed-peers test#51394craig[bot] merged 1 commit intocockroachdb:masterfrom
replicagc-changed-peers test#51394Conversation
tbg
left a comment
There was a problem hiding this comment.
Thank you!
Reviewed 1 of 1 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @irfansharif)
pkg/cmd/roachtest/replicagc.go, line 38 at r1 (raw file):
While you have this paged in, could you add a comment with a synopsis here? Something like a (better version of)
// Checks that when a node has all of its replicas taken away in absentia restarts without being able to talk to any of its old peers, it will still replicaGC its replicas quickly.
pkg/cmd/roachtest/replicagc.go, line 69 at r1 (raw file):
} t.Status("waiting for zero replicas on n1 and n2")
remove "and n2"
pkg/cmd/roachtest/replicagc.go, line 81 at r1 (raw file):
// attribute. We'll later start n3 using this attribute to test GC replica // count. h.isolateDeadNodes(ctx, 4)
// run this on n4 (it's live, that's all that matters)
Fixes cockroachdb#51097. Fixes cockroachdb#51367. This is fallout from cockroachdb#50329, this test previously attempted to recommission a fully decommissioned node. It seems we relied on the decomm/recomm subsystems to trigger replica gc operations, that that test was then asserting on. It suffices to simply mark the nodes as decommissioning instead of fully decommissioning them. While here, I've re-written this test in the more stateful style of the `decommission-recommission` roachtest. Release note: None
207e79c to
d1521dd
Compare
|
TFTR! bors r=tbg |
|
bors r+ |
Build failed |
|
Flaked on #51263. |
|
bors retry |
|
bors r+ |
Build failed |
|
bors r+ |
Build failed |
|
This is getting a bit ridiculous. Flaked on #51331. bors r+ |
Build failed |
|
Flaked on #51263. bors r+ |
Build succeeded |
|
Only took 15 hours get merged, woot. |
Fixes #51097. Fixes #51367.
This is fallout from #50329, this test previously attempted to
recommission a fully decommissioned node. It seems we relied on the
decomm/recomm subsystems to trigger replica gc operations, that that
test was then asserting on. It suffices to simply mark the nodes as
decommissioning instead of fully decommissioning them. While here, I've
re-written this test in the more stateful style of the
decommission-recommissionroachtest.Release note: None