Reduce execution time for replica migration tests#4006
Conversation
📝 WalkthroughWalkthroughThree cluster replica-migration test procedures ( ChangesReplica migration test updates
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Looks @enjoy-binbin is the author of this test. Can you take a look please? Thanks. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/unit/cluster/replica-migration.tcl`:
- Around line 30-34: The test writes a 10KB large value to key_977613 in slot 0
on primary 3, and when slot 0 is migrated to primary 0, this large payload
inflates server 4's replication offset, which weakens the regression check. To
fix this, keep the large padding write to generate the initial large
repl_offset, but then overwrite key_977613 with a small final value before the
slot migration/rebalancing occurs. This ensures the offset padding is
established without the migrated value itself being large. Apply the same fix to
the similar code at lines 110-112 where the large value write pattern is
repeated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 8777d27d-edaf-4ff5-9847-727d4c35521f
📒 Files selected for processing (1)
tests/unit/cluster/replica-migration.tcl
I see the test fail in "failover not happen", did you know the reason? |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #4006 +/- ##
============================================
+ Coverage 76.75% 76.77% +0.02%
============================================
Files 162 162
Lines 81017 81017
============================================
+ Hits 62187 62204 +17
+ Misses 18830 18813 -17 🚀 New features to boost your workflow:
|
No, my hunch is something to do with long running time. I have done 20+ runs and have not encountered any failure. (finger crossed) this fixes the flakiness as well. |
Signed-off-by: Sushil Paneru <sushil.paneru1@gmail.com>
5273e4e to
75857af
Compare
Replace ~10k individual INCR round-trips with single SET commands using large values. And set shutdown-timeout 0 before shutting down the node. Signed-off-by: Sushil Paneru <sushil.paneru1@gmail.com>
Description
The test "Migrated replica reports zero repl offset and rank, and fails to win election - shutdown" takes upto 25s. This PR reduces it to ~4s while maintaining the same coverage.
This might end up closing #3992 as well
From https://github.com/valkey-io/valkey/actions/runs/27586402593/job/81557596685
After the fix
Update - Applied similar fix to other tests as well resulting
Fix