Skip to content

Rewrite the faster failover test case#3495

Merged
enjoy-binbin merged 2 commits into
valkey-io:unstablefrom
enjoy-binbin:failover_test
Apr 16, 2026
Merged

Rewrite the faster failover test case#3495
enjoy-binbin merged 2 commits into
valkey-io:unstablefrom
enjoy-binbin:failover_test

Conversation

@enjoy-binbin

Copy link
Copy Markdown
Member

The test introduced in #2227 is fragile because it unconditionally asserted
that "best ranked replica" would be logged for every shard's replica. This
assumption was incorrect for two reasons:

  1. myselfIsBestRankedReplica() requires failover_failed_primary_rank == 0.
    When two primaries fail simultaneously, only the shard with the lowest
    shard_id gets failed_primary_rank == 0. The other shard's replicas can
    only trigger "Myself become the best ranked replica" after the first
    shard completes failover and the result propagates via gossip.

  2. clusterAllReplicasThinkPrimaryIsFail() requires all sibling replicas
    to have their MY_PRIMARY_FAIL flag propagated via gossip. If the
    election delay is shorter than the gossip propagation time, the
    replica may start the election before receiving the flag.

It's also too strict and checks too many things. Like we can check that there
is no delay before starting the failover, but we shouldn't check that the rank
0 replica actually wins, because maybe it doesn't. Also we're not sure the other
replica does a psync afterwards, it can also do a full sync.

Restructure the test into three focused scenarios with retry logic:

  • Test 0 (3 primaries + 1 replica): The sole replica is deterministically
    the best ranked replica. All conditions are trivially satisfied.

  • Test 1 (3 primaries + 2 replicas): Single primary failure with two
    competing replicas. failed_primary_rank is deterministically 0, but
    MY_PRIMARY_FAIL gossip timing may prevent triggering in some runs.
    Uses internal retry with cluster recovery to ensure at least one
    successful trigger.

  • Test 2 (5 primaries + 7 replicas): Two primaries failure. Verifies no
    failover timeout and uses retry to cover the "best ranked replica"
    path for at least one shard.

The test introduced in valkey-io#2227 is fragile because it unconditionally asserted
that "best ranked replica" would be logged for every shard's replica. This
assumption was incorrect for two reasons:

1. myselfIsBestRankedReplica() requires failover_failed_primary_rank == 0.
   When two primaries fail simultaneously, only the shard with the lowest
   shard_id gets failed_primary_rank == 0. The other shard's replicas can
   only trigger "Myself become the best ranked replica" after the first
   shard completes failover and the result propagates via gossip.

2. clusterAllReplicasThinkPrimaryIsFail() requires all sibling replicas
   to have their MY_PRIMARY_FAIL flag propagated via gossip. If the
   election delay is shorter than the gossip propagation time, the
   replica may start the election before receiving the flag.

It's also too strict and checks too many things. Like we can check that there
is no delay before starting the failover, but we shouldn't check that the rank
0 replica actually wins, because maybe it doesn't. Also we're not sure the other
replica does a psync afterwards, it can also do a full sync.

Restructure the test into three focused scenarios with retry logic:

- Test 0 (3 primaries + 1 replica): The sole replica is deterministically
  the best ranked replica. All conditions are trivially satisfied.

- Test 1 (3 primaries + 2 replicas): Single primary failure with two
  competing replicas. failed_primary_rank is deterministically 0, but
  MY_PRIMARY_FAIL gossip timing may prevent triggering in some runs.
  Uses internal retry with cluster recovery to ensure at least one
  successful trigger.

- Test 2 (5 primaries + 7 replicas): Two primaries failure. Verifies no
  failover timeout and uses retry to cover the "best ranked replica"
  path for at least one shard.

Signed-off-by: Binbin <binloveplay1314@qq.com>
Signed-off-by: Binbin <binloveplay1314@qq.com>
@enjoy-binbin

Copy link
Copy Markdown
Member Author

@codecov

codecov Bot commented Apr 13, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.53%. Comparing base (2871efd) to head (0108813).
⚠️ Report is 7 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3495      +/-   ##
============================================
- Coverage     76.80%   76.53%   -0.27%     
============================================
  Files           157      157              
  Lines         79048    79048              
============================================
- Hits          60711    60499     -212     
- Misses        18337    18549     +212     

see 14 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@zuiderkwast zuiderkwast added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Apr 13, 2026

@zuiderkwast zuiderkwast left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very advanced test cases! Looks definitely more robust that before.

@enjoy-binbin enjoy-binbin merged commit 087280e into valkey-io:unstable Apr 16, 2026
99 of 100 checks passed
@github-project-automation github-project-automation Bot moved this to To be backported in Valkey 9.1 Apr 16, 2026
@enjoy-binbin enjoy-binbin deleted the failover_test branch April 16, 2026 11:15
sarthakaggarwal97 pushed a commit to sarthakaggarwal97/valkey that referenced this pull request Apr 16, 2026
The test introduced in valkey-io#2227 is fragile because it unconditionally asserted
that "best ranked replica" would be logged for every shard's replica. This
assumption was incorrect for two reasons:

1. myselfIsBestRankedReplica() requires failover_failed_primary_rank == 0.
   When two primaries fail simultaneously, only the shard with the lowest
   shard_id gets failed_primary_rank == 0. The other shard's replicas can
   only trigger "Myself become the best ranked replica" after the first
   shard completes failover and the result propagates via gossip.

2. clusterAllReplicasThinkPrimaryIsFail() requires all sibling replicas
   to have their MY_PRIMARY_FAIL flag propagated via gossip. If the
   election delay is shorter than the gossip propagation time, the
   replica may start the election before receiving the flag.

It's also too strict and checks too many things. Like we can check that there
is no delay before starting the failover, but we shouldn't check that the rank
0 replica actually wins, because maybe it doesn't. Also we're not sure the other
replica does a psync afterwards, it can also do a full sync.

Restructure the test into three focused scenarios with retry logic:

- Test 0 (3 primaries + 1 replica): The sole replica is deterministically
  the best ranked replica. All conditions are trivially satisfied.

- Test 1 (3 primaries + 2 replicas): Single primary failure with two
  competing replicas. failed_primary_rank is deterministically 0, but
  MY_PRIMARY_FAIL gossip timing may prevent triggering in some runs.
  Uses internal retry with cluster recovery to ensure at least one
  successful trigger.

- Test 2 (5 primaries + 7 replicas): Two primaries failure. Verifies no
  failover timeout and uses retry to cover the "best ranked replica"
  path for at least one shard.

Signed-off-by: Binbin <binloveplay1314@qq.com>
madolson pushed a commit that referenced this pull request Apr 27, 2026
The test introduced in #2227 is fragile because it unconditionally asserted
that "best ranked replica" would be logged for every shard's replica. This
assumption was incorrect for two reasons:

1. myselfIsBestRankedReplica() requires failover_failed_primary_rank == 0.
   When two primaries fail simultaneously, only the shard with the lowest
   shard_id gets failed_primary_rank == 0. The other shard's replicas can
   only trigger "Myself become the best ranked replica" after the first
   shard completes failover and the result propagates via gossip.

2. clusterAllReplicasThinkPrimaryIsFail() requires all sibling replicas
   to have their MY_PRIMARY_FAIL flag propagated via gossip. If the
   election delay is shorter than the gossip propagation time, the
   replica may start the election before receiving the flag.

It's also too strict and checks too many things. Like we can check that there
is no delay before starting the failover, but we shouldn't check that the rank
0 replica actually wins, because maybe it doesn't. Also we're not sure the other
replica does a psync afterwards, it can also do a full sync.

Restructure the test into three focused scenarios with retry logic:

- Test 0 (3 primaries + 1 replica): The sole replica is deterministically
  the best ranked replica. All conditions are trivially satisfied.

- Test 1 (3 primaries + 2 replicas): Single primary failure with two
  competing replicas. failed_primary_rank is deterministically 0, but
  MY_PRIMARY_FAIL gossip timing may prevent triggering in some runs.
  Uses internal retry with cluster recovery to ensure at least one
  successful trigger.

- Test 2 (5 primaries + 7 replicas): Two primaries failure. Verifies no
  failover timeout and uses retry to cover the "best ranked replica"
  path for at least one shard.

Signed-off-by: Binbin <binloveplay1314@qq.com>
@enjoy-binbin enjoy-binbin moved this from To be backported to Done in Valkey 9.1 May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants