Skip to content

fix: stabilize flaky TestControlRequestFeatureEnabled#915

Merged
mattisonchao merged 4 commits intomainfrom
fix/flaky-test-control-request-feature-enabled
Feb 27, 2026
Merged

fix: stabilize flaky TestControlRequestFeatureEnabled#915
mattisonchao merged 4 commits intomainfrom
fix/flaky-test-control-request-feature-enabled

Conversation

@mattisonchao
Copy link
Copy Markdown
Member

@mattisonchao mattisonchao commented Feb 26, 2026

Summary

  • Fix flaky TestControlRequestFeatureEnabled that intermittently fails with expected: int(1), actual: int64(2)
  • The test snapshot-checked follower commit offsets and asserted they were exactly 1 behind the leader, but async replication timing can cause the gap to vary
  • Replace the strict point-in-time equality check with assert.Eventually that waits for followers to catch up to within 1 offset of the leader

Test plan

  • go vet ./tests/control/... passes
  • CI passes (this was the only flaky test in the prior run)

@mattisonchao mattisonchao changed the title fix: stabilize flaky TestControlRequestFeatureEnabled fix: stabilize flaky TestControlRequestFeatureEnabled and TestLeaderHint tests Feb 26, 2026
@mattisonchao mattisonchao force-pushed the fix/flaky-test-control-request-feature-enabled branch from d8ff28d to 0bc4ffb Compare February 26, 2026 05:49
@mattisonchao mattisonchao changed the title fix: stabilize flaky TestControlRequestFeatureEnabled and TestLeaderHint tests fix: stabilize flaky TestControlRequestFeatureEnabled Feb 26, 2026
The test had two flaky assertions:
1. Strict equality check that followers are exactly 1 commit offset
   behind the leader - replaced with assert.Eventually that waits for
   followers to catch up
2. Point-in-time checksum comparison across replicas without waiting
   for follower replication to complete - replaced with assert.Eventually
   that waits for follower checksums to match the leader

Also increased assert.Eventually timeouts from 10s to 30s for
robustness under CI load.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mattisonchao mattisonchao force-pushed the fix/flaky-test-control-request-feature-enabled branch from 0bc4ffb to e9dc144 Compare February 26, 2026 08:55
mattisonchao and others added 3 commits February 27, 2026 01:18
The test was flaky because commit notifications are piggybacked on
replication messages, so followers never learn the last committed
entry's status without a subsequent write.

- Write all keys before checking feature enabled (flushes the
  feature-enable commit notification to followers)
- Record leader checksum, then write a flush entry to propagate
  the commit notification for the last key
- Wait for followers to reach the recorded commit offset
- Compare checksums with assert.Equal for clear error messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mattisonchao mattisonchao self-assigned this Feb 27, 2026
@mattisonchao mattisonchao merged commit 9264a27 into main Feb 27, 2026
9 checks passed
@mattisonchao mattisonchao deleted the fix/flaky-test-control-request-feature-enabled branch February 27, 2026 05:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant