Skip to content

feat: refactor follower controller for better lifecycle management#887

Merged
mattisonchao merged 17 commits intomainfrom
feat.refactor.lifecycle
Feb 22, 2026
Merged

feat: refactor follower controller for better lifecycle management#887
mattisonchao merged 17 commits intomainfrom
feat.refactor.lifecycle

Conversation

@mattisonchao
Copy link
Copy Markdown
Member

@mattisonchao mattisonchao commented Feb 20, 2026

Motivation

The follower controller had fragile lifecycle management with race conditions during shutdown and tightly coupled synchronization logic.

Modification

  • Refactor followerController to use atomic state and sync.RWMutex with double-check locking for safe concurrent access.
  • Extract LogSynchronizer into its own file for cleaner replication stream handling.
  • Introduce dserror package for domain errors; convert to gRPC status errors at the server boundary.
  • Make NewTerm idempotent during leader negotiations.
  • Fix InstallSnapshot to recover the database on failure and validate term before destructive operations.
  • Make MockServerReplicateStream context-aware to prevent goroutine leaks in tests.

mattisonchao and others added 4 commits February 12, 2026 15:45
…liability

- Add context-aware Send/Recv to MockServerReplicateStream with Cancel()
  to prevent goroutine leaks in tests
- Add double-check pattern on closed state after acquiring locks in
  follower controller to prevent race conditions during shutdown
- Separate LogSynchronizer.Sync() and Close() into distinct operations,
  adding SyncAndClose() for the AppendEntries call path
- Migrate error returns from constant.ErrAlreadyClosed/ErrInvalidTerm/
  ErrInvalidStatus to dserror equivalents for consistent error handling
- Allow idempotent NewTerm calls during negotiations (term == current is
  no longer rejected)
- Fix InstallSnapshot to assign db before SendAndClose to avoid using a
  nil db reference on success path

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mattisonchao
Copy link
Copy Markdown
Member Author

Code Review Analysis

Critical

1. Potential deadlock between NewTerm and becomeFollower callback
follower_controller.go — The becomeFollower callback (line 275-279) acquires fc.rwMutex.Lock() inside the bgAppender goroutine. Meanwhile, NewTerm() acquires fc.rwMutex.Lock() (line 297), then calls fc.logSynchronizer.Close() which calls ls.waitGroup.Wait(). If bgAppender is inside becomeFollower() waiting for the lock, and NewTerm holds the lock waiting for bgAppender to finish — deadlock.

2. Race condition: term check before lock in NewTerm allows term regression
follower_controller.go:293-297 — The term comparison newTerm < fc.term.Load() happens before acquiring rwMutex. Between the check and the lock, another concurrent NewTerm could set a higher term. The stale check then allows overwriting with a lower term, violating the monotonically-increasing term invariant.

High

3. Delete() uses WAL and DB after Close() already closed them
follower_controller.go:643-652Close() on line 643 closes fc.wal and fc.db, then lines 651-652 call fc.wal.Delete() and fc.db.Delete() on the already-closed resources. This could panic or silently fail to delete data.

4. initDatabase silently ignores ReadTerm error
follower_controller.go:125 — The error from db.ReadTerm() is never checked. The := shadows the named return err, so execution continues with potentially undefined term/options values.

Medium

5. Unbuffered signal channels cause lost notifications
follower_controller.go:192stateApplierCond is created with zero buffer. Signals sent via PushNoBlock are dropped when the receiver is busy. Same issue with syncCond in log_synchronizer.go:166. This can delay entry application or WAL sync indefinitely if no new appends arrive.

6. InstallSnapshot holds write lock during entire stream receive
follower_controller.go:483-484 — The lock is held for the entire snapshot transfer including all stream.Recv() calls, blocking GetStatus, NewTerm, Truncate, and AppendEntries for potentially minutes.

Low

7. InstallSnapshot does not update fc.status on success — Status remains whatever it was before (e.g. FENCED), which may prevent accepting append entries without an additional NewTerm cycle.

8. Test TestFollower_DisconnectLeader reuses an already-cancelled stream (line 851-854) — The test passes for the wrong reason since the stream context is already cancelled before AppendEntries is called.

…m and onAppend

- Change fc.status from proto.ServingStatus to atomic.Int32 to allow
  lock-free status updates from the LogSynchronizer's onAppend callback
- Rename becomeFollower to onAppend in LogSynchronizer
- Simplify Status() and GetStatus() to read directly from atomic
@mattisonchao mattisonchao self-assigned this Feb 22, 2026
- Protect fc.db access in processCommitRequest with RLock to prevent
  race with InstallSnapshot swapping the database
- Recover database on failed snapshot install instead of leaving
  follower in a broken state with closed db
- Check ReadTerm error in initDatabase (was silently ignored due to
  variable shadowing)
- Reorganize struct fields with clear documentation of locking
  requirements: invariants, atomic state, and guarded resources
mattisonchao and others added 8 commits February 22, 2026 15:12
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The follower controller now returns domain errors (dserror package) instead
of gRPC status errors. Add toGRPCError conversion at the gRPC handler
boundary so remote callers (e.g. coordinator) can still match on expected
gRPC error codes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Make stateApplierCond a buffered channel (size 1) to prevent missed signals
- Add rwMutex.RLock in test helper to avoid data race on logSynchronizer access
- Remove stale comment in log_synchronizer append path

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The syncCond channel between bgAppender and bgSyncer was unbuffered.
Combined with PushNoBlock (non-blocking send), if bgSyncer hadn't
reached its select when bgAppender pushed, the signal was silently
dropped, causing bgSyncer to never wake up and the test to hang.

This is the same class of bug fixed in the previous commit for
stateApplierCond.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move UpdateTerm and atomic offset stores before SendAndClose to prevent
term loss if the node crashes between sending the response and persisting
the term.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The snapshot DB already contains the leader's persisted term and
TermOptions. Reading them via initDatabase is sufficient. The explicit
UpdateTerm call was overwriting the leader's TermOptions (including
notification settings) with empty defaults.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mattisonchao
Copy link
Copy Markdown
Member Author

Updated review after detailed analysis against main:

Items from the original review — status:

  1. Deadlock between NewTerm and onAppend — Fixed in 4883d79 (status is now atomic, no lock in callback).
  2. Term regression race in NewTerm — Fixed in 06c2cb8 (double-check after lock acquisition).
  3. Delete() uses WAL/DB after Close() — Same pattern as main (original also closed before deleting via saved references). Acceptable.
  4. initDatabase ignores ReadTerm error — Not an issue, error is checked on line 128.
  5. Unbuffered signal channels — Fixed in 224bfc1 and c8d6299 (channels are now buffered).
  6. InstallSnapshot holds lock during stream — By design; same as main which held sync.Mutex throughout handleSnapshot.
  7. InstallSnapshot doesn't update status — Acceptable; coordinator always calls NewTerm after snapshot.
  8. Test reuses cancelled stream — Fixed in test changes (new stream created after NewTerm).

Two regressions found and fixed:

  • InstallSnapshot was persisting term after sending ack — crash could lose term (037bd6d).
  • InstallSnapshot was overwriting TermOptions with empty defaults — removed redundant UpdateTerm since snapshot DB already contains the leader's term and options (54ea495).

Update prepareTestDb to store the term in the snapshot database,
reflecting that real leader snapshots already contain the persisted term.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mattisonchao mattisonchao merged commit 1724010 into main Feb 22, 2026
9 checks passed
@mattisonchao mattisonchao deleted the feat.refactor.lifecycle branch February 22, 2026 14:12
mattisonchao added a commit that referenced this pull request Feb 23, 2026
…se (#893)

### Motivation

`TestCoordinatorE2E` is flaky due to a data race between the WAL
`runSync` goroutine calling `segment.Flush()` (which reads the mmap) and
`wal.Close()` calling `segment.Close()` → `MMap.Unmap()` (which
writes/invalidates the mmap). The race was latent in the WAL code and
surfaced after the follower controller lifecycle refactor in #887
changed shutdown timing.

CI failure:
https://github.com/oxia-db/oxia/actions/runs/22312993235/job/64549930899

### Modification

- Add `RLock`/`RUnlock` to `readWriteSegment.Flush()` to synchronize
against the write lock already held by `Close()`. This ensures either
`Flush()` completes before `Unmap()` begins, or `Unmap()` completes
first and `Flush()` sees `txnMappedFile == nil` and returns safely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant