Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: rabbitmq/ra
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v3.0.0
Choose a base ref
...
head repository: rabbitmq/ra
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v3.0.1
Choose a head ref
  • 9 commits
  • 12 files changed
  • 1 contributor

Commits on Mar 10, 2026

  1. Recover corrupt snapshot indexes file from machine state

    The indexes file is intentionally not fsynced, so corruption after a
    crash is expected. Previously this caused a badmatch crash in
    find_snapshots during init. Now ra_snapshot carries the machine config
    and can recover live indexes by reading the snapshot and calling
    ra_machine:live_indexes/2, the same approach used in complete_accept.
    kjnilsson committed Mar 10, 2026
    Configuration menu
    Copy the full SHA
    adf11ab View commit details
    Browse the repository at this point in the history
  2. Fix segment deletion during init after dual WAL flush

    When a crash occurs after the segment writer flushes a WAL file to
    segments but before the WAL file is deleted, recovery replays the
    same WAL creating segments that overlap with those from the first
    flush. compact_segrefs correctly handles this by truncating the
    range of partially overlapping segment refs. However the deletion
    logic used the -- operator which compares full {Filename, Range}
    tuples. A segment whose range was truncated (but not removed) no
    longer matched its original ref, so it appeared in the diff and
    was deleted even though the reader still referenced it. The
    subsequent fold during state machine recovery then crashed with
    ra_log_failed_to_open_segment enoent.
    
    Compare by filename only when deciding which segments to delete,
    so that segments still referenced by the reader (even with a
    truncated range) are preserved.
    kjnilsson committed Mar 10, 2026
    Configuration menu
    Copy the full SHA
    2364616 View commit details
    Browse the repository at this point in the history
  3. Send RPCs to snapshot_backoff peers when leader enforces leadership

    When a leader receives a pre_vote_rpc from a follower with a stale
    term, make_all_rpcs now includes peers in snapshot_backoff status
    alongside normal peers. This ensures the lagging follower that
    triggered the pre-vote gets its snapshot expeditiously rather than
    waiting for the backoff timer to fire. The pending backoff timer is
    cancelled via a new cancel_snapshot_retry_timer effect before the
    RPC is sent.
    kjnilsson committed Mar 10, 2026
    Configuration menu
    Copy the full SHA
    e1256f8 View commit details
    Browse the repository at this point in the history
  4. Fix WAL recovery crash when segment writer deletes mem table entries …

    …concurrently
    
    During multi-file WAL recovery after a power-off, the segment writer
    processes mem tables from earlier WAL files asynchronously. When servers
    have no Pid (normal during recovery), the segment writer deletes entries
    directly from the mem table ETS. If this deletion races with recovery of
    the next WAL file, recover_entry calls mem_table_please which re-scans
    the (now partially depleted) ETS table. The resulting ra_mt state has a
    LastSeq that no longer matches the PrevIdx tracked in the writers map,
    causing ra_mt:insert_sparse to return {error, gap_detected} — an
    unhandled case_clause in recover_entry that crashes the node at boot.
    
    Fix by carrying the Tables map across WAL files in the recovery fold,
    alongside the already-carried Writers map. This way recover_entry reuses
    the ra_mt state it built during earlier file recovery rather than
    re-scanning a potentially mutated ETS table.
    
    Made-with: Cursor
    kjnilsson committed Mar 10, 2026
    Configuration menu
    Copy the full SHA
    9aa04f1 View commit details
    Browse the repository at this point in the history
  5. test fix

    kjnilsson committed Mar 10, 2026
    Configuration menu
    Copy the full SHA
    500435b View commit details
    Browse the repository at this point in the history

Commits on Mar 11, 2026

  1. Sync parent directory after creating config file.

    Else it may fail to boot. Ignore for windows.
    kjnilsson committed Mar 11, 2026
    Configuration menu
    Copy the full SHA
    86e6e2b View commit details
    Browse the repository at this point in the history
  2. Change registration vs log init order for new servers.

    New servers should register _after_ log initialisation to ensure the
    config file is fully written as it is required for successful recovery
    kjnilsson committed Mar 11, 2026
    Configuration menu
    Copy the full SHA
    3ec78e4 View commit details
    Browse the repository at this point in the history
  3. Merge pull request #589 from rabbitmq/recover-indexes

    Improve crash recovery resilience
    kjnilsson authored Mar 11, 2026
    Configuration menu
    Copy the full SHA
    d674387 View commit details
    Browse the repository at this point in the history
  4. v3.0.1

    kjnilsson committed Mar 11, 2026
    Configuration menu
    Copy the full SHA
    5179304 View commit details
    Browse the repository at this point in the history
Loading