Skip to content

[Detail Bug] Durability notifications can report writes as durable after DB is already closed #307

@detail-app

Description

@detail-app

Detail Bug Report

https://app.detail.dev/org_89d327b3-b883-4365-b6a3-46b6701342a9/bugs/bug_082b619d-8b3f-47bf-9f85-1e5109328517

Summary

  • Context: DurabilityNotifier is a new component that manages durability notifications for in-flight database writes by subscribing to SlateDB status updates.
  • Bug: When DurabilityNotifier::spawn is called on an already-closed database, it initializes the internal state with closed_reason = None, ignoring the database's actual close status.
  • Actual vs. expected: New subscribers receive Ok(durable_seq) responses even though the database is closed, when they should immediately receive Err(CloseReason).
  • Impact: Operations that should fail immediately upon database closure can incorrectly succeed, violating the invariant that no operations should succeed on a closed database.

Code with Bug

In lite/src/backend/durability_notifier.rs:

pub fn spawn(db: &slatedb::Db) -> Self {
    let status_rx = db.subscribe();
    let initial_durable_seq = status_rx.borrow().durable_seq;  // <-- BUG 🔴 reads durable_seq but ignores close_reason
    let state = Arc::new(Mutex::new(State {
        closed_reason: None,  // <-- BUG 🔴 always initialized to None (even if DB already closed)
        last_durable_seq: initial_durable_seq,
        waiters: VecDeque::new(),
    }));
    tokio::spawn(run_notifier(status_rx, state.clone()));
    Self { state }
}

Explanation

status_rx.borrow() returns the current DbStatus, which includes both durable_seq and close_reason. spawn only copies durable_seq and hardcodes closed_reason: None, so a notifier created after the DB is already closed temporarily appears “open”.

This creates a race window between returning from spawn and when the spawned run_notifier task first observes the close status. During this window, subscribe can see closed_reason == None and immediately satisfy the request with Ok(durable_seq) (if last_durable_seq >= target_durable_seq) instead of returning Err(CloseReason).

Recommended Fix

Initialize closed_reason from the initial subscribed status (clone full DbStatus and copy both fields) so the notifier reflects a pre-closed DB immediately.

History

This bug was introduced in commit f23d225. The commit added DurabilityNotifier but initialized state in spawn using only durable_seq and hardcoded closed_reason: None, missing the initial close_reason from DbStatus.

Metadata

Metadata

Assignees

Labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions