Skip to content

[Detail Bug] Delete-on-empty: streams may not be auto-deleted when multiple DOE deadlines exist with different min_age #274

@detail-app

Description

@detail-app

Detail Bug Report

https://app.detail.dev/org_89d327b3-b883-4365-b6a3-46b6701342a9/bugs/bug_0f0cc24d-edf4-4b2e-a9f8-c980d3860150

Summary

  • Context: The delete-on-empty (DOE) feature allows streams to be automatically deleted after they've been empty for a configured minimum age. When a stream's DOE configuration changes or the stream becomes empty multiple times, multiple deadline entries can exist with different min_age values.
  • Bug: The eligibility check uses max_deadline with max_min_age from different deadline entries, instead of checking each (deadline, min_age) pair individually.
  • Actual vs. expected: The current code fails to delete streams that should be deleted when one of the deadline entries is eligible but the max-value check is not satisfied. Expected behavior is to delete if ANY deadline entry is eligible.
  • Impact: Streams that should be automatically deleted based on their delete-on-empty configuration are not deleted, leading to unexpected resource retention and violating user expectations.

Code with Bug

In lite/src/backend/bgtasks/stream_doe.rs:

async fn process_stream_doe(
    &self,
    stream_id: StreamId,
    pending: PendingStreamDoe,
) -> Result<(), StreamDeleteOnEmptyError> {
    if !self.stream_has_records(stream_id).await?
        && self
            .stream_doe_is_eligible(stream_id, pending.max_min_age, pending.max_deadline)  // <-- BUG 🔴 mixes max values from different entries
            .await?
        && let Some((basin, stream)) = self.stream_id_mapping(stream_id).await?
    {
        match self.delete_stream(basin, stream).await {
            Ok(()) | Err(DeleteStreamError::StreamNotFound(_)) => {}
            Err(err) => return Err(err.into()),
        }
    }
    self.clear_doe_deadlines(stream_id, &pending.deadlines)
        .await?;
    Ok(())
}

PendingStreamDoe aggregates independent entries into unrelated maxima:

struct PendingStreamDoe {
    deadlines: Vec<TimestampSecs>,
    max_deadline: TimestampSecs,
    max_min_age: Duration,
}

impl PendingStreamDoe {
    fn push(&mut self, deadline: TimestampSecs, min_age: Duration) {
        self.deadlines.push(deadline);
        if deadline > self.max_deadline {
            self.max_deadline = deadline;  // <-- BUG 🔴 aggregated with max_min_age from potentially different entry
        }
        if min_age > self.max_min_age {
            self.max_min_age = min_age;
        }
    }
}

Explanation

DOE deadlines are stored as multiple independent (deadline, min_age) entries. The correct semantics are: delete the stream if any entry indicates it has been empty long enough.

Current behavior collapses multiple entries into (max_deadline, max_min_age) and checks only that single pair. This can create a false negative when the entry with the largest deadline is not the same entry with the largest min_age.

Counter-example (from the proof):

  • Entry 1: deadline=1100, min_age=100
  • Entry 2: deadline=1062, min_age=10
  • write_timestamp=1050

Entry 2 is eligible (1050 + 10 <= 1062), so the stream should be deleted. The current code instead checks 1050 + 100 <= 1100, which fails, so the stream is not deleted.

Codebase Inconsistency

Multiple deadline entries are intentionally created and not cleaned up when new ones are added:

  • Reconfiguration adds a new DOE deadline when delete_on_empty.min_age changes (in lite/src/backend/streams.rs):
if let Some(min_age) = meta.config.delete_on_empty.min_age.filter(|age| !age.is_zero())
    && prior_doe_min_age != Some(min_age)
{
    txn.put(
        kv::stream_doe_deadline::ser_key(
            kv::timestamp::TimestampSecs::after(min_age),
            stream_id,
        ),
        kv::stream_doe_deadline::ser_value(min_age),
    )?;
}

This makes it expected that multiple (deadline, min_age) pairs exist, but process_stream_doe evaluates only a single max-derived pair.

Failing Test

A regression test demonstrates the false-negative deletion:

Command:

cargo test --lib -p s2-lite "backend::bgtasks::stream_doe::tests::stream_doe_bug_max_values_mismatch"

Output:

thread 'backend::bgtasks::stream_doe::tests::stream_doe_bug_max_values_mismatch' panicked at lite/src/backend/bgtasks/stream_doe.rs:677:9:
Stream should have been deleted based on Entry 2 eligibility

Recommended Fix

Store each min_age alongside its corresponding deadline in PendingStreamDoe, and in process_stream_doe iterate through the (deadline, min_age) pairs, deleting if any call to stream_doe_is_eligible returns true.

History

This bug was introduced in commit 471ac62. This commit implemented the initial delete-on-empty feature, which tracks multiple deadline entries per stream. The bug stems from aggregating multiple entries using max_deadline and max_min_age and then using those unrelated maxima together in a single eligibility check rather than evaluating each entry independently.

Metadata

Metadata

Assignees

Labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions