-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Detail Bug Report
Summary
- Context: The delete-on-empty (DOE) feature allows streams to be automatically deleted after they've been empty for a configured minimum age. When a stream's DOE configuration changes or the stream becomes empty multiple times, multiple deadline entries can exist with different min_age values.
- Bug: The eligibility check uses
max_deadlinewithmax_min_agefrom different deadline entries, instead of checking each (deadline, min_age) pair individually. - Actual vs. expected: The current code fails to delete streams that should be deleted when one of the deadline entries is eligible but the max-value check is not satisfied. Expected behavior is to delete if ANY deadline entry is eligible.
- Impact: Streams that should be automatically deleted based on their delete-on-empty configuration are not deleted, leading to unexpected resource retention and violating user expectations.
Code with Bug
In lite/src/backend/bgtasks/stream_doe.rs:
async fn process_stream_doe(
&self,
stream_id: StreamId,
pending: PendingStreamDoe,
) -> Result<(), StreamDeleteOnEmptyError> {
if !self.stream_has_records(stream_id).await?
&& self
.stream_doe_is_eligible(stream_id, pending.max_min_age, pending.max_deadline) // <-- BUG 🔴 mixes max values from different entries
.await?
&& let Some((basin, stream)) = self.stream_id_mapping(stream_id).await?
{
match self.delete_stream(basin, stream).await {
Ok(()) | Err(DeleteStreamError::StreamNotFound(_)) => {}
Err(err) => return Err(err.into()),
}
}
self.clear_doe_deadlines(stream_id, &pending.deadlines)
.await?;
Ok(())
}PendingStreamDoe aggregates independent entries into unrelated maxima:
struct PendingStreamDoe {
deadlines: Vec<TimestampSecs>,
max_deadline: TimestampSecs,
max_min_age: Duration,
}
impl PendingStreamDoe {
fn push(&mut self, deadline: TimestampSecs, min_age: Duration) {
self.deadlines.push(deadline);
if deadline > self.max_deadline {
self.max_deadline = deadline; // <-- BUG 🔴 aggregated with max_min_age from potentially different entry
}
if min_age > self.max_min_age {
self.max_min_age = min_age;
}
}
}Explanation
DOE deadlines are stored as multiple independent (deadline, min_age) entries. The correct semantics are: delete the stream if any entry indicates it has been empty long enough.
Current behavior collapses multiple entries into (max_deadline, max_min_age) and checks only that single pair. This can create a false negative when the entry with the largest deadline is not the same entry with the largest min_age.
Counter-example (from the proof):
- Entry 1:
deadline=1100,min_age=100 - Entry 2:
deadline=1062,min_age=10 write_timestamp=1050
Entry 2 is eligible (1050 + 10 <= 1062), so the stream should be deleted. The current code instead checks 1050 + 100 <= 1100, which fails, so the stream is not deleted.
Codebase Inconsistency
Multiple deadline entries are intentionally created and not cleaned up when new ones are added:
- Reconfiguration adds a new DOE deadline when
delete_on_empty.min_agechanges (inlite/src/backend/streams.rs):
if let Some(min_age) = meta.config.delete_on_empty.min_age.filter(|age| !age.is_zero())
&& prior_doe_min_age != Some(min_age)
{
txn.put(
kv::stream_doe_deadline::ser_key(
kv::timestamp::TimestampSecs::after(min_age),
stream_id,
),
kv::stream_doe_deadline::ser_value(min_age),
)?;
}This makes it expected that multiple (deadline, min_age) pairs exist, but process_stream_doe evaluates only a single max-derived pair.
Failing Test
A regression test demonstrates the false-negative deletion:
Command:
cargo test --lib -p s2-lite "backend::bgtasks::stream_doe::tests::stream_doe_bug_max_values_mismatch"
Output:
thread 'backend::bgtasks::stream_doe::tests::stream_doe_bug_max_values_mismatch' panicked at lite/src/backend/bgtasks/stream_doe.rs:677:9:
Stream should have been deleted based on Entry 2 eligibility
Recommended Fix
Store each min_age alongside its corresponding deadline in PendingStreamDoe, and in process_stream_doe iterate through the (deadline, min_age) pairs, deleting if any call to stream_doe_is_eligible returns true.
History
This bug was introduced in commit 471ac62. This commit implemented the initial delete-on-empty feature, which tracks multiple deadline entries per stream. The bug stems from aggregating multiple entries using max_deadline and max_min_age and then using those unrelated maxima together in a single eligibility check rather than evaluating each entry independently.