Skip to content

Id tracker single write of pending updates#7872

Closed
IvanPleshkov wants to merge 4 commits intodevfrom
id-tracker-single-write-in-flush
Closed

Id tracker single write of pending updates#7872
IvanPleshkov wants to merge 4 commits intodevfrom
id-tracker-single-write-in-flush

Conversation

@IvanPleshkov
Copy link
Contributor

@IvanPleshkov IvanPleshkov commented Jan 6, 2026

Problem:

The mutable ID tracker collects all pending updates and stores them in the file in flush.

But it is implemented as a separate write call for each mapping item. In the out-of-disk case, there is a bug where data may be written partially, and the mappings file will be corrupted.

Solution:

This PR collects all pending updates into the temporary buffer first and then applies them as a single write call. Temporary buffer uses try_set_capacity_exact to catch OOM situations.

Please make attention while reviewing that file flushing was changed

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 6, 2026

Warning

Rate limit exceeded

@IvanPleshkov has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 35 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 4ee66c5 and 2edd366.

📒 Files selected for processing (3)
  • lib/common/io/src/counting_write.rs
  • lib/common/io/src/lib.rs
  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📝 Walkthrough

Walkthrough

Adds a new public CountingWrite writer in lib/common/io that implements Write and tracks total bytes written without storing data. Exposes bytes_written() and a Default constructor. Integrates CountingWrite into lib/segment/src/id_tracker/mutable_id_tracker.rs to perform a two-phase write: first estimate total size by writing to CountingWrite, allocate an exact-capacity buffer, then write all data into that buffer and persist it in one operation; adjusts flush/durability logic around the final write.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

release:1.16.3

Suggested reviewers

  • timvisee
  • coszio
  • JojiiOfficial

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: consolidating multiple write calls into a single write operation for pending ID tracker updates.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The pull request description clearly explains the problem (partial writes causing file corruption), the solution (single write call with pre-allocated buffer), and highlights important implementation details for reviewers.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
lib/segment/src/id_tracker/mutable_id_tracker.rs (1)

400-460: Two-phase buffering approach looks solid.

The implementation correctly:

  1. Estimates required size without allocation
  2. Pre-allocates exact buffer size with OOM handling via try_set_capacity_exact
  3. Performs single write_all to disk
  4. Explicitly drops the buffer before fsync to minimize memory pressure

This addresses the partial-write corruption risk from the PR objectives.

Optional: Consider using slice reference for the parameter

Using &[MappingChange] instead of &Vec<MappingChange> is more idiomatic and flexible:

 fn store_mapping_changes(
     mappings_path: &Path,
-    changes: &Vec<MappingChange>,
+    changes: &[MappingChange],
 ) -> OperationResult<()> {
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08785c5 and c1344d0.

📒 Files selected for processing (3)
  • lib/common/io/src/counting_write.rs
  • lib/common/io/src/lib.rs
  • lib/segment/src/id_tracker/mutable_id_tracker.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust
In new code, don't use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice - use bytemuck or zerocopy crates instead
Prefer explicit exhaustive matches over catch-all _ arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using : _ over using .. in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)

Files:

  • lib/common/io/src/lib.rs
  • lib/common/io/src/counting_write.rs
  • lib/segment/src/id_tracker/mutable_id_tracker.rs
🧠 Learnings (12)
📓 Common learnings
Learnt from: ffuugoo
Repo: qdrant/qdrant PR: 6352
File: lib/segment/src/id_tracker/mutable_id_tracker.rs:289-299
Timestamp: 2025-04-17T10:52:59.516Z
Learning: In Qdrant's MutableIdTracker, FileVersionTracker flushers are designed to capture the pending version at creation time (not at flush time) to ensure they remain in sync with pending_mappings flusher. This design ensures consistency between data and version information during flush operations.
Learnt from: timvisee
Repo: qdrant/qdrant PR: 6150
File: lib/segment/src/id_tracker/mutable_id_tracker.rs:62-71
Timestamp: 2025-03-12T16:23:44.715Z
Learning: In the MutableIdTracker implementation in Qdrant, the error case where versions file exists but mappings file is missing is considered a system-level error (potential corruption) rather than a user error, and is handled via logging and debug assertions without surfacing user-facing errors.
Learnt from: timvisee
Repo: qdrant/qdrant PR: 7388
File: lib/shard/src/segment_holder/flush.rs:73-76
Timestamp: 2025-10-13T09:34:22.740Z
Learning: In Qdrant's SegmentHolder flush logic, when `sync=false` and a background flush is already running, returning early is acceptable because: (1) flushes normally only start if there are changes, (2) the lowest version number is acknowledged in the WAL when changes exist, and (3) missed operations will replay on restart. However, an edge case exists when the `force` flag is used: if the first forced flush has no changes but a follow-up non-force flush with changes is ignored, the system might acknowledge a version that is too recent.
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.
📚 Learning: 2025-08-15T11:42:00.297Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.

Applied to files:

  • lib/common/io/src/lib.rs
  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📚 Learning: 2025-10-13T22:58:03.121Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7400
File: lib/segment/src/id_tracker/simple_id_tracker.rs:234-241
Timestamp: 2025-10-13T22:58:03.121Z
Learning: SimpleIdTracker in lib/segment/src/id_tracker/simple_id_tracker.rs is being deprecated and should not receive fixes related to version tracking or recovery logic, as it has a different version storage structure that is incompatible with newer trackers.

Applied to files:

  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📚 Learning: 2025-04-17T10:52:59.516Z
Learnt from: ffuugoo
Repo: qdrant/qdrant PR: 6352
File: lib/segment/src/id_tracker/mutable_id_tracker.rs:289-299
Timestamp: 2025-04-17T10:52:59.516Z
Learning: In Qdrant's MutableIdTracker, FileVersionTracker flushers are designed to capture the pending version at creation time (not at flush time) to ensure they remain in sync with pending_mappings flusher. This design ensures consistency between data and version information during flush operations.

Applied to files:

  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📚 Learning: 2025-07-11T11:35:21.549Z
Learnt from: generall
Repo: qdrant/qdrant PR: 6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using `id.to_string().parse().unwrap()` is acceptable for testing purposes and doesn't need to match production code's `id_tracker.internal_id()` approach. Test code can use mock implementations that serve the testing goals.

Applied to files:

  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📚 Learning: 2025-08-15T11:41:10.629Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:857-857
Timestamp: 2025-08-15T11:41:10.629Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the multivector offset storage has two different patterns: for RAM storage, offsets are collected into Vec<MultivectorOffset> and used directly; for MMAP storage, offsets are consumed to create a file via create_offsets_file_from_iter, then the file is loaded back as MultivectorOffsetsStorageMmap. The direct consumption of offsets iterator in the MMAP case is intentional.

Applied to files:

  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📚 Learning: 2025-06-02T18:10:47.203Z
Learnt from: coszio
Repo: qdrant/qdrant PR: 6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec<T>, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::<T>() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Applied to files:

  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📚 Learning: 2025-05-19T14:40:20.068Z
Learnt from: coszio
Repo: qdrant/qdrant PR: 6528
File: lib/posting_list/src/view.rs:118-118
Timestamp: 2025-05-19T14:40:20.068Z
Learning: In the bitpacking crate, the BitPacker::decompress_strictly_sorted function takes an Option<PointOffsetType> as its first parameter, which means using checked_sub(1) without unwrapping is intentional and correct.

Applied to files:

  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.

Applied to files:

  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📚 Learning: 2025-08-14T11:31:21.777Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7048
File: lib/quantization/src/encoded_storage.rs:61-79
Timestamp: 2025-08-14T11:31:21.777Z
Learning: In test storage implementations (like TestEncodedStorage in lib/quantization/src/encoded_storage.rs), IvanPleshkov prefers to keep the code simple rather than adding complex overflow protection, since test storage is not used in production and can be allowed to panic on edge cases.

Applied to files:

  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📚 Learning: 2025-05-08T10:34:42.223Z
Learnt from: timvisee
Repo: qdrant/qdrant PR: 6444
File: lib/segment/src/index/field_index/map_index/immutable_map_index.rs:373-375
Timestamp: 2025-05-08T10:34:42.223Z
Learning: In the Qdrant codebase, storage backends may have different error handling approaches. Specifically, the mmap variant of `remove_point` method does not return errors, while the RocksDB variant does propagate errors using the `?` operator. These differences are intentional by design.

Applied to files:

  • lib/segment/src/id_tracker/mutable_id_tracker.rs
📚 Learning: 2025-09-01T11:19:26.371Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7193
File: lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs:17-30
Timestamp: 2025-09-01T11:19:26.371Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs, the ChunkedMmapVectors underlying implementation does not validate against zero-sized vectors, so adding such validation in QuantizedChunkedMmapStorage::new is unnecessary and would be redundant.

Applied to files:

  • lib/segment/src/id_tracker/mutable_id_tracker.rs
🧬 Code graph analysis (1)
lib/segment/src/id_tracker/mutable_id_tracker.rs (1)
lib/segment/src/id_tracker/immutable_id_tracker.rs (6)
  • writer (212-212)
  • writer (234-234)
  • writer (238-238)
  • writer (243-243)
  • new (281-357)
  • drop (422-431)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Test Python bindings
  • GitHub Check: lint
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: test-consistency
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: e2e-tests
  • GitHub Check: rust-tests (windows-latest)
🔇 Additional comments (4)
lib/common/io/src/counting_write.rs (1)

1-25: LGTM!

Clean implementation of a byte-counting writer. The design is minimal and serves its purpose well for pre-calculating buffer sizes.

lib/common/io/src/lib.rs (1)

1-1: LGTM!

Module export for the new counting_write utility.

lib/segment/src/id_tracker/mutable_id_tracker.rs (2)

6-6: LGTM!

New imports for CountingWrite and TrySetCapacityExact to support the two-phase buffering strategy.

Also applies to: 20-20


470-478: LGTM!

The function correctly serializes changes without flushing—caller handles durability. Same optional &[MappingChange] suggestion applies here.

@IvanPleshkov IvanPleshkov force-pushed the id-tracker-single-write-in-flush branch from 4ee66c5 to 5d5d30a Compare January 6, 2026 13:13
@IvanPleshkov IvanPleshkov requested a review from agourlay January 6, 2026 13:18
Comment on lines -441 to -444

// Explicitly flush writer to catch IO errors
writer.flush()?;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this removed?

// Create a buffer with exact required size.
let mut writer: Vec<u8> = Vec::default();
writer
.try_set_capacity_exact(counting_write.bytes_written() as usize)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this atomic?

@timvisee
Copy link
Member

timvisee commented Jan 8, 2026

On second thought this approach has other problems.

Because now we might:

  • grow mappings file by 100 bytes (fill with zeroes)
  • write only 50 bytes of data
  • crash

Which leaves us with a file having a lot of zeroes at the end. In that case we cannot properly determine what entries are valid.

This is superseded by the alternative in #7877

@timvisee timvisee closed this Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants