Id tracker single write of pending updates by IvanPleshkov · Pull Request #7872 · qdrant/qdrant

IvanPleshkov · 2026-01-06T13:04:53Z

Problem:

The mutable ID tracker collects all pending updates and stores them in the file in flush.

But it is implemented as a separate write call for each mapping item. In the out-of-disk case, there is a bug where data may be written partially, and the mappings file will be corrupted.

Solution:

This PR collects all pending updates into the temporary buffer first and then applies them as a single write call. Temporary buffer uses try_set_capacity_exact to catch OOM situations.

Please make attention while reviewing that file flushing was changed

coderabbitai · 2026-01-06T13:08:01Z

Warning

Rate limit exceeded

@IvanPleshkov has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 35 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 4ee66c5 and 2edd366.

📒 Files selected for processing (3)

lib/common/io/src/counting_write.rs
lib/common/io/src/lib.rs
lib/segment/src/id_tracker/mutable_id_tracker.rs

📝 Walkthrough

Walkthrough

Adds a new public CountingWrite writer in lib/common/io that implements Write and tracks total bytes written without storing data. Exposes bytes_written() and a Default constructor. Integrates CountingWrite into lib/segment/src/id_tracker/mutable_id_tracker.rs to perform a two-phase write: first estimate total size by writing to CountingWrite, allocate an exact-capacity buffer, then write all data into that buffer and persist it in one operation; adjusts flush/durability logic around the final write.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Mutable in-memory ID tracker without RocksDB #6150: Modifies the same mutable_id_tracker.rs write path and is closely related to the changes integrating CountingWrite.
Mutable ID tracker: improve version flushing, grow file in one shot #6267: Adjusts file-size handling and one-shot write behavior in mutable_id_tracker.rs, overlapping with the new buffering strategy.
Clone pending updates in buffered storages #7801: Changes flush and mapping-write logic in mutable_id_tracker.rs, touching durability and write-flow similar to this PR.

Suggested labels

release:1.16.3

Suggested reviewers

timvisee
coszio
JojiiOfficial

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately reflects the main change: consolidating multiple write calls into a single write operation for pending ID tracker updates.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check	✅ Passed	The pull request description clearly explains the problem (partial writes causing file corruption), the solution (single write call with pre-allocated buffer), and highlights important implementation details for reviewers.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

lib/segment/src/id_tracker/mutable_id_tracker.rs (1)
400-460: Two-phase buffering approach looks solid.

The implementation correctly:

Estimates required size without allocation

Pre-allocates exact buffer size with OOM handling via try_set_capacity_exact

Performs single write_all to disk

Explicitly drops the buffer before fsync to minimize memory pressure

This addresses the partial-write corruption risk from the PR objectives.
Optional: Consider using slice reference for the parameter

Using &[MappingChange] instead of &Vec<MappingChange> is more idiomatic and flexible:
 fn store_mapping_changes(
     mappings_path: &Path,
-    changes: &Vec<MappingChange>,
+    changes: &[MappingChange],
 ) -> OperationResult<()> {

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08785c5 and c1344d0.

📒 Files selected for processing (3)

lib/common/io/src/counting_write.rs
lib/common/io/src/lib.rs
lib/segment/src/id_tracker/mutable_id_tracker.rs

🧰 Additional context used

📓 Path-based instructions (1)

**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust
In new code, don't use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice - use bytemuck or zerocopy crates instead
Prefer explicit exhaustive matches over catch-all _ arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using : _ over using .. in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)

Files:

lib/common/io/src/lib.rs
lib/common/io/src/counting_write.rs
lib/segment/src/id_tracker/mutable_id_tracker.rs

🧠 Learnings (12)

📓 Common learnings

Learnt from: ffuugoo
Repo: qdrant/qdrant PR: 6352
File: lib/segment/src/id_tracker/mutable_id_tracker.rs:289-299
Timestamp: 2025-04-17T10:52:59.516Z
Learning: In Qdrant's MutableIdTracker, FileVersionTracker flushers are designed to capture the pending version at creation time (not at flush time) to ensure they remain in sync with pending_mappings flusher. This design ensures consistency between data and version information during flush operations.

Learnt from: timvisee
Repo: qdrant/qdrant PR: 6150
File: lib/segment/src/id_tracker/mutable_id_tracker.rs:62-71
Timestamp: 2025-03-12T16:23:44.715Z
Learning: In the MutableIdTracker implementation in Qdrant, the error case where versions file exists but mappings file is missing is considered a system-level error (potential corruption) rather than a user error, and is handled via logging and debug assertions without surfacing user-facing errors.

Learnt from: timvisee
Repo: qdrant/qdrant PR: 7388
File: lib/shard/src/segment_holder/flush.rs:73-76
Timestamp: 2025-10-13T09:34:22.740Z
Learning: In Qdrant's SegmentHolder flush logic, when `sync=false` and a background flush is already running, returning early is acceptable because: (1) flushes normally only start if there are changes, (2) the lowest version number is acknowledged in the WAL when changes exist, and (3) missed operations will replay on restart. However, an edge case exists when the `force` flag is used: if the first forced flush has no changes but a follow-up non-force flush with changes is ignored, the system might acknowledge a version that is too recent.

Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.

📚 Learning: 2025-08-15T11:42:00.297Z

Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.

Applied to files:

lib/common/io/src/lib.rs
lib/segment/src/id_tracker/mutable_id_tracker.rs

📚 Learning: 2025-10-13T22:58:03.121Z

Learnt from: generall
Repo: qdrant/qdrant PR: 7400
File: lib/segment/src/id_tracker/simple_id_tracker.rs:234-241
Timestamp: 2025-10-13T22:58:03.121Z
Learning: SimpleIdTracker in lib/segment/src/id_tracker/simple_id_tracker.rs is being deprecated and should not receive fixes related to version tracking or recovery logic, as it has a different version storage structure that is incompatible with newer trackers.

Applied to files:

lib/segment/src/id_tracker/mutable_id_tracker.rs

📚 Learning: 2025-04-17T10:52:59.516Z

Learnt from: ffuugoo
Repo: qdrant/qdrant PR: 6352
File: lib/segment/src/id_tracker/mutable_id_tracker.rs:289-299
Timestamp: 2025-04-17T10:52:59.516Z
Learning: In Qdrant's MutableIdTracker, FileVersionTracker flushers are designed to capture the pending version at creation time (not at flush time) to ensure they remain in sync with pending_mappings flusher. This design ensures consistency between data and version information during flush operations.

Applied to files:

lib/segment/src/id_tracker/mutable_id_tracker.rs

📚 Learning: 2025-07-11T11:35:21.549Z

Learnt from: generall
Repo: qdrant/qdrant PR: 6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using `id.to_string().parse().unwrap()` is acceptable for testing purposes and doesn't need to match production code's `id_tracker.internal_id()` approach. Test code can use mock implementations that serve the testing goals.

Applied to files:

lib/segment/src/id_tracker/mutable_id_tracker.rs

📚 Learning: 2025-08-15T11:41:10.629Z

Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:857-857
Timestamp: 2025-08-15T11:41:10.629Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the multivector offset storage has two different patterns: for RAM storage, offsets are collected into Vec<MultivectorOffset> and used directly; for MMAP storage, offsets are consumed to create a file via create_offsets_file_from_iter, then the file is loaded back as MultivectorOffsetsStorageMmap. The direct consumption of offsets iterator in the MMAP case is intentional.

Applied to files:

lib/segment/src/id_tracker/mutable_id_tracker.rs

📚 Learning: 2025-06-02T18:10:47.203Z

Learnt from: coszio
Repo: qdrant/qdrant PR: 6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec<T>, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::<T>() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Applied to files:

lib/segment/src/id_tracker/mutable_id_tracker.rs

📚 Learning: 2025-05-19T14:40:20.068Z

Learnt from: coszio
Repo: qdrant/qdrant PR: 6528
File: lib/posting_list/src/view.rs:118-118
Timestamp: 2025-05-19T14:40:20.068Z
Learning: In the bitpacking crate, the BitPacker::decompress_strictly_sorted function takes an Option<PointOffsetType> as its first parameter, which means using checked_sub(1) without unwrapping is intentional and correct.

Applied to files:

lib/segment/src/id_tracker/mutable_id_tracker.rs

📚 Learning: 2025-08-10T18:30:02.986Z

Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.

Applied to files:

lib/segment/src/id_tracker/mutable_id_tracker.rs

📚 Learning: 2025-08-14T11:31:21.777Z

Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7048
File: lib/quantization/src/encoded_storage.rs:61-79
Timestamp: 2025-08-14T11:31:21.777Z
Learning: In test storage implementations (like TestEncodedStorage in lib/quantization/src/encoded_storage.rs), IvanPleshkov prefers to keep the code simple rather than adding complex overflow protection, since test storage is not used in production and can be allowed to panic on edge cases.

Applied to files:

lib/segment/src/id_tracker/mutable_id_tracker.rs

📚 Learning: 2025-05-08T10:34:42.223Z

Learnt from: timvisee
Repo: qdrant/qdrant PR: 6444
File: lib/segment/src/index/field_index/map_index/immutable_map_index.rs:373-375
Timestamp: 2025-05-08T10:34:42.223Z
Learning: In the Qdrant codebase, storage backends may have different error handling approaches. Specifically, the mmap variant of `remove_point` method does not return errors, while the RocksDB variant does propagate errors using the `?` operator. These differences are intentional by design.

Applied to files:

lib/segment/src/id_tracker/mutable_id_tracker.rs

📚 Learning: 2025-09-01T11:19:26.371Z

Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7193
File: lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs:17-30
Timestamp: 2025-09-01T11:19:26.371Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs, the ChunkedMmapVectors underlying implementation does not validate against zero-sized vectors, so adding such validation in QuantizedChunkedMmapStorage::new is unnecessary and would be redundant.

Applied to files:

lib/segment/src/id_tracker/mutable_id_tracker.rs

🧬 Code graph analysis (1)

lib/segment/src/id_tracker/mutable_id_tracker.rs (1)

lib/segment/src/id_tracker/immutable_id_tracker.rs (6)

writer (212-212)

writer (234-234)

writer (238-238)

writer (243-243)

new (281-357)

drop (422-431)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: Test Python bindings
GitHub Check: lint
GitHub Check: test-consensus-compose
GitHub Check: integration-tests
GitHub Check: integration-tests-consensus
GitHub Check: test-consistency
GitHub Check: test-shard-snapshot-api-s3-minio
GitHub Check: rust-tests (macos-latest)
GitHub Check: rust-tests (ubuntu-latest)
GitHub Check: e2e-tests
GitHub Check: rust-tests (windows-latest)

🔇 Additional comments (4)

lib/common/io/src/counting_write.rs (1)

1-25: LGTM!

Clean implementation of a byte-counting writer. The design is minimal and serves its purpose well for pre-calculating buffer sizes.

lib/common/io/src/lib.rs (1)

1-1: LGTM!

Module export for the new counting_write utility.

lib/segment/src/id_tracker/mutable_id_tracker.rs (2)

6-6: LGTM!

New imports for CountingWrite and TrySetCapacityExact to support the two-phase buffering strategy.

Also applies to: 20-20

470-478: LGTM!

The function correctly serializes changes without flushing—caller handles durability. Same optional &[MappingChange] suggestion applies here.

generall · 2026-01-06T13:37:06Z

lib/segment/src/id_tracker/mutable_id_tracker.rs

-
-    // Explicitly flush writer to catch IO errors
-    writer.flush()?;
-


why is this removed?

generall · 2026-01-06T13:39:53Z

lib/segment/src/id_tracker/mutable_id_tracker.rs

+    // Create a buffer with exact required size.
+    let mut writer: Vec<u8> = Vec::default();
+    writer
+        .try_set_capacity_exact(counting_write.bytes_written() as usize)


Is this atomic?

timvisee · 2026-01-08T13:18:24Z

On second thought this approach has other problems.

Because now we might:

grow mappings file by 100 bytes (fill with zeroes)
write only 50 bytes of data
crash

Which leaves us with a file having a lot of zeroes at the end. In that case we cannot properly determine what entries are valid.

This is superseded by the alternative in #7877

coderabbitai bot reviewed Jan 6, 2026

View reviewed changes

id tracker single write in flush

5d5d30a

IvanPleshkov force-pushed the id-tracker-single-write-in-flush branch from 4ee66c5 to 5d5d30a Compare January 6, 2026 13:13

review remarks

83f66a0

IvanPleshkov requested a review from agourlay January 6, 2026 13:18

IvanPleshkov added 2 commits January 6, 2026 14:24

check written bytes count

d751c99

fix comments

2edd366

generall reviewed Jan 6, 2026

View reviewed changes

IvanPleshkov marked this pull request as draft January 6, 2026 13:54

github-actions bot mentioned this pull request Jan 6, 2026

Flaky test tests::hw_metrics::test_hw_metrics_cancellation #6070

Open

IvanPleshkov mentioned this pull request Jan 7, 2026

id tracker persisted mappings offset #7877

Merged

timvisee closed this Jan 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Id tracker single write of pending updates#7872

Id tracker single write of pending updates#7872
IvanPleshkov wants to merge 4 commits intodevfrom
id-tracker-single-write-in-flush

IvanPleshkov commented Jan 6, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Jan 6, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Uh oh!

generall Jan 6, 2026

Uh oh!

generall Jan 6, 2026

Uh oh!

timvisee commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		// Explicitly flush writer to catch IO errors
		writer.flush()?;

Conversation

IvanPleshkov commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem:

Solution:

Uh oh!

coderabbitai bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

generall Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

generall Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

timvisee commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

IvanPleshkov commented Jan 6, 2026 •

edited

Loading

coderabbitai bot commented Jan 6, 2026 •

edited

Loading