Skip to content

faster segment holder locks#8007

Merged
generall merged 6 commits intodevfrom
faster-segment-holder-locks
Feb 3, 2026
Merged

faster segment holder locks#8007
generall merged 6 commits intodevfrom
faster-segment-holder-locks

Conversation

@generall
Copy link
Member

@generall generall commented Jan 28, 2026

This PR showcases an alternative approach for dealing with a lock of SegmentsHolder.

Previously, we used to lock SegmentsHolder for the whole duration of the read and write operations, while locking for write during swap of optimized segments.
This approach could cause the following:

  • when we want to lock for write, we need to wait for updates and searches to finish
  • searches and updates keep piling up
  • Once optimization swap finishes, all updates and searches are allowed to proceed

This can cause tail latencies spikes and maybe even timeouts.

What new approach does:

  • on search: we lock SegmentHolder just to copy list of segments, holder it immidiatelly released
  • on updates: we introduce an extra mutex to prevent parallel updates during optimizer swap
  • on optimizer: instead of write, we use upgradable read + write, minimizing critical section.

As a result, searches should be minimally affected by optimized segment swap.
Downside: we need to do extra copy of segment arcs, but it seems to be an unmeasurable difference.

UPD:

During implementation, I faced unexpected deadlock scenario, which led to further investigation of how LockerSegmentHolder is used. I changed it from type alias to dedicated type, so code navidation now becomes a little bit easier. Additionally I had to inspect and change locking pattern in the snapshot fucntion.

Overall locking schema how looks like this:

Screenshot from 2026-01-31 22-30-43

ToDo:

this PR only introduces new mechanism to one search request. Further PRs should apply it to all suitable read methods.

@generall generall force-pushed the faster-segment-holder-locks branch 2 times, most recently from 90e710f to 9e8323e Compare January 30, 2026 10:47
@generall generall force-pushed the faster-segment-holder-locks branch from 363a137 to f98b8f5 Compare January 30, 2026 18:45
* fix deadlock

* move LockedSegmentHolder into a dedicated file

* reorder locks for snapshot

* fmt

* upd comments
// Unproxy all segments
// Always do this to prevent leaving proxy segments behind
log::trace!("Unproxying all shard segments after function is applied");
let _update_guard = update_lock.blocking_write();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was incorrectly used. Logic inside try_unproxy_segment and unproxy_all_segments was usign 2-step propagation of updates, which assumes segment could change during the first stage, but this lock should have prevented it already on the first step.

New logic changes it into just a single step propagation with updates guard as an explicit parameter, so function can't be misused.

proxy_segment: LockedSegment,
) -> Result<RwLockUpgradableReadGuard<SegmentHolder>, RwLockUpgradableReadGuard<SegmentHolder>>
{
updates_guard: parking_lot::MutexGuard<'a, ()>, // Guarantee no updates are happening
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lock should guarantee there are no parallel updates happening

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making it a newtype could make it clearer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new type for guard? you mean like this pub type = ... ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean a wrapper type: https://doc.rust-lang.org/rust-by-example/generics/new_types.html

pub type doesn't create a new type, it creates merely an alias.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot suggest a type wrapper for updates_guard

Comment on lines 148 to +153
let mut write_segments = RwLockUpgradableReadGuard::upgrade(segments_lock);

// Batch 2: propagate changes to wrapped segment with segment holder write lock
// Propagate proxied changes to wrapped segment, take it out and swap with proxy
// Important: put the wrapped segment back with its original segment ID
let wrapped_segment = {
let mut proxy_segment = proxy_segment.write();
if let Err(err) = proxy_segment.propagate_to_wrapped() {
log::error!(
"Propagating proxy segment {segment_id} changes to wrapped segment failed, ignoring: {err}",
);
}
proxy_segment.wrapped_segment.clone()
};
let wrapped_segment = proxy_segment.read().wrapped_segment.clone();
write_segments.replace(segment_id, wrapped_segment).unwrap();

drop(updates_guard); // Release updates lock as soon as possible
Copy link
Member Author

@generall generall Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

section where reads are locked is very minimal now. Updates can hold in queue

@generall generall marked this pull request as ready for review February 1, 2026 13:46
coderabbitai[bot]

This comment was marked as resolved.

@generall generall requested a review from xzfc February 1, 2026 22:06
…izer.rs

Co-authored-by: Ivan Boldyrev <ivan.boldyrev@qdrant.com>
coderabbitai[bot]

This comment was marked as duplicate.

This comment was marked as resolved.

* Initial plan

* Add UpdatesGuard newtype wrapper for updates_guard

Co-authored-by: generall <1935623+generall@users.noreply.github.com>

* Fix linter issue - remove trailing space in doc comment

Co-authored-by: generall <1935623+generall@users.noreply.github.com>

* Remove unnecessary new() method, use tuple constructor directly

Co-authored-by: generall <1935623+generall@users.noreply.github.com>

* fix clippy

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: generall <1935623+generall@users.noreply.github.com>
Co-authored-by: generall <andrey@vasnetsov.com>
Co-authored-by: Jojii <15957865+JojiiOfficial@users.noreply.github.com>
@qdrant qdrant deleted a comment from coderabbitai bot Feb 3, 2026
@qdrant qdrant deleted a comment from coderabbitai bot Feb 3, 2026
Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see a benchmark on this one. I'm fine with triggering one in CI after merging.

@generall generall merged commit 44ac1f0 into dev Feb 3, 2026
15 checks passed
@generall generall deleted the faster-segment-holder-locks branch February 3, 2026 15:21
generall added a commit that referenced this pull request Feb 9, 2026
* non-locking retrieve

* optimization & write exclusive lock via upgradable_read

* faster segment holder locks debug (#8024)

* fix deadlock

* move LockedSegmentHolder into a dedicated file

* reorder locks for snapshot

* fmt

* upd comments

* Update lib/collection/src/collection_manager/optimizers/segment_optimizer.rs

Co-authored-by: Ivan Boldyrev <ivan.boldyrev@qdrant.com>

* Add UpdatesGuard type wrapper for updates_guard (#8031)

* Initial plan

* Add UpdatesGuard newtype wrapper for updates_guard

Co-authored-by: generall <1935623+generall@users.noreply.github.com>

* Fix linter issue - remove trailing space in doc comment

Co-authored-by: generall <1935623+generall@users.noreply.github.com>

* Remove unnecessary new() method, use tuple constructor directly

Co-authored-by: generall <1935623+generall@users.noreply.github.com>

* fix clippy

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: generall <1935623+generall@users.noreply.github.com>
Co-authored-by: generall <andrey@vasnetsov.com>

* Remove aliassed lifetime

Co-authored-by: Jojii <15957865+JojiiOfficial@users.noreply.github.com>

---------

Co-authored-by: Ivan Boldyrev <ivan.boldyrev@qdrant.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: generall <1935623+generall@users.noreply.github.com>
Co-authored-by: Tim Visée <tim+github@visee.me>
Co-authored-by: Jojii <15957865+JojiiOfficial@users.noreply.github.com>
@timvisee timvisee mentioned this pull request Feb 17, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants