Skip to content

Don't lock SegmentHolder for the entire duration of read operations#8056

Merged
JojiiOfficial merged 5 commits intodevfrom
reduce_segment_holder_locking_during_operations
Feb 6, 2026
Merged

Don't lock SegmentHolder for the entire duration of read operations#8056
JojiiOfficial merged 5 commits intodevfrom
reduce_segment_holder_locking_during_operations

Conversation

@JojiiOfficial
Copy link
Contributor

@JojiiOfficial JojiiOfficial commented Feb 4, 2026

Continuation of #8024

Benchmarks

Setup

Parallel updates + searches using the following commands:

# Upsert
bfb  -n 10M -d 512 --shards 3 --replication-factor 2 --on-disk-vectors true \
     --keywords 5000 --hnsw-m 0 --hnsw-payload-m 16 --tenants true -b 10 --timeout 60 --rps 100

# Search
bfb -n 230k -d 512 --skip-setup --search --keywords 5000 --rps 200 --measure-slow-requests true

For hardware, a 2 core 8GB cloud cluster with 1 node was used.

For the searches, a slightly modified version of BFB was used, mostly to measure slow requests in detail.
The changes can be found here: qdrant/bfb#97

Results

The results of 2-3 runs on each Qdrant@1.16.3 and this PR averaged.

PR

--- QPS (averaged) ---
Min qps:	192.970664440867
Avg qps:	199.982515579988
Median qps:	199.999541695647
Max qps:	201.976920003441

--- Slow requests (averaged) ---
Total: 976
Min slow request	time	0.100097543
Avg slow request	time	0.221581551442721
Median slow request	time	0.1821884285
p95 slow request	time	0.512773123
p99 slow request	time	0.578455576
Max slow request	time	0.6067236445


Qdrant@1.16.3

--- QPS (averaged) ---
Min qps:	194.190540742037
Avg qps:	199.985118634619
Median qps:	199.999534991853
Max qps:	204.55869924178

--- Slow requests (averaged) ---
Total: 1463
Min slow	request	time	0.100111962
Avg slow	request	time	0.314250290654919
Median slow	request	time	0.276687310333333
p95 slow	request	time	0.650480541
p99 slow	request	time	0.728994755
Max slow	request	time	0.792417519666667

Furthermore:

  • On Dev, the upsert command almost always timeouted after ~11 minutes.
  • On this PR, the timeout was around ~13-15 minutes.

These results show that this PR improves on slow requests during search

@JojiiOfficial JojiiOfficial force-pushed the reduce_segment_holder_locking_during_operations branch from eb68f56 to 983e249 Compare February 4, 2026 09:59
@JojiiOfficial JojiiOfficial changed the title [WIP] Don't lock SegmentHolder for the entire duration of read operations Don't lock SegmentHolder for the entire duration of read operations Feb 5, 2026
@JojiiOfficial JojiiOfficial marked this pull request as ready for review February 5, 2026 18:59
coderabbitai[bot]

This comment was marked as resolved.

generall

This comment was marked as resolved.

@qdrant qdrant deleted a comment from coderabbitai bot Feb 6, 2026
@timvisee timvisee self-requested a review February 6, 2026 09:06
Comment on lines 1320 to 1325
Copy link
Member

@timvisee timvisee Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the same treatment here. This still read locks the segment holder for the duration of the for loop.

Calling size_info() should be relatively cheap, but there's another read lock on the segment itself.

Two problems:

  1. on long writes the inner read lock might hang for some time
  2. the sync read lock blocks the async runtime!

In my opinion we should clone the segments. Then use spawn_blocking and run the for loop with read locks in blocking context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I fixed it in 34cf12f

In my opinion we should clone the segments. Then use spawn_blocking and run the for loop with read locks in blocking context.

I'll tackle this in a separate PR 👍🏻

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coderabbitai[bot]

This comment was marked as resolved.

@JojiiOfficial JojiiOfficial merged commit cf41580 into dev Feb 6, 2026
15 checks passed
@JojiiOfficial JojiiOfficial deleted the reduce_segment_holder_locking_during_operations branch February 6, 2026 10:32
@qdrant qdrant deleted a comment from coderabbitai bot Feb 9, 2026
generall added a commit that referenced this pull request Feb 9, 2026
…#8056)

* Don't lock SegmentHolder during read operations

* Add comment clarifying eager segment allocation

* review fixes

* Remove TODO

* Shorter locking of segment holder in calculate_local_shard_stats

---------

Co-authored-by: generall <andrey@vasnetsov.com>
@timvisee timvisee mentioned this pull request Feb 17, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants