Skip to content

Fix stream records transfer data race, losing pending updates#8103

Merged
timvisee merged 7 commits intodevfrom
fix-stream-records-pending-queue-datarace
Feb 11, 2026
Merged

Fix stream records transfer data race, losing pending updates#8103
timvisee merged 7 commits intodevfrom
fix-stream-records-pending-queue-datarace

Conversation

@timvisee
Copy link
Member

@timvisee timvisee commented Feb 11, 2026

A steam record transfer did not include pending items in the update queue (channel). Such pending operations were effectively lost. This problem seems to have existed for a very long time.

A specific scenario is required to make the problem appear. You must have pending operations in the queue. And such update must remain in the queue until after the transfer has sent its respective points batch. It takes time to bootstrap a transfer and Qdrant <1.17.0 limited the queue to 200 operations, making it not very likely to happen.

Since allowing 1 million queued items in Qdrant 1.17 (#8046) the problem suddenly does become much more visible.

This PR fixes the problem and adds a test to assert correct behavior.

Tasks

  • Add integration test
  • Fix issue

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

@timvisee timvisee added bug Something isn't working release:1.17.0 labels Feb 11, 2026
@timvisee timvisee marked this pull request as ready for review February 11, 2026 14:40
@coderabbitai

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

This comment was marked as resolved.

///
/// Returns oneshot channel receiver that will be notified once the plunger operation is
/// processed.
pub async fn plunge_async(&self) -> CollectionResult<oneshot::Receiver<()>> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to polish this a bit more - abstracting away the receiver by just returning a future that needs to be resolved.

But it turns out to be quite the pain with life times further down the line. So I skipped this effort for now.

@timvisee timvisee merged commit c63f9ee into dev Feb 11, 2026
20 of 21 checks passed
@timvisee timvisee deleted the fix-stream-records-pending-queue-datarace branch February 11, 2026 16:18
@tellet-q tellet-q mentioned this pull request Feb 12, 2026
9 tasks
timvisee added a commit that referenced this pull request Feb 13, 2026
* Add integration test to assert all queued updates are also transferred

* Plunge update queue in stream records transfers

* Migrate existing plunger usages to new plunge helper

* Skip test if not compiled with staging flag

* Only send delay operation when staging feature is enabled

* Reformat

* Fix review remarks
@timvisee timvisee mentioned this pull request Feb 17, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working release:1.17.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants