Skip to content

Fix shard snapshots IO error, proxy first then flush#7427

Merged
timvisee merged 1 commit intodevfrom
snapshot-flush-segments-after-proxy
Oct 20, 2025
Merged

Fix shard snapshots IO error, proxy first then flush#7427
timvisee merged 1 commit intodevfrom
snapshot-flush-segments-after-proxy

Conversation

@timvisee
Copy link
Member

@timvisee timvisee commented Oct 20, 2025

Fixes #7404

When taking a shard snapshot, we proxy all segments and flush them.

It is important that we proxy first, and then flush. Proxying ensures that the segments won't change anymore. After that we can flush the last changes that were still pending.

If we do it the other way around, new changes might land in the segments after flush but before we proxy them. The background flush task might trigger another flush during the snapshot process, which conflicts on file IO.

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

@timvisee timvisee changed the title Flush segments after proxy Fix shard snapshots IO error, proxy first then flush Oct 20, 2025
@timvisee timvisee force-pushed the snapshot-flush-segments-after-proxy branch from 2a8d12e to cead453 Compare October 20, 2025 11:07
@timvisee timvisee marked this pull request as ready for review October 20, 2025 11:07
@timvisee timvisee requested a review from agourlay October 20, 2025 11:07
coderabbitai[bot]

This comment was marked as resolved.

@qdrant qdrant deleted a comment from coderabbitai bot Oct 20, 2025
@timvisee timvisee requested a review from generall October 20, 2025 11:55
@timvisee
Copy link
Member Author

Status: @agourlay is running some test on this locally

Copy link
Member

@agourlay agourlay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Survived 1h20 crashing workload ✔️

@timvisee timvisee merged commit 9b0cd22 into dev Oct 20, 2025
15 checks passed
@timvisee timvisee deleted the snapshot-flush-segments-after-proxy branch October 20, 2025 12:41
@timvisee timvisee mentioned this pull request Oct 20, 2025
9 tasks
timvisee added a commit that referenced this pull request Nov 14, 2025
@timvisee timvisee mentioned this pull request Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants