Skip to content

Allow retaining more closed segments#6976

Merged
timvisee merged 11 commits intodevfrom
more-closed-segments
Aug 7, 2025
Merged

Allow retaining more closed segments#6976
timvisee merged 11 commits intodevfrom
more-closed-segments

Conversation

@KShivendu
Copy link
Member

@KShivendu KShivendu commented Aug 4, 2025

Introduces a new wal config called wal_retain_closed. It basically makes wal delta transfers (faster) more likely in large clusters over streaming records (slower). The default value is 1 to maintain backwards compatibility.

It can be configured using these methods:

  1. Env var export QDRANT__STORAGE__WAL__WAL_RETAIN_CLOSED=5

  2. Qdrant YAML Config

storage:
  wal:
    wal_retain_closed: 5

Or while creating the collection you can pass:

PUT /collections/{name}
{
  // ...
  "wal_config": {
    "wal_retain_closed": 5
  }
}

Note that it can't be modified after creating collections (same as other wal configs)

@coderabbitai

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@KShivendu KShivendu changed the title Retain more closed segments Allow retaining more closed segments Aug 4, 2025
coderabbitai[bot]

This comment was marked as resolved.

Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works as I expect.

It currently is not possible for users to change this number in existing collections. It is persisted on collection creation, and our collection update API does not expose it. Maybe that's something we can add in a separate PR. I don't think it's an immediate problem and am fine by merging this as is.

Marked 'request changes' until we bump the WAL dependency.

Comment on lines +56 to +57
# Number of closed WAL segments to keep.
wal_retain_closed: 1
Copy link
Member

@timvisee timvisee Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote to not add it to our config.yaml by default and to keep it a bit more hidden because it is quite specific. If people actively appear to use it, we can always expose it here later.

Wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine. Closed vs open segments is a bit deeper in terms of abstraction. So it's fine to not expose it right away. We can always do it later.

Cargo.toml Outdated
uuid = { version = "1.17", features = ["v4", "serde"] }
validator = { version = "0.18.1", features = ["derive"] }
wal = { git = "https://github.com/qdrant/wal.git", rev = "c4b26b9c0ccc0e06ba7391189e4c8eac051ca531" }
wal = { git = "https://github.com/qdrant/wal.git", branch = "max-closed-to-preserve" }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's switch this back to a commit hash once merged in qdrant/wal.

@agourlay
Copy link
Member

agourlay commented Aug 6, 2025

I think one caveat of this approach is that it makes snapshots larger by default, especially in setup with a high segments count.

Currently we have 2 WAL segments (1 open & 1 closed) = 32mb + 32mb = 64mb per data segment.

With "wal_retain_closed": 5 we would have (1open & 5 closed) = 32mb + 5*32mb = 192mb per data segment.

Not a blocker for sure but something to keep in mind during operations.

WDYT?

@timvisee
Copy link
Member

timvisee commented Aug 6, 2025

@agourlay the default is still just 1 closed segment, which is the same as before. Snapshots should therefore be exactly the same size.

The mention of export QDRANT__STORAGE__WAL__WAL_RETAIN_CLOSED=5 is purely an example of how it may be configured by a user.

If a user chooses to set a higher number of segments to maintain, then we keep more data. And yes, by definition, that means larger snapshots. But there's no way around that. In that case having more history is actually desired.

@KShivendu KShivendu requested a review from timvisee August 7, 2025 04:27
Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 🙏

@timvisee timvisee merged commit 34e18eb into dev Aug 7, 2025
16 checks passed
@timvisee timvisee deleted the more-closed-segments branch August 7, 2025 11:44
timvisee pushed a commit that referenced this pull request Aug 11, 2025
* Retain more closed segments

* Add back rocksdb

* Update WalOptions/WalConfig across the code

* Use NonZeroUsize

* Expose via APIs

* Update gRPC docs

* Fix stoarge compat test

* recompile openapi.json with rocksdb

* default wal retain closed fn

* update openapi.json

* Use qdrant/wal latest commit and remove from config.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants