Skip to content

Limit number of Clocks#8105

Merged
timvisee merged 5 commits intodevfrom
limit-clocks-count
Feb 12, 2026
Merged

Limit number of Clocks#8105
timvisee merged 5 commits intodevfrom
limit-clocks-count

Conversation

@generall
Copy link
Member

@generall generall commented Feb 11, 2026

Clocks are used to track partial ordering of updates across whole cluster. Number of clocks limit the total update parallelism of individual peer, but this limitation seems to acceptable in order to prevent clogging state with too many clocks

@generall generall requested review from agourlay and ffuugoo February 11, 2026 14:14
@generall generall changed the title Limit count of Clocks Limit number of Clocks Feb 11, 2026
coderabbitai[bot]

This comment was marked as resolved.

Copy link
Contributor

@ffuugoo ffuugoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM. Maybe add explicit yield_now().await in between iterations.

@qdrant qdrant deleted a comment from coderabbitai bot Feb 11, 2026
@generall
Copy link
Member Author

async fn get_clock(&self) -> ClockGuard

there are about 100 usage in sync tests, it would be a huge refactoring

@coderabbitai

This comment was marked as resolved.

@ffuugoo

This comment was marked as resolved.

Comment on lines 1456 to +1463
// Release some kept clocks
kept_clocks.retain(|(keep_for, _)| *keep_for > 1);
kept_clocks.retain_mut(|(keep_for, _)| {
if *keep_for == 0 {
return false;
}
*keep_for -= 1;
true
});
Copy link
Member

@timvisee timvisee Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This artificial release mechanism in the test was broken, as it did not decrease the counter.

I've also swapped the number of rounds we keep locks for in the above two loops. Now the smaller loop keeps clocks for longer, while the bigger loop keeps locks for shorter. This way we stay below the specified maximum of 256 clocks and prevent the test panicking.

I believe it's a reasonable way to make the test happy.

@timvisee timvisee merged commit 538eb74 into dev Feb 12, 2026
15 checks passed
@timvisee timvisee deleted the limit-clocks-count branch February 12, 2026 09:56
timvisee added a commit that referenced this pull request Feb 13, 2026
* [manual] limit number of update clocks to 64 to prevent permanent overhead created by spikes

* explicit yield + fix tests

* Add explicit comment on why we yield async thread

* In resolve_wal_delta_randomized test fix clock cleanup and satisfy test

* Use tokio timeout

Co-authored-by: Roman Titov <ffuugoo@users.noreply.github.com>

---------

Co-authored-by: Tim Visée <tim+github@visee.me>
Co-authored-by: timvisee <tim@visee.me>
Co-authored-by: Roman Titov <ffuugoo@users.noreply.github.com>
match self.clock_set.lock().await.get_clock() {
Some(clock) => return clock,
// Prevent blocking async runtime with spinlock
None => yield_now().await,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have a condvar that we can await on.

Copy link
Member

@timvisee timvisee Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is many clocks. We should not wait with a condvar on all of them.

Having a separate atomic or notification channel is possible, but it makes the structure more complicated.

We don't expect to hit this branch under normal circumstances. Only when an excessive amount of parallelism is used, which is a problem on its own.

@timvisee timvisee mentioned this pull request Feb 17, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants