Skip to content

refactor: add WaitGroup for waiting for a dynamic set of tasks #2046

Merged
thomas-zahner merged 24 commits into
lycheeverse:masterfrom
rina-forks:waitgroup
Feb 25, 2026
Merged

refactor: add WaitGroup for waiting for a dynamic set of tasks #2046
thomas-zahner merged 24 commits into
lycheeverse:masterfrom
rina-forks:waitgroup

Conversation

@katrinafyi

@katrinafyi katrinafyi commented Feb 14, 2026

Copy link
Copy Markdown
Member

This is a small step towards recursion. It will enable waiting for a dynamic number of tasks which can spawn other tasks, which allows for correct and deadlock-free termination in the presence of (eventual) recursion.

I think recursion is not an astoundingly difficult feature but, because lychee is async and stream, it needs the combination of a few parts which might be tricky to get right. This is one of those parts. In fact, I had two different deadlock bugs while working on this PR before reaching the final implementation. If this was all done in one big recursion PR, then I think these bugs would be a lot harder to find and fix.

The old implementation in #1603 used Arc<AtomicUsize> to implement this requirement. This new PR uses a RAII-based guard value to represent tasks, which should make it much easier to write correct code - it's all handled by the borrow checker and Drop, so it's impossible to forget to adjust the counter.

In terms of testing, changes the check.rs command to rely on WaitGroup for termination. You can read about this in the comments. This means that every run of lychee goes through the WaitGroup code and acts as a test for WaitGroup. There is a risk that if it is buggy, the impacts would be big. However, I think the unit tests are big enough that they're a decent exercise of the WaitGroup. My second deadlock bug was found by make test.

You can also see the docs at https://waitgroup.lychee-docs-katrinafyi.pages.dev/lychee_lib/waiter/

Related work:

threaded synchronisation primitives (like std::sync::Mutex). if we were
unlucky, these can block the tokio runtime. replace std::sync::Arc with
a system based on (an even thinner wrapper around)
tokio::mspc::sync::channel.
Comment thread lychee-bin/src/commands/check.rs
Comment thread lychee-lib/src/waiter.rs

@mre mre left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm.

@thomas-zahner, any thoughts before we merge this?

@thomas-zahner

Copy link
Copy Markdown
Member

Very cool PR! I don't see any problems with it.

@katrinafyi Could you add a test to waiter.rs to to test if recursion would really work that way? So that the unused part (// unused for now, but will be used for recursion eventually.) is covered and tested already. This would be great documentation and help me (and others) understand how this is used in a simpler scenario without lychee's additional complexity.

i thought quicksort would be easy...
the Infallible name only really makes sense for Result<T, Infallible>
@katrinafyi

Copy link
Copy Markdown
Member Author

@thomas-zahner That's a really interesting idea. I've added a (hopefully) simple example which you can see here: https://github.com/lycheeverse/lychee/compare/74046a9..b27a34f

@thomas-zahner thomas-zahner left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@katrinafyi Thank you, that's very cool! This helps my understanding quite a bit.

Comment thread lychee-lib/src/waiter.rs Outdated
Comment thread lychee-lib/src/waiter.rs Outdated
/// The given `waiter` will be used to detect when the work has finished and it will
/// close the channels. Additionally, `waiter` can be omitted to show that without
/// the [`WaitGroup`], the tasks would not terminate.
#[allow(dead_code)]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either move into mod test or annotate with #[cfg(test)], this should resolve the #[allow(dead_code)], right?

@katrinafyi katrinafyi Feb 18, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I put it outside of mod tests because I wanted it to be included in the docs.

I tried a few things and it's a bit tricky. We need compiling to work (1) normally, (2) in docs, and (3) in tests.

  • #[cfg(any(test, doc))] on the examples: dev-dependencies is not available in doc, so tokio-stream has to be in dependencies. but then, it's flagged as an unused crate in normal compilation.
  • putting inside mod test and making test module #[cfg(any(test, doc))] has the same problems

Do you have thoughts or other suggestions?

@thomas-zahner thomas-zahner Feb 23, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I personally would really like to not have additional dependencies or code that is tests and documentation but possibly ends up in the final binary. (though in reality it should be optimised away)

I think doc tests would solve the problem. They are tested and great for documentation. It would be possible to highlight only the interesting bits of the code. It even seems possible to use dev dependencies in there!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved it into doctest and it looks great and works well, thanks for the suggestion! I should try and use doctests more often I think.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome 🚀 yeah they are quite powerful and useful. Thanks for addressing my concerns.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit that I'm confused by #[cfg(all(test, not(doctest)))]. Why is it not(doctest)? I see that it seems like the only working version where clippy doesn't complain. But do you understand why?

@katrinafyi katrinafyi Feb 25, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit I didn't try other options very rigorously. But the thinking is that the unused crate warning appears when the crate is included but not used. Dev dependencies are enabled in test, and in doctest they get used.

So, test and not doctest is precisely the case where it's included but not used. That's the idea at least.

ETA: The not doctest is just adding a bit more precision. The line suppresses unused crate warning but in doctest, the crate is used so it should be unnecessary to suppress the warning.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah, yes I see that makes sense. Thanks for the great PR!

Comment thread lychee-lib/src/waiter.rs Outdated
Comment thread lychee-lib/Cargo.toml Outdated
katrinafyi and others added 3 commits February 18, 2026 20:02
Co-authored-by: Thomas Zahner <thomas.zahner@protonmail.ch>
@katrinafyi katrinafyi closed this Feb 23, 2026
@katrinafyi

Copy link
Copy Markdown
Member Author

idk why that closed

@katrinafyi katrinafyi reopened this Feb 23, 2026
@thomas-zahner thomas-zahner merged commit 0f901b4 into lycheeverse:master Feb 25, 2026
7 checks passed
@mre mre mentioned this pull request Feb 25, 2026
@mre mre mentioned this pull request Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants