Skip to content

Make ThrottledIterator nonblocking#139988

Merged
DaveCTurner merged 6 commits intoelastic:mainfrom
DaveCTurner:2025/12/24/nonblocking-ThrottledIterator
Dec 29, 2025
Merged

Make ThrottledIterator nonblocking#139988
DaveCTurner merged 6 commits intoelastic:mainfrom
DaveCTurner:2025/12/24/nonblocking-ThrottledIterator

Conversation

@DaveCTurner
Copy link
Copy Markdown
Member

Today ThrottledIterator may block in run() while waiting for another
thread to release the mutex on iterator, but this wait is unnecessary:
if some thread is already executing run() then all other threads
handling item completions can just release their permits and move on.

Today `ThrottledIterator` may block in `run()` while waiting for another
thread to release the mutex on `iterator`, but this wait is unnecessary:
if some thread is already executing `run()` then all other threads
handling item completions can just release their permits and move on.
@DaveCTurner DaveCTurner requested a review from a team as a code owner December 24, 2025 13:09
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Coordination (obsolete) Meta label for Distributed Coordination team. Obsolete. Please do not use. label Dec 24, 2025
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

refs.decRef();
private void onItemRelease() {
try {
if (permits.getAndIncrement() == 0) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This neatly removes the need for special recursion-detection: permits remains strictly positive until the very last line of run(), so the only way we can take this if() branch is if run() has finished its work.

Copy link
Copy Markdown
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, neat


private void run() {
while (permits.tryAcquire()) {
do {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can assert only one thread here at a time? Might be too close to the permits though to really make sense. Ok to skip if you do not find it useful.

At least adding a comment to run() that it is run in one thread at a time only would help reading this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes 549062f asserts that permits.get() is positive within the loop, which makes sense in its own right but also implies mutual exclusion.

@DaveCTurner DaveCTurner enabled auto-merge (squash) December 29, 2025 09:14
@DaveCTurner DaveCTurner merged commit 6bfad13 into elastic:main Dec 29, 2025
35 checks passed
rjernst pushed a commit to rjernst/elasticsearch that referenced this pull request Dec 29, 2025
Today `ThrottledIterator` may block in `run()` while waiting for another
thread to release the mutex on `iterator`, but this wait is unnecessary:
if some thread is already executing `run()` then all other threads
handling item completions can just release their permits and move on.
Kubik42 pushed a commit to Kubik42/elasticsearch that referenced this pull request Dec 30, 2025
Today `ThrottledIterator` may block in `run()` while waiting for another
thread to release the mutex on `iterator`, but this wait is unnecessary:
if some thread is already executing `run()` then all other threads
handling item completions can just release their permits and move on.
@repantis repantis added :Distributed/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. and removed :Distributed Coordination/Distributed labels Jan 28, 2026
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Mar 18, 2026
Silly test mistake: the throttle was supposed to limit the number of
fragments in the queue, but was closing the throttling ref after
submission rather than transmission so that the queue could grow without
bound sometimes.

It's been like this since the test was introduced in elastic#111933, but it
only started failing recently. I suspect until the performance
improvement in elastic#139988 landed it couldn't actually submit enough tasks
at once to cause a failure.

Closes elastic#143644
DaveCTurner added a commit that referenced this pull request Mar 19, 2026
Silly test mistake: the throttle was supposed to limit the number of
fragments in the queue, but was closing the throttling ref after
submission rather than transmission so that the queue could grow without
bound sometimes.

It's been like this since the test was introduced in #111933, but it
only started failing recently. I suspect until the performance
improvement in #139988 landed it couldn't actually submit enough tasks
at once to cause a failure.

Closes #143644
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Mar 19, 2026
Silly test mistake: the throttle was supposed to limit the number of
fragments in the queue, but was closing the throttling ref after
submission rather than transmission so that the queue could grow without
bound sometimes.

It's been like this since the test was introduced in elastic#111933, but it
only started failing recently. I suspect until the performance
improvement in elastic#139988 landed it couldn't actually submit enough tasks
at once to cause a failure.

Closes elastic#143644
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Mar 19, 2026
Silly test mistake: the throttle was supposed to limit the number of
fragments in the queue, but was closing the throttling ref after
submission rather than transmission so that the queue could grow without
bound sometimes.

It's been like this since the test was introduced in elastic#111933, but it
only started failing recently. I suspect until the performance
improvement in elastic#139988 landed it couldn't actually submit enough tasks
at once to cause a failure.

Closes elastic#143644
DaveCTurner added a commit that referenced this pull request Mar 19, 2026
Silly test mistake: the throttle was supposed to limit the number of
fragments in the queue, but was closing the throttling ref after
submission rather than transmission so that the queue could grow without
bound sometimes.

It's been like this since the test was introduced in #111933, but it
only started failing recently. I suspect until the performance
improvement in #139988 landed it couldn't actually submit enough tasks
at once to cause a failure.

Closes #143644
ncordon pushed a commit to ncordon/elasticsearch that referenced this pull request Mar 20, 2026
Silly test mistake: the throttle was supposed to limit the number of
fragments in the queue, but was closing the throttling ref after
submission rather than transmission so that the queue could grow without
bound sometimes.

It's been like this since the test was introduced in elastic#111933, but it
only started failing recently. I suspect until the performance
improvement in elastic#139988 landed it couldn't actually submit enough tasks
at once to cause a failure.

Closes elastic#143644
elasticsearchmachine pushed a commit that referenced this pull request Mar 23, 2026
Silly test mistake: the throttle was supposed to limit the number of
fragments in the queue, but was closing the throttling ref after
submission rather than transmission so that the queue could grow without
bound sometimes.

It's been like this since the test was introduced in #111933, but it
only started failing recently. I suspect until the performance
improvement in #139988 landed it couldn't actually submit enough tasks
at once to cause a failure.

Closes #143644
elasticsearchmachine pushed a commit that referenced this pull request Mar 23, 2026
Silly test mistake: the throttle was supposed to limit the number of
fragments in the queue, but was closing the throttling ref after
submission rather than transmission so that the queue could grow without
bound sometimes.

It's been like this since the test was introduced in #111933, but it
only started failing recently. I suspect until the performance
improvement in #139988 landed it couldn't actually submit enough tasks
at once to cause a failure.

Closes #143644
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
Silly test mistake: the throttle was supposed to limit the number of
fragments in the queue, but was closing the throttling ref after
submission rather than transmission so that the queue could grow without
bound sometimes.

It's been like this since the test was introduced in elastic#111933, but it
only started failing recently. I suspect until the performance
improvement in elastic#139988 landed it couldn't actually submit enough tasks
at once to cause a failure.

Closes elastic#143644
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Mar 30, 2026
On reading this class again a few months after elastic#139988 it took some time
to work this out. Adding a comment for the next person.
DaveCTurner added a commit that referenced this pull request Apr 7, 2026
On reading this class again a few months after #139988 it took some time
to work this out. Adding a comment for the next person.
mromaios pushed a commit to mromaios/elasticsearch that referenced this pull request Apr 9, 2026
On reading this class again a few months after elastic#139988 it took some time
to work this out. Adding a comment for the next person.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Distributed A catch all label for anything in the Distributed Area. Please avoid if you can. >non-issue Team:Distributed Coordination (obsolete) Meta label for Distributed Coordination team. Obsolete. Please do not use. v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants