Do not execute listeners when progress is updated to the end of the range#108095
Conversation
…gress-is-updated-to-end
|
Pinging @elastic/es-distributed (Team:Distributed) |
…gress-is-updated-to-end
| */ | ||
| class ProgressListenableActionFuture extends PlainActionFuture<Long> { | ||
|
|
||
| private record PositionAndListener(Long position, ActionListener<Long> listener) {} |
There was a problem hiding this comment.
Nit: I believe position can be a primitive long?
henningandersen
left a comment
There was a problem hiding this comment.
LGTM.
This makes sense to do. I wonder if we should also separately try to progress the complete marker to allow the fast path read while filling the rest of the region with data, before the gap completes (i.e., based on progress signals)?
| */ | ||
| class ProgressListenableActionFuture extends PlainActionFuture<Long> { | ||
|
|
||
| private record PositionAndListener(Long position, ActionListener<Long> listener) {} |
There was a problem hiding this comment.
| private record PositionAndListener(Long position, ActionListener<Long> listener) {} | |
| private record PositionAndListener(long position, ActionListener<Long> listener) {} |
…gress-is-updated-to-end
|
Thanks Artem and Henning!
I was looking at something similar, yes. In the meanwhile I opened #109212. |
| if (progressValue == end) { | ||
| return; // reached the end of the range, listeners will be completed by {@link #onResponse(Long)} | ||
| } |
There was a problem hiding this comment.
Sorry for my ignorance. If a listener waits for a progressValue lower than the end, it will still be invoked before onResponse is called. Is this not a problem? If the listener also uses tryRead to access data, can it fail at SparseFileTracker#checkAvailable because complete is not updated? Or maybe it is not possible for such listener to go through the tryRead route? In fact, I am not even sure how a listener waiting for the entire range can subsequent go through the tryRead route. Any pointer is appreciated!
There was a problem hiding this comment.
Sorry for my ignorance.
No ignorance here, just good questions :)
If a listener waits for a progressValue lower than the end, it will still be invoked before onResponse is called. Is this not a problem?
Not really a problem for the current read operation; it waited for a given range to be available because it wasn't, and it will be executed as soon as it is available. It may be an issue for a following tryRead (if any), which may also go on the slow read path (because complete may not be updated at the time it executes) and will also waits for the range to be available or will execute immediately if the range is covered by the write progress. Note that Lucene often uses 1kb or 4kb buffers to reads files, so Lucene can usually make 1 or more reads before having to wait a bit more.
I'm looking at ways to optimize the complete field update in my spacetime. Last time we talked about this it was too costly to update it on every write on all ranges, but we can maybe find a good compromise by updating for every write only if the range to be written is garanteed to make the complete advances (that is, all ranges before the one we write are already available).
There was a problem hiding this comment.
A couple options spring to mind:
- Only update
completein assocation with firing listeners. Since if we tell nobody about the progress, there is no need to updatecomplete. - We can check that the start of the range/gap is
>= completeand updatecompleteto progress if it is.
There was a problem hiding this comment.
Thanks Henning, those are two good ideas which I had not thought of. Worth investigating.
There was a problem hiding this comment.
@henningandersen I created the draft #109247, let me know what you think 🙏🏻
…le failing Test required some adjustments now listeners are not completed on progress update if they are waiting up to the end of the gap. Relates elastic#108095 Closes elastic#109237
ProgressListenableActionFutureexecutes listeners once theprogressis updated to a value the listener is waiting for.But when a listener waits for the exact end of a range/gap, there is no need to execute it at the time the progress is updated to the end of the range. Instead we can leave the listener and it will be executed when the range/gap is completed (which should happen just after).
I'd like to propose this change because we can have read listeners that use
CacheFile#tryReadwhich relies onSparseFileTracker#checkAvailableand the volatileSparseFileTracker#completefield, which is only updated when the range is completed, just before executing listeners.If listeners are executed when the progress is updated to the end of the range, they may call
tryReadbefore theSparseFileTracker#completefield is updated, making the fast read path fails.