[serve] add ability to track child requests#53941
Merged
zcin merged 1 commit intoray-project:masterfrom Jun 19, 2025
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR adds the ability to track child Serve requests so that when a parent request is cancelled, all its in-flight children are also cancelled.
- Introduce in-flight request bookkeeping in the request context and
ReplicaResult. - Update handle logic to catch and translate
asyncio.CancelledErrorintoRequestCancelledErrorbased on the cancellation flag. - Adjust existing HTTP and iterator cancellation tests to use a URL helper and new error naming.
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| python/ray/serve/tests/test_http_cancellation.py | Replace hardcoded URL with get_application_url and update expected cancellation tag name. |
| python/ray/serve/tests/test_handle_cancellation.py | Remove unnecessary assert inside pytest.raises for async iterator cancellation. |
| python/ray/serve/handle.py | Add try/except around __await__/__anext__ to rethrow RequestCancelledError when needed. |
| python/ray/serve/context.py | Add cancel_on_parent_request_cancel flag and helper functions to track in-flight requests. |
| python/ray/serve/_private/replica_result.py | Register and clean up in-flight child requests; adjust cancellation error mapping. |
| python/ray/serve/_private/replica.py | Extend _on_request_cancelled to cancel both pending and in-flight child requests. |
Comments suppressed due to low confidence (4)
python/ray/serve/_private/replica.py:896
- [nitpick] The parameter was renamed from
request_metadatatometadata, which is inconsistent with other methods in this class. Consider aligning on one name (e.g.,request_metadata) for clarity.
):
python/ray/serve/context.py:282
- [nitpick] These new helper functions lack type hints and docstrings. Adding a return type annotation and a brief docstring would improve readability and maintainability.
def _get_in_flight_requests(parent_request_id):
python/ray/serve/handle.py:282
- The removal of the
try/exceptaround thewrap_futurecall meansasyncio.CancelledErrorwill bypass the original cancellation handling in_fetch_future_result_async. Consider reinstating the catch to raiseRequestCancelledErrorwhenself._cancelledis true, preserving the intended semantics.
self._replica_result = await asyncio.wrap_future(
python/ray/serve/_private/replica_result.py:112
- Catching
TaskCancelledErrorand rethrowing a bareasyncio.CancelledErrorloses the original request ID context. It may be better to rethrowRequestCancelledError(self._request_id)here to maintain consistency in cancellation reporting.
raise asyncio.CancelledError()
abrarsheikh
reviewed
Jun 19, 2025
Comment on lines
+282
to
+284
| self._replica_result = await asyncio.wrap_future( | ||
| self._replica_result_future | ||
| ) |
Contributor
There was a problem hiding this comment.
curious why did we have to move the exception handling to the caller?
abrarsheikh
approved these changes
Jun 19, 2025
minerharry
pushed a commit
to minerharry/ray
that referenced
this pull request
Jun 27, 2025
## Why are these changes needed? Track child requests in Ray Serve. Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
elliot-barn
pushed a commit
that referenced
this pull request
Jul 2, 2025
## Why are these changes needed? Track child requests in Ray Serve. Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
Track child requests in Ray Serve.