-
Notifications
You must be signed in to change notification settings - Fork 7.4k
[Serve] Provide async versions of serve.get_app_handle() and serve.get_deployment_handle() #44782
Description
Description
serve.get_app_handle() and serve.get_deployment_handle() and their underlying method ServeControllerClient.get_handle() allow users to dynamically get a handle to a Serve Deployment (either the ingress deployment of an app, or a specific deployment, depending on which API you use).
These methods involve either 1 or 2 network calls to the Serve Controller to gather information, but those calls are done synchronously (ray.get(...)), which makes them inefficient to use in asynchronous code such as a FastAPI Deployment acting as a dynamic ingress to other deployments. Providing async variants of these functions would be a useful feature for async callers.
I would be happy to make these changes, though I think I would need some guidance on naming conventions and whatnot :)
Use case
See discussion at https://ray-distributed.slack.com/archives/CNCKBBRJL/p1713194071772759 for more details about our use case, but TLDR we create handles dynamically at runtime and noticed it was blocking other requests in our FastAPI app.