Allow configuring Serve control loop interval, add related docs by JoshKarpel · Pull Request #45063 · ray-project/ray

JoshKarpel · 2024-04-30T20:45:07Z

Why are these changes needed?

In our experiments, adjusting this value upward helps the Serve Controller keep up with a large number of autoscaling metrics pushes from a large number of DeploymentHandles (because the loop body is blocking, so increasing the interval lets more other code when the control loop isn't running), at the cost of control loop responsiveness (since it doesn't run as often).

Related issue number

Closes #44784 ... for now!

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

Signed-off-by: Josh Karpel <josh.karpel@gmail.com>

Signed-off-by: Josh Karpel <josh.karpel@gmail.com> Signed-off-by: Josh Karpel <josh.karpel@gmail.com>

JoshKarpel · 2024-04-30T20:48:45Z

doc/source/serve/advanced-guides/performance.md


 You can set an end-to-end timeout for HTTP requests by setting the `request_timeout_s` in the `http_options` field of the Serve config. HTTP Proxies will wait for that many seconds before terminating an HTTP request. This config is global to your Ray cluster, and it cannot be updated during runtime. Use [client-side retries](serve-best-practices-http-requests) to retry requests that time out due to transient failures.
+
+### Give the Serve Controller more time to process requests


Took the liberty of adding a section here in case others run into the same issue. Please feel free to reword as desired, not sure what level of detail you want here :)

JoshKarpel · 2024-04-30T20:49:27Z

python/ray/serve/_private/constants.py

-# How often to call the control loop on the controller.
-CONTROL_LOOP_PERIOD_S = 0.1
+# How long to sleep between control loop cycles on the controller.
+CONTROL_LOOP_INTERVAL_S = float(os.getenv("RAY_SERVE_CONTROL_LOOP_INTERVAL_S", 0.1))


I thought INTERVAL made more sense than PERIOD as the name, since it's the time between cycles, not a target for when the next cycle starts.

way better :)

JoshKarpel · 2024-04-30T20:50:31Z

python/ray/serve/autoscaling_policy.py

        # Only actually scale the replicas if we've made this decision for
        # 'scale_up_consecutive_periods' in a row.
-        if decision_counter > int(config.upscale_delay_s / CONTROL_LOOP_PERIOD_S):
+        if decision_counter > int(config.upscale_delay_s / CONTROL_LOOP_INTERVAL_S):


Seems like the interval is used in a few other places to count control loop cycles - am I breaking some assumption by allowing it to be configurable to some larger value (e.g., does this still make sense if the loop interval is large)?

I don't believe so -- but @zcin should confirm

I don't think this breaks any assumptions, if upscale delay < control loop interval, then the intervals between cycles that the controller sleeps for already inherently "covers" the required delay, so this code still makes sense.

edoakes

LGTM pending @zcin chiming in on the autoscaling question

edoakes · 2024-04-30T21:06:00Z

python/ray/serve/autoscaling_policy.py

        # Only actually scale the replicas if we've made this decision for
        # 'scale_up_consecutive_periods' in a row.
-        if decision_counter > int(config.upscale_delay_s / CONTROL_LOOP_PERIOD_S):
+        if decision_counter > int(config.upscale_delay_s / CONTROL_LOOP_INTERVAL_S):


I don't believe so -- but @zcin should confirm

edoakes · 2024-04-30T21:06:06Z

python/ray/serve/_private/constants.py

-# How often to call the control loop on the controller.
-CONTROL_LOOP_PERIOD_S = 0.1
+# How long to sleep between control loop cycles on the controller.
+CONTROL_LOOP_INTERVAL_S = float(os.getenv("RAY_SERVE_CONTROL_LOOP_INTERVAL_S", 0.1))


way better :)

JoshKarpel · 2024-04-30T23:47:02Z

Thanks for the quick reviews! Much appreciated!

JoshKarpel added 2 commits April 30, 2024 15:41

allow configuring CONTROL_LOOP_INTERVAL_S, add related docs

b6f261f

Signed-off-by: Josh Karpel <josh.karpel@gmail.com>

Merge branch 'master' into allow-configuring-serve-control-loop-interval

2c3c655

Signed-off-by: Josh Karpel <josh.karpel@gmail.com> Signed-off-by: Josh Karpel <josh.karpel@gmail.com>

JoshKarpel commented Apr 30, 2024

View reviewed changes

JoshKarpel marked this pull request as ready for review April 30, 2024 20:50

JoshKarpel requested review from a team, GeneDer, akshay-anyscale, edoakes, shrekris-anyscale and zcin as code owners April 30, 2024 20:50

JoshKarpel mentioned this pull request Apr 30, 2024

[Serve] Improve scalability of Serve DeploymentHandles #44784

Closed

edoakes reviewed Apr 30, 2024

View reviewed changes

edoakes approved these changes Apr 30, 2024

View reviewed changes

edoakes merged commit 23d05bb into ray-project:master Apr 30, 2024

JoshKarpel deleted the allow-configuring-serve-control-loop-interval branch April 30, 2024 22:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow configuring Serve control loop interval, add related docs#45063

Allow configuring Serve control loop interval, add related docs#45063
edoakes merged 2 commits intoray-project:masterfrom
JoshKarpel:allow-configuring-serve-control-loop-interval

JoshKarpel commented Apr 30, 2024 •

edited

Loading

Uh oh!

JoshKarpel Apr 30, 2024

Uh oh!

JoshKarpel Apr 30, 2024

Uh oh!

edoakes Apr 30, 2024

Uh oh!

JoshKarpel Apr 30, 2024

Uh oh!

edoakes Apr 30, 2024

Uh oh!

zcin Apr 30, 2024

Uh oh!

edoakes left a comment

Uh oh!

edoakes Apr 30, 2024

Uh oh!

edoakes Apr 30, 2024

Uh oh!

JoshKarpel commented Apr 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		You can set an end-to-end timeout for HTTP requests by setting the `request_timeout_s` in the `http_options` field of the Serve config. HTTP Proxies will wait for that many seconds before terminating an HTTP request. This config is global to your Ray cluster, and it cannot be updated during runtime. Use [client-side retries](serve-best-practices-http-requests) to retry requests that time out due to transient failures.

		### Give the Serve Controller more time to process requests

Conversation

JoshKarpel commented Apr 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

edoakes left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JoshKarpel commented Apr 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JoshKarpel commented Apr 30, 2024 •

edited

Loading