[Data] ConcurrencyCapBackpressurePolicy - Only increase threshold#58023
Conversation
…e buildup Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com>
Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com>
Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request modifies the ConcurrencyCapBackpressurePolicy to only allow the backpressure threshold to increase, never decrease. This simplifies the policy by removing the logic for downward smoothing and idle resets. The changes in the core policy file are clear and correctly implement this new behavior. The associated tests have been updated to reflect this change, with the old downward smoothing test refactored to verify that the threshold does not decrease, and a new test added to explicitly check for threshold increases. My review identifies some misleading comments in the test files that should be corrected for clarity and maintainability.
| # Test downward adjustment (should be smoothed) | ||
| # threshold = max(10 + 4*1, 150) = 150, which is < 200, so should be smoothed | ||
| # Test that threshold is maintained when calculated threshold is lower | ||
| # threshold = max(10 + 4*1, 150) = 150, which is < 200, so should be maintained |
There was a problem hiding this comment.
The calculation in this comment is a bit misleading. It seems to use the initial _q_level_nbytes (10.0) and _q_level_dev (1.0) to calculate the threshold, but _update_queue_threshold updates these values using EWMA based on the new sample (q=150) before calculating the base threshold. This means the actual calculated threshold is different from 150. While the test's assertion is correct, the comment could be confusing for future maintenance. For better clarity, consider removing the specific calculation and just stating the expected behavior.
| policy._queue_history[mock_op] = deque([50, 60, 70, 80, 90, 100]) | ||
|
|
||
| # Test that threshold is increased when calculated threshold is higher | ||
| # threshold = max(50 + 4*20, 200) = 200, which is > 100, so should be updated |
There was a problem hiding this comment.
Similar to a previous comment, the calculation threshold = max(50 + 4*20, 200) = 200 in this comment doesn't account for the EWMA updates of level and dev that happen inside _update_queue_threshold. The actual calculated threshold will be different due to these updates. To avoid confusion, it would be clearer to either provide the full, accurate calculation (which might be too complex for a comment) or simplify the comment to describe the behavior without showing a potentially incorrect formula.
| # threshold = max(10 + 4*1, 150) = 150, which is < 200, so should be smoothed | ||
| self.assertLess(threshold3, 200) | ||
| self.assertGreaterEqual(threshold3, 150) | ||
| # threshold = max(10 + 4*1, 150) = 150, which is < 200, so should maintain 200 |
There was a problem hiding this comment.
The calculation threshold = max(10 + 4*1, 150) = 150 in this comment is misleading as it doesn't account for the EWMA updates for level and dev that occur within _update_queue_threshold. The actual calculated threshold is different. While the test's assertion is correct, this discrepancy can make the test logic harder to follow. Please consider updating the comment to be more accurate or simplifying it to just describe the expected outcome.
Training Benchmark results |
…licy_threshold_up_only Signed-off-by: Srinath Krishnamachari <68668616+srinathk10@users.noreply.github.com>
…y-project#58023) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. >⚠️ Remove these instructions before submitting your PR. > 💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete. ## Description > Briefly describe what this PR accomplishes and why it's needed. ### [Data] ConcurrencyCapBackpressurePolicy - Only increase threshold When `_update_queue_threshold` to adjust the queue threshold to cap concurrency based on current queued bytes, - Only allow increasing the threshold or maintaining it. - Cannot decrease threshold because the steady state of queued bytes is not known. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: Srinath Krishnamachari <68668616+srinathk10@users.noreply.github.com> Signed-off-by: xgui <xgui@anyscale.com>
…y-project#58023) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. >⚠️ Remove these instructions before submitting your PR. > 💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete. ## Description > Briefly describe what this PR accomplishes and why it's needed. ### [Data] ConcurrencyCapBackpressurePolicy - Only increase threshold When `_update_queue_threshold` to adjust the queue threshold to cap concurrency based on current queued bytes, - Only allow increasing the threshold or maintaining it. - Cannot decrease threshold because the steady state of queued bytes is not known. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: Srinath Krishnamachari <68668616+srinathk10@users.noreply.github.com>
…y-project#58023) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. >⚠️ Remove these instructions before submitting your PR. > 💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete. ## Description > Briefly describe what this PR accomplishes and why it's needed. ### [Data] ConcurrencyCapBackpressurePolicy - Only increase threshold When `_update_queue_threshold` to adjust the queue threshold to cap concurrency based on current queued bytes, - Only allow increasing the threshold or maintaining it. - Cannot decrease threshold because the steady state of queued bytes is not known. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: Srinath Krishnamachari <68668616+srinathk10@users.noreply.github.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
…y-project#58023) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. >⚠️ Remove these instructions before submitting your PR. > 💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete. ## Description > Briefly describe what this PR accomplishes and why it's needed. ### [Data] ConcurrencyCapBackpressurePolicy - Only increase threshold When `_update_queue_threshold` to adjust the queue threshold to cap concurrency based on current queued bytes, - Only allow increasing the threshold or maintaining it. - Cannot decrease threshold because the steady state of queued bytes is not known. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: Srinath Krishnamachari <68668616+srinathk10@users.noreply.github.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
…y-project#58023) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. >⚠️ Remove these instructions before submitting your PR. > 💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete. ## Description > Briefly describe what this PR accomplishes and why it's needed. ### [Data] ConcurrencyCapBackpressurePolicy - Only increase threshold When `_update_queue_threshold` to adjust the queue threshold to cap concurrency based on current queued bytes, - Only allow increasing the threshold or maintaining it. - Cannot decrease threshold because the steady state of queued bytes is not known. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: Srinath Krishnamachari <68668616+srinathk10@users.noreply.github.com>
…y-project#58023) > Thank you for contributing to Ray! 🚀 > Please review the [Ray Contribution Guide](https://docs.ray.io/en/master/ray-contribute/getting-involved.html) before opening a pull request. >⚠️ Remove these instructions before submitting your PR. > 💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete. ## Description > Briefly describe what this PR accomplishes and why it's needed. ### [Data] ConcurrencyCapBackpressurePolicy - Only increase threshold When `_update_queue_threshold` to adjust the queue threshold to cap concurrency based on current queued bytes, - Only allow increasing the threshold or maintaining it. - Cannot decrease threshold because the steady state of queued bytes is not known. ## Related issues > Link related issues: "Fixes ray-project#1234", "Closes ray-project#1234", or "Related to ray-project#1234". ## Additional information > Optional: Add implementation details, API changes, usage examples, screenshots, etc. --------- Signed-off-by: Srinath Krishnamachari <srinath.krishnamachari@anyscale.com> Signed-off-by: Srinath Krishnamachari <68668616+srinathk10@users.noreply.github.com> Signed-off-by: peterxcli <peterxcli@gmail.com>
Description
[Data] ConcurrencyCapBackpressurePolicy - Only increase threshold
When
_update_queue_thresholdto adjust the queue threshold to cap concurrency based on current queued bytes,Related issues
Additional information