[Python] Fix AsyncIO Server maximum_concurrent_rpcs enforcement preventing negative active_rpcs count#41532
[Python] Fix AsyncIO Server maximum_concurrent_rpcs enforcement preventing negative active_rpcs count#41532robinvd wants to merge 8 commits intogrpc:masterfrom
Conversation
c3055e0 to
e60a413
Compare
…rent RPC limit The _ConcurrentRpcLimiter only increments _active_rpcs when a request is accepted, but decrease_once_finished() was always called for all requests. This caused the counter to go negative when rejected requests finished, effectively disabling the maximum_concurrent_rpcs limit. Only register the decrement callback for requests that were actually counted (when concurrency_exceeded is False).
e60a413 to
5d5e25c
Compare
| rpc_task.add_done_callback(rpc_tasks.discard) | ||
|
|
||
| if self._limiter is not None: | ||
| if self._limiter is not None and not concurrency_exceeded: |
There was a problem hiding this comment.
Note for other reviewers -
If concurrency_exceeded is True above, that means active_rps would not have incremented.
concurrency_exceeded - False means - acitve_rpcs incremented -> so we need to decrement it as well.
If it has not incremented, we should not decrement the active_rpcs count for the current rpc.
So this change looks correct to me.
Also another point is that we could also add a check inside _decrease_active_rpcs_count but that would be incorrect because - rpcs can finish at different times, and the one rpc which caused the active_rpcs count to increase should be the one which should decrease it on finish.
There was a problem hiding this comment.
i agree that this is a slightly confusing setup, likely why the original bug was there. But without significant overhaul this seems to be the best diff
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request correctly fixes a counter underflow bug in _ConcurrentRpcLimiter that could bypass the concurrent RPC limit. The change ensures the counter is only decremented for requests that were actually accepted, which is the right approach. The newly added regression test is well-designed and effectively verifies the fix. I've included a couple of suggestions to enhance the clarity and maintainability of the new test code.
Co-authored-by: Ashesh Vidyut <134911583+asheshvidyut@users.noreply.github.com>
|
The gemini suggestions seem fine to me, ive enabled |
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
@robinvd |
…ate_task asyncio.create_task() expects a coroutine, but UnaryUnaryCall is only awaitable. Restore the coro_wrapper usage that was removed by a previous suggestion.
204657a to
85572c4
Compare
@asheshvidyut ive update the code, and the type err should now be fixed! if there is anything else small feel free to edit yourself! |
| # Wait for all requests to fully complete | ||
| await asyncio.sleep(test_constants.SHORT_TIMEOUT) | ||
|
|
There was a problem hiding this comment.
| # Wait for all requests to fully complete | |
| await asyncio.sleep(test_constants.SHORT_TIMEOUT) |
Is there a reason we need to sleep here? Didn't we wait on all requests to complete and even counted their statuses?
This just adds 4 seconds of wait time:
# with
Ran 1 test in 8.042s
# without
Ran 1 test in 4.044s
There was a problem hiding this comment.
@sergiitk I think this is fine. If we remove the sleep, then the test might go flaky.
Reason being when client receives the OK status, then at that moment, server might not have completed the scheduled job of decrementing the active RPC counter.
rpc_task.add_done_callback(self._decrease_active_rpcs_count)The above line is scheduled on the event loop, but might not neccessarily had run already, when client receives the reply.
The sleep is to make sure wave 2 is successful. Without sleep wave 2 test will start immediately, and will impact the test flow.
Co-authored-by: Sergii Tkachenko <sergiitk@google.com>
…nting negative active_rpcs count (grpc#41532) ## Summary The `_ConcurrentRpcLimiter` only increments `_active_rpcs` when a request is accepted, but `decrease_once_finished()` was always called for all requests. This caused the counter to go negative when rejected requests finished, effectively disabling the `maximum_concurrent_rpcs` limit. ## Fix Only register the decrement callback for requests that were actually counted (when `concurrency_exceeded` is `False`). Fixes grpc#41531 Closes grpc#41532 COPYBARA_INTEGRATE_REVIEW=grpc#41532 from robinvd:fix-concurrent-rpc-limiter-underflow 5047e25 PiperOrigin-RevId: 876177526
…nting negative active_rpcs count (grpc#41532) ## Summary The `_ConcurrentRpcLimiter` only increments `_active_rpcs` when a request is accepted, but `decrease_once_finished()` was always called for all requests. This caused the counter to go negative when rejected requests finished, effectively disabling the `maximum_concurrent_rpcs` limit. ## Fix Only register the decrement callback for requests that were actually counted (when `concurrency_exceeded` is `False`). Fixes grpc#41531 Closes grpc#41532 COPYBARA_INTEGRATE_REVIEW=grpc#41532 from robinvd:fix-concurrent-rpc-limiter-underflow 5047e25 PiperOrigin-RevId: 876177526
Summary
The
_ConcurrentRpcLimiteronly increments_active_rpcswhen a request is accepted, butdecrease_once_finished()was always called for all requests. This caused the counter to go negative when rejected requests finished, effectively disabling themaximum_concurrent_rpcslimit.Fix
Only register the decrement callback for requests that were actually counted (when
concurrency_exceededisFalse).Fixes #41531