[Python] Support observability in AsyncIO stack#41573
Conversation
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces observability support for the AsyncIO stack in gRPC Python, which is a great addition. The changes include adding metrics exporting, new examples, and a comprehensive test suite for the async functionality. The implementation also brings in experimental xDS support for the AsyncIO server. The code is well-structured and the new features are well-tested. I have a few suggestions to improve resource management in the new OpenTelemetry examples and tests by ensuring the MeterProvider is properly shut down. This will make the examples more robust and prevent potential resource leaks in the test suite.
|
|
6c065b2 to
e45891b
Compare
|
@sergiitk - investigating SIGSEGV issue. managed to reproduce it locally. |
asheshvidyut
left a comment
There was a problem hiding this comment.
I see some comments like these in the repo -
# TODO(xuanwn): Implement _registered_method after we have
# observability for Asyncio.
Please check if we need to update those with PR as well.
|
@asheshvidyut - registered methods will be done in a separate PR, as a follow-up. I have already implemented something locally on top of this PR, but it is not a full solution yet. |
Thanks. |
There was a problem hiding this comment.
Overall LGTM.
Investigated sync stack during review.
One important that to address is that in the tests for sync stack
https://github.com/grpc/grpc/blob/master/src/python/grpcio_tests/tests/observability/_open_telemetry_observability_test.py
I see there are more test cases - for example -
testRecordUnaryUnaryWithClientInterceptor, testRecordUnaryUnaryWithServerInterceptor, testRecordUnaryUnaryClientOnly
but in our test in file - src/python/grpcio_tests/tests_aio/observability/open_telemetry_observability_test.py
we have added one test case - test_record_unary_unary
Do you think we should add other cases as well? or is it fine?
|
FYI - I updated the description to include #39061 as the issue which would be fixed after this PR is merged. Also just noticed |
so I didn't bother duplicating that in the observability test suite. However, if you'd prefer to have more tests than in the current setup I can do it, not a problem. Just let me know, which tests you'd like to add. |
I decided to re-run CI to double check if that's occasional or permanent. edit: apparently edit#2: I ran failed test using a python native build (instead of bazel build). Not a single failure for 1000 iterations of |
Something is breaking in MacOS, both the tests which are failing for Python are in MacOS |
|
Seems like MacOS issue is gone after re-triggering CI jobs. Additionally registered methods support for AsyncIO will be tracked under #41796. PR is still in progress. |
Co-authored-by: Sergii Tkachenko <sergiitk@google.com>
|
The last "Bazel Basic Tests for Python (Local)" run failed with ======================================================================
FAIL: test_cancelled_watch_removed_from_watch_list (__main__.HealthServicerTest.test_cancelled_watch_removed_from_watch_list)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/root/.cache/bazel/_bazel_root/954bb7512d44d20015390af6e76121c6/sandbox/processwrapper-sandbox/3444/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/health_check/health_servicer_test.runfiles/com_github_grpc_grpc/src/python/grpcio_tests/tests_aio/unit/_test_base.py", line 31, in wrapper
return loop.run_until_complete(f(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.cache/bazel/_bazel_root/954bb7512d44d20015390af6e76121c6/execroot/com_github_grpc_grpc/external/python_3_11_x86_64-unknown-linux-gnu/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/root/.cache/bazel/_bazel_root/954bb7512d44d20015390af6e76121c6/sandbox/processwrapper-sandbox/3444/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/health_check/health_servicer_test.runfiles/com_github_grpc_grpc/src/python/grpcio_tests/tests_aio/health_check/health_servicer_test.py", line 246, in test_cancelled_watch_removed_from_watch_list
self.assertTrue(queue.empty())
AssertionError: False is not trueNot sure if it's related, restarted the job. |
|
@sergiitk - I executed test myself with: I was not able to observe any issues for this test suite for 1000 runs: |
Closes grpc#39800 and grpc#39061 Work done: 1. Added metrics exporting mechanism for AsyncIO stack in Python 2. Added AsyncIO example with observability plugin enabled. Metrics are now correctly recorded (as per sync stack). Tested locally with Python 3.12. 3. Added AsyncIO observability test suite, to confirm metrics collection for all possible RPC's: * unary unary * unary stream * stream unary * stream stream Caveats for the current solution: 1. All AsyncIO RPCs currently behave as unregistered methods from a metrics pespective (`grpc.method` field set to `other`) 2. xDS Observability support in for AsyncIO stack is out of scope. ToDo's: 1. Once grpc#40556 is merged - tracing mechanism will be introduced for AsyncIO stack Closes grpc#41573 PiperOrigin-RevId: 882655402
Closes grpc#39800 and grpc#39061 Work done: 1. Added metrics exporting mechanism for AsyncIO stack in Python 2. Added AsyncIO example with observability plugin enabled. Metrics are now correctly recorded (as per sync stack). Tested locally with Python 3.12. 3. Added AsyncIO observability test suite, to confirm metrics collection for all possible RPC's: * unary unary * unary stream * stream unary * stream stream Caveats for the current solution: 1. All AsyncIO RPCs currently behave as unregistered methods from a metrics pespective (`grpc.method` field set to `other`) 2. xDS Observability support in for AsyncIO stack is out of scope. ToDo's: 1. Once grpc#40556 is merged - tracing mechanism will be introduced for AsyncIO stack Closes grpc#41573 PiperOrigin-RevId: 882655402
Closes grpc#39800 and grpc#39061 Work done: 1. Added metrics exporting mechanism for AsyncIO stack in Python 2. Added AsyncIO example with observability plugin enabled. Metrics are now correctly recorded (as per sync stack). Tested locally with Python 3.12. 3. Added AsyncIO observability test suite, to confirm metrics collection for all possible RPC's: * unary unary * unary stream * stream unary * stream stream Caveats for the current solution: 1. All AsyncIO RPCs currently behave as unregistered methods from a metrics pespective (`grpc.method` field set to `other`) 2. xDS Observability support in for AsyncIO stack is out of scope. ToDo's: 1. Once grpc#40556 is merged - tracing mechanism will be introduced for AsyncIO stack Closes grpc#41573 PiperOrigin-RevId: 882655402
Closes #39800 and #39061
Work done:
Caveats for the current solution:
grpc.methodfield set toother)ToDo's: