Skip to content

[Python] Support observability in AsyncIO stack#41573

Closed
Zgoda91 wants to merge 21 commits into
grpc:masterfrom
Zgoda91:asyncio_observability
Closed

[Python] Support observability in AsyncIO stack#41573
Zgoda91 wants to merge 21 commits into
grpc:masterfrom
Zgoda91:asyncio_observability

Conversation

@Zgoda91

@Zgoda91 Zgoda91 commented Feb 6, 2026

Copy link
Copy Markdown
Contributor

Closes #39800 and #39061

Work done:

  1. Added metrics exporting mechanism for AsyncIO stack in Python
  2. Added AsyncIO example with observability plugin enabled. Metrics are now correctly recorded (as per sync stack). Tested locally with Python 3.12.
  3. Added AsyncIO observability test suite, to confirm metrics collection for all possible RPC's:
  • unary unary
  • unary stream
  • stream unary
  • stream stream

Caveats for the current solution:

  1. All AsyncIO RPCs currently behave as unregistered methods from a metrics pespective (grpc.method field set to other)
  2. xDS Observability support in for AsyncIO stack is out of scope.

ToDo's:

  1. Once [Python] Open Telemetry tracing (part 1) #40556 is merged - tracing mechanism will be introduced for AsyncIO stack

@emil10001

Copy link
Copy Markdown
Member

/gemini review

@sergiitk sergiitk added the release notes: yes Indicates if PR needs to be in release notes label Feb 6, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces observability support for the AsyncIO stack in gRPC Python, which is a great addition. The changes include adding metrics exporting, new examples, and a comprehensive test suite for the async functionality. The implementation also brings in experimental xDS support for the AsyncIO server. The code is well-structured and the new features are well-tested. I have a few suggestions to improve resource management in the new OpenTelemetry examples and tests by ensuring the MeterProvider is properly shut down. This will make the examples more robust and prevent potential resource leaks in the test suite.

Comment thread examples/python/observability/async_observability_greeter_client.py Outdated
Comment thread examples/python/observability/async_observability_greeter_server.py
@sergiitk

sergiitk commented Feb 6, 2026

Copy link
Copy Markdown
Member

src/python/grpcio_tests/tests_aio/unit/call_test failed with segfault (exited with error code 139) SIGSEGV
https://btx.cloud.google.com/invocations/565877ef-5aca-4277-bbd7-e58317e77e14/targets/%2F%2Fsrc%2Fpython%2Fgrpcio_tests%2Ftests_aio%2Funit:call_test;config=979f3de4fea9e6f501133cb4aaee8415c218bebabbddb7a377e28a83409f0428/tests

@Zgoda91 Zgoda91 force-pushed the asyncio_observability branch from 6c065b2 to e45891b Compare February 9, 2026 08:25
@Zgoda91

Zgoda91 commented Feb 9, 2026

Copy link
Copy Markdown
Contributor Author

@sergiitk - investigating SIGSEGV issue. managed to reproduce it locally.

@Zgoda91

Zgoda91 commented Feb 18, 2026

Copy link
Copy Markdown
Contributor Author

@sergiitk - fix for occasional seg fault is merged under #41630

@Zgoda91 Zgoda91 marked this pull request as ready for review February 19, 2026 16:16
@Zgoda91 Zgoda91 requested a review from sergiitk February 19, 2026 16:17

@asheshvidyut asheshvidyut left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see some comments like these in the repo -

 # TODO(xuanwn): Implement _registered_method after we have
 # observability for Asyncio.

Please check if we need to update those with PR as well.

Comment thread src/python/grpcio/grpc/_cython/_cygrpc/aio/call.pyx.pxi Outdated
Comment thread src/python/grpcio/grpc/_cython/_cygrpc/observability.pyx.pxi Outdated
@Zgoda91

Zgoda91 commented Feb 26, 2026

Copy link
Copy Markdown
Contributor Author

@asheshvidyut - registered methods will be done in a separate PR, as a follow-up.

I have already implemented something locally on top of this PR, but it is not a full solution yet.

@asheshvidyut

Copy link
Copy Markdown
Member

@asheshvidyut - registered methods will be done in a separate PR, as a follow-up.

I have already implemented something locally on top of this PR, but it is not a full solution yet.

Thanks.

@asheshvidyut asheshvidyut left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.

Investigated sync stack during review.

One important that to address is that in the tests for sync stack
https://github.com/grpc/grpc/blob/master/src/python/grpcio_tests/tests/observability/_open_telemetry_observability_test.py

I see there are more test cases - for example -

testRecordUnaryUnaryWithClientInterceptor, testRecordUnaryUnaryWithServerInterceptor, testRecordUnaryUnaryClientOnly

but in our test in file - src/python/grpcio_tests/tests_aio/observability/open_telemetry_observability_test.py

we have added one test case - test_record_unary_unary

Do you think we should add other cases as well? or is it fine?

Comment thread src/python/grpcio/grpc/aio/_server.py
Comment thread src/python/grpcio/grpc/_cython/_cygrpc/observability.pyx.pxi Outdated
@asheshvidyut

asheshvidyut commented Feb 27, 2026

Copy link
Copy Markdown
Member

FYI - I updated the description to include #39061 as the issue which would be fixed after this PR is merged.

Also just noticed third_party/xds is commited in this PR. Do we need it?

@Zgoda91

Zgoda91 commented Mar 2, 2026

Copy link
Copy Markdown
Contributor Author

@asheshvidyut

  1. Regarding the third_party/xds - it was a mistake. I'll revert it. Thanks for spotting that!

  2. Regarding the missing UTs: I decided not to add a one-to-one copy of those tests found in the sync stack. My idea was, that we are re-using an existing plugin. Therefore I focused on proving that all possible AsyncIO client <-> server interactions are working fine (metrics are collected), rather than considering every possible edge case. We already have comprehensive AsyncIO interceptors test suites:

so I didn't bother duplicating that in the observability test suite. However, if you'd prefer to have more tests than in the current setup I can do it, not a problem. Just let me know, which tests you'd like to add.

@Zgoda91

Zgoda91 commented Mar 3, 2026

Copy link
Copy Markdown
Contributor Author

@asheshvidyut

  1. Distribution Tests Python MacOS failed on Compiling grpc_tools/_protoc_compiler.pyx. For some reason GCC was not able to compile upb_generator. For me it seems unrelated.
  2. Bazel Python MacOS Tests failed on //src/python/grpcio_tests/tests_aio/interop:local_interop_test bazel target, but I wasn't able to run the test suite locally

I decided to re-run CI to double check if that's occasional or permanent.

edit: apparently kokoro:run did not retrigger jobs as expected. Am I missing something?

edit#2: I ran failed test using a python native build (instead of bazel build). Not a single failure for 1000 iterations of local_interop_test.py test suite.

@Zgoda91 Zgoda91 removed the kokoro:run label Mar 3, 2026
@asheshvidyut

Copy link
Copy Markdown
Member

@asheshvidyut

  1. Distribution Tests Python MacOS failed on Compiling grpc_tools/_protoc_compiler.pyx. For some reason GCC was not able to compile upb_generator. For me it seems unrelated.
  2. Bazel Python MacOS Tests failed on //src/python/grpcio_tests/tests_aio/interop:local_interop_test bazel target, but I wasn't able to run the test suite locally

I decided to re-run CI to double check if that's occasional or permanent.

edit: apparently kokoro:run did not retrigger jobs as expected. Am I missing something?

edit#2: I ran failed test using a python native build (instead of bazel build). Not a single failure for 1000 iterations of local_interop_test.py test suite.

Something is breaking in MacOS, both the tests which are failing for Python are in MacOS

@Zgoda91

Zgoda91 commented Mar 4, 2026

Copy link
Copy Markdown
Contributor Author

@asheshvidyut

Seems like MacOS issue is gone after re-triggering CI jobs.

Additionally registered methods support for AsyncIO will be tracked under #41796. PR is still in progress.

@sergiitk sergiitk changed the title [Python] Added observability support to AsyncIO stack [Python] Support observability in AsyncIO stack Mar 5, 2026
Comment thread examples/python/observability/async_observability_greeter_server.py Outdated
Comment thread examples/python/observability/async_observability_greeter_server.py Outdated
Comment thread src/python/grpcio/grpc/_cython/_cygrpc/observability.pyx.pxi
Comment thread src/python/grpcio/grpc/_cython/_cygrpc/observability.pyx.pxi Outdated
Comment thread examples/python/observability/async_observability_greeter_client.py Outdated
@sergiitk

sergiitk commented Mar 11, 2026

Copy link
Copy Markdown
Member

The last "Bazel Basic Tests for Python (Local)" run failed with

======================================================================
FAIL: test_cancelled_watch_removed_from_watch_list (__main__.HealthServicerTest.test_cancelled_watch_removed_from_watch_list)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/954bb7512d44d20015390af6e76121c6/sandbox/processwrapper-sandbox/3444/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/health_check/health_servicer_test.runfiles/com_github_grpc_grpc/src/python/grpcio_tests/tests_aio/unit/_test_base.py", line 31, in wrapper
    return loop.run_until_complete(f(*args, **kwargs))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.cache/bazel/_bazel_root/954bb7512d44d20015390af6e76121c6/execroot/com_github_grpc_grpc/external/python_3_11_x86_64-unknown-linux-gnu/lib/python3.11/asyncio/base_events.py", line 654, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/root/.cache/bazel/_bazel_root/954bb7512d44d20015390af6e76121c6/sandbox/processwrapper-sandbox/3444/execroot/com_github_grpc_grpc/bazel-out/k8-fastbuild/bin/src/python/grpcio_tests/tests_aio/health_check/health_servicer_test.runfiles/com_github_grpc_grpc/src/python/grpcio_tests/tests_aio/health_check/health_servicer_test.py", line 246, in test_cancelled_watch_removed_from_watch_list
    self.assertTrue(queue.empty())
AssertionError: False is not true

Not sure if it's related, restarted the job.

@Zgoda91

Zgoda91 commented Mar 12, 2026

Copy link
Copy Markdown
Contributor Author

@sergiitk - I executed test myself with:

./tools/bazel test --config=python --cache_test_results=no --runs_per_test=1000 --test_timeout=300 //src/python/grpcio_tests/tests_aio/health_check/...

I was not able to observe any issues for this test suite for 1000 runs:

INFO: 1001 processes: 5 action cache hit, 2000 linux-sandbox.
INFO: Build completed successfully, 1001 total actions
//src/python/grpcio_tests/tests_aio/health_check:health_servicer_test    PASSED in 8.2s
  Stats over 1000 runs: max = 8.2s, min = 5.2s, avg = 6.1s, dev = 0.6s

Zgoda91 added a commit to Zgoda91/grpc that referenced this pull request Mar 22, 2026
Closes grpc#39800 and grpc#39061

Work done:
1. Added metrics exporting mechanism for AsyncIO stack in Python
2. Added AsyncIO example with observability plugin enabled. Metrics are now correctly recorded (as per sync stack). Tested locally with Python 3.12.
3. Added AsyncIO observability test suite, to confirm metrics collection for all possible RPC's:
* unary unary
* unary stream
* stream unary
* stream stream

Caveats for the current solution:
1. All AsyncIO RPCs currently behave as unregistered methods from a metrics pespective (`grpc.method` field set to `other`)
2. xDS Observability support in for AsyncIO stack is out of scope.

ToDo's:
1. Once grpc#40556 is merged - tracing mechanism will be introduced for AsyncIO stack

Closes grpc#41573

PiperOrigin-RevId: 882655402
asheshvidyut pushed a commit to asheshvidyut/grpc that referenced this pull request Mar 26, 2026
Closes grpc#39800 and grpc#39061

Work done:
1. Added metrics exporting mechanism for AsyncIO stack in Python
2. Added AsyncIO example with observability plugin enabled. Metrics are now correctly recorded (as per sync stack). Tested locally with Python 3.12.
3. Added AsyncIO observability test suite, to confirm metrics collection for all possible RPC's:
* unary unary
* unary stream
* stream unary
* stream stream

Caveats for the current solution:
1. All AsyncIO RPCs currently behave as unregistered methods from a metrics pespective (`grpc.method` field set to `other`)
2. xDS Observability support in for AsyncIO stack is out of scope.

ToDo's:
1. Once grpc#40556 is merged - tracing mechanism will be introduced for AsyncIO stack

Closes grpc#41573

PiperOrigin-RevId: 882655402
asheshvidyut pushed a commit to a-detiste/grpc that referenced this pull request Jun 10, 2026
Closes grpc#39800 and grpc#39061

Work done:
1. Added metrics exporting mechanism for AsyncIO stack in Python
2. Added AsyncIO example with observability plugin enabled. Metrics are now correctly recorded (as per sync stack). Tested locally with Python 3.12.
3. Added AsyncIO observability test suite, to confirm metrics collection for all possible RPC's:
* unary unary
* unary stream
* stream unary
* stream stream

Caveats for the current solution:
1. All AsyncIO RPCs currently behave as unregistered methods from a metrics pespective (`grpc.method` field set to `other`)
2. xDS Observability support in for AsyncIO stack is out of scope.

ToDo's:
1. Once grpc#40556 is merged - tracing mechanism will be introduced for AsyncIO stack

Closes grpc#41573

PiperOrigin-RevId: 882655402
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lang/Python release notes: yes Indicates if PR needs to be in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"grpc.server.*" metrics missing from grpc.aio after enabling reporting.

5 participants