Skip to content

[event_engine] Introduce a event_engine_poller_for_python experiment#40243

Closed
eugeneo wants to merge 12 commits intogrpc:masterfrom
eugeneo:python-no-backup-poller-experiment
Closed

[event_engine] Introduce a event_engine_poller_for_python experiment#40243
eugeneo wants to merge 12 commits intogrpc:masterfrom
eugeneo:python-no-backup-poller-experiment

Conversation

@eugeneo
Copy link
Contributor

@eugeneo eugeneo commented Jul 17, 2025

This commit introduces the event_engine_poller_for_python experiment, allowing for controlled rollout and testing of the EventEngine's Posix poller specifically within gRPC Python bindings.

Crucially, this change does not alter the default behavior for builds where GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER is not defined. In such configurations, the Posix EventEngine poller will continue to be enabled unconditionally, preserving existing functionality.

The primary impact of this change is when GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER is defined (e.g., in gRPC Python's build system). In this scenario, the enablement of the Posix poller transitions from being implicitly disabled to being configurable via the event_engine_poller_for_python experiment flag. This enables a controlled, experimental rollout of the EventEngine poller in environments that previously opted out of its direct instantiation.

@drfloob
Copy link
Member

drfloob commented Jul 23, 2025

Discussed yesterday: let's split this PR up into (1) a cleanup/code reorganization PR, and (2) the changes required to introduce the experiment. Bundling them together means a rollback of both if there would otherwise only need to be a rollback of one.

@eugeneo eugeneo force-pushed the python-no-backup-poller-experiment branch 3 times, most recently from c984ee6 to 8cae4bd Compare July 25, 2025 20:33
PosixError err;
int connect_errno;
PosixEventPoller* poller = poller_manager_.Poller();
CHECK(poller_manager_.has_value());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This runtime check needed to be added 7 times, which feels like a code smell, and is error-prone if we forget to add it to any function (now or later). I know you're interested in removing PollerManager. If we're going to take this approach of having an optional PollerManager that is checked every time it's used, we need a short finite timeline before these checks are removed. Please add a TODO and explain what's needed before we can remove the need for this to be an optional type / remove the checks / delete the PollerManager.

OnConnectCallback on_connect, const ResolvedAddress& addr,
const EndpointConfig& args, MemoryAllocator memory_allocator,
Duration timeout) {
CHECK(poller_manager_.has_value());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this crash if the experiment is disabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is supposed to crash. See CreateEndpointFromUnconnectedFdInternal. I just made it more explicit and sooner.

@eugeneo eugeneo force-pushed the python-no-backup-poller-experiment branch from 8cae4bd to b4d011b Compare July 29, 2025 19:40
@eugeneo eugeneo force-pushed the python-no-backup-poller-experiment branch 8 times, most recently from 754c44b to 648631f Compare August 4, 2025 21:21
@eugeneo eugeneo force-pushed the python-no-backup-poller-experiment branch from 964aec9 to 3a0acf9 Compare August 14, 2025 21:12
github-actions bot and others added 2 commits August 14, 2025 14:31
Co-authored-by: eugeneo <287917+eugeneo@users.noreply.github.com>
@eugeneo eugeneo assigned Vignesh2208 and unassigned drfloob Aug 15, 2025
@eugeneo eugeneo requested review from Vignesh2208 and removed request for markdroth August 15, 2025 17:54
Copy link
Contributor

@Vignesh2208 Vignesh2208 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. Left some minor comments

bool EventEngineExperimentDisabledForPython() {
#ifdef GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER
return true;
return !grpc_core::IsEventEnginePollerForPythonEnabled();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: change the comment of this function. It doesn't always return true if GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER is defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried using Gemini to improve the comment...

#ifdef GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER
bool poller_enabled = grpc_core::IsEventEnginePollerForPythonEnabled();
#else // GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER
bool poller_enabled = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename local variable to event_engine_poller_enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


void grpc_client_channel_global_init_backup_polling() {
#ifndef GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER
#ifdef GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider renaming this macro to GRPC_BUILD_FOR_PYTHON to make the intention clear

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to discuss with Python team

#endif // GRPC_PLATFORM_SUPPORTS_POSIX_POLLING
}
AfterForkInChild();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the check on IsEventEngineForkEnabled() removed ?

should there be an else if (poller_ != nullptr) { poller_->HandleForkInChild(); } here if the IsEventEngineForkEnabled() returned false ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a CHECK.
Fork handlers should not be called if the fork is not enabled

@copybara-service copybara-service bot closed this in 7949731 Sep 8, 2025
@sergiitk
Copy link
Member

sergiitk commented Sep 8, 2025

Looks like this broke grpc/core/master/macos/grpc_basictests_python reconnect unittest. Only MacOS, only python 3.9.

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/unittest/case.py", line 59, in testPartExecutor
    yield
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/unittest/case.py", line 592, in run
    self._callTestMethod(testMethod)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/unittest/case.py", line 550, in _callTestMethod
    method()
  File "/Volumes/BuildData/tmpfs/altsrc/github/grpc/workspace_python_macos_opt_native/src/python/grpcio_tests/tests/unit/_reconnect_test.py", line 240, in test_reconnect
    self.fail(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/unittest/case.py", line 676, in fail
    raise self.failureException(msg)
AssertionError: [TEST] Recovery verification failed: didn't receive 3 good calls in sequence. Failed calls: 16 out of 16

https://source.cloud.google.com/results/invocations/c1ba0a1f-f81a-48d6-9047-45cc88f09e0b

eugeneo pushed a commit to eugeneo/grpc that referenced this pull request Sep 9, 2025
eugeneo pushed a commit to eugeneo/grpc that referenced this pull request Sep 9, 2025
asheshvidyut pushed a commit to asheshvidyut/grpc that referenced this pull request Sep 12, 2025
…rpc#40243)

This commit introduces the `event_engine_poller_for_python` experiment, allowing for controlled rollout and testing of the EventEngine's Posix poller specifically within gRPC Python bindings.

**Crucially, this change does *not* alter the default behavior for builds where `GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER` is *not* defined.** In such configurations, the Posix EventEngine poller will continue to be enabled unconditionally, preserving existing functionality.

The primary impact of this change is when `GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER` *is* defined (e.g., in gRPC Python's build system). In this scenario, the enablement of the Posix poller transitions from being implicitly disabled to being configurable via the `event_engine_poller_for_python` experiment flag. This enables a controlled, experimental rollout of the EventEngine poller in environments that previously opted out of its direct instantiation.

Closes grpc#40243

COPYBARA_INTEGRATE_REVIEW=grpc#40243 from eugeneo:python-no-backup-poller-experiment 17906e6
PiperOrigin-RevId: 804472655
@anniefrchz anniefrchz mentioned this pull request Oct 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants