Skip to content

ActiveSupport: Refactor Notifications::Fanout to support interleaved start/finish#43241

Closed
theojulienne wants to merge 1 commit intorails:mainfrom
theojulienne:notification-avoid-global-state
Closed

ActiveSupport: Refactor Notifications::Fanout to support interleaved start/finish#43241
theojulienne wants to merge 1 commit intorails:mainfrom
theojulienne:notification-avoid-global-state

Conversation

@theojulienne
Copy link
Copy Markdown
Contributor

@theojulienne theojulienne commented Sep 17, 2021

Summary

Currently the way the ActiveSupport::Notifications::Instrumenter API is written, the #start method is allowed to return state that is then returned in #finish_with_state to ensure instrumented timings and events are distributed to the same listeners, but it also presents an API that appears to support calling #start and #finish_with_state without any explicit ordering between multiple calls of these operations. However, the way that ActiveSupport::Notifications::Fanout is currently implemented, only the list of listeners is maintained using this state, while any state relating to the Evented, Timed, MonotonicTimed, and EventObject subscribers are stored using a stack in thread-local storage.

When instrumenting code that doesn't start and end within a single block, the public #start and #finish_with_state methods would seemingly support being interleaved, but in reality because of the usage of a thread-local stack they do not. This became a problem at GitHub during testing and instrumentation of IOPromises, but it would be the same for anything that needs to be instrumented that doesn't guarantee that start/finish are executed in the same (stack) order.

An example of code that would currently misbehave:

start1 = instrumenter.start "foo"
start2 = instrumenter.start "bar"
instrumenter.finish_with_state start1, "foo" # accidentally picks up the start time of start2/bar
instrumenter.finish_with_state start2, "bar" # accidentally picks up the start time of start1/foo

This PR attempts to simplify this logic to just be implemented once in Fanout rather than every subscriber class, and to ensure that state is used to store all intermediate state if it is provided, but also retains backwards compatibility of using a thread-local stack when it is not provided.

Other Information

The implementation in this PR allows continued usage of the #finish variant that relies on a thread-local stack. It does this by allowing the subscriber classes to return state, and storing that in a single stack, which is used if #finish is used rather than #finish_with_state. This means that code that uses #finish_with_state (including by extension #instrument) are guaranteed to get the correct state regardless of interleaving of events. Older code that might use #start and #finish will use the thread-local stack and will not support interleaving of events (retaining the current behaviour).

The listeners provided to Fanout can also now implement #finish_with_state and this will be preferenced over #finish. This allows backwards compatibility with custom listeners (as per the unchanged tests there), but allows custom class-based listeners to adopt the state as well, in case they want to have their state returned to them.

@jhawthorn
Copy link
Copy Markdown
Member

Thanks for opening this. It's definitely something we should solve.

I am concerned there will be other places not prepared for events to be out of order. For example AS::Subscriber seems to keep a thread local stack to group events. I'd love to remove this but we'll have to see how/where it is used https://github.com/rails/rails/blob/main/activesupport/lib/active_support/subscriber.rb#L135-L162

One thing I'd love to do, seeing as we're adjusting the public API a little, is use this as an opportunity to speed up and improve the notification handling. It's always bugged me how many duplicate Time objects we allocate and the levels of indirection (This is definitely outside the scope of what is needed here, but I'd like to try to spike on that this week).

Another thing I'm wondering is whether we could use the publish_event interface added in #41590. If we record the start and end on the event correctly before publishing it feels like we would avoid any interleaving issues (or need to support old thread-stack behaviour).

@jhawthorn
Copy link
Copy Markdown
Member

The listeners provided to Fanout can also now implement #finish_with_state and this will be preferenced over #finish. This allows backwards compatibility with custom listeners (as per the unchanged tests there), but allows custom class-based listeners to adopt the state as well, in case they want to have their state returned to them.

Having looked a bit more I'd rather we not support this (specifically arbitrary data from listeners). Allowing state other than plain Event objects limits the frameworks flexibility here and is going to require keeping track of duplicate data when there are many subscribers.

I'm wondering if a version of finish (finish_with_event?) which was passed an event rather than a payload could solve both the problem identified here, encourage better support of #41590, and speed up notifications.

@rails-bot
Copy link
Copy Markdown

rails-bot bot commented Jan 5, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants