Skip to content

Guard against subscriber exceptions in ActiveSupport::Notifications#43282

Merged
rafaelfranca merged 3 commits intorails:mainfrom
theojulienne:guard-against-instrumentation-exceptions
Oct 8, 2021
Merged

Guard against subscriber exceptions in ActiveSupport::Notifications#43282
rafaelfranca merged 3 commits intorails:mainfrom
theojulienne:guard-against-instrumentation-exceptions

Conversation

@theojulienne
Copy link
Copy Markdown
Contributor

Summary

Currently the fanout loops in ActiveSupport::Notifications::Fanout to send notifications to subscribers performs no exception handling, meaning any exceptions in a subscriber immediately break out of the notification loop.

This means notifications are delivered to an arbitrary subset of the listeners, depending on subscription order. Additionally, some of these listeners (especially the ones defined in the Fanout class) rely on a Thread.current stack to store information about start events for when finish events are later emitted. If an exception occurs during either the start or finish process in one listener, other listeners will unfortunately be left uncalled, leaving a consistency problem because the stack will still contain the elements that are complete but never had #finish called due to an earlier exception.

This PR improves this situation by ensuring that fanout events will always run on all listeners, with any exceptions accumulated and then wrapped in an InstrumentationSubscriberError. This means an erroring #finish subscriber (the most likely) will cause all working instrumentation to succeed, before a InstrumentationSubscriberError is finally raised for any broken ones. This ensures that the minimal number of subscribers are impacted, and in the case of the Thread.current stack, also ensures that the stack is cleaned up for all the Timed, MonotonicTimed, etc listener classes defined here.

Callers can catch InstrumentationSubscriberError and inspect the exceptions attribute to find the original exceptions from subscribers and handle them accordingly.

This PR touches the same code paths as #43241, and will be trivial to adjust for that PR by specifying the iteration type like iterate_guarding_exceptions(..., :each) and iterate_guarding_exceptions(..., :map), but I left this additional complexity out of this PR so it was simpler, and can update the other PR as needed if this one is accepted.

@theojulienne theojulienne force-pushed the guard-against-instrumentation-exceptions branch from dd862cc to 8afcbb6 Compare September 22, 2021 02:07
@theojulienne
Copy link
Copy Markdown
Contributor Author

theojulienne commented Sep 22, 2021

One open question here is whether to adjust the default implementation to be safer as is done in this PR (but requires changes to tests that rely on this behaviour, which may mean it's too expected), or whether to use an optional subclassed Fanout or an option to enable it. It would be pretty easy to leave the changes to use a method like #iterate_guarding_exceptions that a subclass could use to introduce this behaviour, which would make it optional to adopt.

That said, I do think the consistency issues with the unfortunate usage of Thread.current do suggest it may be worth changing behaviour.

@theojulienne theojulienne force-pushed the guard-against-instrumentation-exceptions branch from b17cc36 to f4a7942 Compare September 22, 2021 02:43

listeners.each do |s|
yield s
rescue => e
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not we use rescue Exception => e here? If our goal is make sure no exception will stop the chain I think it makes sense,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I think that does make sense, given the consistency issues would be equally bad regardless of exception type.

@jhawthorn
Copy link
Copy Markdown
Member

Thanks!

That said, I do think the consistency issues with the unfortunate usage of Thread.current do suggest it may be worth changing behaviour.

If we solve these with an approach like #43241 does that make this less necessary or is there still a need?

@theojulienne
Copy link
Copy Markdown
Contributor Author

If we solve these with an approach like #43241 does that make this less necessary or is there still a need?

I think there is an argument that it would be nice to see instrumenters work as independently as possible from each-other, regardless of whether the rails internals use a Thread.current or not.

In any case, #43241 still relies on Thread.current for backwards-compatibility and so would fall into the same trap, unless the path without state was fully deprecated.

@rafaelfranca rafaelfranca merged commit 5e1a039 into rails:main Oct 8, 2021
casperisfine pushed a commit to Shopify/rails that referenced this pull request Oct 28, 2021
…t-instrumentation-exceptions"

This reverts commit 5e1a039, reversing
changes made to 720f6b1.
casperisfine pushed a commit to Shopify/rails that referenced this pull request Nov 2, 2021
Ref: rails#43282
Ref: rails#43561

It can be legitimate for subscriber to want to bubble up some exception
to the caller, so wrapping it change the exception class which might break
the calling code expecting a specific error.

We can minimize this by only using InstrumentationSubscriberError
when more than one subscriber raised.
jhawthorn added a commit to jhawthorn/rails that referenced this pull request Oct 25, 2023
Previously in rails#43282, which shipped
with Rails 7.0, we added the guarantee that if an exception occurred
within an ActiveSupport::Notificaitons subscriber that the remaining
subscribers would still be run.

This was broken in rails#44469, where the
different types of subscribers were broken into groups by type for
performance. Although we would guard exceptions and allways run all (or
none) of the same group, that didn't work cross-group with different
types of subscriber.
yoones pushed a commit to yoones/rails that referenced this pull request Mar 6, 2025
Previously in rails#43282, which shipped
with Rails 7.0, we added the guarantee that if an exception occurred
within an ActiveSupport::Notificaitons subscriber that the remaining
subscribers would still be run.

This was broken in rails#44469, where the
different types of subscribers were broken into groups by type for
performance. Although we would guard exceptions and allways run all (or
none) of the same group, that didn't work cross-group with different
types of subscriber.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants