Fix leak of callbacks on canceled requests.#105
Conversation
We clear mErrorListener on cancel(), and mListener in all Request subclasses in the toolbox package, and synchronize all reads/writes of these fields. This ensures that we don't hold a reference to the listeners after a request is canceled. Custom subclasses of Request will need to do the same if they are concerned about this leak (which only lasts as long as the network request takes, in the worst case). Also fix some lingering thread-safety issues for any variable in Request which can be mutated during the processing of a request and which is read from multiple threads. This provides a somewhat stronger guarantee around callback invocation after cancel than we previously had (although a weaker one than we incorrectly advertised) - after calling cancel(), we won't deliver the callback as long as either cancel() was called on the same thread as the ResponseDelivery being used or if the Request subclass handles clearing its listener on cancel() correctly. Fixes google#15
| @@ -99,6 +103,7 @@ public interface Method { | |||
| private boolean mCanceled = false; | |||
uhager
left a comment
There was a problem hiding this comment.
I think I proposed synchronizing cancel and ResponseDelivered in a CL some time ago (and we have them overridden with synchronisation on our end), so I like these changes.
| */ | ||
| public boolean isCanceled() { | ||
| return mCanceled; | ||
| synchronized (mLock) { |
There was a problem hiding this comment.
I think we should have a separate lock object for this. I'm just worried that one thread is trying to cancel while another is handling the waiting requests from the RequestCompleteListener. Then the cancelling thread (which may be trying to cancel a whole bunch of requests) has to wait.
There was a problem hiding this comment.
What if we altered the notify methods to only grab the current NetworkRequestCompleteListener while holding the lock, while moving the dispatch itself out of the lock? Or is there a reason that the dispatch itself needs to be part of the same critical section?
There was a problem hiding this comment.
That would be better actually, since the RequestCompleteListener doesn't actually need the request, just the response, so there is no reason for the request to wait.
|
Comments addressed; PTAL. Thanks! |
| listener = mRequestCompleteListener; | ||
| } | ||
| if (listener != null) { | ||
| listener.onResponseReceived(this, response); |
There was a problem hiding this comment.
I should have thought of this earlier, but since the listener doesn't need the request, just the CacheKey, should we change this (and the NoUsableResponse below) to send only that, i.e. onResponseReceived(getCacheKey, response)?
There was a problem hiding this comment.
That seems reasonable, but I think it's out of the scope of this pull request. (And not something we urgently need to resolve before the next public release, since this API for network responses is package-private, and since I don't think it's harmful to pass the request on even if it's unused).
joebowbeer
left a comment
There was a problem hiding this comment.
I like that there are no open calls with locks held.
Shouldn't the overriding cancel methods share their parent's lock?
Adding getter methods for the guarded state might improve the code by moving the lock handling out of the main body.
|
|
||
| @Override | ||
| public void cancel() { | ||
| super.cancel(); |
There was a problem hiding this comment.
This leaves a window of opportunity for mListener to remain assigned while the state guarded by the Request lock has already been cleared.
One locked region needs to contain both sets of changes, right?
There was a problem hiding this comment.
That is possible, yes, but I don't see any problems it would cause, i.e. reasons this approach wouldn't meet the more tightly-defined guarantees I've noted in the Request#cancel Javadoc, or scenarios holding the same lock would fix.
Do you see something bad that can happen during this window?
There was a problem hiding this comment.
The contract is that the request is either live and the state is consistent, or the request is canceled and the state is cleared. You can't guarantee this using two disjoint locks.
There was a problem hiding this comment.
Whether the request is canceled or not is solely up to mCanceled. Other variables being mutated in cancel() have no bearing on the cancellation state. There is no guarantee that the listener is cleared at the same time as mCanceled, but I don't see why that matters.
There was a problem hiding this comment.
I'm just trying to make sense of the changes. I think it's confusing to have an identically-named lock mLock in the parent and in every subclass when they all essentially do the same thing. In ImageRequest, for example, if you're not going to share the lock, I think it might be more readable to remove the new lock and declare the listener volatile. That eliminates the red flags regarding deadlock as well as the casual reader's assumption that mLock is shared. However, my intuition is that the parent+subclass state should be managed by one lock.
There was a problem hiding this comment.
(I'm going to go ahead and merge this soon; if there are follow up concerns that require code changes, I can address them in a separate change).
The lock in each class is a private variable; they aren't considered part of the API of the class. Subclasses of Request don't have to worry about mLock at all. They only optionally have to worry about nulling their listener in cancel() if they care about preventing the leak, and doing so in a thread safe way to avoid an NPE in deliverResponse after a call to cancel(). They can use volatile if they like instead, or just not worry about nulling it at all and it'll work exactly the same as it does today.
Declaring the listener volatile doesn't change anything about thread-safety guarantees - it still would be modified outside of the lock held by Request - and I think it is debatable whether it is easier to read as volatile variables can be very tricky to reason about vs. a simple lock, and also not necessarily more performant especially if you have a bunch of them.
Let me try to lay out in more detail why it's fine for the locks to be separate. ExecutorDelivery checks if a request is canceled, and if not, it invokes deliverResponse/deliverError. This is inherently not thread-safe, because the request can be canceled after the check but before the invocation. So, there are two other things that can be done to make it thread-safe, which are now clearly documented in Request#cancel():
- Ensure cancel() is invoked on the same thread as ResponseDelivery, which is the main thread by default. In this case, the two will be serialized in some order, so explicit locking isn't needed.
- Have the custom listener null out the listener when overriding cancel() using its own thread-safe mechanism. In this case, it is guaranteed that once cancel() returns to the caller, the listeners will never be invoked, even if called from another thread. That is because both listeners are nulled out within some lock that is also grabbed when the listeners are read in deliverResponse/deliverError, although the lock is different in each case.
Now it's certainly possible that isCanceled() will return true to some other thread than the one calling cancel(), even though the listener hasn't been nullified yet, if it is called before cancel() has returned. But this does not impact the guarantees above, and I don't think the benefits of offering such a guarantee are worth the API complexity for typical use cases of cancel.
Overall, the only purposes to nullifying the listener in subclasses are to:
- Mitigate a leak of the activity by dropping references to it when no longer needed. As long as the listener isn't invoked after being nullified, this can happen at any time after cancel().
- Try to offer a stronger guarantee about not invoking the listener after cancel() on a separate thread from the ResponseDelivery. Currently there is no guarantee at all. With this change, as long as you have ensured that cancel() has returned control to the caller, the listener won't be invoked.
The current approach should accomplish these goals, as best as I can tell.
|
|
||
| @Override | ||
| public void cancel() { | ||
| super.cancel(); |
There was a problem hiding this comment.
One locked region needs to contain both sets of changes?
|
|
||
| @Override | ||
| public void cancel() { | ||
| super.cancel(); |
| // @CallSuper | ||
| public void cancel() { | ||
| mCanceled = true; | ||
| synchronized (mLock) { |
There was a problem hiding this comment.
Shouldn't same lock be used by subclasses overriding cancel?
Overriding cancel methods should hold the common lock when they call super.cancel() and then clear their own state?
Providing synchronized getters for the guarded state might improve the code.
| private final int mDefaultTrafficStatsTag; | ||
|
|
||
| /** Lock to guard state which can be mutated after a request is added to the queue. */ | ||
| private final Object mLock = new Object(); |
There was a problem hiding this comment.
Should be protected so it can be shared by subclasses?
There was a problem hiding this comment.
This would be a pretty rough API for Request subclasses to implement, and it would mean losing control of the lock so subclasses could end up doing all kinds of bad things with the lock held, leading to the concerns you mentioned earlier about open-ended locks.
Most importantly, though, I don't think it strengthens any guarantees as noted in my other comment.
There was a problem hiding this comment.
I don't think it is no more difficult to implement than what your subclasses have already implemented, and in some ways is simpler -- in that they don't have to create their own lock.
Designing for extension is difficult, and even more difficult when locks are involved. But I don't think you can mitigate by hiding the cancellation lock from subclasses -- that just prevents them from correctly maintaining the state.
|
|
||
| private final Listener<T> mListener; | ||
| /** Lock to guard mListener as it is cleared on cancel() and read on delivery. */ | ||
| private final Object mLock = new Object(); |
There was a problem hiding this comment.
Should share lock with Request.cancel?
| private final Listener<String> mListener; | ||
|
|
||
| /** Lock to guard mListener as it is cleared on cancel() and read on delivery. */ | ||
| private final Object mLock = new Object(); |
There was a problem hiding this comment.
Share lock with Request.cancel?
|
|
||
| private final Response.Listener<Bitmap> mListener; | ||
| /** Lock to guard mListener as it is cleared on cancel() and read on delivery. */ | ||
| private final Object mLock = new Object(); |
We clear mErrorListener on cancel(), and mListener in all Request
subclasses in the toolbox package, and synchronize all reads/writes of
these fields. This ensures that we don't hold a reference to the
listeners after a request is canceled. Custom subclasses of Request
will need to do the same if they are concerned about this leak (which
only lasts as long as the network request takes, in the worst case).
Also fix some lingering thread-safety issues for any variable in
Request which can be mutated during the processing of a request and
which is read from multiple threads.
This provides a somewhat stronger guarantee around callback invocation
after cancel than we previously had (although a weaker one than we
incorrectly advertised) - after calling cancel(), we won't deliver the
callback as long as either cancel() was called on the same thread as
the ResponseDelivery being used or if the Request subclass handles
clearing its listener on cancel() correctly.
Fixes #15