Improve request cancelation handling. #103

jpd236 · 2017-09-27T23:20:53Z

Provide stronger thread safety guarantees by locking access to any
variable which can be mutated during the processing of a request (i.e.
by the CacheDispatcher/NetworkDispatcher threads) and dispatching
response callbacks while holding the same lock that is held when
setting mCanceled. Unfortunately there's not much we can do about
alternative ResponseDelivery mechanisms, which are hopefully rare.
Add a cancel listener interface and use it to prevent RequestFuture
from blocking for the complete timeout (or indefinitely) if a request
is canceled.
Since listeners are often Android component classes like Activitys,
clear them upon cancellation. Unfortunately, this must be done in each
Request subclass to work, assuming the subclass follows the standard
pattern of holding the response listener as a member. But this enables
concerned apps to resolve this leak.

Fixes #15
Fixes #85

- Provide stronger thread safety guarantees by locking access to any variable which can be mutated during the processing of a request (i.e. by the CacheDispatcher/NetworkDispatcher threads) and dispatching response callbacks while holding the same lock that is held when setting mCanceled. Unfortunately there's not much we can do about alternative ResponseDelivery mechanisms, which are hopefully rare. - Add a cancel listener interface and use it to prevent RequestFuture from blocking for the complete timeout (or indefinitely) if a request is canceled. - Since listeners are often Android component classes like Activitys, clear them upon cancellation. Unfortunately, this must be done in each Request subclass to work, assuming the subclass follows the standard pattern of holding the response listener as a member. But this enables concerned apps to resolve this leak. Fixes google#15 Fixes google#85

googlebot · 2017-09-27T23:20:55Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.

If you've already signed a CLA, it's possible we don't have your GitHub username or you're using a different email address. Check your existing CLA data and verify that your email is set on your git commits.
If your company signed a CLA, they designated a Point of Contact who decides which employees are authorized to participate. You may need to contact the Point of Contact for your company and ask to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the project maintainer to go/cla#troubleshoot.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again.

jpd236 · 2017-09-27T23:27:28Z

@googlebot is apparently having issues right now; the above comment can be ignored and I'll request a re-check once it's resolved.

joebowbeer

I think this should be deferred to 1.0.2.

I suggest releasing a fix for the DiskBasedCache OOME/NAIE issues sooner rather than later -- with few additional (potentially destabilizing) changes.

jpd236 · 2017-09-27T23:36:06Z

I appreciate the concern, but given the volume of reports of the leak in #15 and duped reports, I'd like to get this resolved as well in the next release. I also find the bug with RequestFuture hanging forever in wait() to be pretty severe in nature - it caused thread pool exhaustion in one of our internal apps. The DiskBasedCache bug is severe as well when it does happen, but has been present for quite some time already.

This does happen to be the last planned change on the roadmap for 1.0.1, though. I wouldn't anticipate other changes being merged before release unless they fix regressions introduced since 1.0.0. So hopefully we'll have an RC build ready soon.

joebowbeer · 2017-09-27T23:34:42Z

src/main/java/com/android/volley/Request.java

@@ -56,6 +56,12 @@
        int PATCH = 7;
    }

+    /** Callback interface to notify listeners when a request has been canceled. */
+    public interface CancelListener {


The names seems odd. Ideas for a better name:

OnRequestCanceledListener

CancellationListener

OnCancelListener

CancellationListener makes sense - the intent is to be consistent with "ErrorListener", so the prefixes should both be nouns describing what happened.

joebowbeer · 2017-09-27T23:37:01Z

src/main/java/com/android/volley/Request.java

+        // To uphold the contract of cancel(), we hold the same lock here around both the check of
+        // isCanceled() as well as the actual response delivery. This ensures that after a call
+        // to cancel() returns, we will never deliver the response.
+        synchronized (mLock) {


The change from open calls (without lock held) to closed calls makes me nervous about deadlock.

joebowbeer

@jpd236 Then I suggest making the minimal change to alleviate the memory retention. Similar to the mcxiaoke fix, which has been in use for a while.

Table the lock changes for another release.

uhager · 2017-09-27T23:44:19Z

src/main/java/com/android/volley/Request.java

@@ -96,9 +105,11 @@
    private boolean mShouldCache = true;

    /** Whether or not this request has been canceled. */
+    // GuardedBy(mLock)


If you're OK with importing android.support.annotation.GuardedBy you could use @GuardedBy("mLock").

We don't have a dependency on the support lib yet, so for now I'm holding off on this. At some point later on we might add it to cover things like this, @VisibleForTesting, etc.

uhager · 2017-09-27T23:52:15Z

src/main/java/com/android/volley/Request.java

@@ -96,9 +105,11 @@
    private boolean mShouldCache = true;

    /** Whether or not this request has been canceled. */
+    // GuardedBy(mLock)
    private boolean mCanceled = false;



Instead of guarding this by a lock, would it be sufficient to make this volatile? Since we don't have a mechanism to un-cancel a request, we know that once isCancelled returns true, it will always return true, so we mainly have to make sure that once one thread cancels, all other threads see that.

It's not sufficient as it stands due to holding mLock around the response dispatching below; it might be sufficient if we change that, but I'm not sure that n volatile variables is necessarily more performant than a simple lock, and it's certainly more tricky to reason about :)

Looking at this a little closer, I see that synchronization is being added to methods that are likely to be called on the main thread. This should be avoided. In particular, Request.cancel() is waiting for a lock held by the ExecutorDelivery.

uhager · 2017-09-27T23:55:11Z

src/main/java/com/android/volley/Request.java

-        mCanceled = true;
+        synchronized (mLock) {
+            mCanceled = true;
+            onCanceledInternal();


Since we don't know what happens in onCancelled, should we maybe check here if the request is already cancelled and avoid double-cancellation? Especially if you make mCanceled volatile as suggested above rather than guarded, but I think it's safer either way.

That's a fair point; I could change this to avoid double-cancellation.

uhager · 2017-09-28T00:05:08Z

src/main/java/com/android/volley/ExecutorDelivery.java

-            if (mRunnable != null) {
-                mRunnable.run();
-            }
+            mRequest.deliverResponse(mResponse, mRunnable);


Could you call that something different, maybe handleResponse or prepareResponseDelivery of maybeDeliverResponse or something? It was a bit confusing to read for me since there are now 2 deliverResponse with different arguments and having different jobs.

Yeah, agreed this is confusing; I couldn't come up with a better name. If we end up keeping this (per my other comment) then maybe deliverResponseAndFinish() or something to indicate the additional work that this method is responsible for doing.

uhager · 2017-09-28T00:20:41Z

src/main/java/com/android/volley/Request.java

+        // To uphold the contract of cancel(), we hold the same lock here around both the check of
+        // isCanceled() as well as the actual response delivery. This ensures that after a call
+        // to cancel() returns, we will never deliver the response.
+        synchronized (mLock) {


That's a lot of stuff behind this lock, I'm not sure that's needed. I think instead we could add a check whether finish() has been called (addMarker() after finish() leads to IllegalStateException). Since this method (and finish) are package-private it is anyway not possible for users to access them without using ExecutorDelivery. And as long as ExecutorDelivery always runs them on the same thread, we can't have conflicts here as far as I can tell. What do you think?

@joebowbeer I acknowledge the concern, but I don't see a way to properly implement the guarantees that are advertised by cancel() without ensuring that the same lock is held for both cancel() and for the response dispatching itself. Without this lock, it's possible for someone to call cancel() and still have the callbacks called (if we've already checked isCanceled() here).

@uhager It doesn't seem to me like finish() is always called (in the response.intermediate case) so I'm not sure that works in every case, right? And the assumption about ExecutorDelivery running on the same thread is the one this locking is intended to resolve.

Another option might be to weaken the guarantee by stating that you must call cancel() from the same thread as the ResponseDelivery being used (typically the main thread) in order to guarantee that the callbacks aren't invoked. If you call cancel() from another thread, and the main thread is in the process of delivering responses, it may still end up delivering the response. In this case we could leave the dispatching as it was and just accept the risk of a post-cancel dispatch. However, this would mean needing to lock writes to the error listener / response listener.

@jjoslin - any thoughts?

AFAICT the same lock needs to be held during cancel() and response dispatching to uphold the cancel() contract as @jpd236 states. I feel like this is a reasonable thing to do but also acknowledge that it does create the potential for deadlock (although I think it will be rare in practice). I support/prefer the current implementation but I wouldn't be opposed to weakening the guarantee either.

jpd236

Thank you both for the prompt reviews!

Regarding keeping the scope small: I'm not sure what the other proposed fix would be but the memory retention is only one of the two issues being resolved here, the other being the RequestFuture cancellation bug. Note that just the minimal fix of clearing listeners in cancel() can only safely be done if cancel invocation is thread-safe.

I also think the ship has already sailed re: small bug fixes; the Apache header changes and soft TTL tweaks that recently landed are just as large if not more so. Volley's 1.0.0 release has been out for some time and I believe the vast majority of clients are not running into issues, so I don't see a strong reason to rush a release prematurely. We'll get there soon, and affected clients may use SNAPSHOT builds in the mean time (and can pin to a specific snapshot if they're concerned about picking up less-tested changes).

jpd236 · 2017-09-28T00:11:09Z

src/main/java/com/android/volley/Request.java

@@ -56,6 +56,12 @@
        int PATCH = 7;
    }

+    /** Callback interface to notify listeners when a request has been canceled. */
+    public interface CancelListener {


CancellationListener makes sense - the intent is to be consistent with "ErrorListener", so the prefixes should both be nouns describing what happened.

jpd236 · 2017-09-28T00:24:41Z

src/main/java/com/android/volley/ExecutorDelivery.java

-            if (mRunnable != null) {
-                mRunnable.run();
-            }
+            mRequest.deliverResponse(mResponse, mRunnable);


Yeah, agreed this is confusing; I couldn't come up with a better name. If we end up keeping this (per my other comment) then maybe deliverResponseAndFinish() or something to indicate the additional work that this method is responsible for doing.

jpd236 · 2017-09-28T00:25:08Z

src/main/java/com/android/volley/Request.java

@@ -96,9 +105,11 @@
    private boolean mShouldCache = true;

    /** Whether or not this request has been canceled. */
+    // GuardedBy(mLock)


We don't have a dependency on the support lib yet, so for now I'm holding off on this. At some point later on we might add it to cover things like this, @VisibleForTesting, etc.

jpd236 · 2017-09-28T00:25:56Z

src/main/java/com/android/volley/Request.java

@@ -96,9 +105,11 @@
    private boolean mShouldCache = true;

    /** Whether or not this request has been canceled. */
+    // GuardedBy(mLock)
    private boolean mCanceled = false;



It's not sufficient as it stands due to holding mLock around the response dispatching below; it might be sufficient if we change that, but I'm not sure that n volatile variables is necessarily more performant than a simple lock, and it's certainly more tricky to reason about :)

jpd236 · 2017-09-28T00:26:26Z

src/main/java/com/android/volley/Request.java

-        mCanceled = true;
+        synchronized (mLock) {
+            mCanceled = true;
+            onCanceledInternal();


That's a fair point; I could change this to avoid double-cancellation.

jpd236 · 2017-09-28T00:29:03Z

src/main/java/com/android/volley/Request.java

+        // To uphold the contract of cancel(), we hold the same lock here around both the check of
+        // isCanceled() as well as the actual response delivery. This ensures that after a call
+        // to cancel() returns, we will never deliver the response.
+        synchronized (mLock) {


@joebowbeer I acknowledge the concern, but I don't see a way to properly implement the guarantees that are advertised by cancel() without ensuring that the same lock is held for both cancel() and for the response dispatching itself. Without this lock, it's possible for someone to call cancel() and still have the callbacks called (if we've already checked isCanceled() here).

@uhager It doesn't seem to me like finish() is always called (in the response.intermediate case) so I'm not sure that works in every case, right? And the assumption about ExecutorDelivery running on the same thread is the one this locking is intended to resolve.

Another option might be to weaken the guarantee by stating that you must call cancel() from the same thread as the ResponseDelivery being used (typically the main thread) in order to guarantee that the callbacks aren't invoked. If you call cancel() from another thread, and the main thread is in the process of delivering responses, it may still end up delivering the response. In this case we could leave the dispatching as it was and just accept the risk of a post-cancel dispatch. However, this would mean needing to lock writes to the error listener / response listener.

@jjoslin - any thoughts?

uhager · 2017-09-28T00:56:54Z

@jpd236 I meant that ExecutorDelivery always sends deliverResponse on the same Handler. So deliverResponse can't be called simultaneously, since - as far as I can tell - no other class calls it and it's not accessible to users. Or does RequestFuture change that behaviour in ExecutorDelivery?
For an intermediate response, we want it to be allowed to be called again, just not after finish was called. So if 2 threads call ExecutorDelivery#postResponse with non-intermediate responses, only the first deliverResponse will run, the second one will see that finish has been called and return. If we allow users to call deliverResponse(response, runnable), or it is called somewhere else in Volley, then I agree, we need some locking.

uhager · 2017-09-28T01:51:58Z

Thinking about this some more, wouldn't it make more sense to synchronize access to mErrorListener? Where it is set to null when cancelled and then in deliverResponse around deliverError? Check if it's non-null and then synchronize using it?
I'm just worried that this is a very big block that is locked and can block a thread that tries to cancel the request for some time. At the very least we should use a dedicated lock for this rather than reusing the other one.

googlebot · 2017-09-28T02:27:34Z

CLAs look good, thanks!

jjoslin · 2017-09-29T18:08:49Z

src/main/java/com/android/volley/Request.java

+
+    private void onCanceledInternal() {
+        mErrorListener = null;
+        if (mCancelListener != null) {


I think you need to synchronize access to mCancelListener because setCancelListener() could be called on a different thread. Or pass it in through the ctor and make it volatile.

jjoslin · 2017-09-29T18:14:27Z

src/main/java/com/android/volley/Request.java

+    }
+
+    /** Sets a {@link CancelListener} which will be notified when the request is canceled. */
+    public void setCancelListener(CancelListener listener) {


Why not add another ctor to accommodate this?

jjoslin · 2017-09-29T18:17:34Z

src/main/java/com/android/volley/Request.java

+    }
+
+    private void onCanceledInternal() {
+        mErrorListener = null;


mErrorListener needs to be volatile (or access to it needs to be synchronized) to ensure getErrorListener() returns the correct value. Same for mCancelListener.

jjoslin · 2017-09-29T21:17:38Z

src/main/java/com/android/volley/Request.java

+    private void onCanceledInternal() {
+        mErrorListener = null;
+        if (mCancelListener != null) {
+            mCancelListener.onRequestCanceled();


The other callbacks run on the same thread as the ResponseDelivery being used whereas this runs on the calling thread. I realize in most cases those threads will be the same (main thread) but what do you think about providing a stronger guarantee about the thread onRequestCanceled() runs on?

jjoslin · 2017-09-29T21:34:16Z

src/main/java/com/android/volley/Request.java

+        // To uphold the contract of cancel(), we hold the same lock here around both the check of
+        // isCanceled() as well as the actual response delivery. This ensures that after a call
+        // to cancel() returns, we will never deliver the response.
+        synchronized (mLock) {


AFAICT the same lock needs to be held during cancel() and response dispatching to uphold the cancel() contract as @jpd236 states. I feel like this is a reasonable thing to do but also acknowledge that it does create the potential for deadlock (although I think it will be rare in practice). I support/prefer the current implementation but I wouldn't be opposed to weakening the guarantee either.

joebowbeer

Please remove any new locking from code that can be called from main thread.

joebowbeer · 2017-09-30T01:04:21Z

src/main/java/com/android/volley/Request.java

@@ -96,9 +105,11 @@
    private boolean mShouldCache = true;

    /** Whether or not this request has been canceled. */
+    // GuardedBy(mLock)
    private boolean mCanceled = false;



Looking at this a little closer, I see that synchronization is being added to methods that are likely to be called on the main thread. This should be avoided. In particular, Request.cancel() is waiting for a lock held by the ExecutorDelivery.

joebowbeer · 2017-09-30T01:30:17Z

src/main/java/com/android/volley/toolbox/RequestFuture.java

@@ -51,9 +52,10 @@
 *
 * @param <T> The type of parsed response this future expects.
 */
-public class RequestFuture<T> implements Future<T>, Response.Listener<T>,
-       Response.ErrorListener {
+public class RequestFuture<T> implements Future<T>, Response.Listener<T>, Response.ErrorListener,


I'm not sure how much difference this makes, but the pattern I like to use for classes that implement Future is to have them wrap an internal FutureTask instance to handle the synchronization. For example, see BackgroundTask in JCiP (Ch. 9 ). This makes it easier to get the synchronization correct, and makes it more apparent that it is.

jpd236 · 2017-10-03T18:26:45Z

OK - this PR seems too contentious to try and merge all at once :) Let me try to split it into more reasonable chunks so it's more manageable; I'll send them out separately.

First of all, thanks all for the detailed comments; I really do appreciate it.

For the leak, I need to think it through a bit more but I think you're probably right that we can just synchronize access to mErrorListener/mListener in Request and each of the toolbox subclasses, and then clear them on cancel. This doesn't guarantee that deliverResponse()/deliverError() won't be called after calling cancel(), but calling them should just no-op if the respective listener hasn't been nulled out, so we should be okay just updating the Javadoc to note this limitation - there are no guarantees if cancel() is called from a different thread than the response delivery. It also leaves this locking up to the Request subclass which isn't ideal, but there's not much we can do about that without holding our own lock for code we don't control.

Please note that grabbing a lock on the main thread should not be a problem; it should finish in microseconds. It's only an issue if that lock is contentious or if the work performed inside the lock can take a non-trivial amount of time. Certainly if it's safe to avoid it's better, as with anything on the main thread, but I don't think it's a concern here when necessary.

For the cancel listener - it'll be a bit more involved, but a possible option here is to add a postCancel to ResponseDelivery and plumb cancel events through there, so for our default implementation it'll run on the main thread. The main issue is that ResponseDelivery is an interface, so as we did with the Apache HTTP refactoring for HttpStack, we'd need to add a new interface and continue to support the old one (dropping cancellation events) for anyone using it. I may punt on this for a while given the complexity and uncertainty about the API, and instead just update RequestFuture to warn about using wait(0) in some form.

I think that covers the higher level open questions. For the nitty-gritty we can cover that on each PR.

jpd236 · 2017-10-03T19:14:34Z

Leak/thread-safety: #105

jpd236 requested a review from jjoslin September 27, 2017 23:20

joebowbeer reviewed Sep 27, 2017

View reviewed changes

joebowbeer suggested changes Sep 27, 2017

View reviewed changes

uhager reviewed Sep 28, 2017

View reviewed changes

jpd236 commented Sep 28, 2017

View reviewed changes

jjoslin reviewed Sep 29, 2017

View reviewed changes

joebowbeer suggested changes Sep 30, 2017

View reviewed changes

jpd236 mentioned this pull request Oct 3, 2017

RequestFuture#get blocks forever if an in-flight request is cancelled #85

Open

jpd236 mentioned this pull request Oct 10, 2019

RequestQueue#cancelAll race condition #299

Closed

Improve request cancelation handling. #103

Are you sure you want to change the base?

Improve request cancelation handling. #103

Conversation

jpd236 commented Sep 27, 2017

googlebot commented Sep 27, 2017

jpd236 commented Sep 27, 2017

joebowbeer left a comment

Choose a reason for hiding this comment

jpd236 commented Sep 27, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joebowbeer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpd236 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uhager commented Sep 28, 2017

uhager commented Sep 28, 2017

googlebot commented Sep 28, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joebowbeer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpd236 commented Oct 3, 2017

jpd236 commented Oct 3, 2017