Added timeout for closing devfs sync http connections. #66152

gaaclarke · 2020-09-18T23:01:02Z

Description

HttpClientRequest.close is randomly hanging then throwing a generic exception after 4 minutes. I've filed a bug against dart and introduced a timeout in flutter since the request that would fail to converge on close would succeed on the second attempt in my testing.

I was able to validate with local testing that this addresses the issue and the 500ms timeout is well above my observed time for close to succeed which was more like 10ms in my observations.

cc @jmagman @xster

Related Issues

#63869
Fixes #65402

Tests

I added the following tests:

none

Checklist

Before you create this PR, confirm that it meets all requirements listed below by checking the relevant checkboxes ([x]). This will ensure a smooth and quick review process.

I read the Contributor Guide and followed the process outlined there for submitting PRs.
I signed the CLA.
I read and followed the Flutter Style Guide, including Features we expect every widget to implement.
I read the Tree Hygiene wiki page, which explains my responsibilities.
I updated/added relevant documentation (doc comments with ///).
All existing and new tests are passing.
The analyzer (flutter analyze --flutter-repo) does not report any problems on my PR.
I am willing to follow-up on review comments in a timely manner.

Breaking Change

Did any tests fail when you ran them? Please read Handling breaking changes.

No, no existing tests failed, so this is not a breaking change.
Yes, this is a breaking change. If not, delete the remainder of this section.
- I wrote a design doc: https://flutter.dev/go/template Replace this with a link to your design doc's short link
- I got input from the developer relations team, specifically from: Replace with the names of who gave advice
- I wrote a migration guide: https://flutter.dev/go/breaking-changes-template Replace this with a link to a pull request that adds the migration guide to https://flutter.dev/docs/release/breaking-changes

flutter-dashboard · 2020-09-18T23:01:05Z

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat.

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

jmagman · 2020-09-19T01:00:31Z

Were you ever able to repro? I'd rather know what changed between 1.17 and 1.20 and track down the bug, rather than just timing out. @zanderso was investigating this area with #54123. There was also a lot of asset work done this release. @jonahwilliams @dnfield

zanderso · 2020-09-19T03:26:16Z

Ah, that PR was something else.

As for this PR, 4 minutes sounds like an OS-level connection timeout. We might be able to bring that in by setting connectionTimeout on the HttpClient. Of course, that won't help if the client is already connected, and is timing out afterwards.

However, client request output is buffered (buffer size 8KB): https://api.dart.dev/dev/2.10.0-146.0.dev/dart-io/HttpClientRequest/bufferOutput.html, so we might not be trying to connect and send anything until the close() call.

If we think the transfer is taking too long, we should use abort() https://api.dart.dev/dev/2.10.0-146.0.dev/dart-io/HttpClientRequest/abort.html after the timeout to clean up the connection properly instead of leaking it for 4 minutes.

As @jmagman says however, it would be nice to find the root cause before adding the timeout, but adding the timeout is not terrible here since we've already got a retry loop.

jonahwilliams · 2020-09-19T15:38:09Z

Should we be using platform specific tools in order to upload binaries/assets to each device instead of HTTP? Once we have the devFS location presumably we could use something like adb push/ios-deploy something. Is that more reliable?

jonahwilliams · 2020-09-21T14:45:23Z

Wrote up go/flutter-improving-devfs-reliability to discuss some options

jonahwilliams · 2020-09-21T14:45:57Z

In general I think playing with the timeouts here is a bit scary. We don't really know how long it should take to sync an asset

gaaclarke · 2020-09-21T16:24:30Z

@jmagman @jonahwilliams @zanderso Yes, I was able to reproduce it. The bug is in closing the http request. It is a bug in the Dart http library. I've filed an issue.

We already have timeouts for this call. They are 4 minutes long which is dictated by Dart. This timeout is one that is more amenable to our users and is 50x the observed duration during normal operation. Closing the request happens after the data has already been sent, so we do know how long it should take. The call before close is the one that scales based on the size of the payload.

Regardless, the devfs sync operation retries 10 times, so even if it somehow prematurely times out, it would have to exhibit the pathological case 9 more times and prematurely timeout 9 times which is highly unlikely. (in the bug reports and my observation the stalled asset has always succeeded the second time).

gaaclarke · 2020-09-21T16:44:55Z

One other thing to consider is that even if Dart fixes the http code, we will still want an apropos timeout.

gaaclarke · 2020-09-21T17:18:37Z

It's also possible that the bug is in the devfs server. Do we own that code? I'm thinking it might possibly be closing a connection when it receives the payload before returning a status code, that would match my observations.

jonahwilliams · 2020-09-21T17:19:41Z

Yes, the devFS is part of the dart vm service: https://github.com/dart-lang/sdk/blob/master/sdk/lib/vmservice/devfs.dart

jonahwilliams · 2020-09-21T17:20:21Z

there are a few associated files there, but do note that we use the HTTP API and not the JSON-RPC API

jonahwilliams · 2020-09-21T17:31:11Z

I agree that fixing this may involve changing/adding timeouts, but given that we don't really seem to know the cause I'm a bit concerned that we'd just be sort of papering-over some other issue. If you can reproduce the syncing issues fairly consistently it seems like the best course of action would be to identify the actual error condition we're hitting on the device/devFS

jmagman · 2020-09-21T17:43:41Z

Yes, I was able to reproduce it.

I was never able to get it to repro. I can help bisecting if I can figure out how to do so.

jmagman · 2020-09-21T17:44:35Z

Should we be using platform specific tools in order to upload binaries/assets to each device instead of HTTP?

Just to link it here, @jonahwilliams opened #66266

gaaclarke · 2020-09-21T20:54:31Z

I was able to capture the error in pcap:
no-response-put-devfs.pcapng.gz

The HTTP PUT is number 1547. The server never sends us a status code back for that PUT despite having sent the data. The retry is number 2086.

Here is the logs (with my patch recovering):

[  +24 ms] Connecting to service protocol: http://localhost:63162
[        ] DDS is currently disabled due to https://github.com/flutter/flutter/issues/62507
[  +24 ms] Successfully connected to service protocol: http://localhost:63162
[   +1 ms] Waiting for iPhone (2) to report its views...
[   +4 ms] Waiting for iPhone (2) to report its views... (completed in 3ms)
[   +6 ms] DevFS: Creating new filesystem on the device (null)
[   +9 ms] DevFS: Created new filesystem on the device
(file:///private/var/mobile/Containers/Data/Application/9BFD0295-83EC-4888-821D-3EAFB5BD8D50/tmp/stable_flutterL2ODmx/stable_flutter/)
[   +2 ms] Updating assets
[  +82 ms] Manifest contained wildcard assets. Inserting missing file into build graph to force rerun. for more information see #56466.
[   +6 ms] Syncing files to device iPhone (2)...
[   +1 ms] Scanning asset files
[   +2 ms] <- reset
[        ] Compiling dart to kernel with 0 updated files
[   +1 ms] <- recompile package:stable_flutter/main.dart d855deb7-0e70-446b-9333-9243cf07f514
[        ] <- d855deb7-0e70-446b-9333-9243cf07f514
[  +39 ms] Updating files
[ +539 ms] Error writing "build/flutter_assets/assets/icons/star.png" to DevFS: TimeoutException after 0:00:00.500000: Future not completed
[        ] trying again in a few - 9 more attempts left
[ +518 ms] DevFS: Sync finished

I'll add these details to the Dart issue as well.

I'm a bit concerned that we'd just be sort of papering-over some other issue

That's usually how internet protocols work, that's why there is the "Robustness principle" that dictates that we be liberal with what we accept. That means we need to consider the fact that the server, now and in the future, may not return a status code to our PUT request.

jonahwilliams · 2020-09-21T21:04:13Z

We control both the client and the server here. If there is an ambient bug in the devFS logic that is only being triggered on iOS, we need to track that down.

gaaclarke · 2020-09-21T22:25:37Z

I've been able to verify that request.response.close() isn't getting called in the pathological case:

The following patch yields:

--- a/sdk/lib/_internal/vm/bin/vmservice_server.dart
+++ b/sdk/lib/_internal/vm/bin/vmservice_server.dart
@@ -307,13 +307,16 @@ class Server {
             fsPath = fsPathList[0];
           }
         }
-      } catch (e) {/* ignore */}
+      } catch (e) {
+        print('vmservice_server: error $e');
+      }
 
       String result;
       try {
         result = await _service.devfs.handlePutStream(fsName, fsPath, fsUri,
             request.cast<List<int>>().transform(gzip.decoder));
       } catch (e) {
+        print('vmservice_server: handlePutStream error $e');
         request.response.statusCode = HttpStatus.internalServerError;
         request.response.write(e);
         request.response.close();
@@ -326,6 +329,7 @@ class Server {
         request.response.write(result);
       }
       request.response.close();
+      print('vmservice_server: request close! ${fsUri ?? ''}');
       return;
     }
     if (request.method != 'GET') {

default	15:21:31.051786-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/icons/star.png
default	15:21:31.052525-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/images/DS_Store
default	15:21:31.052836-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/icons/.DS_Store
default	15:21:31.053275-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/icons/DS_Store
default	15:21:31.054822-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/images/dash.png
default	15:21:31.061486-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/AssetManifest.json
default	15:21:31.063394-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/FontManifest.json
default	15:21:31.081074-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/packages/cupertino_icons/assets/CupertinoIcons.ttf
default	15:21:31.084466-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/NOTICES
default	15:21:31.117615-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/fonts/MaterialIcons-Regular.otf
default	15:21:31.153009-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/videos/test_video_1.mp4
default	15:21:32.028810-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/images/.DS_Store

When:

[ +535 ms] Error writing "build/flutter_assets/assets/images/.DS_Store" to DevFS: TimeoutException after 0:00:00.500000: Future not completed
[        ] trying again in a few - 9 more attempts left

gaaclarke · 2020-09-21T22:37:31Z

When printing out the put messages as well as when we close the response we find that that code never got the PUT request despite what the client thinks:

server:

default	15:32:30.857713-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/assets/icons/.DS_Store
default	15:32:30.864622-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/assets/icons/DS_Store
default	15:32:30.865021-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/assets/images/.DS_Store
default	15:32:30.865427-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/assets/icons/star.png
default	15:32:30.900191-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/images/.DS_Store
default	15:32:30.901053-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/icons/star.png
default	15:32:30.901350-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/icons/DS_Store
default	15:32:30.901582-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/icons/.DS_Store
default	15:32:30.910111-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/AssetManifest.json
default	15:32:30.912617-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/AssetManifest.json
default	15:32:30.912967-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/assets/videos/test_video_1.mp4
default	15:32:30.914638-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/packages/cupertino_icons/assets/CupertinoIcons.ttf
default	15:32:30.917633-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/fonts/MaterialIcons-Regular.otf
default	15:32:30.921756-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/FontManifest.json
default	15:32:30.923954-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/FontManifest.json
default	15:32:30.928618-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/packages/cupertino_icons/assets/CupertinoIcons.ttf
default	15:32:30.934921-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/NOTICES
default	15:32:30.947721-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/NOTICES
default	15:32:30.971329-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/fonts/MaterialIcons-Regular.otf
default	15:32:31.006757-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/videos/test_video_1.mp4
default	15:32:31.873486-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/assets/images/DS_Store
default	15:32:31.877027-0700	Runner	flutter: vmservice_server: PUT build/flutter_assets/assets/images/dash.png
default	15:32:31.889961-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/images/DS_Store
default	15:32:31.896555-0700	Runner	flutter: vmservice_server: request close! build/flutter_assets/assets/images/dash.png

client:

[  +40 ms] Updating files
[ +534 ms] Error writing "build/flutter_assets/assets/images/DS_Store" to DevFS: TimeoutException after 0:00:00.500000: Future not completed
[        ] trying again in a few - 9 more attempts left
[        ] Error writing "build/flutter_assets/assets/images/dash.png" to DevFS: TimeoutException after 0:00:00.500000: Future not completed
[        ] trying again in a few - 9 more attempts left
[ +530 ms] DevFS: Sync finished

Notice that build/flutter_assets/assets/images/DS_Store is only showing up once on the server side.

gaaclarke · 2020-09-21T23:02:45Z

I investigated using HttpClient.connectionTimeout and it didn't catch the problem case:

[   +2 ms] Updating assets
[  +81 ms] Manifest contained wildcard assets. Inserting missing file into build graph to force rerun. for more information see #56466.
[   +5 ms] Syncing files to device iPhone (2)...
[   +1 ms] Scanning asset files
[   +1 ms] <- reset
[        ] Compiling dart to kernel with 0 updated files
[   +1 ms] <- recompile package:stable_flutter/main.dart d2c1dc4a-4220-4815-8395-44f784c2bfaf
[        ] <- d2c1dc4a-4220-4815-8395-44f784c2bfaf
[  +38 ms] Updating files
[+239078 ms] Error writing "build/flutter_assets/assets/icons/.DS_Store" to DevFS: HttpException: , uri = http://localhost:50448
[        ] trying again in a few - 9 more attempts left
[ +525 ms] DevFS: Sync finished

--- a/packages/flutter_tools/lib/src/devfs.dart
+++ b/packages/flutter_tools/lib/src/devfs.dart
@@ -220,7 +220,9 @@ class _DevFSHttpWriter {
     : httpAddress = serviceProtocol.httpAddress,
       _client = httpClient,
       _osUtils = osUtils,
-      _logger = logger;
+      _logger = logger {
+        _client.connectionTimeout = Duration(milliseconds:500);
+      }
 
   final HttpClient _client;
   final OperatingSystemUtils _osUtils;
@@ -273,7 +275,7 @@ class _DevFSHttpWriter {
         // The contents has already been streamed, closing the request should
         // not take long but we are experiencing hangs with it, see #63869.
         final HttpClientResponse response =
-            await request.close().timeout(const Duration(milliseconds: 500));
+            await request.close();
         response.listen((_) {},
           onError: (dynamic error) {
             _logger.printTrace('error: $error');

gaaclarke · 2020-09-21T23:49:47Z

I updated the PR so that it aborts the request in the case of a timeout.

gaaclarke · 2020-09-22T21:12:00Z

I did some further investigation into the root cause of the missing status code and filed a dart bug here: dart-lang/sdk#43525

It seems like there might be something gobbled up in _HttpParser. Root cause wasn't established.

zanderso · 2020-09-22T23:33:47Z

packages/flutter_tools/lib/src/devfs.dart

The docs for abort() say that this will cause HttpClientRequest.done to be completed with an error, which I would have expected to be propagated to the containing zone, skipping over the catch block below since it is an asynchronous error.

Even if that isn't what should happen, we can be on the safe side and put an await request.done after this line, which should definitely force the error into the catch block below.

I made the change. I tried to understand what the concern is and had a hard time figuring it out. Is your concern that there was some asynchronous operations in abort that could be short circuited the TimeoutException?

Without awaiting the request's done Future, the TimeoutException would be rethrown and caught below as expected, but in addition, abort() will complete the done Future with an error (probably in a later microtask). When a Future is completed with an error, and it doesn't have an error handling callback attached with e.g. Future.catchError or isn't awaited, then the error skips over synchronous try {} catch {} blocks, and is passed to the unhandled exception callback of the containing Zone. We don't want that to happen here because that callback will report the exception to crash logging.

If you aren't seeing the abort() result in an error like that, then that behavior doesn't match the API docs...

is passed to the unhandled exception callback of the containing Zone

Ah, ok. That's the piece I was missing, thanks. I was thinking the the abort exception would silently be ignored since there was no one to catch it.

zanderso

lgtm

zanderso · 2020-09-23T15:06:40Z

packages/flutter_tools/lib/src/devfs.dart

Without awaiting the request's done Future, the TimeoutException would be rethrown and caught below as expected, but in addition, abort() will complete the done Future with an error (probably in a later microtask). When a Future is completed with an error, and it doesn't have an error handling callback attached with e.g. Future.catchError or isn't awaited, then the error skips over synchronous try {} catch {} blocks, and is passed to the unhandled exception callback of the containing Zone. We don't want that to happen here because that callback will report the exception to crash logging.

If you aren't seeing the abort() result in an error like that, then that behavior doesn't match the API docs...

zanderso · 2020-09-23T15:10:36Z

packages/flutter_tools/lib/src/devfs.dart

nit: no newline after '=', two space indent on continued line.

Suggested change

final HttpClientResponse response =

await request.close().timeout(const Duration(milliseconds: 500));

final HttpClientResponse response = await request.close().timeout(

const Duration(milliseconds: 500),

);

fluttergithubbot · 2020-09-23T18:52:06Z

This pull request is not suitable for automatic merging in its current state.

The status or check suite Google testing has failed. Please fix the issues identified (or deflake) before re-applying this label.

zanderso · 2020-09-23T21:53:55Z

packages/flutter_tools/test/general.shard/devfs_test.dart

+      'test',
+      fileSystem.currentDirectory,
+      fileSystem: fileSystem,
+      logger: BufferLogger.test(),


We probably want to take a look at the stdout and stderr from the BufferLogger to make sure their contents is as expected.

zanderso · 2020-09-23T22:07:11Z

packages/flutter_tools/test/general.shard/devfs_test.dart

+    when(httpRequest.close()).thenAnswer((Invocation invocation) async {
+      if (!didAbort) {
+        while (!didAbort) {
+          await new Future.delayed(const Duration(milliseconds : 10));
+        }
+        throw HttpException('aborted');
+      }
+      didRetry = true;
+      return httpClientResponse;
+    });
+    when(httpRequest.abort()).thenAnswer((_) {
+      didAbort = true;
+    });


We can use completers to avoid the polling loop. This also uses a completer for the result of close(), so that the same Future can be returned for httpRequest.done.

final Completer<void> abortCompleter = Completer<void>(); final Completer<HttpClientResponse> responseCompleter = Completer<HttpClientResponse>(); when(httpRequest.close()).thenAnswer((Invocation invocation) async { if (!didAbort) { await abortCompleter.future; unawaited(Future<void>(() async { responseCompleter.completeError(HttpException('aborted')); }); return responseCompleter.future; } didRetry = true; return httpClientResponse; }); when(httpRequest.abort()).thenAnswer((_) { didAbort = true; abortCompleter.complete(); }); when(httpRequest.done).thenReturn(responseCompleter.future);

gaaclarke · 2020-09-24T00:34:00Z

Google Tests seems to not have run and it looks like they failed again in CL 333390424.

zanderso

A negative test that makes sure the tool fails the right way when all 10 retries fail would also be good if we don't already have one of those.

zanderso · 2020-09-24T04:17:24Z

packages/flutter_tools/test/general.shard/devfs_test.dart

+      <Completer<MockHttpClientResponse>>[hanger, succeeder];
+    succeeder.complete(MockHttpClientResponse());
+
+    when(httpRequest.close()).thenAnswer((Invocation invocation) async {


This doesn't need to by async, and I don't think you want to await the Future before returning it---the code under test should be doing that.

zanderso · 2020-09-24T04:18:33Z

packages/flutter_tools/test/general.shard/devfs_test.dart

+        return succeeder.future;
+      } else {
+        // This branch shouldn't happen.
+        assert(false);


assert(false) -> fail('This branch should not happen')

gaaclarke

A negative test that makes sure the tool fails the right way when all 10 retries fail would also be good if we don't already have one of those.

If you don't mind. I'd rather prioritize getting this fix landed for now. It's affecting users and has ran into plenty of roadblocks. I still have to investigate what's happening in "Google Tests" and I don't want to add the opportunity for more issues.

I added an issue for that test so we don't lose it: #66564

jonahwilliams · 2020-09-24T16:46:51Z

Correctness should be the primary concern here. Cherry picking something like this is especially risky, given that we don't really have integration test coverage for these sorts of failures - we'll be relying on crash reports and user issues to know if the fix worked.

gaaclarke · 2020-09-24T16:50:28Z

Correctness should be the primary concern here. Cherry picking something like this is especially risky, given that we don't really have integration test coverage for these sorts of failures - we'll be relying on crash reports and user issues to know if the fix worked.

Are you talking about the test that exhausts the retry count? That isn't related to to this change. This change has 100% test code coverage. As far as integration tests, we actually do have them, they are in Google Test and I'm investigating them now.

zanderso · 2020-09-24T17:04:18Z

Two things:

I'm suggesting adding a negative unit test as part of this change because I suspect it might shed light on the failing internal test. If you feel it would be easier to attack the failing internal test directly first, then feel free to reverse the order.
If you are feeling deadline pressure from a person or a process to get something landed quickly, then please include me wherever any relevant discussion is happening.

gaaclarke · 2020-09-24T18:15:34Z

Two things:

I'm suggesting adding a negative unit test as part of this change because I suspect it might shed light on the failing internal test. If you feel it would be easier to attack the failing internal test directly first, then feel free to reverse the order.

If you are feeling deadline pressure from a person or a process to get something landed quickly, then please include me wherever any relevant discussion is happening.

Thanks, I don't want to be misunderstood I'm not trying to shirk responsibility, I'm being economical with my time and respectful of our user's time. I have responsibilities beyond this PR. If a task is suggested, no matter how good the suggestion is, if it doesn't advance the fixing of this issue it needs to be prioritized against all the other responsibilities I have. In this case I don't see a reasonable theory that the suggested test will help Google Tests pass or cover an issue that could arise from the code change. If I'm missing something please let me know. I think it's a good idea, that's why I made the issue.

jmagman · 2020-09-24T18:26:34Z

Mac tool_tests failure in an infra failure in LUCI, it passed on Cirrus.

jmagman

LGTM, let's merge this.

jmagman · 2020-09-24T18:28:59Z

LGTM, let's merge this.

Hm I thought FRoB had run again, but it didn't? Do we know if those TAP tests are now passing?

gaaclarke · 2020-09-24T18:29:09Z

@jmagman I need a few more hours to investigate the Google Test failures that aren't showing up in the CI list. I don't think they are critical failures but I want to make sure. The one I investigated yesterday was failing because of extra stderr output which I can't reproduce locally.

jmagman · 2020-09-24T18:29:42Z

@gaaclarke Sorry to rush you, I had thought the Google tests were passing, but they just didn't run.

gaaclarke · 2020-09-24T21:22:03Z

I was able to reproduce Google Test basic_runner_attach_android_test_pixel_3_xl_google_29_x86. I was able to verify that the test is working before this PR and afterwards all assets appear to fail to load. Those tests run on slow machines, hopefully it is just something running slow and there isn't an error in my assessment that that synchronization step isn't bounded by asset size.

[  +12 ms] <- compile package:mobile.flutter.tests.app/main.dart
[+47873 ms] Updating files
[ +784 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 9 more attempts left
[  +88 ms] Error writing "build/flutter_assets/FontManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 9 more attempts left
[ +567 ms] Error writing "build/flutter_assets/MaterialIcons-Extended.ttf" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 9 more attempts left
[ +843 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 8 more attempts left
[ +155 ms] Error writing "build/flutter_assets/FontManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 8 more attempts left
[ +839 ms] Error writing "build/flutter_assets/MaterialIcons-Extended.ttf" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 8 more attempts left
[ +598 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 7 more attempts left
[        ] Error writing "build/flutter_assets/FontManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 7 more attempts left
[+1008 ms] Error writing "build/flutter_assets/MaterialIcons-Extended.ttf" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 7 more attempts left
[ +343 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 6 more attempts left
[        ] Error writing "build/flutter_assets/FontManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 6 more attempts left
[+1067 ms] Error writing "build/flutter_assets/MaterialIcons-Extended.ttf" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 6 more attempts left
[ +190 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 5 more attempts left
[        ] Error writing "build/flutter_assets/FontManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 5 more attempts left
[+1407 ms] Error writing "build/flutter_assets/MaterialIcons-Extended.ttf" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 5 more attempts left
[ +154 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 4 more attempts left
[        ] Error writing "build/flutter_assets/FontManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 4 more attempts left
[ +542 ms] Error writing "lib/main.dart.dill" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 9 more attempts left
[ +469 ms] Error writing "build/flutter_assets/MaterialIcons-Extended.ttf" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 4 more attempts left
[   +9 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 3 more attempts left
[  +23 ms] Error writing "build/flutter_assets/FontManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 3 more attempts left
[+1393 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 2 more attempts left
[  +10 ms] Error writing "build/flutter_assets/FontManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 2 more attempts left
[ +120 ms] Error writing "build/flutter_assets/MaterialIcons-Extended.ttf" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 3 more attempts left
[+1082 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
[  +11 ms] trying again in a few - 1 more attempts left
[        ] Error writing "build/flutter_assets/FontManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 1 more attempts left
[ +462 ms] Error writing "build/flutter_assets/MaterialIcons-Extended.ttf" to DevFS: HttpException: Request has been aborted
[   +1 ms] trying again in a few - 2 more attempts left
[ +688 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 0 more attempts left
[        ] Error writing "build/flutter_assets/FontManifest.json" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 0 more attempts left
[ +617 ms] Error writing "build/flutter_assets/MaterialIcons-Extended.ttf" to DevFS: HttpException: Request has been aborted
[        ] trying again in a few - 1 more attempts left
[ +892 ms] Error writing "build/flutter_assets/AssetManifest.json" to DevFS: HttpException: Request has been aborted
stderr: [   +1 ms] Could not update files on device: HttpException: Request has been aborted
[   +8 ms] Syncing files to device Pixel 3 XL Android... (completed in 62,374ms, longer than expected)

gaaclarke · 2020-09-24T21:56:17Z

I printed out the times those close functions take to finish in milliseconds with the size of the payload in bytes:

request.close size:78 time: 1499
request.close size:229308 time: 1850
request.close size:61 time: 1948
request.close size:12848960 time: 5294

That's very different than the local ~10ms I was seeing.

Notice that each one is larger than the last. I think this is an artifact of the result that we are parallelizing the sending of the files, but the server is handling them serially. That means payload size does indirectly affect the time it takes to wait for the status code (not in network cost, but processing cost). It also means that retry attempts will have lower likelihood of timing out as the server load lessens (assuming some tasks converge).

These tests are also running on server farms with emulated devices so I expect their numbers to be higher than anyone would experience in any sane setup. This speaks to the need to find the root cause in the server even more. A safe number would be 10x the observed max which would be 50 seconds, which is still better than the current 4 minute timeout. It is hardly an acceptable pause to receive 20% of the time.

updated the timeout documentation reduced the max number of inflight uploads

gaaclarke · 2020-09-24T22:32:15Z

@jonahwilliams @zanderso Please give this another look. I increased the timeout to 10 seconds based on my observations in Google Tests, trying to balance safe and considerable. I also reduced the max inflight uploads to limit the variance in the time the request.close takes (since it appears to be waiting for the serial handling of the server of PUT requests).

zanderso

lgtm on green presubmits

mehmetf · 2020-10-19T15:33:51Z

@gaaclarke

Unfortunately, this had a side effect for developers deploying large apps over proxied WFH environments. There are legitimate cases where the sync takes over 10 seconds. We should either revert this or increase the timeout. I believe 60 seconds is reasonable. I would consider any dev cycle that takes over 60 seconds as unusable. See b/171005910.

jonahwilliams · 2020-10-19T15:37:26Z

It should be safe to revert. The problem this was trying to solve was syncing assets during the initial connection, after #66742 we no longer need to sync any assets

jonahwilliams · 2020-10-19T15:57:25Z

Increasing to 60 seconds would be a better change though, then we're not relying on the OS level timeout

gaaclarke · 2020-10-19T17:45:00Z

#68488 increased the timeout to 1 minute.

flutter-dashboard bot added the tool Affects the "flutter" command-line tool. See also t: labels. label Sep 18, 2020

googlebot added the cla: yes label Sep 18, 2020

gaaclarke requested a review from zanderso September 18, 2020 23:01

zanderso reviewed Sep 22, 2020

View reviewed changes

dodatw mentioned this pull request Sep 23, 2020

Flutter keeps getting stuck in "Syncing files to device" when installing app on a real iOS Device. #65746

Closed

zanderso approved these changes Sep 23, 2020

View reviewed changes

gaaclarke added waiting for tree to go green labels Sep 23, 2020

fixed analyzer errors in test

f6a1347

zanderso reviewed Sep 23, 2020

View reviewed changes

gaaclarke added 2 commits September 23, 2020 15:37

switched the test to using completers

30275ca

checked stderr to make sure we aren't printing things out

322dd1e

gaaclarke requested a review from zanderso September 23, 2020 22:41

zanderso reviewed Sep 24, 2020

View reviewed changes

zach feedback

3d73379

gaaclarke commented Sep 24, 2020

View reviewed changes

removed dead code

12d2c53

jmagman approved these changes Sep 24, 2020

View reviewed changes

increased the timeout

8fac09e

updated the timeout documentation reduced the max number of inflight uploads

zanderso approved these changes Sep 25, 2020

View reviewed changes

gaaclarke merged commit 76ad864 into flutter:master Sep 25, 2020

jmagman mentioned this pull request Sep 29, 2020

Stuck on white screen when debugging iOS #63800

Closed

Added timeout for closing devfs sync http connections. #66152

Added timeout for closing devfs sync http connections. #66152

Conversation

gaaclarke commented Sep 18, 2020 • edited by xster Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Tests

Checklist

Breaking Change

Uh oh!

flutter-dashboard bot commented Sep 18, 2020

Uh oh!

jmagman commented Sep 19, 2020

Uh oh!

zanderso commented Sep 19, 2020

Uh oh!

jonahwilliams commented Sep 19, 2020

Uh oh!

jonahwilliams commented Sep 21, 2020

Uh oh!

jonahwilliams commented Sep 21, 2020

Uh oh!

gaaclarke commented Sep 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gaaclarke commented Sep 21, 2020

Uh oh!

gaaclarke commented Sep 21, 2020

Uh oh!

jonahwilliams commented Sep 21, 2020

Uh oh!

jonahwilliams commented Sep 21, 2020

Uh oh!

jonahwilliams commented Sep 21, 2020

Uh oh!

jmagman commented Sep 21, 2020

Uh oh!

jmagman commented Sep 21, 2020

Uh oh!

gaaclarke commented Sep 21, 2020

Uh oh!

jonahwilliams commented Sep 21, 2020

Uh oh!

gaaclarke commented Sep 21, 2020

Uh oh!

gaaclarke commented Sep 21, 2020

Uh oh!

gaaclarke commented Sep 21, 2020

Uh oh!

gaaclarke commented Sep 21, 2020

Uh oh!

gaaclarke commented Sep 22, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zanderso left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fluttergithubbot commented Sep 23, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

gaaclarke commented Sep 18, 2020 •

edited by xster

Loading

gaaclarke commented Sep 21, 2020 •

edited

Loading

gaaclarke left a comment •

edited

Loading

zanderso commented Sep 24, 2020 •

edited

Loading

gaaclarke commented Sep 24, 2020 •

edited

Loading

gaaclarke commented Sep 24, 2020 •

edited

Loading