Skip to content

💥 ☂️ Solution to fleet-wide Android emulator crashes on CI #153445

@matanlurey

Description

@matanlurey

Forked from #152769 as it's unrelated to my specific test.

As of 2024-08-14, we have systemic fleet-wide Android emulator crashes that appear unrelated with any specific test:

In nearly every case, the following occurs (note that every test uses a slightly different setup, the logs are not uniform):

  1. The emulator successfully starts (as part of a CI task job)
  2. The test, typically using flutter drive ... installs the test APK, and starts executing
  3. An ambient print message appears in the console: getIsolate: (-32000) Service connection disposed
  4. The test (driver script) ends (usually you'll see adb: device offline in the logs by this point) with a failure

Confusingly, these tests work run fine roughly 95% - 99% of the time.


My plan this week is roughly the following:

  • Try LED builds of the Android emulator try-jobs, trying to extract crash logs, twiddle scripts
  • See what the memory usage of the emulator is locally (we have a max of 2gb right now)
  • Meet with the maintainer of the Chromium Android emulator tools (we use) on GVC

/cc @reidbaker @zanderso @johnmccutchan @christopherfujino @chunhtai


TODOs

  • Consider removing --writable-system:

    https://flutter-review.googlesource.com/c/recipes/+/59180/1

    W 15:30:04.302    0.034s Main  Emulator will be slow to start, as "writable_system=True" but system snapshot not found.
  • Try meeting with the Chromium Android emulator team on any obvious next steps.

    ✅ Upgrading to the latest avd_cipd_version:
    Roll avd_cipd_verison to latest to use the crashreport tool. #153520

  • Try getting a copy of the emu-crash.db log entries, if any.

    Did not result in anything.

    During emulator startup, the console reads:

    INFO    | Storing crashdata in: /tmp/android-unknown/emu-crash-34.2.14.db, detection is enabled for process: 124450

    I wrote a small script to try and scrape this directory when a test crashes, but it resulted in nothing:

    Could not read file: /tmp/android-unknown/emu-crash-34.2.14.db/settings.dat

    It's possible that nothing is being logged, the logs are actually elsewhere, but seems like a dead end for now.

  • Try running the same tests on Github Actions VM and see if I observe similar flakes:

    ✅ Ran overnight on a 15m cron, API v35 seems too unstable to use as non-bringup task:

    https://github.com/matanlurey/flutter-renegade-gha

  • [ ] Try running a demo Android app (or even just the emulator itself) without any Flutter

  • [ ] Try SSHing into a bot and running steps manually to try and detect the crash

Metadata

Metadata

Assignees

Labels

P1High-priority issues at the top of the work listc: crashStack traces logged to the consolec: tech-debtTechnical debt, code quality, testing, etc.platform-androidAndroid applications specificallyteam-infraOwned by Infrastructure team

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions