Skip to content

[ci] Copilot CI: Android device tests, emulator robustness, and snapshot fallback#34340

Merged
PureWeen merged 26 commits intomainfrom
copilot-ci-android
Mar 11, 2026
Merged

[ci] Copilot CI: Android device tests, emulator robustness, and snapshot fallback#34340
PureWeen merged 26 commits intomainfrom
copilot-ci-android

Conversation

@PureWeen
Copy link
Copy Markdown
Member

@PureWeen PureWeen commented Mar 4, 2026

Note

Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!

Description

This PR brings together three improvements to the Copilot CI infrastructure:

  1. Auto-trigger uitests and device-tests on darc-* branches — Major rework of ci-copilot.yml to support UI tests and device tests triggered on darc branches.

  2. Make emulator startup and provisioning more robust — Improvements to Start-Emulator.ps1 and provision.yml for more reliable Android emulator handling.

  3. Support fallback environment for snapshots — Changes to UITest.cs and VisualRegressionTester.cs to support snapshot environment fallback.

Changes

  • eng/pipelines/ci-copilot.yml — Reworked CI pipeline for Android device test support
  • .github/scripts/shared/Start-Emulator.ps1 — More robust emulator startup
  • eng/pipelines/common/provision.yml — Provisioning improvements
  • src/Controls/tests/TestCases.Shared.Tests/UITest.cs — Snapshot fallback support
  • src/TestUtils/src/VisualTestUtils/VisualRegressionTester.cs — Snapshot fallback support

Copilot AI review requested due to automatic review settings March 4, 2026 22:11
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 4, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 34340

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 34340"

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the Copilot CI infrastructure across three areas: (1) enabling darc-* branches to trigger UI tests and device tests, (2) making Android emulator startup more robust on Linux CI, and (3) adding snapshot environment fallback so iOS 26 tests can use ios baselines when ios-26 baselines are missing.

Changes:

  • Android emulator setup was substantially rewritten from PowerShell (Start-Emulator.ps1) to inline bash scripts to support Linux/KVM-based emulation on Azure Pipelines hosted Ubuntu agents
  • The ios-26 environment fallback to ios baselines is implemented in both VisualRegressionTester.cs and UITest.cs
  • provision.yml gains an Xcode version fallback (latest available) when the requested version isn't found

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
eng/pipelines/ci-copilot.yml Major rework: Android uses Linux+KVM, cross-platform tool installs, smarter emulator health check/restart
.github/scripts/shared/Start-Emulator.ps1 Removed DEVICE_UDID env var fallback for iOS; removed shutdown-other-simulators block; simplified success message
eng/pipelines/common/provision.yml Xcode version failure converted to WARNING + fallback to latest available Xcode
src/Controls/tests/TestCases.Shared.Tests/UITest.cs Added fallbackEnvironmentName logic for ios-26ios baseline fallback
src/TestUtils/src/VisualTestUtils/VisualRegressionTester.cs New fallbackEnvironmentName parameter in VerifyMatchesSnapshot to check fallback baseline
eng/pipelines/ci-uitests.yml Added darc-* to CI trigger branches
eng/pipelines/ci-device-tests.yml Added darc-* to CI trigger branches

You can also share your feedback on Copilot code review. Take the survey.

displayName: 'Restart Android Emulator (Fresh)'
condition: eq('${{ parameters.Platform }}', 'android')
continueOnError: true
timeoutInMinutes: 15
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Restart Android Emulator (Fresh)" step sets device_timeout=300 (5 min) and boot_timeout=600 (10 min), for a combined maximum wait of 900 seconds (15 minutes), but the step's timeoutInMinutes: 15. Adding the pre-wait operations (kill, sleep, ADB server restart), the step will almost certainly exceed its 15-minute timeout in the worst case when a full restart is needed. The timeoutInMinutes should be increased to at least 20 minutes, or the internal timeouts should be reduced (e.g., device_timeout=120 and boot_timeout=480).

Suggested change
timeoutInMinutes: 15
timeoutInMinutes: 20

Copilot uses AI. Check for mistakes.
echo "##vso[task.logissue type=error]Copilot PR review failed. Check CopilotLogs artifact for details."
exit 1
fi
displayName: 'Check Copilot Result'
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition: succeededOrFailed() was removed from the "Check Copilot Result" step. Without this condition, the step uses the default succeeded() condition and will be skipped if any prior step in the pipeline fails (e.g., the "Publish Build Logs" task fails because $(LogDirectory) is missing). This means a Copilot failure could be silently hidden: CopilotFailed is set to true but the Check step never runs. The condition: succeededOrFailed() should be restored to ensure this step always runs and the pipeline correctly fails when Copilot reports an error.

Suggested change
displayName: 'Check Copilot Result'
displayName: 'Check Copilot Result'
condition: succeededOrFailed()

Copilot uses AI. Check for mistakes.
@PureWeen PureWeen force-pushed the copilot-ci-android branch 2 times, most recently from 7473e81 to 31359be Compare March 4, 2026 22:32
kubaflo and others added 23 commits March 9, 2026 19:19
Improve CI reliability for Android and iOS by hardening emulator/simulator startup and provisioning.

Key changes:
- .github/scripts/shared/Start-Emulator.ps1: remove auto-use of DEVICE_UDID from env and remove shutdown-of-other-simulators logic; shorten success message.
- eng/pipelines/ci-copilot.yml: refactor job pool selection (use Ubuntu for Android), create artifact dirs early, and make provisioning parameters conditional by platform.
  - Replace earlier PS-based Android emulator invocation with a cross-platform shell flow: fix KVM permissions, install dependencies, locate AVDs, start emulator directly, wait for adb and boot completion, export DEVICE_UDID, and tighten timeouts.
  - Add Linux/macOS Java home detection, Node.js/Appium/Copilot/GH installs and authentication flows, Appium driver installs, and PATH handling.
  - Add emulator health checks, restart/reuse logic, and more robust artifact collection and cleanup around the PR reviewer step; make sed usage portable between Linux and macOS.
  - Improve iOS simulator handling: select runtime priority (iOS 18/17/26), prefer iPhone Xs or fallback to iPhone 11 Pro, create device if missing, shut down other booted sims, and set DEVICE_UDID.
- eng/pipelines/common/provision.yml: when exact Xcode version isn't found, fall back to the latest available Xcode on the agent instead of failing.

Overall effect: more resilient CI runs across Ubuntu/macOS agents, fewer false failures from missing runtimes, AVDs, or incompatible Xcode versions, and improved tooling installation and auth handling for the PR reviewer agent.
…ProvisionAppium templates

Replace ~600 lines of manual Android setup with existing pipeline templates:
- enable-kvm.yml for KVM permissions (same as uitests/device-tests)
- provision.yml for SDK/JDK provisioning (same parameters as ui-tests-steps.yml)
- Cake android.cake --target=boot for emulator lifecycle (same as device-tests)
- UseNode@1 + ProvisionAppium for Appium install (same as ui-tests-steps.yml)
- Remove iOS-specific code (Android-only scope)
- Remove redundant environment verification and emulator restart steps

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace ~400 lines of manual emulator management with existing templates:
- Add enable-kvm.yml for Linux KVM permissions
- Align provision.yml params with ui-tests-steps.yml
- Replace inline AVD/SDK/emulator boot with Cake boot (android.cake --target=boot)
- Replace brew Node.js with UseNode@1 task
- Replace npm Appium with ProvisionAppium via Provisioning.csproj
- Remove redundant Verify Build Environment step
- Remove Restart Android Emulator (Fresh) step
- Add APPIUM_HOME and LogDirectory variables

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Provision.yml sets ANDROID_SDK_ROOT but doesn't add platform-tools to PATH.
Cake boot finds adb internally via the SDK root, but subsequent steps need
it in PATH too. Also persist the PATH for later steps.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Android device-tests and uitests both run on Linux with KVM-accelerated
emulation. Switch ci-copilot to the same so enable-kvm.yml actually runs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cake --target=boot fires the emulator process and returns immediately.
Device-tests and uitests use --target=testOnly/uitest which waits internally,
but ci-copilot needs the emulator ready before the agent runs. Add explicit
wait for device, boot_completed, and package manager.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On Linux hosted agents, adb is not in PATH by default. Use ANDROID_SDK_ROOT
(set by provision.yml) with fallback to the standard hosted image path.
Also persist the PATH addition for subsequent steps.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cake's Teardown kills the emulator when the boot target exits. The
uitest/testOnly targets keep it alive but also run tests. Since we just
need a running emulator for the Copilot agent, create AVD and launch
emulator directly — provision.yml already installed the SDK and images.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Hosted ubuntu-22.04 agents have ~4.2GB free in home directory, but the
default userdata partition needs 7.3GB. Use -partition-size 2048 to
reduce to 2GB which is sufficient for testing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The -partition-size flag doesn't override config.ini. Modify config.ini
directly after AVD creation to set disk.dataPartition.size=2048m, which
fits within the ~4.2GB free on hosted ubuntu agents.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The google_apis_playstore system image for API 30 needs 7.3GB for the
userdata partition, but hosted ubuntu agents only have ~4.2GB free.
Remove unnecessary tools (CodeQL, Go, Python, Chromium, extra .NET SDKs,
Swift) to free ~10GB before launching the emulator.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The emulator boots successfully but adb doesn't see it. Restart the
adb server fresh before launching emulator, use adb wait-for-device
instead of polling, and capture emulator log for debugging.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Newer gh CLI versions on ubuntu-22.04 require read:org scope which the
token may not have. Using GH_TOKEN env var bypasses scope validation
while still authenticating. Falls back to gh auth login if needed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GNU sed (Linux) uses 'sed -i' while BSD sed (macOS) uses 'sed -i ""'.
Also re-triggered pipeline with unquoted parameters to fix PRNumber
receiving 'Platform=android' as part of its value.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On Linux, UseNode@1 installs to /opt/hostedtoolcache/node/*/x64/bin/
which may not persist to pwsh subprocesses. Explicitly prepend the
copilot binary directory to PATH via ##vso[task.prependpath].

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ##vso[task.prependpath] from the install step should propagate,
but pwsh subprocess may not inherit it. Find the copilot binary and
add its directory to PATH explicitly in the reviewer step.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PATH exported from bash doesn't reliably propagate to pwsh subprocess
on Linux hosted agents. Create a symlink in /usr/local/bin which is
universally on PATH. Also use -NoProfile to prevent pwsh profiles
from interfering with PATH resolution.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
copilot --version outputs to stderr, so '2>$null' discards the output
and the check always fails on Linux. Use Get-Command (reliable) and
merge stderr with '2>&1' when capturing the version string.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot CLI now requires Node.js v24+. UseNode@1 was pinned to 20.3.1
which causes 'GitHub Copilot CLI requires Node.js v24 or higher'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When emulator -list-avds returns a single AVD, PowerShell stores it as
a string not an array. $avdList[0] then returns the first character
('E' from 'Emulator_30') instead of the full name. Fix by wrapping
with @() and [string[]] to force array type.

Also fix selection regex: 'API.*30' didn't match 'Emulator_30' since
there's no 'API' prefix. Changed to match '30' anywhere in the name.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Pre-authorize ADB keys before boot (mirrors android.cake HandleVirtualDevice)
- Restart ADB server at 90s boot wait (mirrors android.cake PrepareDevice)
- Disable all animations (window, transition, animator) to reduce CPU load
- Set infinite screen timeout and stay-awake to prevent screen lock
- Wake screen and dismiss system dialogs after boot
- Clear logcat buffer for clean agent logs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The emulator sits idle 15-30 min while Node/Appium/CLI are installed.
SystemUI can ANR during this idle period on low-resource CI agents.

Adds a dedicated warmup step right before the reviewer agent:
- Wakes screen and dismisses any ANR/system dialogs
- Opens/closes Settings to exercise the system
- Verifies device responsiveness before handing off to agent
- Also adds stability wait after boot to fix 'device offline' errors

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
P1: Add retry logic for Android APK install in Build-AndDeploy.ps1.
On ADB0010/broken-pipe failure (transient API 30 bug), uninstall
existing packages, restart ADB, and retry once.

P2: Add ANR dialog dismissal in BuildAndRunHostApp.ps1 before each
test run. Checks dumpsys window for lingering ANR dialogs and
force-dismisses with HOME+BACK. Also enhances the pipeline warmup
step with a two-pass dismiss loop and explicit ANR detection.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the copilot-ci-android branch from 727764d to 055a5d4 Compare March 10, 2026 00:21
github-actions bot and others added 2 commits March 10, 2026 20:44
Emulator sometimes starts but ADB shows offline for >120s on hosted
agents (~25% of failed builds). This was a fatal failure with no recovery.

Changes:
- Wrap emulator launch + adb wait-for-device in a retry loop (2 attempts)
- On retry: stop emulator process, restart ADB, relaunch
- Add retryCountOnTaskFailure: 1 on the AVD step as belt-and-suspenders
- Add ADB key re-authorization every 60s during boot wait (matches Cake)

This gives up to 4 chances total to get a working emulator:
2 script-level retries x 2 step-level retries.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Copilot reviewer agent on PR #25036 (Nested Flex Layouts) ran 5
try-fix attempts across models and hit the 180-minute timeout during
Round 2 cross-pollination. Doubling to 360 minutes to accommodate
complex PRs that need multiple fix attempts.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants