[ci] Copilot CI: Android device tests, emulator robustness, and snapshot fallback#34340
[ci] Copilot CI: Android device tests, emulator robustness, and snapshot fallback#34340
Conversation
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 34340Or
iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 34340" |
There was a problem hiding this comment.
Pull request overview
This PR improves the Copilot CI infrastructure across three areas: (1) enabling darc-* branches to trigger UI tests and device tests, (2) making Android emulator startup more robust on Linux CI, and (3) adding snapshot environment fallback so iOS 26 tests can use ios baselines when ios-26 baselines are missing.
Changes:
- Android emulator setup was substantially rewritten from PowerShell (
Start-Emulator.ps1) to inline bash scripts to support Linux/KVM-based emulation on Azure Pipelines hosted Ubuntu agents - The
ios-26environment fallback toiosbaselines is implemented in bothVisualRegressionTester.csandUITest.cs provision.ymlgains an Xcode version fallback (latest available) when the requested version isn't found
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
eng/pipelines/ci-copilot.yml |
Major rework: Android uses Linux+KVM, cross-platform tool installs, smarter emulator health check/restart |
.github/scripts/shared/Start-Emulator.ps1 |
Removed DEVICE_UDID env var fallback for iOS; removed shutdown-other-simulators block; simplified success message |
eng/pipelines/common/provision.yml |
Xcode version failure converted to WARNING + fallback to latest available Xcode |
src/Controls/tests/TestCases.Shared.Tests/UITest.cs |
Added fallbackEnvironmentName logic for ios-26 → ios baseline fallback |
src/TestUtils/src/VisualTestUtils/VisualRegressionTester.cs |
New fallbackEnvironmentName parameter in VerifyMatchesSnapshot to check fallback baseline |
eng/pipelines/ci-uitests.yml |
Added darc-* to CI trigger branches |
eng/pipelines/ci-device-tests.yml |
Added darc-* to CI trigger branches |
You can also share your feedback on Copilot code review. Take the survey.
eng/pipelines/ci-copilot.yml
Outdated
| displayName: 'Restart Android Emulator (Fresh)' | ||
| condition: eq('${{ parameters.Platform }}', 'android') | ||
| continueOnError: true | ||
| timeoutInMinutes: 15 |
There was a problem hiding this comment.
The "Restart Android Emulator (Fresh)" step sets device_timeout=300 (5 min) and boot_timeout=600 (10 min), for a combined maximum wait of 900 seconds (15 minutes), but the step's timeoutInMinutes: 15. Adding the pre-wait operations (kill, sleep, ADB server restart), the step will almost certainly exceed its 15-minute timeout in the worst case when a full restart is needed. The timeoutInMinutes should be increased to at least 20 minutes, or the internal timeouts should be reduced (e.g., device_timeout=120 and boot_timeout=480).
| timeoutInMinutes: 15 | |
| timeoutInMinutes: 20 |
| echo "##vso[task.logissue type=error]Copilot PR review failed. Check CopilotLogs artifact for details." | ||
| exit 1 | ||
| fi | ||
| displayName: 'Check Copilot Result' |
There was a problem hiding this comment.
The condition: succeededOrFailed() was removed from the "Check Copilot Result" step. Without this condition, the step uses the default succeeded() condition and will be skipped if any prior step in the pipeline fails (e.g., the "Publish Build Logs" task fails because $(LogDirectory) is missing). This means a Copilot failure could be silently hidden: CopilotFailed is set to true but the Check step never runs. The condition: succeededOrFailed() should be restored to ensure this step always runs and the pipeline correctly fails when Copilot reports an error.
| displayName: 'Check Copilot Result' | |
| displayName: 'Check Copilot Result' | |
| condition: succeededOrFailed() |
7473e81 to
31359be
Compare
Improve CI reliability for Android and iOS by hardening emulator/simulator startup and provisioning. Key changes: - .github/scripts/shared/Start-Emulator.ps1: remove auto-use of DEVICE_UDID from env and remove shutdown-of-other-simulators logic; shorten success message. - eng/pipelines/ci-copilot.yml: refactor job pool selection (use Ubuntu for Android), create artifact dirs early, and make provisioning parameters conditional by platform. - Replace earlier PS-based Android emulator invocation with a cross-platform shell flow: fix KVM permissions, install dependencies, locate AVDs, start emulator directly, wait for adb and boot completion, export DEVICE_UDID, and tighten timeouts. - Add Linux/macOS Java home detection, Node.js/Appium/Copilot/GH installs and authentication flows, Appium driver installs, and PATH handling. - Add emulator health checks, restart/reuse logic, and more robust artifact collection and cleanup around the PR reviewer step; make sed usage portable between Linux and macOS. - Improve iOS simulator handling: select runtime priority (iOS 18/17/26), prefer iPhone Xs or fallback to iPhone 11 Pro, create device if missing, shut down other booted sims, and set DEVICE_UDID. - eng/pipelines/common/provision.yml: when exact Xcode version isn't found, fall back to the latest available Xcode on the agent instead of failing. Overall effect: more resilient CI runs across Ubuntu/macOS agents, fewer false failures from missing runtimes, AVDs, or incompatible Xcode versions, and improved tooling installation and auth handling for the PR reviewer agent.
…ProvisionAppium templates Replace ~600 lines of manual Android setup with existing pipeline templates: - enable-kvm.yml for KVM permissions (same as uitests/device-tests) - provision.yml for SDK/JDK provisioning (same parameters as ui-tests-steps.yml) - Cake android.cake --target=boot for emulator lifecycle (same as device-tests) - UseNode@1 + ProvisionAppium for Appium install (same as ui-tests-steps.yml) - Remove iOS-specific code (Android-only scope) - Remove redundant environment verification and emulator restart steps Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace ~400 lines of manual emulator management with existing templates: - Add enable-kvm.yml for Linux KVM permissions - Align provision.yml params with ui-tests-steps.yml - Replace inline AVD/SDK/emulator boot with Cake boot (android.cake --target=boot) - Replace brew Node.js with UseNode@1 task - Replace npm Appium with ProvisionAppium via Provisioning.csproj - Remove redundant Verify Build Environment step - Remove Restart Android Emulator (Fresh) step - Add APPIUM_HOME and LogDirectory variables Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Provision.yml sets ANDROID_SDK_ROOT but doesn't add platform-tools to PATH. Cake boot finds adb internally via the SDK root, but subsequent steps need it in PATH too. Also persist the PATH for later steps. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Android device-tests and uitests both run on Linux with KVM-accelerated emulation. Switch ci-copilot to the same so enable-kvm.yml actually runs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cake --target=boot fires the emulator process and returns immediately. Device-tests and uitests use --target=testOnly/uitest which waits internally, but ci-copilot needs the emulator ready before the agent runs. Add explicit wait for device, boot_completed, and package manager. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On Linux hosted agents, adb is not in PATH by default. Use ANDROID_SDK_ROOT (set by provision.yml) with fallback to the standard hosted image path. Also persist the PATH addition for subsequent steps. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cake's Teardown kills the emulator when the boot target exits. The uitest/testOnly targets keep it alive but also run tests. Since we just need a running emulator for the Copilot agent, create AVD and launch emulator directly — provision.yml already installed the SDK and images. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Hosted ubuntu-22.04 agents have ~4.2GB free in home directory, but the default userdata partition needs 7.3GB. Use -partition-size 2048 to reduce to 2GB which is sufficient for testing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The -partition-size flag doesn't override config.ini. Modify config.ini directly after AVD creation to set disk.dataPartition.size=2048m, which fits within the ~4.2GB free on hosted ubuntu agents. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The google_apis_playstore system image for API 30 needs 7.3GB for the userdata partition, but hosted ubuntu agents only have ~4.2GB free. Remove unnecessary tools (CodeQL, Go, Python, Chromium, extra .NET SDKs, Swift) to free ~10GB before launching the emulator. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The emulator boots successfully but adb doesn't see it. Restart the adb server fresh before launching emulator, use adb wait-for-device instead of polling, and capture emulator log for debugging. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Newer gh CLI versions on ubuntu-22.04 require read:org scope which the token may not have. Using GH_TOKEN env var bypasses scope validation while still authenticating. Falls back to gh auth login if needed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GNU sed (Linux) uses 'sed -i' while BSD sed (macOS) uses 'sed -i ""'. Also re-triggered pipeline with unquoted parameters to fix PRNumber receiving 'Platform=android' as part of its value. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
On Linux, UseNode@1 installs to /opt/hostedtoolcache/node/*/x64/bin/ which may not persist to pwsh subprocesses. Explicitly prepend the copilot binary directory to PATH via ##vso[task.prependpath]. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ##vso[task.prependpath] from the install step should propagate, but pwsh subprocess may not inherit it. Find the copilot binary and add its directory to PATH explicitly in the reviewer step. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PATH exported from bash doesn't reliably propagate to pwsh subprocess on Linux hosted agents. Create a symlink in /usr/local/bin which is universally on PATH. Also use -NoProfile to prevent pwsh profiles from interfering with PATH resolution. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
copilot --version outputs to stderr, so '2>$null' discards the output and the check always fails on Linux. Use Get-Command (reliable) and merge stderr with '2>&1' when capturing the version string. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot CLI now requires Node.js v24+. UseNode@1 was pinned to 20.3.1 which causes 'GitHub Copilot CLI requires Node.js v24 or higher'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When emulator -list-avds returns a single AVD, PowerShell stores it as
a string not an array. $avdList[0] then returns the first character
('E' from 'Emulator_30') instead of the full name. Fix by wrapping
with @() and [string[]] to force array type.
Also fix selection regex: 'API.*30' didn't match 'Emulator_30' since
there's no 'API' prefix. Changed to match '30' anywhere in the name.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Pre-authorize ADB keys before boot (mirrors android.cake HandleVirtualDevice) - Restart ADB server at 90s boot wait (mirrors android.cake PrepareDevice) - Disable all animations (window, transition, animator) to reduce CPU load - Set infinite screen timeout and stay-awake to prevent screen lock - Wake screen and dismiss system dialogs after boot - Clear logcat buffer for clean agent logs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The emulator sits idle 15-30 min while Node/Appium/CLI are installed. SystemUI can ANR during this idle period on low-resource CI agents. Adds a dedicated warmup step right before the reviewer agent: - Wakes screen and dismisses any ANR/system dialogs - Opens/closes Settings to exercise the system - Verifies device responsiveness before handing off to agent - Also adds stability wait after boot to fix 'device offline' errors Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
P1: Add retry logic for Android APK install in Build-AndDeploy.ps1. On ADB0010/broken-pipe failure (transient API 30 bug), uninstall existing packages, restart ADB, and retry once. P2: Add ANR dialog dismissal in BuildAndRunHostApp.ps1 before each test run. Checks dumpsys window for lingering ANR dialogs and force-dismisses with HOME+BACK. Also enhances the pipeline warmup step with a two-pass dismiss loop and explicit ANR detection. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
727764d to
055a5d4
Compare
Emulator sometimes starts but ADB shows offline for >120s on hosted agents (~25% of failed builds). This was a fatal failure with no recovery. Changes: - Wrap emulator launch + adb wait-for-device in a retry loop (2 attempts) - On retry: stop emulator process, restart ADB, relaunch - Add retryCountOnTaskFailure: 1 on the AVD step as belt-and-suspenders - Add ADB key re-authorization every 60s during boot wait (matches Cake) This gives up to 4 chances total to get a working emulator: 2 script-level retries x 2 step-level retries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Copilot reviewer agent on PR #25036 (Nested Flex Layouts) ran 5 try-fix attempts across models and hit the 180-minute timeout during Round 2 cross-pollination. Doubling to 360 minutes to accommodate complex PRs that need multiple fix attempts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Note
Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!
Description
This PR brings together three improvements to the Copilot CI infrastructure:
Auto-trigger uitests and device-tests on
darc-*branches — Major rework ofci-copilot.ymlto support UI tests and device tests triggered on darc branches.Make emulator startup and provisioning more robust — Improvements to
Start-Emulator.ps1andprovision.ymlfor more reliable Android emulator handling.Support fallback environment for snapshots — Changes to
UITest.csandVisualRegressionTester.csto support snapshot environment fallback.Changes
eng/pipelines/ci-copilot.yml— Reworked CI pipeline for Android device test support.github/scripts/shared/Start-Emulator.ps1— More robust emulator startupeng/pipelines/common/provision.yml— Provisioning improvementssrc/Controls/tests/TestCases.Shared.Tests/UITest.cs— Snapshot fallback supportsrc/TestUtils/src/VisualTestUtils/VisualRegressionTester.cs— Snapshot fallback support