Skip to content

Fix flaky boot issues by adding a retry parameter#563

Merged
dotdoom merged 8 commits intofutureware-tech:mainfrom
nilsreichardt:fix-boot-issue
Feb 5, 2026
Merged

Fix flaky boot issues by adding a retry parameter#563
dotdoom merged 8 commits intofutureware-tech:mainfrom
nilsreichardt:fix-boot-issue

Conversation

@nilsreichardt
Copy link
Copy Markdown
Contributor

@nilsreichardt nilsreichardt commented Feb 4, 2026

As reported in #548 this GitHub Action is sometimes a bit flaky. It doesn't complete the boot process of the simulator. As a workaround I add retries, that can be configured with the following parameters:

  • boot_timeout_seconds (default: 360): Maximum number of seconds to wait for the Simulator to finish booting (0 disables the timeout)
  • boot_retries (default: 2): Number of times to retry booting when waiting for the Simulator to finish booting fails. Setting this to 2 will result in 3 attempts: one normal attempt and two retries.

Before this PR the action failed 6x out of 20 runs (see logs). With the changes from this PR, the action failed 0 out of 40 runs (see these logs and these logs)

In this log the retry was used: https://github.com/nilsreichardt/integration_test_problem/actions/runs/21669542187/job/62473614621

Closes #548

@dotdoom
Copy link
Copy Markdown
Member

dotdoom commented Feb 4, 2026

Thank you, LGTM! I'll resolve the conflicts that resulted from my earlier PR and merge this.

@dotdoom dotdoom merged commit 6a2cef2 into futureware-tech:main Feb 5, 2026
3 checks passed
github-merge-queue Bot pushed a commit to SharezoneApp/sharezone-app that referenced this pull request Feb 5, 2026
@gemini-code-assist gemini-code-assist Bot mentioned this pull request Mar 1, 2026
Merged
olerass added a commit to rainbow-me/rainbow that referenced this pull request Apr 22, 2026
The `Ensure Simulator is fully booted` step in `ios-e2e.yml` polls `xcrun simctl list | grep "Booted"` with a 120-second timeout. The `simulator-action` that creates the simulator already supports a `wait_for_boot: true` input that uses Apple's `simctl bootstatus` internally, which is the supported mechanism for detecting simulator readiness. The custom loop was added in #6713 (Aug 2025) during the GH Actions migration with no documented rationale; presumably the action's native wait was simply overlooked at the time.

We noticed the grep-based detection is fragile while working on the iOS 26 SDK migration. An iOS 26.4 simulator booted in ~36 seconds, yet our custom loop timed out at 120s because the `grep 'Booted'` pattern didn't match reliably during the boot transition. Swapping to the action's native wait fixed it cleanly in that context.

This change backports the combined swap to develop's current iPhone 16 / iOS 18.5 setup. Behavior should be equivalent on green runs and more robust on slow boots.

@janicduplessis pointed out on this PR that the built-in wait has historically been flaky: `simctl bootstatus` sometimes returns while the simulator isn't actually ready, or hangs outright. We actually observed that hang pattern on one shard here. To combat this the change also bumps the simulator-action from v4 to v5, which addresses this exact class of flake. v5 ships `boot_timeout_seconds: 360` and `boot_retries: 2` as defaults (see [#563](futureware-tech/simulator-action#563)), so a hung `bootstatus` call now bails after 6 minutes and retries up to twice before failing the step. No explicit config needed on our side.

Ref FEPLAT-81.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Action is flaky, any resolution?

2 participants