Skip to content

feat(perps): fingerprint-gated build cache for agentic preflight#30565

Merged
abretonc7s merged 16 commits into
mainfrom
feat/perps/expo-agentic-start
May 22, 2026
Merged

feat(perps): fingerprint-gated build cache for agentic preflight#30565
abretonc7s merged 16 commits into
mainfrom
feat/perps/expo-agentic-start

Conversation

@abretonc7s

@abretonc7s abretonc7s commented May 22, 2026

Copy link
Copy Markdown
Contributor

Description

Adds a fingerprint-gated build cache to scripts/perps/agentic/preflight.sh so farmslot dispatches stop paying for redundant yarn setup + pod install --repo-update + xcodebuild cycles when the native dep graph hasn't changed.

Why: farmslot today hardcodes --clean --wallet-setup, forcing ~15–20 min per dispatch even when only TypeScript changed. Investigation showed @expo/fingerprint (already a dep) + scripts/generate-fingerprint.js (already wired) can deterministically detect "no native change" — but the preflight path didn't use them.

Approach:

  • New --mode <auto|fast|rebuild-native|clean> flag on preflight.sh. Legacy --clean/--rebuild continue to work unchanged.
  • Two-tier cache: shared $MM_BUILD_CACHE_DIR (default ~/Library/Caches/mm-mobile-builds on macOS, ~/.cache/mm-mobile-builds on Linux) keyed by fingerprint, plus a per-worktree .agent/build-cache/<plat>/installed.json sidecar.
  • Per-fingerprint flock wraps the full decide + install + build + store region, so two peer worktrees at the same fingerprint produce exactly one xcodebuild/gradle invocation. The second worker installs the artifact the first published.
  • installed.json records both fingerprint AND target (sim UDID / adb serial); the fast-path skip requires both to match, so a recorded build on one sim won't false-hit on another.
  • pod install --repo-update is now opt-in (only --mode clean); plain pod install runs first with a one-shot --repo-update retry on failure.
  • --mode fast is strict: missing cache OR failed cache install hard-fails instead of silently rebuilding.
  • --check-only stays read-only: a cache-decision that would mutate the sim/device exits with an explanatory failure instead.

All new helpers live in scripts/perps/agentic/lib/build-cache.sh. Idempotent test suites (unit + real-sim e2e) live next to it.

Boundary: change is fully contained under scripts/perps/agentic/ — no root package.json shortcuts, no perps-out-of-scope files. Callers invoke bash scripts/perps/agentic/preflight.sh --mode <…> directly.

Scenario Today After
Same worktree, app already installed at this fp on this sim ~15–20 min ~30–45 s
Peer worktree, shared cache hit ~15–20 min ~60–90 s
Native diff (Podfile/native module) ~15–20 min ~5–8 min (first worker only; rest cache-install)
Cold host / --mode clean ~15–20 min unchanged (escape hatch)

Farmslot side (projects/metamask-mobile-farm/project.json--mode auto) is intentionally deferred to a follow-up: the --mode flag must land on main first since farmslot clones MM fresh per dispatch.

Changelog

CHANGELOG entry: null

Related issues

Fixes: N/A (developer tooling improvement; supports the farmslot dispatch loop)

Manual testing steps

Two idempotent test scripts ship with the change. Both safely stash and restore any pre-existing .agent/build-cache.

Feature: build-cache lib + preflight --mode plumbing

  Scenario: unit suite passes
    Given a clean checkout of this branch
    When I run "bash scripts/perps/agentic/lib/test-build-cache.sh"
    Then 26 PASS lines print and exit code is 0
    And "ALL TESTS PASSED" is printed
    And re-running the command immediately also exits 0 (idempotent)

  Scenario: real-simulator cache-hit recognition (Path 1)
    Given a booted iOS simulator with MetaMask installed
    When I run "bash scripts/perps/agentic/lib/test-preflight-cache-e2e.sh"
    Then preflight logs "Cache: installed app matches fingerprint <hash>"
    And the build branch is skipped (no pod install / xcodebuild)
    And MetaMask remains installed on the simulator (no destructive ops)

  Scenario: --mode fast is strict
    Given a worktree with no cached build for the current fingerprint
    And no MetaMask installed on the simulator at the right fingerprint
    When I run "bash scripts/perps/agentic/preflight.sh --platform ios --mode fast"
    Then preflight exits non-zero with "Mode 'fast' but no cached build for fp <hash>"

  Scenario: --check-only stays read-only
    Given a shared cache hit for the current fingerprint
    And no MetaMask installed at that fingerprint
    When I run "bash scripts/perps/agentic/preflight.sh --platform ios --mode auto --check-only"
    Then preflight exits non-zero with a clear "cache hit available, but --check-only forbids install" message
    And the simulator state is unchanged

  Scenario: legacy --clean path is unchanged
    Given an existing worktree
    When I run "bash scripts/perps/agentic/preflight.sh --platform ios --clean --wallet-setup"
    Then preflight prints "Mode: clean (yarn setup → pod --repo-update → build)"
    And executes the same path as before this PR

Direct invocation examples (no yarn shortcuts; perps-scoped):

bash scripts/perps/agentic/preflight.sh --platform ios --mode auto --wallet-setup     # fingerprint-gated reuse
bash scripts/perps/agentic/preflight.sh --platform ios --mode fast --wallet-setup     # fail loud if no cached/installed build
bash scripts/perps/agentic/preflight.sh --platform ios --clean --wallet-setup         # legacy clean rebuild (unchanged)

Screenshots/Recordings

N/A — script-only change, no UI surface.

Before

N/A

After

N/A

Pre-merge author checklist

  • I've followed MetaMask Contributor Docs and MetaMask Mobile Coding Standards.
  • I've completed the PR template to the best of my ability
  • I've included tests if applicable (26 unit + real-sim e2e suite in scripts/perps/agentic/lib/)
  • I've documented my code using JSDoc format if applicable (shell — header comments + README section)
  • I've applied the right labels on the PR

Performance checks (if applicable)

  • I've tested on Android — N/A (developer preflight; same code path executes on both platforms; covered by mode flag + cache lib unit tests; Android-real-sim e2e left to a separate Linux farmslot host)
  • I've tested with a power user scenario — N/A (no runtime / wallet path touched)
  • I've instrumented key operations with Sentry traces for production performance metrics — N/A (developer tooling, never ships to production)

Pre-merge reviewer checklist

  • I've manually tested the PR (e.g. pull and build branch, run the app, test code being changed).
  • I confirm that this PR addresses all acceptance criteria described in the ticket it closes and includes the necessary testing evidence such as recordings and or screenshots.

Note

Medium Risk
Medium risk because it changes preflight.sh control flow for iOS/Android build/install decisions and adds cross-worktree caching + locking; mistakes could cause stale builds to be reused or unexpected install/build skips.

Overview
Adds a fingerprint-gated shared native build cache to scripts/perps/agentic/preflight.sh, allowing iOS .app and Android .apk artifacts to be reused across worktrees when the @expo/fingerprint hash matches, with per-fingerprint locking to serialize build/store.

Introduces --mode <auto|fast|rebuild-native|clean> to control cache usage and rebuild strictness, updates CocoaPods behavior to avoid --repo-update except in clean (with a one-shot retry on failure), and tightens read-only semantics for --check-only (no installs/adb reverse/Metro wallet steps; fails if cache would mutate state).

Adds the new lib/build-cache.sh helper library (artifact paths, memoized fingerprinting, installed sidecar tracking, pruning, flock/mkdir locks), plus smoke/e2e shell tests and README docs describing the modes and cache semantics.

Reviewed by Cursor Bugbot for commit fc1b85d. Bugbot is set up for automated code reviews on this repo. Configure here.

Add --mode <auto|fast|rebuild-native|clean> to preflight.sh with a
two-tier build cache keyed by @expo/fingerprint. Cache lives in
$MM_BUILD_CACHE_DIR (default ~/Library/Caches/mm-mobile-builds)
plus a per-worktree installed.json sidecar at .agent/build-cache/.

Why: farmslot dispatches every task with --clean, forcing full
yarn setup + pod install --repo-update + xcodebuild (~15-20 min)
even when native deps are identical. auto mode reuses an existing
build whenever the fingerprint matches, dropping warm dispatches
to under 90s. Parallel worktrees at the same fingerprint share one
artifact via flock — only the first builds, the rest install from
cache.

Modes:
  auto           skip build on fingerprint hit; pod install w/o
                 --repo-update; falls back to retry with --repo-update
                 on failure
  fast           fail loud if no cached/installed app
  rebuild-native skip yarn setup, force native rebuild
  clean          preserves current --clean semantics (escape hatch)

Includes idempotent test suites:
  scripts/perps/agentic/lib/test-build-cache.sh         (26 unit tests)
  scripts/perps/agentic/lib/test-preflight-cache-e2e.sh (real-sim e2e)
@github-actions

Copy link
Copy Markdown
Contributor

CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes.

@abretonc7s abretonc7s marked this pull request as ready for review May 22, 2026 02:24
@abretonc7s abretonc7s requested a review from a team as a code owner May 22, 2026 02:24
@abretonc7s abretonc7s added team-perps Perps team no-changelog no-changelog Indicates no external facing user changes, therefore no changelog documentation needed labels May 22, 2026
…wnership)

Drop a:auto:{ios,android} and a:fast:{ios,android} from root package.json
so the change is fully contained inside scripts/perps/agentic/. Callers
invoke preflight.sh directly with --mode auto|fast|rebuild-native|clean.
README updated to show the direct invocation.
@abretonc7s abretonc7s enabled auto-merge May 22, 2026 02:27
Comment thread scripts/perps/agentic/preflight.sh
Comment thread scripts/perps/agentic/lib/build-cache.sh Outdated
Comment thread scripts/perps/agentic/lib/build-cache.sh Outdated
Comment thread scripts/perps/agentic/preflight.sh
…ast-fail

Round 1 review feedback:
- B1: --check-only is now read-only on cache hits. Guarded the cache install
  with !$CHECK_ONLY for both iOS and Android; in check-only the script fails
  loud naming the fp mismatch instead of mutating the sim/device.
- B2: cache install failure used to leave APP_INSTALLED stale after the
  uninstall; reset to 0 before install attempt so a failed install can't
  silently skip the rebuild.
- B3: --mode fast now hard-fails when a cached artifact exists but its
  install fails, instead of falling through to a native build.
- B4: per-fingerprint flock now wraps the full cache-decide + build + store
  region, not only the post-build store. Two peer worktrees at the same
  fingerprint now produce exactly one xcodebuild/gradle invocation; the
  second installs from the cache published by the first.
- B5: installed.json target is now validated against SIM_TARGET / ADB_TARGET
  before the fast-path skip, so a recorded fp from a different sim no
  longer false-hits.
- N1: bc_prune now uses a portable mtime helper (BSD stat -f → GNU stat -c
  fallback), keeping LRU correct on Linux farmslot hosts.
- N2: bc_has_artifact rejects empty .app dirs (missing Info.plist) and
  zero-byte .apk files so a half-written or aborted store can't be hit.
- N3: tests no longer call GNU `timeout`; a portable `_capture_for` helper
  uses a backgrounded watchdog so the suites run on base macOS.
- N4: bc_record_install and bc_store_artifact now write JSON through jq
  with --arg, escaping arbitrary paths/targets correctly.
…trictness, check-only read-only

Round 2 review feedback:
- B1: auto mode now forces a fresh build when the installed app's fingerprint
  or target doesn't match and no valid cache artifact is available. Previously
  APP_INSTALLED stayed >0 in this branch, so the build gate skipped and the
  stale app passed straight through. Now we reset APP_INSTALLED=0 on every
  fall-through path (cache miss, lock-acquire failure) for both iOS and Android.
- B2: --check-only is now genuinely read-only. Adds an early exit before the
  Metro / CDP / wallet steps and gates the Android `adb reverse` setup. Cache
  hit branches that would mutate the sim/device already fail with --check-only.
- B3: --mode fast now fails loud on every cache-infrastructure failure path:
  fingerprint compute failure, lock-acquire failure, cache miss, and cache
  install failure — instead of falling back to a stale-app skip or a native
  build.
- N1: README now documents the macOS mkdir-mutex fallback (flock is Linux-only
  in base) and how to recover from a leftover `<fp>.lock.d` after `kill -9`.
- N2: bc_fingerprint memoizes into a non-exported shell var (BC_FINGERPRINT_MEMO),
  so a stale value inherited from a parent process can no longer pin the wrong
  cache key.
- N3: New test asserts --mode fast hard-fails when the fingerprint command
  cannot be run (covers the cache-infrastructure-failure paths).
@abretonc7s abretonc7s marked this pull request as draft May 22, 2026 02:55
auto-merge was automatically disabled May 22, 2026 02:55

Pull request was converted to draft

Comment thread scripts/perps/agentic/lib/build-cache.sh
…se --clean+--check-only

Round 3 review feedback (both actionable nits handled):
- N1: --check-only summary no longer claims "App installed and at the right
  fingerprint" when the fingerprint was not actually verified. Tracks a
  CHECK_ONLY_FP_VERIFIED flag set only by the Path-1 cache hit, then prints
  one of two messages based on whether verification actually happened.
- N2: --check-only combined with --mode clean / legacy --clean now fails
  loud at arg-parse time instead of briefly running yarn setup before the
  late short-circuit. The two flags are inherently contradictory (clean is
  destructive, check-only is read-only).
…ode wording

- README auto-mode pod install row now reflects actual behaviour: pod install
  runs on every native rebuild (no --repo-update, with a one-shot retry).
- preflight default-mode help text now describes the cache-gated behaviour
  inherited from --mode auto (without the fail-loud guarantee).
Bugbot caught that R2's de-export of BC_FINGERPRINT_MEMO made the memo
useless across `FP=$(bc_fingerprint)` calls: every command substitution
runs the function in a fresh subshell, and an in-memory shell variable
disappears with the subshell. The post-build store then re-ran
`node scripts/generate-fingerprint.js`, which (if Podfile.lock or some
other fingerprinted file changed during pod install) could publish the
artifact under a different fingerprint than the one whose lock was
acquired, defeating the mutual-exclusion guarantee.

Switch the memo to a file at /tmp/bc-fp-$$ (PID is stable across bash
subshells); preflight wipes the file at startup via bc_fingerprint_reset_memo
so a leftover from an unrelated prior process with the same PID can't
pin the wrong cache key. Tests also reset on entry/exit.

@geositta geositta left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found two PR-introduced behavior changes that should likely be fixed before merge:

  1. Default mode no longer preserves the existing “skip native build when the app is already installed” behavior because it now enters the cache validation path.
  2. Cache-hit installs uninstall the app outside --clean / --wallet-setup, which removes local app state during normal reuse.

Comment thread scripts/perps/agentic/preflight.sh
Comment thread scripts/perps/agentic/preflight.sh Outdated
R5 flagged the previous /tmp/bc-fp-\$\$ memo as symlink-attackable: a
local attacker who knows the next bash PID can pre-create a symlink at
that path and our `>` redirection would clobber an arbitrary file the
attacker has write access to through the symlink target.

Switch to a private 0700 dir created by `mktemp -d` at preflight startup
and exported as BC_MEMO_DIR so command-substitution subshells inherit
the same path. bc_fingerprint refuses to trust the memo file unless it
is a regular file (not a symlink) inside our private dir, and falls
back to per-call computation when BC_MEMO_DIR is unset (lib used
stand-alone). Tests clean up the dir on exit.
…nto inherited dir

R6 demonstrated that an inherited BC_MEMO_DIR was being rm -rf'd by
bc_fingerprint_reset_memo:
  BC_MEMO_DIR=/tmp/victim bash -c '. build-cache.sh; bc_fingerprint_reset_memo'
  → /tmp/victim deleted

Fix: bc_memo_init drops a sentinel file `.bc_memo_owner` inside its
mktemp -d dir. bc_memo_cleanup only recurses rm -rf when the sentinel
is present, so an inherited or attacker-controlled path is left
untouched. bc_fingerprint_reset_memo unsets BC_MEMO_DIR after the
guarded cleanup and asks bc_memo_init for a fresh dir.

New test asserts the attack scenario: with BC_MEMO_DIR pointing at a
foreign temp dir, calling bc_fingerprint_reset_memo must NOT delete
that dir or its contents.
…o longer enough)

R7 showed the on-disk `.bc_memo_owner` sentinel was forgeable: anyone with
write access to a victim dir could pre-create the marker, then BC_MEMO_DIR
injection would trick reset_memo into deleting the victim.

Replace the sentinel with a non-exported shell variable BC_MEMO_DIR_OWNED.
Set only by bc_memo_init in the shell that created the dir; child
processes that inherit BC_MEMO_DIR through `export` cannot also inherit
the ownership flag, so the cleanup path refuses to recurse rm -rf
into an attacker-supplied path. Hardens against both the R6 attack
(plain inherited path) and the R7 attack (inherited path with a forged
on-disk sentinel).

Test now reproduces both attack shapes and asserts neither victim dir
is deleted.
@github-actions github-actions Bot added size-XL and removed size-L labels May 22, 2026
Bash imports any exported env var as a regular shell var on startup, so
the previous "non-exported flag" defense was bypassable: a parent could
run `BC_MEMO_DIR=/tmp/victim BC_MEMO_DIR_OWNED=1 preflight.sh` and the
child shell would see both, treat the inherited dir as owned, and
recurse a destructive delete on reset_memo.

Fix: bc_fingerprint_reset_memo now `unset`s both BC_MEMO_DIR_OWNED and
BC_MEMO_DIR before calling bc_memo_init, without ever calling
bc_memo_cleanup on the inherited values. Any pre-existing claim from
env is discarded; only ownership we set ourselves in this shell, after
mktemp -d returns, will ever satisfy the cleanup guard.

New test reproduces the R8 attack (BC_MEMO_DIR=victim plus forged env
BC_MEMO_DIR_OWNED=1) and asserts the victim dir is preserved.
…hip-aware helper

R9 found two remaining paths that delete an inherited BC_MEMO_DIR:
1) Direct `bc_memo_init` followed by `bc_memo_cleanup` (no reset_memo
   in between) trusted the inherited OWNED flag, because the previous
   defense only sanitized inside bc_fingerprint_reset_memo.
2) The test EXIT cleanup blindly removed BC_MEMO_DIR if set, which would
   destroy an attacker-supplied path on early test failure.

Library fix: build-cache.sh now `unset`s BC_MEMO_DIR + BC_MEMO_DIR_OWNED
at source time (before any function is defined or called), so every
code path through the lib starts from a known-clean state. Only
ownership set by bc_memo_init running in this shell, after that unset,
is ever trusted.

Test fix: cleanup now delegates to bc_memo_cleanup (which honors the
ownership flag) instead of a raw destructive delete.

Two new tests reproduce both R9 attack shapes (direct init+cleanup,
EXIT cleanup on inherited path).
CI's check-pr-max-lines fails > 1000 changed lines. The diff was at
1089 after the R6-R9 security hardening + tests. Trim:

- Consolidate the five inherited-BC_MEMO_DIR attack tests into a
  parameterized _memo_attack helper (R6/R7/R8/R9A/R9B). Same coverage,
  fewer lines.
- Compress verbose multi-paragraph comments in preflight.sh and
  build-cache.sh (the design rationale stays in commit messages).
- Combine consecutive `unset` calls.

No functional change. All 32 unit tests + the real-sim e2e still pass.
@github-actions github-actions Bot added size-L and removed size-XL labels May 22, 2026
@abretonc7s abretonc7s marked this pull request as ready for review May 22, 2026 04:06
…uninstall)

Reviewer flagged that the cache-hit install path was destroying wallet/
app container data even when neither --clean nor --wallet-setup was
requested, because it ran `simctl uninstall` / `adb uninstall` before
installing the cached artifact. origin/main only uninstalled under
DO_CLEAN || DO_WALLET_SETUP.

`simctl install` and `adb install -r` both reinstall the bundle in
place and keep the existing container by default, so the preemptive
uninstall was unnecessary. Drop it on both platforms. On install
failure we reset APP_INSTALLED=0 so the build branch still fires
(or --mode fast still hard-fails).

The default-mode cache opt-in flagged in the second comment is kept
deliberately and documented as a wanted behavior change in
preflight.sh's mode banner and the README.
@abretonc7s abretonc7s enabled auto-merge May 22, 2026 04:26

@cursor cursor Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 1e9ee30. Configure here.

Comment thread scripts/perps/agentic/preflight.sh
geositta
geositta previously approved these changes May 22, 2026
…evice

Bugbot found that the Android build branch guarded the --wallet-setup
uninstall on APP_INSTALLED > 0, but the new cache-decision phase zeroes
APP_INSTALLED on cache miss even when the app is still physically on
the device. The result: --wallet-setup runs adb install -r over the
old container, preserving stale wallet data.

Replace the stale-count guard with a live `adb shell pm list packages`
check, independent of the cache-decision state. iOS already used an
unguarded uninstall, no fix needed there.
@github-actions

Copy link
Copy Markdown
Contributor

🔍 Smart E2E Test Selection

  • Selected E2E tags: None (no tests recommended)
  • Selected Performance tags: None (no tests recommended)
  • Risk Level: low
  • AI Confidence: 97%
click to see 🤖 AI reasoning details

E2E Test Selection:
All 5 changed files are exclusively within scripts/perps/agentic/ — a developer/CI tooling directory for the agentic workflow toolkit. The changes introduce:

  1. A fingerprint-gated build cache system (lib/build-cache.sh) for reusing native builds across worktrees
  2. A new --mode flag for preflight.sh (auto|fast|rebuild-native|clean) to control build behavior
  3. Unit tests and E2E tests for the build cache system
  4. Updated README documentation

None of these changes touch:

  • Any app source code (no React Native components, screens, or hooks)
  • Any controllers or Engine code
  • Any navigation or routing logic
  • Any user-facing features or flows
  • Any Detox E2E test files
  • Any CI workflow files that trigger Detox tests

These are purely developer tooling scripts using a separate CDP-based testing approach (not Detox). The changes improve build efficiency for the agentic workflow but have zero impact on the app's behavior, UI, or any E2E test scenarios. No Detox E2E tags are warranted.

Performance Test Selection:
No app code was changed. The changes are purely to developer/CI build tooling scripts (preflight.sh, build-cache.sh) that manage how the app is built and deployed to simulators/emulators. These scripts have no impact on app runtime performance, rendering, data loading, or any user-facing flows. No performance tests are warranted.

View GitHub Actions results

@sonarqubecloud

Copy link
Copy Markdown

@abretonc7s abretonc7s added skip-sonar-cloud Only used for bypassing sonar cloud when failures are not relevant to the changes. skip-e2e skip E2E test jobs labels May 22, 2026
@abretonc7s abretonc7s added this pull request to the merge queue May 22, 2026
Merged via the queue into main with commit c06187a May 22, 2026
192 of 226 checks passed
@abretonc7s abretonc7s deleted the feat/perps/expo-agentic-start branch May 22, 2026 10:38
@github-actions github-actions Bot locked and limited conversation to collaborators May 22, 2026
@metamaskbotv2 metamaskbotv2 Bot added the release-7.80.0 Issue or pull request that will be included in release 7.80.0 label May 22, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

no-changelog no-changelog Indicates no external facing user changes, therefore no changelog documentation needed release-7.80.0 Issue or pull request that will be included in release 7.80.0 size-L skip-e2e skip E2E test jobs skip-sonar-cloud Only used for bypassing sonar cloud when failures are not relevant to the changes. team-perps Perps team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants