ci(INFRA-3597): Phase 5 — Namespace APK fingerprint cache and artifact validation#29886
Conversation
|
CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes. |
9855ab9 to
edfcf90
Compare
8cc139a to
a669ea3
Compare
97fa1cf to
136318e
Compare
… hit detection Namespace Cache Volumes are not key-based like cirruslabs/cache, so APK reuse on the Namespace path requires a different mechanism. This adds: - APK output dirs (prod/flask) to nscloud-cache-action paths so built APKs persist across runs on Namespace volumes - A marker file (.e2e-apk-cache-marker) that records the key inputs (ref, build_type, cache_generation, fingerprint, Gradle hash) after a successful build - A check step before the build gate that compares the marker to current inputs -- if they match AND both APKs exist, the gate reports needs-native-build=false and the repack path runs instead - The marker is recorded only after a successful native build The current (Cirrus/GH) path is unchanged -- find-reusable-build, cirruslabs/cache, and the existing gate logic all remain gated on runner_provider != namespace. Adapted from Borislav Grigorov's initial approach (9855ab9) to work with main's refactored gate pattern (find-reusable-build + gate step). Phase 5 of INFRA-3597 / parent epic INFRA-3511. Co-authored-by: Cursor <cursoragent@cursor.com>
…rence Mounting APK output dirs directly as nscloud cache volumes prevents Gradle from deleting/recreating them during packageProdRelease: Unable to delete directory 'android/app/build/outputs/apk/prod/release' Move APK cache to ~/.namespace-apk-cache/ (a dedicated staging area on the cache volume). On cache hit, APKs are copied FROM staging TO the Gradle output dirs. After a successful build, APKs are copied TO staging and the marker is recorded. This avoids mount interference while preserving APK reuse across Namespace runs. Co-authored-by: Cursor <cursoragent@cursor.com>
…stence The separate ~/.namespace-apk-cache/ path was not persisting between Namespace runs despite being listed in nscloud-cache-action. The cache volume grew (184MB -> 501MB) but the staging dir contents were empty on the second run. Move staging to $GRADLE_USER_HOME/apk-cache/ which is a subdirectory of an already-persisted cache path (GRADLE_USER_HOME/caches and /wrapper are confirmed to persist). This avoids relying on a standalone new path that nscloud may not handle correctly. Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Add nsc cache gradle setup step that generates an init script with workspace-scoped short-term credentials. The script is placed in $GRADLE_USER_HOME/init.d/ so it auto-loads on every ./gradlew invocation without modifying scripts/build.sh. This enables cross-run Gradle task reuse: first run populates the remote cache (slow), subsequent runs reuse cached build outputs (fast). Works alongside the existing nscloud-cache-action for dependency downloads (local cache volumes for JARs/plugins). Per Namespace docs and updated INFRA-3597 acceptance criteria. Co-authored-by: Cursor <cursoragent@cursor.com>
…caching Replace custom GRADLE_USER_HOME/caches and /wrapper paths with the native cache: gradle support in nscloud-cache-action. The native framework support handles Gradle dependency paths automatically and may use a different persistence mechanism that resolves the cache miss issue we've been seeing with custom path entries. This works alongside the Gradle remote build cache (nsc cache gradle setup) — Cache Volumes handle downloaded dependencies (JARs, plugins), remote cache handles compilation outputs and task results. Co-authored-by: Cursor <cursoragent@cursor.com>
… gradle The native cache: gradle targets ~/.gradle/ but our GRADLE_USER_HOME is /home/runner/_work/.gradle/ -- different paths. Gradle writes deps to the custom location but the native cache mounts at the default, so deps are never cached and Maven Central returns 429 on every cold run. Add both: cache: gradle for the default ~/.gradle (future-proofing) plus explicit GRADLE_USER_HOME/caches and /wrapper for our custom location. Belt and suspenders until GRADLE_USER_HOME is standardized. Co-authored-by: Cursor <cursoragent@cursor.com>
Per Namespace team guidance: add maven cache mode alongside gradle to ensure Maven dependency downloads (including plugins from Maven Central) are retained in ~/.m2/repository across runs. This avoids repeated bulk downloads that trigger HTTP 429 rate limiting from repo.maven.apache.org on cold-cache builds. Even after Namespace ships their in-house Maven mirrors, this local caching remains beneficial as it skips downloads entirely. Co-authored-by: Cursor <cursoragent@cursor.com>
…esting Co-authored-by: Cursor <cursoragent@cursor.com>
…testing Co-authored-by: Cursor <cursoragent@cursor.com>
…te limits Co-authored-by: Cursor <cursoragent@cursor.com>
…id E2E on namespace Mount the same cache volume paths (Gradle, Maven, Yarn, node_modules, apk-cache) in E2E shards so the post-step commit preserves the build job's cached data instead of overwriting it with an empty state. Limit Android E2E on namespace to SmokeAccounts (1 shard) to validate cache volume persistence end-to-end before enabling the full matrix. iOS E2E remains skipped on namespace. Co-authored-by: Cursor <cursoragent@cursor.com>
Only mount absolute Gradle paths in the E2E shard so its post-step commit does not overwrite the build job's heavier node_modules/yarn cache with a lighter fresh-install version. Co-authored-by: Cursor <cursoragent@cursor.com>
- Guard Namespace APK cache check with force-builds override so force-builds label/tag triggers a fresh build on namespace too. - Remove hardcoded phase5/cache-and-artifacts branch from Gradle remote cache write policy. Only main, release/*, and stable/* branches can push.
Match find-reusable-build: empty fingerprint omits source identity from the marker; do not run Namespace APK cache hit logic in that case. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 7ad2e5d. Configure here.
- Skip APK cache marker recording when source-fingerprint is empty to avoid writing markers that can never be matched. - Guard iOS .metamask actions/cache restore with runner_provider != namespace on both native-build and reuse-hit paths to prevent conflict with nscloud-cache-action symlinks on macOS.
🔍 Smart E2E Test Selection⏭️ Smart E2E selection skipped - skip-smart-e2e-selection label found All E2E tests pre-selected. |
|
| # in sync with the length of matrix.shard | ||
| - run: yarn test:unit --shard=${{ matrix.shard }}/10 --forceExit --silent --coverageReporters=json --json --outputFile=tests/results/unit-test-results-${{ matrix.shard }}.json | ||
| # Namespace Linux: cap Jest workers to reduce cgroup OOM SIGKILL without tuning heap. | ||
| - run: yarn test:unit --shard=${{ matrix.shard }}/10${{ inputs.runner_provider == 'namespace' && ' --maxWorkers=50%' || '' }} --forceExit --silent --coverageReporters=json --json --outputFile=tests/results/unit-test-results-${{ matrix.shard }}.json |
There was a problem hiding this comment.
--maxWorkers=50%
do we know the runtime impact of this on namespace CI runs?




Description
INFRA-3597 Phase 5 — Cache and Artifact Architecture for Namespace runner migration. Replaces fragile caches without changing build-output contracts, covering all cache families across Android and iOS builds and E2E tests.
Changes:
Android Cache Architecture
cache: gradle+cache: mavenvianscloud-cache-actioninbuild-android-e2e.ymlandrun-e2e-workflow.ymlnsc cache gradle setupwith branch-based write policy (--push=falsefor PR/fork branches)$GRADLE_USER_HOME/apk-cache/— cache hit skips full buildnscloud-cache-actionin all relevant workflowsmetamask-android-buildcache volume (single tag per Namespace recommendation for best convergence)iOS Cache Architecture
cache: cocoapodsinbuild-ios-e2e.ymlandrun-e2e-workflow.ymlcache: xcodeinbuild-ios-e2e.yml(replacescirruslabs/cacheon Namespace)~/Library/Detoxpath in iOS E2Enscloud-cache-actionnode_modules,ios/vendor/bundle,~/.cocoapods/reposexcluded from explicit cache paths — on macOSnscloud-cache-actionuses symlinks which break Xcode ScanDependencies and Ruby/Bundler require chainsCache Write Policy
main,release/*,stable/*can push (--push=true); PR/fork branches read-only (--push=false)Infrastructure Fixes
actions/cachesteps insetup-e2e-envwhenrunner_provider == 'namespace'(Android system image, Yarn, Bundler, CocoaPods specs)/opt/android-sdk/system-images/...fromnscloud-cache-actionpaths (pre-baked in Dockerfile base image, permission denied on bind-mount)--maxWorkers=50%on Namespace unit shards to reduce OOM SIGKILL riskRollback Safety
inputs.runner_provider == 'namespace'runner_provider=currentpath unchanged and validatedAcceptance Criteria Status
.metamaskconvertednode_modulestarball preservedfail-on-cache-missnot removednscloud-checkout-actionnot adopted (INFRA-3628)Validation Runs
Changelog
CHANGELOG entry: null
Related issues
Fixes: INFRA-3597 (parent epic INFRA-3511)
Manual testing steps
ci.ymlwithrunner_provider=namespace— all builds and E2E tests should pass (except known flakes)ci.ymlwithrunner_provider=current— confirms existing Cirrus/GitHub runner path is unaffectedScreenshots/Recordings
N/A — CI infrastructure PR.
Pre-merge author checklist
Pre-merge reviewer checklist
Note
Medium Risk
Touches CI build/test gating and caching behavior across Android/iOS and E2E workflows; misconfiguration could cause cache poisoning/misses or skipped builds that break downstream tests.
Overview
Introduces Namespace-runner cache configuration via
namespacelabs/nscloud-cache-actionfor Android (Gradle/Maven +apk-cache) and iOS (CocoaPods/Xcode + Detox cache in E2E runner), and skips redundantactions/cacherestores/saves insetup-e2e-envwhenrunner_provider == 'namespace'.Adds a Namespace-only Android APK fingerprint cache (marker + stored APKs under
${GRADLE_USER_HOME}/apk-cache) that can short-circuit the native build path, plus Namespace Gradle remote build cache setup with branch-based push policy.Adjusts CI stability on Namespace Linux by appending
--maxWorkers=50%to sharded Jest unit runs to reduce OOM kills.Reviewed by Cursor Bugbot for commit 9fc9d14. Bugbot is set up for automated code reviews on this repo. Configure here.