feat(infra): install bwrap AppArmor profile + bubblewrap on runner EC2#725
Draft
G4614 wants to merge 1 commit into
Draft
feat(infra): install bwrap AppArmor profile + bubblewrap on runner EC2#725G4614 wants to merge 1 commit into
G4614 wants to merge 1 commit into
Conversation
The runner AMI is Ubuntu 24.04 (noble), which ships kernel.apparmor_restrict_unprivileged_userns=1 but does NOT ship the `bwrap-userns-restrict` AppArmor profile that Ubuntu 25.04+ includes. Without that profile bwrap — used by BoxLite's jailer for sandbox isolation (the SecurityOptions::strict path) — is DENIED the userns capability and every box fails with "Timeout waiting for guest ready / VM subprocess exited before guest became ready". The runner user-data now does two things to support BoxLite's security option on this host: 1. apt-get install bubblewrap so /usr/bin/bwrap exists (the AppArmor profile is scoped to that path; bundled bwrap from bubblewrap-sys would land at an arbitrary cache path that the profile can't match). 2. Write /etc/apparmor.d/bwrap-userns-restrict with the same profile Ubuntu 25.04+ ships, then apparmor_parser -r to load it. The kernel restriction stays on globally — only /usr/bin/bwrap gets the userns + capability allowances it needs. Reference: docs/faq.md "Ubuntu 24.04: Timeout waiting for guest ready" §Fix A (Option A — targeted, recommended). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
DorianZheng
reviewed
Jun 10, 2026
| apt-get install -y /tmp/mount-s3.deb | ||
| rm -f /tmp/mount-s3.deb | ||
|
|
||
| # Ubuntu 24.04 ships kernel.apparmor_restrict_unprivileged_userns=1 but does NOT |
Member
There was a problem hiding this comment.
do we need to install this profile when we install bubblewrap from apt?
DorianZheng
added a commit
that referenced
this pull request
Jun 10, 2026
…follow-up) (#726) ## What Regenerates the committed API clients against the post-#715 (merged A2 + MVP) API surface — the follow-up that #715's merge commit explicitly deferred: >⚠️ **CI will be red until generated clients are regenerated** against the merged API surface … **Generated clients now carry ZERO diff in this PR** (reset to main in `f9ea0730`) — regenerate upstream against the merged API surface. Since that merge, the **API client drift** check fails on every PR touching `apps/**` (e.g. #725's run 8 minutes after the merge). This PR turns it green again. ## Content **Commit 2 — the regen (`apps/libs/api-client`, `apps/api-client-go`, 231 files).** Pure `openapi-generator` 7.23.0 output, zero hand edits, produced with the exact `api-client-drift.yml` recipe (pinned generator via `openapitools.json`, NestJS spec boot with local Redis, GNU sed for the postprocess script). `analytics-api-client` and `toolbox-api-client` regenerated to **zero diff** (already current since #721/#723). Surface delta (mirrors the A2 + MVP API changes): - **removed:** snapshots / docker-registry / build / backup / archive-lifecycle / quota / usage-overview endpoints and models; `BoxState` build states (`pending_build`, `build_failed`, …); `write:snapshots` + `delete:snapshots` permission values; `listBoxesPaginated`'s `snapshots` filter param - **added:** `SystemRole`, `UpdateOrganizationName` (+ `PATCH /organizations/{organizationId}/name`), admin overview/observability models **Commit 1 — prek lint unblock (34 deleted lines, dashboard).** The Sandbox→Box rename left `LEGACY_*` route enum members byte-identical to the canonical ones — 4 pre-existing `@typescript-eslint/no-duplicate-enum-values` errors at HEAD that fail the repo's prek pre-commit hook (`make lint:fix`) for *every* local commit. The legacy routes are unreachable (identical paths, canonical registrations precede them), so this deletes them plus the orphaned `LegacyBoxRedirect`. No behavior change. Included here because nothing can be committed locally until it lands. ## Verification - `go build ./...` passes in `apps/api-client-go` (standalone), `apps/common-go`, `apps/otel-collector/exporter`. - The **API client drift** check on this PR is the canonical byte-for-byte proof. ## Known follow-up (intentionally split) Per review preference, this PR is generated code only. Three consumers still reference removed APIs and will not compile against the new clients until the prepared follow-up PR lands (branched on top of this one): - `apps/cli` — Dockerfile-build flow (`CreateBuildInfo`, `BOXSTATE_BUILD_FAILED`/`PENDING_BUILD`, `--dockerfile`/`--context`, MCP `buildInfo` arg, `pkg/minio`) - `apps/dashboard` — Registries page + registry hooks, usage-overview wiring in Spending/Limits, `templates` filter arg - `apps/libs/sdk-typescript` — `Box.buildInfo`/`backupState`/`backupCreatedAt`, `getBuildLogsUrl` No CI workflow compiles these consumers on PR today (the drift check is the only `apps/**` gate), so this PR is green-mergeable; the follow-up restores local builds. Note `apps/runner` has a **pre-existing** unrelated compile failure on main (`boxlite.WithPort` undefined in `pkg/boxlite`) — out of scope here.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
The runner EC2 AMI is Ubuntu 24.04 (noble). Ubuntu 24.04 ships
kernel.apparmor_restrict_unprivileged_userns=1but does NOT ship thebwrap-userns-restrictAppArmor profile that Ubuntu 25.04+ includes. Without that profile, bwrap (used by BoxLite's jailer for sandbox isolation —SecurityOptions::strict) is DENIED the userns capability and every box fails:So BoxLite's security option (
SecurityOptions::strict, default for prod) is unusable on every freshly-provisioned runner — boxes only start if youSecurityOptions::development()(disable jailer) or manually patch the host.Fix
apps/infra/sst.config.tsuser-data — two additions tobuildRunnerUserData():apt-get install -y bubblewrapso/usr/bin/bwrapexists. The AppArmor profile is scoped to that exact path; the runner's fallback bundled-bwrap (frombubblewrap-sys) extracts to an arbitrary cache path that the profile can't match./etc/apparmor.d/bwrap-userns-restrictwith the same profile Ubuntu 25.04+ ships (bwrap+unpriv_bwrap), thenapparmor_parser -rto load it.The global kernel restriction (
apparmor_restrict_unprivileged_userns=1) stays on — only/usr/bin/bwrapgets the userns + capability allowances it needs. This isdocs/faq.md"Ubuntu 24.04: Timeout waiting for guest ready" §Fix A (the targeted, recommended option).Why not the alternatives:
sysctl -w kernel.apparmor_restrict_unprivileged_userns=0(FAQ Fix B)SecurityOptions::development()(FAQ Fix C)Test plan — manual deploy verification
Can't unit-test cloud-init from the repo. Verify on next
pulumi up:pulumi upon a stack withRUNNERS=1(or more) brings up Runner EC2(s) with the new user-data.ssh runner→cat /var/log/runner-setup.logshows the AppArmor profile write +apparmor_parser -rexit 0.aa-status | grep bwrapshowsbwrap(the new profile) loaded.strictsecurity profile). It reaches "guest ready" without the 30s timeout.dmesg | grep apparmor | grep bwrapno longer showsDENIEDlines forcomm="bwrap" capability=8.The user-data only re-runs on instance replacement; the Runner's
ignoreChanges: ["userDataBase64"]is intentional (in-place upgrades go throughscripts/deploy/runner-update-binary.sh). So this change lands on the NEXT runner replacement — for existing runners, the same AppArmor profile + bwrap install must be applied via SSM Run Command. Suggested follow-up: a one-shot SSM doc that idempotently applies the sametee+apparmor_parserblock on existing runners.🤖 Generated with Claude Code