Skip to content

Commit a6aa483

Browse files
committed
fix: route crabbox proof through brokered aws
1 parent b21d3e5 commit a6aa483

4 files changed

Lines changed: 128 additions & 75 deletions

File tree

.agents/skills/crabbox/SKILL.md

Lines changed: 62 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
name: crabbox
3-
description: Use Crabbox for OpenClaw remote validation across Linux, macOS, Windows, and WSL2. Default to Blacksmith Testbox for broad Linux proof; includes direct Blacksmith and owned AWS/Hetzner fallback notes when Crabbox fails.
3+
description: Use Crabbox for OpenClaw remote validation across Linux, macOS, Windows, and WSL2. Default to the repo Crabbox config, use brokered AWS for normal broad proof, and keep Blacksmith Testbox as an explicit opt-in or outage diagnostic path.
44
---
55

66
# Crabbox
@@ -9,9 +9,15 @@ Use Crabbox when OpenClaw needs remote Linux proof for broad tests, CI-parity
99
checks, secrets, hosted services, Docker/E2E/package lanes, warmed reusable
1010
boxes, sync timing, logs/results, cache inspection, or lease cleanup.
1111

12-
Default backend: `blacksmith-testbox`. The separate `blacksmith-testbox` skill
13-
has been removed; this skill owns both the normal Crabbox path and the direct
14-
Blacksmith fallback playbook.
12+
Default backend: the repo `.crabbox.yaml`, currently brokered AWS. Do not
13+
override it to Blacksmith unless the user explicitly asks for Blacksmith proof,
14+
the task is specifically about Testbox behavior, or AWS/brokered Crabbox is the
15+
broken layer.
16+
17+
Blacksmith Testbox is a delegated fallback, not the default router. If a
18+
Blacksmith run queues, fails capacity, fails auth, or cannot allocate, stop
19+
after one real attempt and switch to the repo default or report the blocker.
20+
Do not retry Blacksmith in a loop.
1521

1622
## First Checks
1723

@@ -28,9 +34,10 @@ pnpm crabbox:run -- --help | sed -n '1,120p'
2834

2935
- OpenClaw scripts prefer `../crabbox/bin/crabbox` when present. The user PATH
3036
shim can be stale.
31-
- Check `.crabbox.yaml` for repo defaults, but override provider explicitly.
32-
Even if config still says AWS, maintainer validation should normally pass
33-
`--provider blacksmith-testbox`.
37+
- Check `.crabbox.yaml` for repo defaults and honor them. For normal Linux
38+
validation, omit `--provider` so the wrapper uses brokered AWS.
39+
- Pass `--provider blacksmith-testbox` only for explicit Blacksmith/Testbox
40+
work or a deliberate comparison.
3441
- If a warm direct-provider lease smells stale, retry with `--full-resync`
3542
(alias `--fresh-sync`) before replacing the lease. This resets the remote
3643
workdir, skips the fingerprint fast path, reseeds Git when possible, and
@@ -54,7 +61,8 @@ pnpm crabbox:run -- --help | sed -n '1,120p'
5461
## macOS And Windows Targets
5562

5663
Use these only when the task needs an existing non-Linux host. OpenClaw broad
57-
validation still defaults to `blacksmith-testbox`.
64+
Linux validation uses the repo Crabbox config unless a provider is explicitly
65+
requested.
5866

5967
Crabbox supports static SSH targets:
6068

@@ -75,7 +83,7 @@ Crabbox supports static SSH targets:
7583
with `../crabbox/bin/crabbox run --help`, config/flag tests, and the Crabbox
7684
Go test suite.
7785

78-
## Default Blacksmith Backend
86+
## Default Brokered AWS Backend
7987

8088
Use this for `pnpm check`, `pnpm check:changed`, `pnpm test`,
8189
`pnpm test:changed`, Docker/E2E/live/package gates, or anything likely to fan
@@ -84,11 +92,7 @@ out across many Vitest projects.
8492
Changed gate:
8593

8694
```sh
87-
pnpm crabbox:run -- --provider blacksmith-testbox \
88-
--blacksmith-org openclaw \
89-
--blacksmith-workflow .github/workflows/ci-check-testbox.yml \
90-
--blacksmith-job check \
91-
--blacksmith-ref main \
95+
pnpm crabbox:run -- \
9296
--idle-timeout 90m \
9397
--ttl 240m \
9498
--timing-json \
@@ -99,11 +103,7 @@ pnpm crabbox:run -- --provider blacksmith-testbox \
99103
Full suite:
100104

101105
```sh
102-
pnpm crabbox:run -- --provider blacksmith-testbox \
103-
--blacksmith-org openclaw \
104-
--blacksmith-workflow .github/workflows/ci-check-testbox.yml \
105-
--blacksmith-job check \
106-
--blacksmith-ref main \
106+
pnpm crabbox:run -- \
107107
--idle-timeout 90m \
108108
--ttl 240m \
109109
--timing-json \
@@ -114,11 +114,7 @@ pnpm crabbox:run -- --provider blacksmith-testbox \
114114
Focused rerun:
115115

116116
```sh
117-
pnpm crabbox:run -- --provider blacksmith-testbox \
118-
--blacksmith-org openclaw \
119-
--blacksmith-workflow .github/workflows/ci-check-testbox.yml \
120-
--blacksmith-job check \
121-
--blacksmith-ref main \
117+
pnpm crabbox:run -- \
122118
--idle-timeout 90m \
123119
--ttl 240m \
124120
--timing-json \
@@ -128,19 +124,18 @@ pnpm crabbox:run -- --provider blacksmith-testbox \
128124

129125
Read the JSON summary. Useful fields:
130126

131-
- `provider`: should be `blacksmith-testbox`
132-
- `leaseId`: `tbx_...`
133-
- `syncDelegated`: should be `true`
127+
- `provider`: should normally be `aws`
128+
- `leaseId`: `cbx_...`
129+
- `syncDelegated`: should normally be `false`
134130
- `commandPhases`: populated when the command prints `CRABBOX_PHASE:<name>`
135131
- `commandMs` / `totalMs`
136132
- `exitCode`
137133

138-
Crabbox should stop one-shot Blacksmith Testboxes automatically after the run.
139-
Verify cleanup when a run fails, is interrupted, or the command output is
140-
unclear:
134+
Crabbox should stop one-shot AWS leases automatically after the run. Verify
135+
cleanup when a run fails, is interrupted, or the command output is unclear:
141136

142137
```sh
143-
blacksmith testbox list
138+
../crabbox/bin/crabbox list --provider aws
144139
```
145140

146141
## Observability Flags
@@ -331,13 +326,13 @@ Interactive CLI/onboarding:
331326

332327
## Reuse And Keepalive
333328

334-
For most Blacksmith-backed Crabbox calls, one-shot is enough. Use reuse only
335-
when you need multiple manual commands on the same hydrated box.
329+
For most Crabbox calls, one-shot is enough. Use reuse only when you need
330+
multiple manual commands on the same hydrated box.
336331

337332
If Crabbox returns a reusable id or you intentionally keep a lease:
338333

339334
```sh
340-
pnpm crabbox:run -- --provider blacksmith-testbox --id <tbx_id> --no-sync --timing-json --shell -- "pnpm test <path>"
335+
pnpm crabbox:run -- --id <cbx_id-or-slug> --no-sync --timing-json --shell -- "pnpm test <path>"
341336
```
342337

343338
Stop boxes you created before handoff:
@@ -386,14 +381,16 @@ WebVNC portal, and opens the portal. Keep browsers windowed for human QA; use
386381
## If Crabbox Fails
387382

388383
Keep the fallback narrow. First decide whether the failure is Crabbox itself,
389-
Blacksmith/Testbox, repo hydration, sync, or the test command.
384+
the brokered AWS lease, Blacksmith/Testbox, repo hydration, sync, or the test
385+
command.
390386

391387
Fast checks:
392388

393389
```sh
394390
command -v crabbox
395391
../crabbox/bin/crabbox --version
396-
crabbox run --provider blacksmith-testbox --help | sed -n '1,140p'
392+
pnpm crabbox:run -- --help | sed -n '1,140p'
393+
../crabbox/bin/crabbox doctor
397394
command -v blacksmith
398395
blacksmith --version
399396
blacksmith testbox list
@@ -403,36 +400,36 @@ Common Crabbox-only failures:
403400

404401
- Provider missing or old CLI: use `../crabbox/bin/crabbox` from the sibling
405402
repo, or update/install Crabbox before retrying.
406-
- Bad local config: pass `--provider blacksmith-testbox` plus explicit
407-
`--blacksmith-*` flags instead of relying on `.crabbox.yaml`.
408-
- Slug/claim confusion: use the raw `tbx_...` id, or run one-shot without
409-
`--id`.
403+
- Bad local config: inspect `.crabbox.yaml`, `crabbox config show`, and
404+
`crabbox whoami`; normal OpenClaw proof should use brokered AWS without
405+
asking for cloud keys.
406+
- Slug/claim confusion: use the raw `cbx_...` / `tbx_...` id, or run one-shot
407+
without `--id`.
410408
- Sync/timing bug: add `--debug --timing-json`; capture the final JSON and the
411409
printed Actions URL. Large sync warnings now include top source directories
412410
by file count and a hint to update `.crabboxignore` / `sync.exclude`; inspect
413411
those before reaching for `--force-sync-large`. Quiet rsync watchdogs and SSH
414412
timeouts now print `next_action=` hints; follow them, usually `--full-resync`
415413
first and a fresh lease second.
416-
- Cleanup uncertainty: run `blacksmith testbox list` and stop only boxes you
414+
- Cleanup uncertainty: run `crabbox list --provider aws`; for explicit
415+
Blacksmith runs, use `blacksmith testbox list` and stop only boxes you
417416
created.
418-
- Testbox queued/capacity pressure: do not convert a broad changed gate or full
419-
suite into local `OPENCLAW_LOCAL_CHECK_MODE=throttled pnpm ...`. Leave the
420-
remote lane queued, switch to a narrower targeted local check, or stop and
421-
report the capacity blocker.
417+
- Testbox queued/capacity pressure: do not retry Blacksmith repeatedly. Rerun
418+
once without `--provider` so `.crabbox.yaml` routes to brokered AWS, or report
419+
the Blacksmith blocker if Testbox itself is the requested proof.
422420

423-
If Crabbox cannot dispatch, sync, attach, or stop but Blacksmith itself works,
424-
first try the same command through the repo wrapper with `--debug` and
425-
`--timing-json`:
421+
If brokered AWS cannot dispatch, sync, attach, or stop, retry once with
422+
`--debug` and `--timing-json`:
426423

427424
```sh
428-
pnpm crabbox:run -- --provider blacksmith-testbox --debug --timing-json -- \
425+
pnpm crabbox:run -- --debug --timing-json -- \
429426
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed
430427
```
431428

432429
Full suite:
433430

434431
```sh
435-
pnpm crabbox:run -- --provider blacksmith-testbox --debug --timing-json -- \
432+
pnpm crabbox:run -- --debug --timing-json -- \
436433
CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test
437434
```
438435

@@ -451,9 +448,10 @@ Raw Blacksmith footguns:
451448
- Treat `blacksmith testbox list` as cleanup diagnostics, not a shared reusable
452449
queue.
453450

454-
Escalate to owned AWS/Hetzner only when Blacksmith is down, quota-limited,
455-
missing the needed environment, or owned capacity is the explicit goal. Use the
456-
Owned Cloud Fallback section below.
451+
Use Blacksmith only when the task is specifically about Testbox, brokered AWS
452+
is unavailable, or an explicit comparison is needed. If Blacksmith is down or
453+
quota-limited, do not keep probing it; stay on brokered AWS and note the
454+
delegated-provider outage.
457455

458456
## Blacksmith Backend Notes
459457

@@ -489,13 +487,14 @@ Important Blacksmith footguns:
489487
blacksmith auth login --non-interactive --organization openclaw
490488
```
491489

492-
## Owned Cloud Fallback
490+
## Brokered AWS
493491

494-
Use AWS/Hetzner only when Blacksmith is down, quota-limited, missing the needed
495-
environment, or owned capacity is explicitly the goal.
492+
Use AWS for normal OpenClaw remote proof. The repo `.crabbox.yaml` already
493+
selects brokered AWS, so omit `--provider` unless you are testing a different
494+
provider deliberately.
496495

497496
```sh
498-
pnpm crabbox:warmup -- --provider aws --class beast --market on-demand --idle-timeout 90m
497+
pnpm crabbox:warmup -- --class beast --market on-demand --idle-timeout 90m
499498
pnpm crabbox:hydrate -- --id <cbx_id-or-slug>
500499
pnpm crabbox:run -- --id <cbx_id-or-slug> --timing-json --shell -- "env NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed"
501500
pnpm crabbox:stop -- <cbx_id-or-slug>
@@ -519,8 +518,8 @@ crabbox whoami
519518
- If broker auth is missing, run `crabbox login --url https://crabbox.openclaw.ai --provider aws`.
520519
- If the CLI asks for `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, or AWS
521520
profile setup during normal OpenClaw validation, assume the agent selected
522-
the wrong path. Use brokered `crabbox login`, `--provider blacksmith-testbox`,
523-
or an existing brokered lease before asking the user for cloud credentials.
521+
the wrong path. Use brokered `crabbox login` or an existing brokered lease
522+
before asking the user for cloud credentials.
524523
- Ask for AWS keys only for explicit direct-provider/account administration,
525524
not for normal brokered OpenClaw proof.
526525
- Trusted automation may still use
@@ -533,8 +532,7 @@ macOS config lives at:
533532
```
534533

535534
It should include `broker.url`, `broker.token`, and usually `provider: aws`
536-
for owned-cloud lanes. Do not let that config override the OpenClaw default
537-
when Blacksmith proof is requested; pass `--provider blacksmith-testbox`.
535+
for OpenClaw lanes. Let that config drive normal validation.
538536

539537
### Interactive Desktop / WebVNC
540538

@@ -572,14 +570,15 @@ Use `--market spot|on-demand` only on AWS warmup/one-shot runs.
572570
## Failure Triage
573571

574572
- Crabbox cannot find provider: verify `../crabbox/bin/crabbox --help` lists
575-
`blacksmith-testbox`; update Crabbox before falling back.
573+
the provider selected by `.crabbox.yaml`; update Crabbox before falling back.
576574
- Hydration stuck or failed: open the printed GitHub Actions run URL and inspect
577575
the hydration step.
578576
- Sync failed: rerun with `--debug`; check changed-file count and whether the
579577
checkout is dirty.
580578
- Command failed: rerun only the failing shard/file first. Do not rerun a full
581579
suite until the focused failure is understood.
582-
- Cleanup uncertain: `blacksmith testbox list`; stop owned `tbx_...` leases you
580+
- Cleanup uncertain: `crabbox list --provider aws`; for explicit Blacksmith
581+
runs, use `blacksmith testbox list` and stop owned `tbx_...` leases you
583582
created.
584583
- Crabbox broken but Blacksmith works: use the direct Blacksmith fallback above,
585584
then file/fix the Crabbox issue.

.agents/skills/openclaw-testing/SKILL.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,9 @@ Prove the touched surface first. Do not reflexively run the whole suite.
2323
- normal source checkout, tests only: `pnpm test:changed`
2424
- normal source checkout, one failing file: `pnpm test <path-or-filter> -- --reporter=verbose`
2525
- Codex worktree or linked/sparse checkout, one/few explicit files: `node scripts/run-vitest.mjs <path-or-filter>`
26-
- Codex worktree or linked/sparse checkout, changed gates or anything broad: `node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox ... --shell -- "pnpm check:changed"`
26+
- Codex worktree or linked/sparse checkout, changed gates or anything broad:
27+
`node scripts/crabbox-wrapper.mjs run ... --shell -- "pnpm check:changed"`
28+
and let `.crabbox.yaml` choose the provider
2729
- workflow-only: `git diff --check`, workflow syntax/lint (`actionlint` when available)
2830
- docs-only: `pnpm docs:list`, docs formatter/lint only if docs tooling changed or requested
2931
2. Reproduce narrowly before fixing.
@@ -44,11 +46,13 @@ Prove the touched surface first. Do not reflexively run the whole suite.
4446
`node scripts/run-vitest.mjs` for tiny local proof, `node
4547
scripts/crabbox-wrapper.mjs` for Testbox, and `git commit --no-verify` only
4648
after the relevant remote or node-wrapper proof is already clean.
47-
- For Blacksmith Testbox proof, use Crabbox first. `pnpm crabbox:run -- --provider
48-
blacksmith-testbox --timing-json -- <command...>` warms, claims, syncs, runs,
49-
reports, and cleans up one-shot boxes. Reuse only an id/slug created in this
50-
operator session; `blacksmith testbox list` is diagnostics only, not a shared
51-
work queue.
49+
- For remote proof, use Crabbox first and omit `--provider` unless a specific
50+
provider is being tested. The repo Crabbox config routes normal broad proof to
51+
brokered AWS. Blacksmith Testbox is explicit opt-in; if it queues, fails
52+
capacity, or cannot allocate, retry once through the default Crabbox route or
53+
report the Testbox blocker. Reuse only an id/slug created in this operator
54+
session; `blacksmith testbox list` is diagnostics only, not a shared work
55+
queue.
5256

5357
## Local Test Shortcuts
5458

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ Docs: https://docs.openclaw.ai
66

77
### Changes
88

9+
- Maintainer tooling: route Crabbox skill defaults through the repo brokered AWS config, leaving Blacksmith Testbox as an explicit opt-in instead of the broad-proof default.
910
- CLI/onboarding: localize the setup wizard and bundled channel setup flows for English, Simplified Chinese, and Traditional Chinese. (#80645) Thanks @GaosCode.
1011
- Agents/skills: cache hydrated `resolvedSkills` across warm gateway turns while keying reuse by the redacted effective config, reducing redundant skill snapshot rebuilds without crossing config-gated skill boundaries. (#81451) Thanks @solodmd.
1112

0 commit comments

Comments
 (0)