openclaw
diff --git a/‎.agents/skills/autoreview/SKILL.md‎
Lines changed: 7 additions & 0 deletions b/‎.agents/skills/autoreview/SKILL.md‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎.agents/skills/channel-message-flows/SKILL.md‎
Lines changed: 44 additions & 0 deletions b/‎.agents/skills/channel-message-flows/SKILL.md‎
Lines changed: 44 additions & 0 deletions
diff --git a/‎.agents/skills/openclaw-docker-e2e-authoring/SKILL.md‎
Lines changed: 64 additions & 0 deletions b/‎.agents/skills/openclaw-docker-e2e-authoring/SKILL.md‎
Lines changed: 64 additions & 0 deletions
diff --git a/‎.agents/skills/openclaw-mac-release/SKILL.md‎
Lines changed: 95 additions & 0 deletions b/‎.agents/skills/openclaw-mac-release/SKILL.md‎
Lines changed: 95 additions & 0 deletions
diff --git a/‎.agents/skills/openclaw-testing/SKILL.md‎
Lines changed: 3 additions & 1 deletion b/‎.agents/skills/openclaw-testing/SKILL.md‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎.agents/skills/telegram-crabbox-e2e-proof/SKILL.md‎
Lines changed: 24 additions & 14 deletions b/‎.agents/skills/telegram-crabbox-e2e-proof/SKILL.md‎
Lines changed: 24 additions & 14 deletions
@@ -136,3 +136,10 @@ Include:
 - the clean review result from the final helper/review run, or why a remaining finding was consciously rejected
 
 Do not run another Codex review solely to improve the final report wording. If the final helper run exited 0 and produced no accepted/actionable findings, report that exact run as clean.
+
+## PR / CI Closeout
+
+- Prefer direct run/job APIs after CI starts: `gh run view <run-id> --json jobs`; use PR rollup only for final mergeability.
+- After rebase, compare `origin/main..HEAD`; drop CI-fix commits already upstream before pushing.
+- For prompt snapshot CI failures, prove/generate with Linux Node 24 before rerunning the failed job.
+- Update PR body once near the final head unless proof labels are missing or stale enough to block CI.
@@ -0,0 +1,44 @@
+---
+name: channel-message-flows
+description: "Use when previewing local channel message flow fixtures."
+---
+
+# Channel Message Flows
+
+Use this from the OpenClaw repo root to send canned channel preview flows while iterating on message UX. These are real sends/edits/deletes against the configured channel target.
+
+## Telegram
+
+Native Telegram `sendMessageDraft` tool progress, then a final answer:
+
+```bash
+node --import tsx scripts/dev/channel-message-flows.ts \
+  --channel telegram \
+  --target <telegram-chat-id> \
+  --flow working-final \
+  --duration-ms 20000
+```
+
+Thinking preview, then a final answer:
+
+```bash
+node --import tsx scripts/dev/channel-message-flows.ts \
+  --channel telegram \
+  --target <telegram-chat-id> \
+  --flow thinking-final
+```
+
+## Options
+
+- `--account <accountId>`: Telegram account id when not using the default.
+- `--thread-id <id>`: Telegram forum topic/message thread id.
+- `--delay-ms <ms>`: Override preview update cadence.
+- `--duration-ms <ms>`: Simulated working duration for `working-final`.
+- `--final-text <text>`: Override the durable final message.
+
+## Notes
+
+- `--target` is the numeric Telegram chat id.
+- `working-final` exercises native Telegram `sendMessageDraft` with static `Working` status and sample tool progress.
+- `thinking-final` exercises formatted `Thinking` reasoning preview clearing before the final answer.
+- Only `--channel telegram` is implemented for now.
@@ -0,0 +1,64 @@
+---
+name: openclaw-docker-e2e-authoring
+description: "Author OpenClaw Docker E2E and live provider Docker lanes."
+---
+
+# OpenClaw Docker E2E Authoring
+
+Use this when adding or changing Docker E2E lanes, release-path Docker tests,
+or live-provider Docker proof.
+
+## Lane Choice
+
+- Deterministic Docker: fake the dependency/server and assert the exact runtime
+  contract crossing the boundary.
+- Live Docker: use real provider credentials/model only when user-visible
+  behavior needs the real service.
+- Prefer both when they prove different risks: deterministic for byte/payload
+  routing, live for actual provider behavior.
+
+## Authoring Rules
+
+- Test-only helpers live in `test/helpers` or `scripts/e2e/lib/<lane>/`, not
+  `src/**`, unless production imports them.
+- Package-installed app runs from `/app`; mount only explicit harness/helper
+  paths read-only.
+- Fake servers should log boundary requests as JSONL and clients should assert
+  the real dependency payload, not just process success.
+- Add the package script and `scripts/lib/docker-e2e-scenarios.mjs` lane in the
+  same change.
+- If a lane installs a plugin from npm, default the spec via env so published
+  and local override paths are both testable.
+
+## Media And Vision
+
+- Expected answer must exist only in pixels or provider output being tested.
+- Use neutral filenames, neutral prompts, and no metadata leaks.
+- Random bitmap/OCR tokens reuse the repo OCR-safe alphabet `24567ACEF` unless
+  the test owns a stronger glyph set.
+- Make the expected answer unique per run when proving real image
+  understanding.
+
+## `chat.send` E2E
+
+- Require `chat.send` to return `status: "started"` and a string `runId`.
+- Wait for completion with `agent.wait`.
+- Assert final user-visible text via `chat.history` when event ordering is not
+  the behavior under test.
+- Keep originating channel/account metadata only when the bug path needs queued
+  inbound/channel context.
+
+## Verification
+
+Run the smallest proof that covers the touched lane:
+
+```bash
+pnpm exec oxfmt --write <changed files>
+node --check <new .mjs files>
+bash -n <new .sh files>
+node scripts/run-vitest.mjs test/scripts/docker-e2e-plan.test.ts
+OPENCLAW_SKIP_DOCKER_BUILD=1 pnpm test:docker:<lane>
+```
+
+For real-provider lanes, run the matching live Docker script after deterministic
+Docker is green. Finish with `$autoreview` before commit/PR.
@@ -0,0 +1,95 @@
+---
+name: openclaw-mac-release
+description: "Run or recover OpenClaw macOS release signing, notarization, appcast, and asset promotion."
+---
+
+# OpenClaw Mac Release
+
+Use with `$openclaw-release-maintainer`, `$openclaw-release-ci`, and `$one-password` when stable macOS assets, private mac preflight, notarization, appcast promotion, or mac release recovery is involved.
+
+## Credentials
+
+- Canonical ASC item: vault `Molty`, title `API Key - App Store Connect - Personal - Release`.
+- Fields: `private_key_p8`, `key_id`, `issuer_id`.
+- Current known good key id: `AKVLXW849T`.
+- Legacy mirror: vault `Private`, title `API Key - App Store Connect - Personal`; keep it synced for older refs.
+- Stale/revoked key symptom: `xcrun notarytool submit` fails with `HTTP status code: 401. Unauthenticated`.
+- Validate candidate ASC credentials with `xcrun notarytool history` before setting GitHub secrets.
+
+## 1Password
+
+- Use `$one-password`: all `op` work inside one persistent tmux session, no secret output.
+- Prefer `OP_SERVICE_ACCOUNT_TOKEN` from `~/.profile` for Molty reads.
+- Do not assume `MOLTY_OP_SERVICE_ACCOUNT_TOKEN` is alive; it has previously pointed at a deleted service account.
+- If a service token fails, run status-only checks: token present/length and `op whoami`; never print token values.
+- If desktop app auth is needed but Touch ID is unavailable, set `OP_BIOMETRIC_UNLOCK_ENABLED=false` for the manual `op account add --signin` path.
+
+## GitHub Secrets
+
+Target private repo environment: `openclaw/releases-private`, env `mac-release`.
+
+Set only after local notary auth validation:
+
+- `APP_STORE_CONNECT_API_KEY_P8`
+- `APP_STORE_CONNECT_KEY_ID`
+- `APP_STORE_CONNECT_ISSUER_ID`
+
+Do not update these from mixed sources. All three ASC fields must come from the same 1Password item.
+
+## Workflow Shape
+
+- Public release branch may carry mac-only packaging fixes after the stable tag/npm are already live.
+- Use `source_ref=release/YYYY.M.D` for private mac preflight/validation when building that branch variation.
+- Keep `tag=vYYYY.M.D` pointing at the original stable release commit.
+- Real mac publish must reuse:
+  - a successful private mac preflight run for the same tag/source SHA
+  - a successful private mac validation run for the same tag/source SHA
+- If preflight source SHA differs from tag SHA, validation must also use the same `source_ref`; promotion rejects mismatched proof.
+
+## Notarization
+
+- OpenClaw uses `scripts/notarize-mac-artifact.sh`.
+- `xcrun notarytool submit` should use `--no-s3-acceleration`; accelerated upload can surface misleading 401s even when `notarytool history` succeeds.
+- If signing succeeds but notarization fails immediately with 401, check ASC key freshness first.
+- If notarization stays in progress for several minutes after key-file write, that is normal Apple wait time; do not edit blindly.
+
+## Dispatch
+
+Private preflight:
+
+```bash
+gh workflow run openclaw-macos-publish.yml --repo openclaw/releases-private --ref main \
+  -f tag=vYYYY.M.D \
+  -f source_ref=release/YYYY.M.D \
+  -f preflight_only=true \
+  -f smoke_test_only=false \
+  -f allow_late_calver_recovery=false \
+  -f public_release_branch=release/YYYY.M.D
+```
+
+Private validation for a branch-variation preflight:
+
+```bash
+gh workflow run openclaw-macos-validate.yml --repo openclaw/releases-private --ref main \
+  -f tag=vYYYY.M.D \
+  -f source_ref=release/YYYY.M.D
+```
+
+Real publish:
+
+```bash
+gh workflow run openclaw-macos-publish.yml --repo openclaw/releases-private --ref main \
+  -f tag=vYYYY.M.D \
+  -f preflight_only=false \
+  -f smoke_test_only=false \
+  -f preflight_run_id=<successful-preflight-run> \
+  -f validate_run_id=<successful-validation-run> \
+  -f allow_late_calver_recovery=false \
+  -f public_release_branch=release/YYYY.M.D
+```
+
+## Verify
+
+- `gh release view vYYYY.M.D --repo openclaw/openclaw` shows zip, dmg, dSYM zip, not draft, not prerelease.
+- Public `main` `appcast.xml` points at `OpenClaw-YYYY.M.D.zip`.
+- Appcast entry has `sparkle:version`, `sparkle:shortVersionString`, length, and `sparkle:edSignature`.
@@ -27,7 +27,7 @@ Prove the touched surface first. Do not reflexively run the whole suite.
      use the Crabbox wrapper with the provider that matches the proof surface.
      For maintainer heavy `pnpm` gates, that is usually delegated Blacksmith
      Testbox through Crabbox, e.g. `node scripts/crabbox-wrapper.mjs run
-     --provider blacksmith-testbox ... -- pnpm check:changed`. For direct AWS
+--provider blacksmith-testbox ... -- pnpm check:changed`. For direct AWS
      Crabbox proof, omit `--provider` and let `.crabbox.yaml` choose AWS.
    - workflow-only: `git diff --check`, workflow syntax/lint (`actionlint` when available)
    - docs-only: `pnpm docs:list`, docs formatter/lint only if docs tooling changed or requested
@@ -131,6 +131,8 @@ gh run view <run-id> --job <job-id> --log
 - Check exact SHA. Ignore newer unrelated `main` unless asked.
 - For cancelled same-branch runs, confirm whether a newer run superseded it.
 - Fetch full logs only for failed or relevant jobs.
+- Prefer `gh run view <run-id> --json jobs` over PR rollup while debugging; rollup can be stale/noisy.
+- For `prompt:snapshots:check` failures, treat Linux Node 24 as CI truth. If macOS passes but CI drifts, reproduce in a Linux Node 24 container or Testbox, commit that generated output, then rerun.
 
 ## GitHub Release Workflows
 
 
@@ -17,7 +17,8 @@ artifact bundle. The runner leases the shared burner account from Convex.
 Run from the OpenClaw repo and branch under test:
 
 ```bash
-pnpm qa:telegram-user:crabbox -- start \
+proof_cmd="${OPENCLAW_TELEGRAM_USER_PROOF_CMD:-openclaw-telegram-user-crabbox-proof}"
+"$proof_cmd" start \
   --tdlib-url http://artifacts.openclaw.ai/tdlib-v1.8.0-linux-x64.tgz \
   --output-dir .artifacts/qa-e2e/telegram-user-crabbox/pr-review
 ```
@@ -39,7 +40,8 @@ For deterministic visual repros, put the exact mock-model reply in a file and
 pass it to `start`:
 
 ```bash
-pnpm qa:telegram-user:crabbox -- start \
+proof_cmd="${OPENCLAW_TELEGRAM_USER_PROOF_CMD:-openclaw-telegram-user-crabbox-proof}"
+"$proof_cmd" start \
   --tdlib-url http://artifacts.openclaw.ai/tdlib-v1.8.0-linux-x64.tgz \
   --mock-response-file .artifacts/qa-e2e/telegram-user-crabbox/reply.txt \
   --output-dir .artifacts/qa-e2e/telegram-user-crabbox/pr-review
@@ -55,29 +57,31 @@ For visual proof, first send or identify a bottom marker message, then open the
 group/topic directly by message id:
 
 ```bash
-pnpm qa:telegram-user:crabbox -- view \
+proof_cmd="${OPENCLAW_TELEGRAM_USER_PROOF_CMD:-openclaw-telegram-user-crabbox-proof}"
+"$proof_cmd" view \
   --session .artifacts/qa-e2e/telegram-user-crabbox/pr-review/session.json \
   --message-id <message-id>
 ```
 
 This uses Telegram Desktop directly with `tg://privatepost`, not `xdg-open`.
 It also resizes Telegram to `650x1000` at the tested desktop position so
-Telegram switches to single-chat mode with no left chat list or right info
-pane. Do not press Escape after this; Escape can close the selected chat.
+the crop can isolate the chat pane even if Telegram keeps a split/sidebar
+layout. Do not press Escape after this; Escape can close the selected chat.
 
 Bottom behavior matters:
 
 - deep-linking to the newest message keeps Telegram pinned to the bottom, so
   later messages appear live in the recording
 - deep-linking to an older message does not auto-scroll to new arrivals; link
   again to the newest/final marker instead of clicking the down-arrow
-- `650px` is the largest tested clean width; `660px` switches Telegram back to
-  split/sidebar layout
+- the cropped GIF intentionally uses the chat pane, not the whole desktop or
+  whole Telegram window
 
 Send as the real Telegram user:
 
 ```bash
-pnpm qa:telegram-user:crabbox -- send \
+proof_cmd="${OPENCLAW_TELEGRAM_USER_PROOF_CMD:-openclaw-telegram-user-crabbox-proof}"
+"$proof_cmd" send \
   --session .artifacts/qa-e2e/telegram-user-crabbox/pr-review/session.json \
   --text /status
 ```
@@ -87,7 +91,8 @@ For slash commands, omit the bot username; the runner targets the SUT bot.
 Run arbitrary commands on the Crabbox:
 
 ```bash
-pnpm qa:telegram-user:crabbox -- run \
+proof_cmd="${OPENCLAW_TELEGRAM_USER_PROOF_CMD:-openclaw-telegram-user-crabbox-proof}"
+"$proof_cmd" run \
   --session .artifacts/qa-e2e/telegram-user-crabbox/pr-review/session.json \
   -- bash -lc 'source /tmp/openclaw-telegram-user-crabbox/env.sh && python3 /tmp/openclaw-telegram-user-crabbox/user-driver.py transcript --limit 20 --json'
 ```
@@ -106,14 +111,16 @@ python3 /tmp/openclaw-telegram-user-crabbox/user-driver.py probe --text '@{sut}
 Capture the current desktop without ending the session:
 
 ```bash
-pnpm qa:telegram-user:crabbox -- screenshot \
+proof_cmd="${OPENCLAW_TELEGRAM_USER_PROOF_CMD:-openclaw-telegram-user-crabbox-proof}"
+"$proof_cmd" screenshot \
   --session .artifacts/qa-e2e/telegram-user-crabbox/pr-review/session.json
 ```
 
 Check lease state and get the WebVNC command:
 
 ```bash
-pnpm qa:telegram-user:crabbox -- status \
+proof_cmd="${OPENCLAW_TELEGRAM_USER_PROOF_CMD:-openclaw-telegram-user-crabbox-proof}"
+"$proof_cmd" status \
   --session .artifacts/qa-e2e/telegram-user-crabbox/pr-review/session.json
 ```
 
@@ -122,7 +129,8 @@ pnpm qa:telegram-user:crabbox -- status \
 Always finish or explicitly keep the box:
 
 ```bash
-pnpm qa:telegram-user:crabbox -- finish \
+proof_cmd="${OPENCLAW_TELEGRAM_USER_PROOF_CMD:-openclaw-telegram-user-crabbox-proof}"
+"$proof_cmd" finish \
   --session .artifacts/qa-e2e/telegram-user-crabbox/pr-review/session.json \
   --preview-crop telegram-window
 ```
@@ -150,7 +158,8 @@ Attach only the useful visual artifact to the PR unless logs are needed. The
 runner is GIF-only by default:
 
 ```bash
-pnpm qa:telegram-user:crabbox -- publish \
+proof_cmd="${OPENCLAW_TELEGRAM_USER_PROOF_CMD:-openclaw-telegram-user-crabbox-proof}"
+"$proof_cmd" publish \
   --session .artifacts/qa-e2e/telegram-user-crabbox/pr-review/session.json \
   --pr <pr-number> \
   --summary 'Telegram real-user Crabbox session motion GIF'
@@ -189,7 +198,8 @@ experiments unless those artifacts are explicitly needed.
 For a fast one-shot check, use:
 
 ```bash
-pnpm qa:telegram-user:crabbox -- --text /status
+proof_cmd="${OPENCLAW_TELEGRAM_USER_PROOF_CMD:-openclaw-telegram-user-crabbox-proof}"
+"$proof_cmd" --text /status
 ```
 
 This is a start/send/finish shortcut. Prefer the held session for PR review,