Skip to content

Commit 25e5c0b

Browse files
committed
docs(qa): document whatsapp live qa coverage
1 parent 7df9de1 commit 25e5c0b

1 file changed

Lines changed: 62 additions & 30 deletions

File tree

docs/concepts/qa-e2e-automation.md

Lines changed: 62 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ script aliases; both forms are supported.
4848
| `qa telegram` | Live transport lane against a real private Telegram group. |
4949
| `qa discord` | Live transport lane against a real private Discord guild channel. |
5050
| `qa slack` | Live transport lane against a real private Slack channel. |
51+
| `qa whatsapp` | Live transport lane against real WhatsApp Web accounts. |
5152
| `qa mantis` | Before and after verification runner for live transport bugs, with Discord status-reactions evidence, Crabbox desktop/browser smoke, and Slack-in-VNC smoke. See [Mantis](/concepts/mantis) and [Mantis Slack Desktop Runbook](/concepts/mantis-slack-desktop-runbook). |
5253

5354
## Operator flow
@@ -168,15 +169,16 @@ decision still comes from the Discord REST oracle.
168169

169170
CI uses the same command surface in `.github/workflows/qa-live-transports-convex.yml`. Scheduled and default manual runs execute the fast Matrix profile with live frontier credentials, `--fast`, and `OPENCLAW_QA_MATRIX_NO_REPLY_WINDOW_MS=3000`. Manual `matrix_profile=all` fans out into the five profile shards so the exhaustive catalog can run in parallel while keeping one artifact directory per shard.
170171

171-
For transport-real Telegram, Discord, and Slack smoke lanes:
172+
For transport-real Telegram, Discord, Slack, and WhatsApp smoke lanes:
172173

173174
```bash
174175
pnpm openclaw qa telegram
175176
pnpm openclaw qa discord
176177
pnpm openclaw qa slack
178+
pnpm openclaw qa whatsapp
177179
```
178180

179-
They target a pre-existing real channel with two bots (driver + SUT). Required env vars, scenario lists, output artifacts, and the Convex credential pool are documented in [Telegram, Discord, and Slack QA reference](#telegram-discord-and-slack-qa-reference) below.
181+
They target a pre-existing real channel with two bots or accounts (driver + SUT). Required env vars, scenario lists, output artifacts, and the Convex credential pool are documented in [Telegram, Discord, Slack, and WhatsApp QA reference](#telegram-discord-slack-and-whatsapp-qa-reference) below.
180182

181183
For a full Slack desktop VM run with VNC rescue, run:
182184

@@ -276,10 +278,10 @@ coverage helpers, and scenario-selection helper from
276278
| Telegram | x | x | x | | | | | | | x | |
277279
| Discord | x | x | x | | | | | | | | x |
278280
| Slack | x | x | x | x | x | x | x | x | | | |
281+
| WhatsApp | x | x | | x | x | x | | | x | x | |
279282

280283
This keeps `qa-channel` as the broad product-behavior suite while Matrix,
281-
Telegram, and future live transports share one explicit transport-contract
282-
checklist.
284+
Telegram, and other live transports share one explicit transport-contract checklist.
283285

284286
For a disposable Linux VM lane without bringing Docker into the QA path, run:
285287

@@ -308,25 +310,25 @@ guest: env-based provider keys, the QA live provider config path, and
308310
`CODEX_HOME` when present. Keep `--output-dir` under the repo root so the guest
309311
can write back through the mounted workspace.
310312

311-
## Telegram, Discord, and Slack QA reference
313+
## Telegram, Discord, Slack, and WhatsApp QA reference
312314

313-
Matrix has a [dedicated page](/concepts/qa-matrix) because of its scenario count and Docker-backed homeserver provisioning. Telegram, Discord, and Slack are smaller - a handful of scenarios each, no profile system, against pre-existing real channels - so their reference lives here.
315+
Matrix has a [dedicated page](/concepts/qa-matrix) because of its scenario count and Docker-backed homeserver provisioning. Telegram, Discord, Slack, and WhatsApp run against pre-existing real transports, so their reference lives here.
314316

315317
### Shared CLI flags
316318

317319
These lanes register through `extensions/qa-lab/src/live-transports/shared/live-transport-cli.ts` and accept the same flags:
318320

319-
| Flag | Default | Description |
320-
| ------------------------------------- | --------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- |
321-
| `--scenario <id>` | - | Run only this scenario. Repeatable. |
322-
| `--output-dir <path>` | `<repo>/.artifacts/qa-e2e/{telegram,discord,slack}-<timestamp>` | Where reports/summary/observed messages and the output log are written. Relative paths resolve against `--repo-root`. |
323-
| `--repo-root <path>` | `process.cwd()` | Repository root when invoking from a neutral cwd. |
324-
| `--sut-account <id>` | `sut` | Temporary account id inside the QA gateway config. |
325-
| `--provider-mode <mode>` | `live-frontier` | `mock-openai` or `live-frontier` (legacy `live-openai` still works). |
326-
| `--model <ref>` / `--alt-model <ref>` | provider default | Primary/alternate model refs. |
327-
| `--fast` | off | Provider fast mode where supported. |
328-
| `--credential-source <env\|convex>` | `env` | See [Convex credential pool](#convex-credential-pool). |
329-
| `--credential-role <maintainer\|ci>` | `ci` in CI, `maintainer` otherwise | Role used when `--credential-source convex`. |
321+
| Flag | Default | Description |
322+
| ------------------------------------- | -------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- |
323+
| `--scenario <id>` | - | Run only this scenario. Repeatable. |
324+
| `--output-dir <path>` | `<repo>/.artifacts/qa-e2e/<transport>-<timestamp>` | Where reports/summary/observed messages and the output log are written. Relative paths resolve against `--repo-root`. |
325+
| `--repo-root <path>` | `process.cwd()` | Repository root when invoking from a neutral cwd. |
326+
| `--sut-account <id>` | `sut` | Temporary account id inside the QA gateway config. |
327+
| `--provider-mode <mode>` | `live-frontier` | `mock-openai` or `live-frontier` (legacy `live-openai` still works). |
328+
| `--model <ref>` / `--alt-model <ref>` | provider default | Primary/alternate model refs. |
329+
| `--fast` | off | Provider fast mode where supported. |
330+
| `--credential-source <env\|convex>` | `env` | See [Convex credential pool](#convex-credential-pool). |
331+
| `--credential-role <maintainer\|ci>` | `ci` in CI, `maintainer` otherwise | Role used when `--credential-source convex`. |
330332

331333
Each lane exits non-zero on any failed scenario. `--allow-failures` writes artifacts without setting a failing exit code.
332334

@@ -688,22 +690,52 @@ Required env when `--credential-source env`:
688690

689691
Optional:
690692

691-
- `OPENCLAW_QA_WHATSAPP_GROUP_JID` enables `whatsapp-mention-gating`.
693+
- `OPENCLAW_QA_WHATSAPP_GROUP_JID` enables group scenarios such as
694+
`whatsapp-mention-gating` and `whatsapp-group-allowlist-block`.
692695
- `OPENCLAW_QA_WHATSAPP_CAPTURE_CONTENT=1` keeps message bodies in
693696
observed-message artifacts.
694697

695-
Scenarios (`extensions/qa-lab/src/live-transports/whatsapp/whatsapp-live.runtime.ts`):
696-
697-
- `whatsapp-canary`
698-
- `whatsapp-pairing-block`
699-
- `whatsapp-mention-gating`
700-
- `whatsapp-approval-exec-native` - opt-in native WhatsApp exec approval
701-
scenario. Requests an exec approval through the gateway, verifies the
702-
WhatsApp message has native reaction approval affordances, resolves it, and
703-
verifies the resolved WhatsApp follow-up.
704-
- `whatsapp-approval-plugin-native` - opt-in native WhatsApp plugin approval
705-
scenario. Enables exec and plugin approval forwarding together, then verifies
706-
the same pending/resolved native WhatsApp path.
698+
Scenario catalog (`extensions/qa-lab/src/live-transports/whatsapp/whatsapp-live.runtime.ts`):
699+
700+
- Baseline and group gating: `whatsapp-canary`, `whatsapp-pairing-block`,
701+
`whatsapp-mention-gating`, `whatsapp-top-level-reply-shape`,
702+
`whatsapp-restart-resume`, `whatsapp-group-allowlist-block`.
703+
- Native commands: `whatsapp-help-command`, `whatsapp-status-command`,
704+
`whatsapp-commands-command`, `whatsapp-tools-compact-command`,
705+
`whatsapp-whoami-command`, `whatsapp-context-command`,
706+
`whatsapp-native-new-command`.
707+
- Reply and final-output behavior: `whatsapp-tool-only-usage-footer`,
708+
`whatsapp-reply-to-message`, `whatsapp-reply-context-isolation`,
709+
`whatsapp-reply-delivery-shape`, `whatsapp-stream-final-message-accounting`.
710+
- Inbound media and structured messages: `whatsapp-inbound-image-caption`,
711+
`whatsapp-audio-preflight`, `whatsapp-inbound-structured-messages`,
712+
`whatsapp-group-audio-gating`. These send real WhatsApp image, audio,
713+
document, location, contact, and sticker events through the driver.
714+
- Outbound Gateway and message action coverage:
715+
`whatsapp-outbound-media-matrix`,
716+
`whatsapp-outbound-document-preserves-filename`, `whatsapp-outbound-poll`,
717+
`whatsapp-message-actions`.
718+
- Access-control coverage: `whatsapp-access-control-dm-open`,
719+
`whatsapp-access-control-dm-disabled`, `whatsapp-access-control-group-open`,
720+
`whatsapp-access-control-group-disabled`, `whatsapp-group-allowlist-block`.
721+
- Native approvals: `whatsapp-approval-exec-deny-native`,
722+
`whatsapp-approval-exec-native`, `whatsapp-approval-exec-reaction-native`,
723+
`whatsapp-approval-plugin-native`.
724+
- Status reactions: `whatsapp-status-reactions`.
725+
726+
The catalog currently contains 35 scenarios. The `live-frontier` default lane is
727+
kept small at 8 scenarios for fast smoke coverage. The `mock-openai` default
728+
lane runs 29 deterministic scenarios through the real WhatsApp transport while
729+
mocking only model output. Approval scenarios and a few heavier/blocking checks
730+
remain explicit by scenario id.
731+
732+
The WhatsApp QA driver observes structured live events (`text`, `media`,
733+
`location`, `reaction`, and `poll`) and can actively send media, polls,
734+
contacts, locations, and stickers. QA Lab imports that driver through the
735+
`@openclaw/whatsapp/api.js` package surface instead of reaching into private
736+
WhatsApp runtime files. Message content is redacted by default. Outbound
737+
poll and upload-file coverage run through deterministic gateway `poll` and
738+
`message.action` calls instead of model-prompt-only tool invocation.
707739

708740
Output artifacts:
709741

0 commit comments

Comments
 (0)