Conversation
The autopilot-cycle handler always ran ALL_PHASES regardless of job data. This caused production stalls when the embed phase had a large backlog (17K+ stale chunks) that exceeded the 30-minute job timeout. Every 5-min cycle would start, hit the embed wall, stall, and get force-killed — creating an infinite stall loop that kept the queue perpetually unhealthy. The fix validates job.data.phases against ALL_PHASES (preventing injection) and forwards the selected phases to runCycle(). Callers can now submit fast cycles (lint+backlinks+sync+extract) on a 5-min cron and run embed separately with a longer timeout during off-peak hours. If phases is omitted, not an array, or filters to empty, behavior is unchanged (all phases run). Tests: 4 new cases covering phase restriction, invalid name filtering, empty array fallback, and non-array type safety.
The regression guard sliced the first 500 chars after `worker.register('autopilot-cycle'`
and asserted `signal: job.signal` was present. The phase-validation block added in
787ec7d pushed the signal arg past that boundary, so CI test shard 3 failed even
though the handler still propagates the signal correctly. Bump the window to 2000.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Note autopilot-cycle phases passthrough fix on the src/commands/jobs.ts key-files annotation so future readers know the handler honors job.data.phases (validated against ALL_PHASES) as of v0.22.10. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
garrytan
added a commit
that referenced
this pull request
Apr 30, 2026
PR #506 claims v0.22.15, PR #521 claims v0.22.10, intermediate slots (.11/.12/.13/.14) are claimed by other open PRs. v0.22.16 is the next clean PATCH slot. v0.23.0 is claimed by PR #462 so MINOR isn't free. This release fits the 0.22.x train; v0.23.0 lands when #462 ships. Updates VERSION, package.json, CHANGELOG.md header, TODOS.md follow-up labels. Code is unchanged.
garrytan
added a commit
that referenced
this pull request
Apr 30, 2026
…arness (#522) * feat: hermeticity migration — every $GBRAIN_HOME write site honors the env override configDir() in src/core/config.ts already implemented $GBRAIN_HOME as a parent-dir override (returns <override>/.gbrain), but ~12 consumers built paths from os.homedir() directly and bypassed it. Critically, loadConfig/saveConfig themselves used a private getConfigDir() that ignored the env. Fixed. Migrated every write site to gbrainPath() — fail-improve, validator-lint, cycle lock, shell-audit, backpressure-audit, sync-failures, integrity logs, integrations heartbeat, init pglite path, migrate-engine manifest, import checkpoint, v0_13_1 rollback, v0_14_0 host-work. Read-side host-detection in init.ts (~/.claude / ~/.openclaw probes) intentionally NOT migrated; that's a v1.1 follow-up under a separate $GBRAIN_HOST_HOME override. Adds gbrainPath(...segments) sugar plus path validation: $GBRAIN_HOME must be absolute and contain no '..' segments (throws GbrainHomeInvalidError). test/gbrain-home-isolation.test.ts proves write-isolation across all migrated sites. test/migrations-v0_14_0.test.ts updated to use $GBRAIN_HOME instead of the old HOME-swap pattern. Closes part of the claw-test E2E harness preconditions (D13 + D21). * feat: gbrain friction {log,render,list,summary} — agent friction reporter Append-only JSONL writer at $GBRAIN_HOME/friction/<run-id>.jsonl. Schema is a flat extension of StructuredAgentError (D20), one envelope shape across both agent-emitted entries and harness-wrapped command failures. Run-id resolves from --run-id > $GBRAIN_FRICTION_RUN_ID > 'standalone'. Subcommands stay ≤30 LOC each; core lives in src/core/friction.ts (writer + reader + renderer + redactor). render --redact (default for md output) strips \$HOME / \$CWD to placeholders so reports paste safely in PRs/issues. Severity: confused | error | blocker | nit. Kind: friction | delight (D7) | phase-marker | interrupted. Readers tolerate malformed lines (skip + warn). 40 unit tests; this is the channel the claw-test harness writes to and that agents emit through during live-mode runs. * feat: gbrain claw-test — end-to-end fresh-install friction harness Two modes: scripted (CI gate, no agent) and --live (real agent subprocess). Phases: setup → install_brain (gbrain init --pglite) → import (--no-embed) → query → extract all --source fs → verify (gbrain doctor --json, asserts status==='ok' and progress.jsonl phase coverage). AgentRunner interface + registry — interface stays narrow (detect, invoke, optional postInstallHook). v1 ships only OpenClawRunner; the registry pattern lets v1.1 land hermes/codex as ~50-line additions without refactoring callers. OpenClaw invocation: 'openclaw agent --local --agent <name> --message <brief>' matching test/e2e/skills.test.ts (NOT --prompt-file, which doesn't exist). transcript-capture: spawns child with piped stdio, async-drains via fs.createWriteStream + 'drain' events so 256KB+ bursts don't stall the child (D17 backpressure). Writes <run>/transcript.jsonl with schema_version + ts + channel + byte_offset + bytes_b64. Friction entries' transcript_offset field references byte offsets here so render --transcripts can resolve back. progress-tail: parses gbrain's --progress-json events out of child stderr. Phase verification asserts each scenario.expected_phases entry (dotted names like import.files, extract.links_fs, doctor.db_checks) saw at least one event from the actual command — proves the COMMAND ran, not that the agent obeyed prompts. seed-pglite: ~50 LOC SQL replay primitive for the upgrade-from-v0.18 scenario. Existing migration helpers (test/e2e/helpers.ts) are Postgres-only; PGLite has no equivalent. seedPglite opens a fresh PGLite, executes each statement individually (errors name the failing one), then disconnects so gbrain init can take over and walk forward. 53 unit tests covering registry selection, runner detection, multi-byte UTF-8 chunk-boundary safety, PIPE buffer drain, scenario load+validate, progress event parsing, and SQL splitter. * feat: claw-test scenario fixtures + friction-protocol skills convention Two scenarios ship in v1 — fresh-install and upgrade-from-v0.18. Each is a self-contained directory: brain/ (markdown pages), BRIEF.md (live-mode prompt), expected.json (scripted-mode assertions), scenario.json (kind, expected_phases, optional from_version + seed paths). Schema is owned by src/core/claw-test/ scenarios.ts. upgrade-from-v0.18 ships scaffolded — seed/dump.sql is the v1.1 follow-up (needs a real v0.18-shape PGLite dump; seed/README.md documents the gen procedure). The harness gracefully no-ops the seed phase when dump.sql is absent. skills/_friction-protocol.md is a cross-cutting convention skill (like _brain-filing-rules.md). Tells agents when to call gbrain friction log and how to choose severity. Skills the claw-test exercises will gain a > Convention: callout pointing here in a v1.1 sweep. 13 unit tests for the scenario loader + 'shipped scenarios load cleanly' for both. * feat: register gbrain claw-test + gbrain friction; CLAUDE.md + llms sync Wires both commands into src/cli.ts CLI_ONLY allow-list and adds dispatch in handleCliOnly so neither command requires a brain engine connection. CLAUDE.md gains entries for src/commands/{friction,claw-test}.ts + src/core/claw-test/ + skills/_friction-protocol.md, and a Commands section listing all 8 new gbrain claw-test ... and gbrain friction ... invocations with the v0.23 marker. Documents the GBRAIN_HOME write-isolation contract and the v1 caveat (read-side host-fingerprint detection deferred to v1.1). llms.txt + llms-full.txt regenerated via 'bun run build:llms' so the committed generator-output gate passes. test/e2e/claw-test.test.ts is the scripted-mode E2E. Builds a tiny shim that delegates to 'bun run src/cli.ts' (NOT bun --compile, which doesn't bundle PGLite's runtime assets), points the harness at it via GBRAIN_BIN_OVERRIDE, runs --scenario fresh-install end-to-end. Asserts exit 0, zero error/blocker friction. Includes a deliberate-break test that proves the friction signal fires when a phase command rejects. test/claw-test-cli.test.ts covers shipped-scenario load + agent registry + OpenClawRunner detection (relative-path / .. / missing-bin guards) + the GBRAIN_FRICTION_RUN_ID env handoff between harness and friction CLI. Closes the v0.23 claw-test E2E feature. * chore: bump version and changelog (v0.24.0) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(tests): typecheck failures + spawnWithCapture timeout headroom in CI Three CI fixes after PR #522 landed: 1. test/agent-runner.test.ts:89 — UnavailableRunner.invoke() returns Promise<void> by default but the AgentRunner contract requires Promise<InvokeResult>. Annotate the throw-only invoke explicitly so tsc sees the contract is satisfied (the throw makes the body unreachable as far as the return type is concerned). 2. test/seed-pglite.test.ts — bun:test signature is test(name, fn, timeoutMs: number), not test(name, opts: {timeout}, fn). The {timeout: 30_000} object form was a guess that tsc on bun 1.3.13 rejects. Move the 30s cap to the trailing positional number arg on each PGLite-using test. 3. test/transcript-capture.test.ts — `spawnWithCapture > timeout fires SIGTERM/SIGKILL` blew the 10s outer cap on the GitHub runner. Two fixes: (a) use `exec sleep` so the child we spawn IS sleep — SIGTERM goes directly to it, no `/bin/sh` fork-vs-exec process-group ambiguity that could orphan the sleep and force the SIGKILL grace path. (b) bump outer cap to 30s for headroom even when the runner is slow and SIGKILL after the 5s grace is what actually ends the child. * chore: rebump to v0.22.16 (next free 0.22.x patch slot per queue) PR #506 claims v0.22.15, PR #521 claims v0.22.10, intermediate slots (.11/.12/.13/.14) are claimed by other open PRs. v0.22.16 is the next clean PATCH slot. v0.23.0 is claimed by PR #462 so MINOR isn't free. This release fits the 0.22.x train; v0.23.0 lands when #462 ships. Updates VERSION, package.json, CHANGELOG.md header, TODOS.md follow-up labels. Code is unchanged. --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
autopilot-cycleMinion handler insrc/commands/jobs.tsnow forwardsjob.data.phasestorunCycle(). Previously the array was accepted viaMinionJobInput.paramsbut discarded before dispatch — every cycle ran the full 6-phase pipeline regardless of caller intent. Bit hard in production whenembedaccumulated a 17K-chunk backlog and every 5-minute autopilot cron submitted a job that timed out at 30 min.Behavior:
phases: ["lint","backlinks"]→ runs only those two phases (canonical order, not caller order —runCyclekeys onphases.includes(...)).phases: ["BAD"]→ all names filtered, falls back to default 6 phases (caller's input was unrecoverable).phases: []or non-array → falls back to default 6 phases (prior behavior preserved).phases: undefined→ default 6 phases (unchanged).ALL_PHASESfromsrc/core/cycle.ts(set lookup, no injection surface).Test Coverage
test/handlers.test.tsunderautopilot-cycle handler — phase passthrough:job.data.phases restricts which phases run— valid phases forwardedinvalid phase names in job.data.phases are filtered out— bogus names droppedempty phases array falls back to all phases— same as no phasesnon-array phases value is ignored— string"lint"ignoredtest/cycle-abort.test.tswidened (500 → 2000 chars) so it still findssignal: job.signalafter the new validation block was added betweenworker.register('autopilot-cycle', …)and therunCycle(…)call.Pre-Landing Review
Manual review of the 9-LoC production change:
phasesat all.Adversarial Review
Independent subagent review (fresh context, no checklist bias). Findings ranked:
["embed","embed"]) are not deduped before forwarding. Low blast radius:runCyclekeys onphases.includes(...)so each phase runs at most once. Tidiness fix ([...new Set(filtered)]) is harmless but not required for v0.22.10.await import('../core/cycle.ts')calls in close succession. Bun's module cache makes the second a no-op lookup, but they could be combined into one destructured import for tidiness.None blocking. All three are filed mentally as future cleanup; the production fix is correct and the test suite is comprehensive.
Eval Results
No prompt-related files changed — evals skipped.
Greptile Review
No Greptile comments on the PR.
Plan Completion
No plan file — this is a direct response to a production incident. The CHANGELOG entry stands in for a plan: production observation → root cause → fix → tests.
Verification Results
bun test test/cycle-abort.test.ts test/handlers.test.ts→ 15 pass, 0 fail locally.CI on prior commit (d95f1d2):
test (1)test (2)test (3)test (4)— passTier 1 (Mechanical)— pass (after one rerun; flake unrelated to this PR)gitleaks— passTODOS
No items to mark complete (this PR is a hotfix from production observation, not a planned TODO).
Documentation
CLAUDE.md— appended a v0.22.10 (v0.22.10 fix: autopilot-cycle handler forwards job.data.phases to runCycle #521) note to thesrc/commands/jobs.tskey-files annotation: theautopilot-cyclehandler now forwardsjob.data.phasestorunCycle, validated againstALL_PHASESfromsrc/core/cycle.ts, with invalid names filtered and empty/missing arrays falling back to the default 6-phase cycle.Test plan
bun test test/cycle-abort.test.ts test/handlers.test.tspasses (15/15)Tier 1 (Mechanical)passes after rerun (flake)CLAUDE.mdannotation reflects the new behavior🤖 Generated with Claude Code