v0.28.3 feat(recipes): restart-sweep — detect dropped Telegram messages after gateway restarts#675
Merged
garrytan merged 5 commits intogarrytan:masterfrom May 6, 2026
Conversation
…way restarts Adds a tool to detect Telegram messages dropped during OpenClaw gateway restarts by analyzing session state patterns. Features: - Detects sessions with abortedLastRun flag (primary heuristic) - Identifies timing gaps (active before restart, silent after) - Configurable alert modes (Telegram, stdout) - Environment-based configuration - Comprehensive test suite - PII-scrubbed for public use The tool addresses webhook message loss that occurs when the gateway restarts while messages are in-flight. Unlike long-polling, webhooks cannot replay missed messages, making this detection crucial for production reliability.
…script
Reshape the directory-shaped recipes/restart-sweep/ into a single
self-contained recipes/restart-sweep.md with the (fixed) script inlined
as a fenced code block. The recipe loader at integrations.ts:445-485 only
discovers *.md, so the directory shape was invisible.
Eight script fixes:
1. Newline double-escape ('\\n' → '\n') at 8 sites
2. Hard-coded /tmp/ paths → ~/.gbrain/integrations/restart-sweep/ (honors
GBRAIN_HOME); bootstrap-log path env-overridable via OPENCLAW_BOOTSTRAP_LOG
3. exec() of interpolated string → execFile with argv array (no shell)
4. Idempotency: loadAlerted/saveAlerted helpers, atomic tmp+rename, corrupt-
JSON recovery, 30-day prune
5. Aggressive heuristic gated behind OPENCLAW_RESTART_SWEEP_AGGRESSIVE=1
(default OFF — false-positive prone during quiet periods)
6. Old directory shape removed
7. Env reads moved from module top-level to constructor (fixes the import-
time-snapshot bug that made tests semantically bogus)
8. Cooldown layer keyed on (sessionKey, lastAlertedAt) with 6h re-alert
threshold — prevents re-alerting forever when the bootstrap log is
missing and restartTime is synthesized fresh each run
Recipe body adds a Cron environment troubleshooting section with the
wrapper-script pattern (set -a; source .env; set +a; exec node ...) plus
explicit PATH= line for the cron entry. Plus a TODO line pointing at
docs/guides/plugin-handlers.md as the v2 upgrade path (registered Minion
handler in the openclaw repo for queue-backed idempotency).
Tests: 27 bun:test cases (12 ported + 14 new + 1 sentinel-shape guard).
The extractor anchors on <!-- restart-sweep:script --> sentinel and salts
the tmp filename to bypass the ESM import cache. A separate test asserts
the sentinel itself is present so future doc edits dropping it fail loud.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- README.md: add restart-sweep row to "Getting Data In" recipes table - CLAUDE.md: add test/restart-sweep.test.ts to the unit-test inventory - llms-full.txt: regenerated via bun run build:llms Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Brings in v0.28.1 (zombie process reaping, /health timeout, engine disconnect idempotency, PR garrytan#637). Conflicts resolved: - VERSION → 0.28.3 (ours; newer than master's 0.28.1) - package.json → version 0.28.3 (matches VERSION) - CHANGELOG.md → kept v0.28.3 entry above master's v0.28.1 entry; both full entries preserved with their own ### Itemized changes sections Post-merge actions: - bun install (no dep changes) - bun run build:llms (regenerated llms-full.txt to pick up master's CLAUDE.md additions for v0.28.1) - bun run test (3,876 pass / 0 fail) + verify (clean) + typecheck (clean) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan
added a commit
that referenced
this pull request
May 7, 2026
….28.6 Master shipped three v0.28.x patch releases without the takes feature while v0.28-release was in flight: - v0.28.1: zombie process accumulation + health endpoint timeout (#637) - v0.28.3: restart-sweep — detect dropped Telegram messages (#675) - v0.28.4: skillify cross-modal eval quality gate (#674) Master's v0.28.0 slot was consumed without the takes layer ever landing, so this release ships the original takes feature as v0.28.6 (skipping v0.28.5 to leave space for any in-flight master patches). The migration orchestrator file (v0_28_0.ts) and migration skill doc (skills/migrations/v0.28.0.md) keep their original version keys — those identify the migration version, not the release version. Conflicts resolved: - VERSION → 0.28.6 (was 0.28.0; master had 0.28.4) - package.json → 0.28.6 (auto-merged ai-sdk deps from master's v0.27) - CHANGELOG.md → renamed top entry "## [0.28.0]" → "## [0.28.6]" with date 2026-05-06; rebuilt the "To take advantage of" block (was truncated by stale === markers from a prior merge); preserved master's v0.28.4/v0.28.3/v0.28.1 entries beneath - src/cli.ts auto-merged (CLI_ONLY has providers + takes/think both) Verified post-merge: - bun run verify: PASS (privacy + jsonb + progress + test-isolation + wasm + admin-build + typecheck) - 133 tests pass: migrate + apply-migrations + takes-engine + takes-fence - migrations v37 (takes) + v38 (access_tokens_permissions) apply cleanly on top of master's v35 (auto-RLS) + v36 (subagent persistence)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reshape PR #675's
recipes/restart-sweep/directory into a single self-containedrecipes/restart-sweep.mdrecipe with the (fixed) script inlined as a fenced code block. Apply 8 code-quality fixes, port + extend the test suite to bun:test (12 ported + 14 new = 26 cases + 1 sentinel guard = 27 total).Why land it as a recipe, not a default behavior: restart-sweep is host-specific to OpenClaw + Telegram + webhook mode. CLAUDE.md is explicit that host-specific operational tooling lives as plugin handlers in the host's own repo, not in gbrain core. So it ships as an opt-in recipe alongside
twilio-voice-brain,email-to-brain, etc. — discoverable viagbrain integrations list, only "configured" when the user sets the OpenClaw envs. The recipe body documents the v2 upgrade path: registered Minion handler in the openclaw repo againstgbrain/minions(seedocs/guides/plugin-handlers.md).Commits:
feat(recipes): reshape restart-sweep into single .md recipe + harden script— the meat of the change. Newrecipes/restart-sweep.mdwith frontmatter (noexpect_exit_code, schema doesn't have it) + agent-facing setup body + inlined ~325-line script with 8 fixes. Newtest/restart-sweep.test.tswith 27 bun:test cases anchored on a<!-- restart-sweep:script -->sentinel comment. Oldrecipes/restart-sweep/directory deleted.chore: bump version and changelog (v0.28.3)— VERSION + package.json + CHANGELOG entry written in the GStack release-summary voice.docs: sync README + CLAUDE.md for v0.28.3 restart-sweep recipe— README's recipes table gets the new row, CLAUDE.md's test inventory gets the new test annotation, llms-full.txt regenerated viabun run build:llms.Test Coverage
Coverage gate: PASS (100%).
Pre-Landing Review
Already cleared via
/plan-eng-review(6 issues, all 6 resolved with recommended option) and/codexconsult mode (8 findings, all 8 resolved). The plan file at~/.claude/plans/figure-out-if-we-eager-coral.mdcarries the full review trace.Codex caught 2 silent-correctness bugs the eng review missed:
(sessionKey, restartTimeIso)key changes every run when the bootstrap log is missing, so the same stale session re-alerts forever. Fixed by adding a(sessionKey, lastAlertedAt)cooldown layer with 6h re-alert threshold.process.envafter import were semantically bogus. Fixed by moving env reads into theMessageSweepDetectorconstructor.Eval Results
No prompt-related files changed — evals skipped.
Greptile Review
No Greptile comments on the PR.
Scope Drift
CLEAN. Branch intent: reshape PR #675's recipe shape + apply the 8 code fixes + add proper bun:test coverage. Delivered: same. No files outside
recipes/restart-sweep.{md,mjs},test/restart-sweep.test.ts, or the doc-sync targets.Plan Completion
recipes/restart-sweep.md(D2)expect_exit_codein command health_check (D1)alerted.json(D3)loadAlerted(D4)12 plan items, 12 done. 0 deferred.
Verification Results
bun test test/restart-sweep.test.ts→ 27 pass / 0 failgbrain integrations show restart-sweep→ renders cleanlygbrain integrations test recipes/restart-sweep.md→ frontmatter validatesgbrain integrations doctor(withOPENCLAW_OWNER_IDS=test OPENCLAW_TELEGRAM_GROUP=-100) → all 3 health checks passbun run typecheck→ cleanbun run verify→ all 7 pre-test gates pass (privacy, jsonb, progress, test-isolation, wasm, admin-build, typecheck)bun run test→ 3,929 pass / 0 fail across 8 parallel shards + serial passTODOS
No TODO items completed in this PR.
Documentation
Updated three files to sync with v0.28.3:
README.md— added Restart Sweep to the "Getting Data In" recipes tableCLAUDE.md— addedtest/restart-sweep.test.tsannotation to the unit-test inventoryllms-full.txt— regenerated viabun run build:llmsTest plan
bun test test/restart-sweep.test.ts(27 pass / 0 fail)bun run verify(privacy + jsonb + progress + test-isolation + wasm + admin-build + typecheck — all pass)bun run test(3,929 pass / 0 fail, no regressions)gbrain integrations show restart-sweeprenders cleanlygbrain integrations test recipes/restart-sweep.mdfrontmatter validatesgbrain integrations doctor restart-sweep(with envs set) — all 3 health checks pass🤖 Generated with Claude Code