fix(health): use runtime snapshot for channel summaries by BingqingLyu · Pull Request #713 · BingqingLyu/openclaw

BingqingLyu · 2026-04-27T09:46:44Z

Summary

Problem: openclaw health --json rebuilt channel summaries from config plus probe, but did not feed in the live gateway channel runtime snapshot.
Why it matters: Telegram could show running: false, lastStartAt: null, and tokenSource: "none" in health while channels status and live traffic showed the same account running normally.
What changed: thread the live runtime snapshot into health refreshes, build per-account health snapshots through the normal channel snapshot builder, invalidate cached health snapshots when the runtime state is newer than the cached summary, and guard the staleness check so channels that omit lastStartAt (WhatsApp, Zalo) do not cause perpetual cache invalidation.
What did NOT change (scope boundary): no channel runtime logic, probe behavior, or channels.status output path was changed.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

openclaw health --json now reflects live channel runtime fields when the gateway has them, instead of falling back to config/probe-only summaries for channels like Telegram.

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No
If any Yes, explain risk + mitigation:

Repro + Verification

Environment

OS: macOS 26.3.1
Runtime/container: local gateway, npm global openclaw@2026.3.13 for before-state; patched checkout on this branch for after-state
Model/provider: not model-specific
Integration/channel (if any): Telegram
Relevant config (redacted): ~/.openclaw/openclaw.json5 with channels.telegram.botToken

Steps

Configure Telegram and start the local gateway.
Confirm Telegram traffic is working.
Compare openclaw health --json with openclaw channels status --json.

Expected

health should agree with the live runtime state for fields like running, lastStartAt, mode, and tokenSource.

Actual

health reported Telegram as stopped / tokenSource: "none" while channels status reported the same account running from config.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Screenshot: before

Screenshot: after

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios: reproduced the mismatch on the installed 2026.3.13 build, traced the issue into getHealthSnapshot, confirmed the health cache could return a summary older than the live Telegram runtime, added regression tests for stale runtime-backed cache invalidation and for channels that omit lastStartAt, ran the targeted tests plus pnpm build, and captured a real before/after shell repro showing the mismatch on the installed build and agreement on this patched branch running from a proper git worktree.
Edge cases checked: health refresh still works without a runtime provider; runtime provider is cleared on gateway shutdown; cached health snapshots are bypassed when runtime running or lastStartAt is newer than the cache; channels that omit lastStartAt (WhatsApp, Zalo) no longer cause perpetual cache invalidation.
What you did not verify: I did not run a full multi-channel regression beyond Telegram for this cache invalidation path.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No
If yes, exact upgrade steps:

Failure Recovery (if this breaks)

How to disable/revert this change quickly: revert commits 49415a3, 710e3f1, 8dff28d, and f30ce95
Files/config to restore: src/gateway/server-methods/health.ts, src/gateway/server/health-state.ts, src/gateway/server/health-state.test.ts, src/commands/health.ts
Known bad symptoms reviewers should watch for: health stops reflecting live channel runtime fields after channels connect, or gateway shutdown leaves a stale health runtime provider behind

Risks and Mitigations

Risk: health summaries now depend on the gateway runtime snapshot being available during refresh.
- Mitigation: the new code keeps the old behavior when no runtime snapshot provider exists, and only bypasses cached health when runtime state is clearly newer than the cache.

AI-assisted: yes.

Keep probe results in health summaries when plugin snapshot builders omit the probe field, add regression coverage for that path, and harden health-state test cleanup via afterEach.

…lastStartAt

0xble added 5 commits March 14, 2026 21:27

fix(health): use runtime snapshot for channel summaries

52b3cd9

fix(health): preserve omitted probe snapshots

e90ca71

Keep probe results in health summaries when plugin snapshot builders omit the probe field, add regression coverage for that path, and harden health-state test cleanup via afterEach.

fix(health): invalidate stale runtime cache

14d73d5

fix(health): avoid perpetual cache invalidation for channels without …

203d51a

…lastStartAt

style: apply oxfmt formatting

61c402f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(health): use runtime snapshot for channel summaries#713

fix(health): use runtime snapshot for channel summaries#713
BingqingLyu wants to merge 5 commits intomainfrom
fork-pr-46527-fix-health-telegram-runtime

BingqingLyu commented Apr 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

BingqingLyu commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Screenshot: before

Screenshot: after

Human Verification (required)

Review Conversations

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BingqingLyu commented Apr 27, 2026 •

edited

Loading