Skip to content

fix: integrate OpenClaw Gateway health signals#957

Merged
ashione merged 4 commits intomainfrom
auto/gateway-startup-optimization
May 1, 2026
Merged

fix: integrate OpenClaw Gateway health signals#957
ashione merged 4 commits intomainfrom
auto/gateway-startup-optimization

Conversation

@ashione
Copy link
Copy Markdown
Contributor

@ashione ashione commented May 1, 2026

Summary

  • Add a main-process Gateway capability monitor that prefers OpenClaw-native system-presence, health, status, channels.status, and doctor.memory.* signals over stderr string matching.
  • Route OpenClaw health and presence events through dedicated Gateway events, host events, preload allowlists, and the renderer Gateway store.
  • Keep Gateway readiness tied to process/transport/core RPC readiness while treating memory, Dreams, and channel failures as capability degradation.
  • Preserve the existing prelaunch maintenance cache work and extend diagnostics/spec docs for the OpenClaw-native triage flow.

Test plan

  • pnpm exec vitest run tests/unit/gateway-event-dispatch.test.ts tests/unit/gateway-events.test.ts tests/unit/gateway-manager-diagnostics.test.ts tests/unit/gateway-ready-fallback.test.ts tests/unit/dreams-page.test.tsx tests/unit/channel-routes.test.ts tests/unit/harness-specs.test.ts
  • pnpm run typecheck
  • pnpm run lint:check (passes with existing src/pages/Chat/ChatInput.tsx:selectedSkill warning)
  • pnpm run build:vite
  • pnpm run comms:replay
  • pnpm run comms:compare
  • pnpm exec playwright test tests/e2e/openclaw-dreams.spec.ts tests/e2e/channels-health-diagnostics.spec.ts

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce4a1d2b16

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +132 to +137
task();
cache.tasks[taskName] = {
key: cacheKey,
updatedAt: new Date().toISOString(),
};
writeCache(cachePath, cache);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Only persist cache when maintenance task succeeds

This marks a task as cached immediately after invoking task(), but several wrapped maintenance routines (e.g., plugin install/upgrade and cleanup paths in config-sync.ts) catch and log their own failures instead of throwing. In that case a transient failure is still recorded as a cache miss completion, and the next startup becomes a cache hit that skips retrying the maintenance, leaving stale/missing plugins or runtime-deps state until some unrelated cache key change occurs.

Useful? React with 👍 / 👎.

@ashione ashione changed the title fix: reduce Gateway prelaunch maintenance cost fix: integrate OpenClaw Gateway health signals May 1, 2026
Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a82cafcc4c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread electron/gateway/capability-monitor.ts Outdated
Comment on lines +112 to +115
rpcRouter: params.status.gatewayReady === true || this.lastCoreProbe?.ok === true
? 'ready'
: this.lastCoreProbe?.ok === false
? 'blocked'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Let failed core probes override stale ready status

rpcRouter is currently derived with gatewayReady === true || lastCoreProbe?.ok === true, so once gatewayReady has ever been set to true, later failed core probes (lastCoreProbe.ok === false, e.g. repeated system-presence timeouts) can no longer surface as blocked. In that state, diagnostics endpoints will continue reporting core.rpcRouter: "ready" even when the router is actually failing, which makes incident triage and restart decisions inaccurate.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 20280f4a06

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/stores/gateway.ts
));
unsubscribers.push(subscribeHostEvent('gateway:health', (payload) => {
const current = get().health;
set({ health: { ...(current ?? { ok: true }), ok: true, openclawHealth: payload } });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve health failure state on gateway health events

This handler unconditionally sets health.ok to true whenever a gateway:health event arrives, which can overwrite a real unhealthy state (for example after a failed /api/gateway/health check) and make downstream UI/diagnostics read the gateway as healthy until another explicit poll runs. Because the payload is stored as openclawHealth but never consulted for ok, any degraded/failed health signal from the event stream is effectively masked.

Useful? React with 👍 / 👎.

@ashione ashione merged commit 8c9b4ea into main May 1, 2026
7 checks passed
@ashione ashione deleted the auto/gateway-startup-optimization branch May 1, 2026 13:49
@dongsheng123132 dongsheng123132 mentioned this pull request May 2, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant