Skip to content

fix(gateway): pin channel registry at startup to survive registry swaps#53944

Merged
steipete merged 7 commits into
openclaw:mainfrom
affsantos:fix/pin-channel-registry
Mar 25, 2026
Merged

fix(gateway): pin channel registry at startup to survive registry swaps#53944
steipete merged 7 commits into
openclaw:mainfrom
affsantos:fix/pin-channel-registry

Conversation

@affsantos

@affsantos affsantos commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: getChannelPlugin() returns undefined after a runtime registry swap, causing Channel is unavailable: <channel> errors on all outbound message delivery — the message tool, subagent completion announces, and cron delivery. The channel is known (isKnownChannel() passes via the hardcoded CHANNEL_IDS array) but the plugin backing it is gone from the live registry.
  • Why it matters: On a production Slack gateway, upgrading to 2026.3.22 introduced 38 Channel is unavailable: slack errors/day and 72 subagent announce failures/day — up from zero on the previous version. Reproduced and confirmed still present on 2026.3.23-2. Users see dropped replies and undelivered subagent results with no feedback. The issue affects all channels (Slack, Telegram, WhatsApp, Discord) — see Related Issues below.
  • Root cause: Non-primary loadOpenClawPlugins() calls (config-schema reads, provider registry lookups, maybeBootstrapChannelPlugin fallback) can replace the active plugin registry via setActivePluginRegistry(). The replacement registry may have a different channel plugin set — or none at all — because it was built with different loader options. getChannelPlugin() in channels/plugins/registry.ts resolves against the live requireActivePluginRegistry(), so it immediately loses access to the channels that were present at gateway boot.
  • What changed: Added a pinned channel registry (3 new functions in runtime.ts) mirroring the existing pinActivePluginHttpRouteRegistry pattern. The channel registry is pinned at gateway startup in server-runtime-state.ts and released on shutdown. resolveCachedChannelPlugins() now reads from requireActivePluginChannelRegistry() instead of requireActivePluginRegistry(), so channel resolution is immune to mid-flight registry swaps.
  • What did NOT change (scope boundary): No changes to plugin loading, config reload, channel setup/teardown, message delivery pipeline, or the outbound channel resolution fallback chain. The resolveOutboundChannelPlugin bootstrap path still works as before — it just no longer needs to fire because the pinned registry already has the channels.

🤖 AI-assisted

  • Marked as AI-assisted
  • Degree of testing: fully tested (7 new tests + 613 existing channel/outbound/plugin tests pass)
  • I understand what the code does
  • Bot review conversations will be resolved

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

Channel plugins registered at gateway startup are no longer evicted by subsequent plugin registry swaps. The message tool, subagent announces, cron delivery, and all other outbound paths that resolve channels via getChannelPlugin() will consistently find the channel plugin as long as the gateway is running.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: Linux (GCE, Debian 12) / macOS
  • Runtime: Node 22
  • OpenClaw: 2026.3.22 through 2026.3.23-2 (confirmed on both)
  • Channel: Slack (Socket Mode), also affects Telegram, WhatsApp, Discord

Steps

  1. Start a gateway with Slack (or any channel) enabled
  2. Trigger a non-primary plugin registry load (e.g., Control UI fetches config schema, or a subagent spawns and bootstraps plugins)
  3. Send a message to the agent, or have a subagent complete and try to announce

Expected

  • Channel plugin is found, message is delivered

Actual (before fix)

  • getChannelPlugin("slack") returns undefined
  • [tools] message failed: Channel is unavailable: slack
  • Subagent completion direct announce failed: Error: Unknown channel: slack
  • After 3 retries with exponential backoff, the announce is permanently dropped

Evidence from production

Before (2026.3.22, no pin) — 7-day daily error count:

Mon: 0    Thu: 0    Sun: 20
Tue: 0    Fri: 0    Mon: 38
Wed: 0    Sat: 0
# First error appeared Sunday 15:43 UTC — immediately after 2026.3.22 upgrade

Log pattern (cascading announce failures):

12:35:38 [ws] res ✗ agent errorCode=UNAVAILABLE errorMessage=Error: Unknown channel: slack
12:35:38 Subagent completion direct announce failed for run <id>: Error: Unknown channel: slack
12:35:40 Subagent completion direct announce failed for run <id>: Error: Unknown channel: slack  # retry 2
12:35:44 Subagent completion direct announce failed for run <id>: Error: Unknown channel: slack  # retry 3
# → permanently lost, parent session never notified

Affected session types:

  • slack:channel:* — channel thread replies
  • slack:direct:* — DM replies
  • subagent:* — subagent completion announces
  • openai:* — MCP/API sessions attempting Slack delivery

Tests

7 new tests covering pin/release/swap lifecycle:

✓ returns the active registry when not pinned
✓ preserves pinned channel registry across setActivePluginRegistry calls
✓ updates channel registry on swap when not pinned
✓ release restores live-tracking behavior
✓ release is a no-op when the pinned registry does not match
✓ requireActivePluginChannelRegistry creates a registry when none exists
✓ resetPluginRuntimeStateForTest clears channel pin

Full test run: 7 new + 613 existing channel/outbound/plugin tests pass.

Co-Authored: Mário Sousa (@mariosousa-finn)

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime size: S labels Mar 24, 2026
@greptile-apps

greptile-apps Bot commented Mar 24, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a production regression (38 Channel is unavailable errors/day on Edgar/FINN) by pinning the channel plugin registry at gateway startup, preventing subsequent non-primary loadOpenClawPlugins() calls from evicting the channels that were active at boot. The approach directly mirrors the existing pinActivePluginHttpRouteRegistry pattern and is the minimal, correct fix for the root cause.

Key changes:

  • src/plugins/runtime.ts: Adds channelRegistry / channelRegistryPinned state fields and four new functions (pinActivePluginChannelRegistry, releasePinnedPluginChannelRegistry, getActivePluginChannelRegistry, requireActivePluginChannelRegistry), mirroring the existing HTTP-route pin API exactly.
  • src/channels/plugins/registry.ts: Single-line switch from requireActivePluginRegistryrequireActivePluginChannelRegistry in resolveCachedChannelPlugins, making all getChannelPlugin / listChannelPlugins calls resolve against the pinned startup registry.
  • src/gateway/server-runtime-state.ts: Pin is set before the try block and released in both the success and error paths — no pin leak on startup failure.
  • src/plugins/runtime.channel-pin.test.ts: Seven well-scoped tests covering the full pin/release/swap lifecycle.

One minor point: The returned releasePluginRouteRegistry function now also releases the channel registry pin, but its name still only references routes. Renaming or adding a brief comment would improve future readability.

Confidence Score: 5/5

  • Safe to merge — targeted, well-tested bugfix that follows an established pattern with no behaviour changes outside channel resolution.
  • The implementation is correct and complete: the pin/release lifecycle is symmetric, all code paths in server-runtime-state.ts release the pin (success and error), the cache-version check in resolveCachedChannelPlugins still works correctly with the pinned registry, and 7 new tests plus 613 existing tests validate the change. The only finding is a P2 naming issue on releasePluginRouteRegistry which is non-blocking.
  • No files require special attention.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/gateway/server-runtime-state.ts
Line: 233-237

Comment:
**`releasePluginRouteRegistry` now releases channel pin too**

The returned state property `releasePluginRouteRegistry` silently expanded its scope: it now releases both the HTTP route registry pin and the channel registry pin. The name only reflects route-registry concerns, so a future reader may not realise that calling (or skipping) this function also controls channel resolution stability.

Consider renaming the property to `releasePluginRegistryPins` (or similar) to reflect its broadened responsibility, or add a short inline comment noting the dual release:

```suggestion
      releasePluginRouteRegistry: () => {
        // Releases both pinned HTTP-route and channel registries set at startup.
        releasePinnedPluginHttpRouteRegistry(params.pluginRegistry);
        releasePinnedPluginChannelRegistry(params.pluginRegistry);
      },
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(gateway): pin channel registry at st..." | Re-trigger Greptile

Comment thread src/gateway/server-runtime-state.ts
@affsantos affsantos force-pushed the fix/pin-channel-registry branch from 94c1055 to dd5c9d8 Compare March 24, 2026 20:28

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 313587dcbd

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/channels/plugins/registry.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9643f6be38

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/gateway/server-runtime-state.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 40566892b1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/gateway/server-runtime-state.ts Outdated
@affsantos affsantos force-pushed the fix/pin-channel-registry branch from 4056689 to 17bb1eb Compare March 24, 2026 22:40

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 17bb1eb402

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/gateway/server-runtime-state.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2064d98d99

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/plugins/runtime.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c2a2d23780

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/plugins/runtime.ts
@obviyus obviyus self-assigned this Mar 25, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 60b8e1b05e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/server-runtime-state.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a1f1bb5ee0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/server-runtime-state.ts Outdated
affsantos and others added 6 commits March 25, 2026 10:00
Channel plugin resolution fails with 'Channel is unavailable: <channel>'
after the active plugin registry is replaced at runtime. The root cause is
that getChannelPlugin() resolves against the live registry snapshot, which
is replaced when non-primary registry loads (e.g., config-schema reads)
call loadOpenClawPlugins(). If the replacement registry does not carry the
same channel entries, outbound message delivery and subagent announce
silently break.

This mirrors the existing pinActivePluginHttpRouteRegistry pattern: the
channel registry is pinned at gateway startup and released on shutdown.
Subsequent setActivePluginRegistry calls no longer evict the channel
snapshot, so getChannelPlugin() always resolves against the registry that
was active when the gateway booted.
Address Greptile review: releasePluginRouteRegistry now releases both
HTTP-route and channel registry pins. Added comment for clarity.
When preferSetupRuntimeForChannelPlugins is active, gateway boot performs
two plugin loads: a setup-runtime pass and a full reload after listen.
The initial pin captured the setup-entry snapshot. The deferred reload now
re-pins so getChannelPlugin() resolves against the full implementations.
@obviyus obviyus force-pushed the fix/pin-channel-registry branch from a1f1bb5 to 3daf74c Compare March 25, 2026 04:30
@steipete steipete merged commit 2aaea9f into openclaw:main Mar 25, 2026
8 checks passed
@steipete

Copy link
Copy Markdown
Contributor

Landed via temp rebase onto main.

  • Gate: pnpm check && pnpm build && pnpm test && pnpm check:docs
  • Land commit: 91cd017
  • Merge commit: 2aaea9f

Thanks @affsantos!

netandreus pushed a commit to netandreus/openclaw that referenced this pull request Mar 25, 2026
netandreus pushed a commit to netandreus/openclaw that referenced this pull request Mar 25, 2026
npmisantosh pushed a commit to npmisantosh/openclaw that referenced this pull request Mar 25, 2026
npmisantosh pushed a commit to npmisantosh/openclaw that referenced this pull request Mar 25, 2026
fuller-stack-dev pushed a commit to fuller-stack-dev/openclaw that referenced this pull request Mar 25, 2026
fuller-stack-dev pushed a commit to fuller-stack-dev/openclaw that referenced this pull request Mar 25, 2026
jacobtomlinson pushed a commit to jacobtomlinson/openclaw that referenced this pull request Mar 25, 2026
jacobtomlinson pushed a commit to jacobtomlinson/openclaw that referenced this pull request Mar 25, 2026
godlin-gh pushed a commit to YouMindInc/openclaw that referenced this pull request Mar 27, 2026
godlin-gh pushed a commit to YouMindInc/openclaw that referenced this pull request Mar 27, 2026
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
daniel-rudaev pushed a commit to D1DX/openclaw-1 that referenced this pull request May 11, 2026
goweii pushed a commit to goweii/openclaw that referenced this pull request May 24, 2026
goweii pushed a commit to goweii/openclaw that referenced this pull request May 24, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants