Skip to content

fix(line): use getRegistry callback in plugin HTTP handler to prevent stale-route 404#34646

Closed
joelnishanth wants to merge 2 commits into
openclaw:mainfrom
joelnishanth:fix/line-webhook-route-404-issue-34631
Closed

fix(line): use getRegistry callback in plugin HTTP handler to prevent stale-route 404#34646
joelnishanth wants to merge 2 commits into
openclaw:mainfrom
joelnishanth:fix/line-webhook-route-404-issue-34631

Conversation

@joelnishanth

Copy link
Copy Markdown
Contributor

Summary

  • Problem: After a plugin config change (e.g. enabling/disabling a plugin via the Control UI), loadOpenClawPlugins is called again with a different cache key, creating a new PluginRegistry and calling setActivePluginRegistry(newRegistry). The gateway HTTP handler (createGatewayPluginRequestHandler) still holds a reference to the original registry from startup. The LINE channel then re-registers /line/webhook on the new active registry, but all incoming requests are matched against the old (now empty) registry — returning 404 until a full gateway restart.
  • Why it matters: LINE webhooks silently fail — messages are dropped, no sessions created — whenever the plugin registry diverges from the HTTP handler's captured snapshot. Only a full gateway restart recovers the route.
  • What changed: createGatewayPluginRequestHandler now accepts a getRegistry callback (() => PluginRegistry) instead of a pinned registry object, and resolves the live registry on every request. shouldEnforceGatewayAuthForPluginPath in server-runtime-state.ts is updated to call requireActivePluginRegistry() inline as well.
  • What did NOT change: Route registration/unregistration logic in monitorLineProvider, registerPluginHttpRoute, channel lifecycle, auth enforcement behaviour.

Change Type (select all)

  • Bug fix

Scope (select all touched areas)

  • Gateway / orchestration
  • Integrations

Linked Issue/PR

User-visible / Behavior Changes

/line/webhook (and any other plugin-registered webhook path) remains reachable after plugin config edits without requiring a gateway restart.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: macOS (LaunchAgent gateway)
  • Runtime/container: Node 22+
  • Integration/channel: LINE

Steps

  1. Start gateway with LINE channel configured.
  2. Enable or disable any plugin via openclaw config set plugins.X.enabled true/false (triggers loadOpenClawPlugins with a new cache key).
  3. curl -i http://127.0.0.1:18789/line/webhook

Expected

  • HTTP 200 (webhook handler responds)

Actual (before fix)

  • HTTP 404 until openclaw gateway restart

Evidence

  • Failing test/log before + passing after (WIP — tests to be added)

Human Verification (required)

  • Verified scenarios: tracing the registry lifecycle through loadOpenClawPluginssetActivePluginRegistrycreateGatewayPluginRequestHandler in code review; confirmed that the captured registry reference goes stale on any cache-key change.
  • Edge cases checked: cache-hit path (same key) — no behaviour change; cache-miss path (config change) — handler now always reads the current active registry.
  • What you did not verify: live end-to-end on a real LINE bot (pending).

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Failure Recovery (if this breaks)

  • How to disable/revert: revert src/gateway/server/plugins-http.ts and src/gateway/server-runtime-state.ts to the captured-registry pattern.
  • Known bad symptoms: plugin HTTP routes not matched at all (would be caught immediately in smoke tests).

Risks and Mitigations

  • Risk: requireActivePluginRegistry() called on every HTTP request adds a negligible Map-lookup overhead.
    • Mitigation: The lookup is O(1) via a module-level singleton; no measurable latency impact.

Joel Nishanth · offlyn.AI

… stale-route 404

When a plugin config change triggers loadOpenClawPlugins with a different
cache key, setActivePluginRegistry replaces the active registry with a new
object. createGatewayPluginRequestHandler was capturing the original registry
at startup, so any routes re-registered by channels (e.g. /line/webhook) on
the new registry were invisible to the HTTP handler — returning 404 until a
full gateway restart.

Fix: accept a getRegistry callback instead of a pinned registry reference so
the handler always resolves routes from the current active registry. Also
update the shouldEnforcePluginGatewayAuth call in server-runtime-state.ts
to use requireActivePluginRegistry() inline for the same reason.

Adds a regression test (stale-registry swap) that fails without the fix and
passes with it.

Closes openclaw#34631
@greptile-apps

greptile-apps Bot commented Mar 4, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a stale-registry bug in the gateway's plugin HTTP handler by replacing the pinned PluginRegistry reference captured at startup with a getRegistry: () => PluginRegistry callback that resolves the current active registry on every request. A matching regression test is added to plugins-http.test.ts that verifies routes become reachable immediately after a registry swap without a restart.

  • plugins-http.ts: Clean, minimal interface change — registry parameter replaced by getRegistry callback; all route-matching and error-handling logic is unchanged.
  • server-runtime-state.ts: Both handlePluginRequest and shouldEnforcePluginGatewayAuth now call requireActivePluginRegistry() live. However, the pluginRegistry parameter in createGatewayRuntimeState's signature is now unused — it's still declared, still passed by the caller in server.impl.ts, but never read inside the function body. This dead parameter should be cleaned up to avoid misleading future maintainers.
  • plugins-http.test.ts: All existing tests migrated to the getRegistry API; new regression test directly validates the fix.

Confidence Score: 4/5

  • Safe to merge — the fix is correct and well-tested; one dead parameter should be cleaned up.
  • The core change is small, focused, and directly targeted at the described bug. The callback pattern is the idiomatic way to avoid stale closures, the new regression test exercises the exact failure mode, and there are no changes to route-matching or auth logic. Score is 4 rather than 5 solely because params.pluginRegistry is now a dead parameter still present in the public function signature, which is a maintenance concern but not a runtime risk.
  • src/gateway/server-runtime-state.ts — the pluginRegistry parameter in createGatewayRuntimeState is now unused in the function body and should be removed from the signature (and its call-site in server.impl.ts) in a follow-up.

Last reviewed commit: bc713dd

@greptile-apps

greptile-apps Bot commented Mar 4, 2026

Copy link
Copy Markdown
Contributor
Additional Comments (1)

src/gateway/server-runtime-state.ts, line 55
pluginRegistry parameter is now dead code

After this change, params.pluginRegistry is never referenced anywhere in the function body — both handlePluginRequest and shouldEnforcePluginGatewayAuth resolve the registry via requireActivePluginRegistry() directly. The parameter is still declared in the signature and still passed by the caller in server.impl.ts:553, but it has no effect.

Leaving it in place risks misleading future maintainers into thinking it still influences the handler's behavior. Consider removing it from the type signature (and the corresponding call-site) in a follow-up, or at minimum adding a // no longer used — registry is read via requireActivePluginRegistry() comment to make the intent clear.

@bmendonca3

Copy link
Copy Markdown
Contributor

Objective status snapshot:

  • Merge state is currently UNSTABLE.
  • The Bun unit-test lane is failing while several other checks are still running.

Actionable next steps:

  1. Please paste the first failing stack trace from checks (bun, test, pnpm canvas:a2ui:bundle && bunx vitest run --config vitest.unit.config.ts) so maintainers can separate infra noise from a real regression quickly.
  2. Add/point to a regression test that proves route resolution works after registry refresh (the stale-route 404 path this PR targets).
  3. Once the Bun lane is green, this should be straightforward to re-evaluate.

@joelnishanth

Copy link
Copy Markdown
Contributor Author

The checks (bun, test, ...) failure is a pre-existing repo-wide CI issue — oven-sh/setup-bun@v2 404s on bun@1.3.9+cf6cdbbba before any tests run. Tracked in #34637. The equivalent Node.js test run passes. This PR's code changes are not the cause.

Joel Nishanth · offlyn.AI

@thewilloftheshadow thewilloftheshadow added the r: too-many-prs Auto-close: author has more than twenty active PRs. label Mar 6, 2026
@openclaw-barnacle

Copy link
Copy Markdown

Closing this PR because the author has more than 10 active PRs in this repo. Please reduce the active PR queue and reopen or resubmit once it is back under the limit. You can close your own PRs to get back under the limit.

@openclaw-barnacle openclaw-barnacle Bot closed this Mar 6, 2026
@joelnishanth

Copy link
Copy Markdown
Contributor Author

Reopening. Only PR for issue #34631. CI passing, ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime r: too-many-prs Auto-close: author has more than twenty active PRs. size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Title: LINE webhook route intermittently returns 404 until gateway restart (2026.3.2)

3 participants