mattermost: fix /oc_* slash command 503 from duplicate slash-state instance#68089
mattermost: fix /oc_* slash command 503 from duplicate slash-state instance#68089allenliang2022 wants to merge 4 commits into
Conversation
Greptile SummaryThis PR fixes a real, user-impacting bug where all Mattermost Confidence Score: 5/5Safe to merge — the change is a single correct bug fix with no logic, security, or data-integrity regressions. The only finding is P2: the static import widens the mattermost channel entry's eager import graph by pulling in No files require special attention beyond the P2 import-topology note on Prompt To Fix All With AIThis is a comment left during a code review.
Path: extensions/mattermost/index.ts
Line: 11
Comment:
**Eager import pulls heavy operational modules into the channel entry**
The static import of `./slash-route-api.js` is the right fix for module-identity correctness, but it widens the mattermost channel entry's cold-start import graph. `slash-route-api.ts` re-exports from `slash-state.ts`, which has top-level value imports of `createSlashCommandHttpHandler` from `slash-http.ts` — and `slash-http.ts` brings in `model-picker.js`, `reply-delivery.js`, `client.js`, `monitor-auth.js`, `openclaw/plugin-sdk/ssrf-runtime`, and `openclaw/plugin-sdk/browser-security-runtime`. Previously none of these loaded until the channel runtime was started; now they load even on the lightweight `cli-metadata` registration path.
Per the SDK guidance ("keep public SDK entrypoints cheap at module load; if a helper is only needed on async paths, prefer a narrow `*.runtime` subpath"), the preferred shape would be a thin `slash-route.runtime.ts` that static-imports only `./slash-state.js` (preserving single-instance identity) while keeping `slash-http.ts` and its deps behind the lazy boundary. The PR description already proposes this alternative, so flagging it for maintainer judgment rather than as a blocker.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "mattermost: fix /oc_* slash command 503 ..." | Re-trigger Greptile |
| // route would keep returning 503 "Slash commands are not yet initialized" even | ||
| // after activation succeeds. See SDK guidance: do not mix static and dynamic | ||
| // imports for the same runtime surface. | ||
| import { registerSlashCommandRoute } from "./slash-route-api.js"; |
There was a problem hiding this comment.
Eager import pulls heavy operational modules into the channel entry
The static import of ./slash-route-api.js is the right fix for module-identity correctness, but it widens the mattermost channel entry's cold-start import graph. slash-route-api.ts re-exports from slash-state.ts, which has top-level value imports of createSlashCommandHttpHandler from slash-http.ts — and slash-http.ts brings in model-picker.js, reply-delivery.js, client.js, monitor-auth.js, openclaw/plugin-sdk/ssrf-runtime, and openclaw/plugin-sdk/browser-security-runtime. Previously none of these loaded until the channel runtime was started; now they load even on the lightweight cli-metadata registration path.
Per the SDK guidance ("keep public SDK entrypoints cheap at module load; if a helper is only needed on async paths, prefer a narrow *.runtime subpath"), the preferred shape would be a thin slash-route.runtime.ts that static-imports only ./slash-state.js (preserving single-instance identity) while keeping slash-http.ts and its deps behind the lazy boundary. The PR description already proposes this alternative, so flagging it for maintainer judgment rather than as a blocker.
Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/mattermost/index.ts
Line: 11
Comment:
**Eager import pulls heavy operational modules into the channel entry**
The static import of `./slash-route-api.js` is the right fix for module-identity correctness, but it widens the mattermost channel entry's cold-start import graph. `slash-route-api.ts` re-exports from `slash-state.ts`, which has top-level value imports of `createSlashCommandHttpHandler` from `slash-http.ts` — and `slash-http.ts` brings in `model-picker.js`, `reply-delivery.js`, `client.js`, `monitor-auth.js`, `openclaw/plugin-sdk/ssrf-runtime`, and `openclaw/plugin-sdk/browser-security-runtime`. Previously none of these loaded until the channel runtime was started; now they load even on the lightweight `cli-metadata` registration path.
Per the SDK guidance ("keep public SDK entrypoints cheap at module load; if a helper is only needed on async paths, prefer a narrow `*.runtime` subpath"), the preferred shape would be a thin `slash-route.runtime.ts` that static-imports only `./slash-state.js` (preserving single-instance identity) while keeping `slash-http.ts` and its deps behind the lazy boundary. The PR description already proposes this alternative, so flagging it for maintainer judgment rather than as a blocker.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
Codex review: needs changes before merge. Latest ClawSweeper review: 2026-05-23 04:39 UTC / May 23, 2026, 12:39 AM ET. Workflow note: Future ClawSweeper reviews update this same comment in place. How this review workflow works
Summary Reproducibility: yes. by source inspection and contributor proof. Current main still loads the route registrar through the bundled-entry loader while activation imports the same state through native ESM, matching the live 503 reports and the PR's before/after terminal evidence. PR rating Rank-up moves:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. Real behavior proof Risk before merge
Maintainer options:
Next step before merge Security Review findings
Review detailsBest possible solution: Land this PR or an equivalent Mattermost plugin fix after refreshing it for current main, repairing the dual-load fixture, and rerunning focused slash-state plus changed-check proof. Do we have a high-confidence way to reproduce the issue? Yes, by source inspection and contributor proof. Current main still loads the route registrar through the bundled-entry loader while activation imports the same state through native ESM, matching the live 503 reports and the PR's before/after terminal evidence. Is this the best way to solve the issue? Mostly yes. The shared-state split keeps the route loader lazy while making both loader paths observe one account state map, but the PR must be refreshed for current main before it is merge-ready. Label justifications:
Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against d7a078f1962b. |
|
Any updates on this? |
d39a2ef to
d3a6c9a
Compare
d3a6c9a to
4bd7742
Compare
4bd7742 to
2d6d560
Compare
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
|
ClawSweeper PR egg 🔥 Warming up: real-behavior proof passed; findings, security review, or rank-up moves are still in progress. Hatch commandComment Hatchability rules:
What is this egg doing here?
|
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
|
This assigned pull request has been automatically marked as stale after being open for 27 days. |
|
Closing due to inactivity. |
Summary
/oc_model,/oc_status,/oc_help, …) always return503 Service Unavailablewith{"response_type":"ephemeral","text":"Slash commands are not yet initialized. Please try again in a moment."}, even after the log confirmsmattermost: slash commands activated for account default (8 commands).extensions/mattermost/index.tsnow importsregisterSlashCommandRoutestatically from./slash-route-api.jsinstead of resolving it through the sync jiti loaderloadBundledEntryExportSync. That makes the plugin-registration path and the monitor path share a single ESM instance ofslash-state.ts, so the HTTP route closure andactivateSlashCommandssee the sameaccountStatesmap.slash-state.ts,monitor-slash.ts,monitor.ts, the HTTP handler, slash command registration flow, or any other extension. Just the single import inextensions/mattermost/index.ts. Slack'sextensions/slack/index.tshas a similar-looking dual-path import, but its route handler does not close over module state (the handler usesawait import("./handler.runtime.js")inside), so Slack is not affected by the same bug and is explicitly out of scope here.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause
extensions/mattermost/index.ts::registerFullusedloadBundledEntryExportSync(import.meta.url, { specifier: "./slash-route-api.js", exportName: "registerSlashCommandRoute" })to load the route registrar. On macOS/Linux that helper resolves through jiti (seesrc/plugin-sdk/channel-entry-contract.ts:nodeRequireis only tried inside the Win32 branch, everywhere else falls through togetJiti(modulePath)(modulePath)).slash-route-api.tsre-exportsregisterSlashCommandRoutefrom./src/mattermost/slash-state.js. Loading it via jiti causes jiti to re-evaluateslash-state.tsinside its own module registry, producing a second live instance of the module. Meanwhilesrc/mattermost/monitor-slash.tsandsrc/mattermost/monitor.ts(and therefore the channel runtime loaded viaawait import("./channel.runtime-*.js")) import./slash-state.jsthrough native ESM, giving them instance A.The result: two separate
accountStatesmaps.activateSlashCommands/deactivateSlashCommandsmutate the ESM copy → the log line "slash commands activated for account default (N commands)" fires on the ESM instance.registerSlashCommandRoute'srouteHandlerclosure closes over the jiti copy'saccountStates→accountStates.sizestays0for the lifetime of the process → every callback returns503 "Slash commands are not yet initialized. Please try again in a moment.".This matches the guidance in
src/plugin-sdk/CLAUDE.md:The fix is the straightforward application of that rule:
index.tsshould import./slash-route-api.jsstatically soslash-state.tsloads through native ESM exactly once, shared with the monitor / channel-runtime path.Missing detection / guardrail
There is a good unit test for the token-routing logic in
extensions/mattermost/src/mattermost/slash-state.test.ts, but it only exercises one import ofslash-state, so it cannot detect the dual-instance bug. I am happy to add a regression test that specifically simulates loadingslash-route-api.jsthrough jiti alongside the native ESM import and asserts that activation flows through to the route handler — happy to do it in this PR if maintainers prefer. I did not include it in the initial patch because simulating the jiti boundary in Vitest is a bit more involved than the fix itself, and I wanted the minimal-risk diff to land first.Contributing context
The same
loadBundledEntryExportSyncpattern is used inextensions/feishu/index.ts,extensions/nostr/index.ts,extensions/qqbot/index.ts,extensions/slack/index.ts,extensions/zalouser/index.ts. Slack is safe because its handler doesawait import("./handler.runtime.js")inside the request closure instead of closing over module state. I did not audit the others in depth; if similar closures-over-module-state exist there, they would have the same failure mode and warrant a follow-up. Flagging for visibility, not proposing to change them in this PR.Regression Test Plan
extensions/mattermost/src/mattermost/slash-state.dual-load.test.ts(new) — would loadslash-route-api.tsthrough a jiti instance andslash-state.tsthrough native ESM in the same test, activate commands on the ESM side, then drive an HTTP request against the jiti-registered route and assert that it does not return 503.loadBundledEntryExportSyncobserves activations performed via the monitor's static import ofslash-state.js."slash-state.tsonce do not reproduce it. A boundary test that crosses the same jiti / ESM seam is the minimum.slash-state.test.tsis single-import only.User-visible / Behavior Changes
/oc_status,/oc_model,/oc_models,/oc_new,/oc_help,/oc_think,/oc_reasoning,/oc_verbose(and any user-defined slash commands registered through the same path) start working again instead of responding503 Service Unavailable / "Slash commands are not yet initialized.".Diagram
Security Impact
NoNoNoNo— slash-command dispatch semantics, token validation, and multi-account routing are unchanged. The fix only restores the previously-intended flow.NoRepro + Verification
Environment
npm i -g openclaw@2026.4.15, gateway started as the launchdai.openclaw.gatewayLaunchAgent.channels.mattermost.accounts.*entries (default / forge / kline / probe / quill / scout) but only thedefaultaccount has team-admin permission to register commands, soaccountStates.size === 1— the most common production shape.Steps
/oc_modelfrom the Mattermost UI. Server also reproducible via:Expected
size === 1branch should dispatch straight tostate.handler, which validates the token internally and returns the normal slash-command response (200 with the ephemeral reply, or a 401 only for a genuinely wrong token).Actual (before the fix)
HTTP 503with body{"response_type":"ephemeral","text":"Slash commands are not yet initialized. Please try again in a moment."}, even though the activation log fired seconds earlier. The Mattermost UI surfaces this as触发器 "oc_model" 返回了响应状态 503 Service Unavailable./ "Trigger "oc_model" returned response status 503 Service Unavailable."Evidence
Failing-before / passing-after against a locally patched installed dist (equivalent rewrite of this TS change onto the bundled output,
/opt/homebrew/lib/node_modules/openclaw/dist/extensions/mattermost/index.js):Before (immediately after
launchctl kickstart -k gui/$UID/ai.openclaw.gateway, well past activation):After applying the equivalent patch and restarting the gateway:
Interpretation: the
503 "not yet initialized"path is now unreachable — incoming requests flow into the real handler, which correctly rejects malformed payloads (400) and bad tokens (401). A real Mattermost-signed request with the registered token returns the normal slash-command response.Relevant gateway log lines from the fixed run (unchanged shape, included for completeness):
Human Verification
accountStates.size === 1) dispatch path, which was the production failure.ai.openclaw.gatewayLaunchAgent and real Mattermost (/oc_modelfrom the Mattermost UI).launchctl kickstart -kbetween before/after so the Node process and jiti registry are both replaced.callbackPath/callbackUrloverrides (route registration log is unchanged).accountStates.size > 1) token-routing path — code path is untouched, but I did not have a second team-admin account to activate commands on. The unit tests inslash-state.test.tscover that branch.pnpm check/pnpm build/pnpm test— I did not run these locally before opening the PR (this is my first contribution to the repo and I wanted to avoid a large-deps install cycle before you tell me which gates you want covered). Please run CI; happy to iterate on any failures or add the jiti-boundary regression test if that is what you would like to see.Review Conversations
Compatibility / Migration
YesNoNoRisks and Mitigations
slash-route-api.tsand transitively./src/mattermost/slash-state.tsnow load eagerly as part of the plugin entry, instead of only whenregisterFullfires through jiti.slash-state.tsis a small module (≈300 LOC, top-level state is just oneMap), but this does add it to the mattermost plugin's cold-start import cost.new Map()and function definitions), matches howmonitor.tsalready imports it on the hot monitor path, and aligns with the SDK guidance that says the eager/light contract belongs next to the plugin entry and only the heavy runtime should live behind a lazy*.runtime.tsboundary. I am happy to instead introduce a dedicatedslash-route.runtime.tsand keepregisterSlashCommandRoutelazy-but-single-instance if maintainers prefer that shape.loadBundledEntryExportSyncpattern and may have similar dual-instance bugs depending on whether their route registrars close over module state.