Skip to content

[Bug]: @openclaw/diagnostics-prometheus loads via npm but internalDiagnostics capability is bundled-only — empty metrics + ERROR spam (v2026.5.2) #76628

@RayWoo

Description

@RayWoo

Bug type

Behavior bug (incorrect output/state without crash)

Summary

@openclaw/diagnostics-prometheus is published on npm at 2026.5.2 and openclaw plugins install @openclaw/diagnostics-prometheus succeeds end-to-end (config registered, plugin loads, HTTP route bound). However, the /api/diagnostics/prometheus endpoint always returns HTTP 200 with an empty body because the plugin's start() hook fails the internalDiagnostics capability check at runtime. The plugin then logs diagnostics-prometheus: internal diagnostics capability unavailable at ERROR level (logLevelId 5) and gives up subscribing to events.

The capability is gated to plugins with service.origin === "bundled". npm-installed plugins land at ~/.openclaw/npm/node_modules/@openclaw/... and get origin: "global", so the gate denies them — even when they are otherwise correctly configured. There is no documented or supported route to lift this restriction for an npm-published diagnostics plugin.

The end result for any user who follows the Prometheus Quick Start on a real running gateway is that the plugin appears installed and enabled but emits zero metrics, with recurring ERROR-level log noise. The ClawHub install path (clawhub:@openclaw/diagnostics-prometheus) is currently broken with ClawHub package "@openclaw/diagnostics-prometheus" has no installable version. — consistent with the missing version tag in openclaw plugins search prometheus (compare @openclaw/diagnostics-otel ... | v2026.3.22 vs @openclaw/diagnostics-prometheus ... | (no version tag)).

Environment

  • OpenClaw: 2026.5.2 (8b2a6e5)
  • Plugin: @openclaw/diagnostics-prometheus@2026.5.2 (verified via npm view)
  • Node: v24.13.0
  • OS: Debian 11 (linux/amd64), single-user gateway, systemd-managed (user-level)
  • Gateway: 127.0.0.1:18789 loopback, token auth

Steps to reproduce

  1. Confirm the plugin is publishable on npm but not on ClawHub:

    npm view @openclaw/diagnostics-prometheus version       # → 2026.5.2
    openclaw plugins install clawhub:@openclaw/diagnostics-prometheus
    # → "ClawHub package ... has no installable version."
  2. Install via npm spec (the documented fallback per openclaw plugins install --help):

    openclaw plugins install @openclaw/diagnostics-prometheus
    # → "Installed plugin: diagnostics-prometheus"
  3. Enable, including the required diagnostics.enabled: true:

    echo '{ plugins: { entries: { "diagnostics-prometheus": { enabled: true } } }, diagnostics: { enabled: true } }' \
      | openclaw config patch --stdin
  4. Restart gateway. Verify the plugin is loaded:

    openclaw plugins inspect diagnostics-prometheus
    # → Status: loaded, Origin: global, Version: 2026.5.2
  5. Curl the metrics endpoint with the gateway operator token:

    curl -s -o body.txt -w "HTTP %{http_code}, %{size_download}B\n" \
      -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" \
      http://127.0.0.1:18789/api/diagnostics/prometheus
    # Actual:   HTTP 200, 0B
    # Expected: HTTP 200, >0B with text/plain prometheus exposition format
  6. Observe the gateway log:

    [plugins] diagnostics-prometheus: internal diagnostics capability unavailable     # ← ERROR (logLevelId 5), repeated
    

Expected behavior

Either:

  • (a) the npm-installed plugin receives the internalDiagnostics capability so it can subscribe to the gateway's diagnostic event bus and emit metrics, OR
  • (b) openclaw plugins install rejects the install path with a clear error message such as "diagnostics-prometheus must be installed as a bundled plugin; npm-spec install will not function", OR
  • (c) the docs at docs/gateway/prometheus.md#quick-start document that only ClawHub installs are supported and explicitly forbid the npm route until the plugin is bundled.

Actual behavior

  • Plugin install: succeeds.
  • openclaw plugins inspect diagnostics-prometheus: status loaded, origin global, version 2026.5.2.
  • openclaw plugins list | grep -c "enabled ": increases by 1 (plugin counted as enabled).
  • HTTP route registered: http server listening (... diagnostics-prometheus ...) appears in the startup banner.
  • Capability check at service.start(): fails silently to the user, only visible as an ERROR log.
  • /api/diagnostics/prometheus: HTTP 200 with empty body indefinitely.
  • ERROR log line internal diagnostics capability unavailable repeats on each plugin (re)start.

Root cause analysis

The capability injection is gated at dist/services-CTQW_M_S.js:15:

const grantsInternalDiagnostics =
    params.service?.origin === "bundled"
    && params.service.pluginId === params.service.service.id
    && (params.service.service.id === "diagnostics-otel"
       || params.service.service.id === "diagnostics-prometheus");

return {
    config: params.config,
    workspaceDir: params.workspaceDir,
    stateDir: STATE_DIR,
    logger: createPluginLogger(),
    ...grantsInternalDiagnostics ? { internalDiagnostics: {
        emit: emitTrustedDiagnosticEvent,
        onEvent: onInternalDiagnosticEvent
    } } : {}
};

The plugin's start(ctx) hook checks ctx.internalDiagnostics?.onEvent at ~/.openclaw/npm/node_modules/@openclaw/diagnostics-prometheus/src/service.ts:647-650:

start(ctx) {
    const subscribe = ctx.internalDiagnostics?.onEvent;
    if (!subscribe) {
        ctx.logger.error("diagnostics-prometheus: internal diagnostics capability unavailable");
        return;
    }
    // ... subscribe and start recording metrics
}

When the plugin is npm-installed, params.service.origin === "global" (verified via openclaw plugins inspect), so the first conjunct of grantsInternalDiagnostics evaluates to false. The capability object is omitted from the context. The plugin's start() returns early. The metrics endpoint stays empty.

This is the architecturally correct successor to the older module-isolation issues #5190 and #39156, both of which were closed when the gateway moved from a globalThis.__openclawDiagnosticEventsState listener pattern to context-injected capabilities. However, the new injection's allowlist of grant recipients is keyed on origin === "bundled", which excludes every npm-published copy of the same plugin — including the one openclaw itself publishes.

plugins.allow is not the missing piece (verified empirically)

A reasonable hypothesis would be that plugins.allow: ["diagnostics-prometheus"] is what unlocks the capability. The Prometheus Quick Start at docs/gateway/prometheus.md:39 includes that key in the example config. We tested it — it does not affect service.origin and therefore does not lift the gate. It does, however, immediately flip the entire gateway into strict-allowlist mode and disable every other bundled plugin not listed, which is its own set of regressions (this dropped a 71-plugin running gateway to 9 enabled plugins until rolled back). Issue #75575 covers a related but distinct facet of the same plugins.allow does not mean what users expect confusion.

Suggested directions (not prescriptive — maintainer discretion)

  1. Bundle the plugin at dist/extensions/diagnostics-prometheus/ so npm publishing becomes redundant or transitional, and the existing origin: "bundled" gate naturally passes. The ClawHub registry entry is already reserved at the right id; the missing version tag is a symptom that bundling has not yet shipped. (@openclaw/diagnostics-otel shows v2026.3.22 in the same registry — it appears to have been through this transition already, though we have not independently verified it works end-to-end via npm install on v2026.5.2.)
  2. Broaden the gate at services-CTQW_M_S.js:15 to additionally accept origin === "global" for plugins that match a separate trust mechanism (e.g. plugins explicitly listed in plugins.allow AND signed by @openclaw, or a new plugins.trust.<id>: { internalDiagnostics: true } opt-in).
  3. Fail closed at install time: if openclaw plugins install <npm-spec> is invoked for a plugin id that the gateway will never grant the required capability to, reject the install with a clear error rather than appearing to succeed.
  4. Document the limitation: until (1)/(2)/(3) ship, update docs/gateway/prometheus.md Quick Start to remove the npm install hint from Distribution, document that only the ClawHub path is currently supported, and warn that the ClawHub path may be empty until the plugin is bundled.

Cross-references

Note on diagnostics-otel

@openclaw/diagnostics-otel may exhibit identical pathology when installed via npm — the same grantsInternalDiagnostics gate at services-CTQW_M_S.js:15 covers both ids. The ClawHub v2026.3.22 tag on @openclaw/diagnostics-otel suggests it ships properly bundled, but this has not been independently verified on v2026.5.2 in this environment. If a maintainer wants to test, the same reproduction steps apply with diagnostics-otel substituted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions