Bug type
Behavior bug (incorrect output/state without crash)
Summary
Summary
The diagnostics-otel plugin's log export feature (diagnostics.otel.logs: true) doesn't work. The plugin loads successfully and reports "logs exporter enabled (OTLP/Protobuf)" but no log records are ever sent to the OTLP endpoint. Metrics and traces work fine.
Root Cause
registerLogTransport in the plugin-sdk (plugin-sdk/subsystem-QV9R1a2-.js) maintains its own externalTransports Set and loggingState singleton. The gateway's actual logger lives in a separate bundle (daemon-cli.js) with its own independent copies of both.
When the plugin calls registerLogTransport(callback):
- The callback is added to plugin-sdk's externalTransports ✅
- It tries to attach to loggingState.cachedLogger — but that's the plugin-sdk's logger instance, which is null (the plugin-sdk never calls getLogger()) ❌
- The gateway's logger in daemon-cli.js has its own externalTransports Set that the plugin never touches ❌
Result: the transport is registered in the wrong module instance and never receives log events.
Evidence
Gateway logger — daemon-cli.js
grep -c "cachedLogger" dist/daemon-cli.js
→ found (has its own loggingState)
Plugin SDK — subsystem-QV9R1a2-.js
grep -c "cachedLogger" dist/plugin-sdk/subsystem-QV9R1a2-.js
→ found (has its own separate loggingState)
Gateway main bundle does NOT reference plugin-sdk logging
grep -c "plugin-sdk" dist/subsystem-kl-vrkYi.js
→ 0
Meanwhile, onDiagnosticEvent (used for metrics/traces) works because it's event-based — the gateway emits diagnostic events that the plugin subscribes to. The log transport uses a different mechanism (attachTransport on the logger instance) which requires a shared singleton.
Steps to reproduce
- Enable diagnostics-otel with logs: true
- Confirm OTLP endpoint accepts logs (send a test JSON log record — it works)
- Observe: metrics and traces flow to collector, but zero log records arrive
- Check Loki/collector: no openclaw-gateway log streams
Environment
• OpenClaw: 2026.2.19-2 (gateway)
• Plugin: @openclaw/diagnostics-otel 2026.3.2
• Node: v24.13.1
• Stack: grafana/otel-lgtm (OTel Collector + Loki + Tempo + Mimir)
Expected behavior
OTLP log export
Actual behavior
no OLTP log export
OpenClaw version
2026.2.19-2 (gateway)
Operating system
24.04.1-Ubuntu
Install method
npm global
Logs, screenshots, and evidence
# 1. Plugin loads and reports logs enabled
[2026-03-07T10:21:16.320-08:00] INFO [plugins] diagnostics-otel: logs exporter enabled (OTLP/Protobuf)
# 2. Metrics confirmed flowing — openclaw_* metrics present in Mimir
$ curl -s 'http://localhost:3000/api/datasources/proxy/uid/prometheus/api/v1/query?query=openclaw_session_state_total' -u admin:***
{"status":"success","data":{"resultType":"vector","result":[
{"metric":{"openclaw_reason":"message_start","openclaw_state":"processing"},"value":[...,"12"]},
{"metric":{"openclaw_reason":"run_started","openclaw_state":"processing"},"value":[...,"12"]}
]}}
# 3. Traces confirmed flowing — spans present in Tempo
$ curl -s 'http://localhost:3000/api/datasources/proxy/uid/tempo/api/search?limit=5' -u admin:***
{"traces":[
{"rootServiceName":"openclaw-gateway","rootTraceName":"openclaw.message.processed","durationMs":84},
{"rootServiceName":"openclaw-gateway","rootTraceName":"openclaw.model.usage","durationMs":22306},
{"rootServiceName":"openclaw-gateway","rootTraceName":"openclaw.message.processed","durationMs":23238}
]}
# 4. Loki: zero log streams from openclaw-gateway (empty after 40+ minutes of activity)
$ curl -s 'http://localhost:3000/api/datasources/proxy/uid/loki/loki/api/v1/query_range' -u admin:*** \
--data-urlencode 'query={service_name="openclaw-gateway"}' ...
{"status":"success","data":{"result":[]}} # 0 streams
# 5. Proof the pipeline works — manual test log lands in Loki immediately
$ curl -X POST http://localhost:4318/v1/logs -H "Content-Type: application/json" \
-d '{"resourceLogs":[{"resource":{"attributes":[{"key":"service.name","value":{"stringValue":"test-service"}}]},"scopeLogs":[{"logRecords":[{"timeUnixNano":"...","severityText":"INFO","body":{"stringValue":"test log from manual"}}]}]}]}'
# → HTTP 200, appears in Loki within seconds:
{"status":"success","data":{"result":[{"stream":{"service_name":"test-service"},"values":[["...","test log from manual"]]}]}}
# 6. OTel Collector config confirms logs pipeline is wired
# /otel-lgtm/otelcol-config.yaml:
# service.pipelines.logs:
# receivers: [otlp]
# processors: [batch]
# exporters: [otlphttp/logs] → http://127.0.0.1:3100/otlp (Loki)
# 7. Root cause — two isolated module instances
$ grep -c "cachedLogger" dist/daemon-cli.js
6 # gateway has its own loggingState
$ grep -c "cachedLogger" dist/plugin-sdk/subsystem-QV9R1a2-.js
6 # plugin-sdk has a SEPARATE loggingState
$ grep -c "registerLogTransport\|externalTransports" dist/daemon-cli.js
5 # gateway has its own externalTransports Set
$ grep -c "registerLogTransport\|externalTransports" dist/plugin-sdk/subsystem-QV9R1a2-.js
5 # plugin-sdk has a SEPARATE externalTransports Set
# The plugin calls registerLogTransport from plugin-sdk → registers into plugin-sdk's Set
# The gateway logger in daemon-cli.js iterates its OWN Set → never sees the plugin's transport
Impact and severity
• Affected users: All users enabling diagnostics.otel.logs: true — the feature silently does nothing. Metrics and traces users are unaffected.
• Severity: Moderate — doesn't block workflows or cause data loss, but the feature is advertised as working and silently fails with no error. Users will spend time debugging their collector/Loki config before realizing it's an upstream issue.
• Frequency: Always. 100% reproducible on any install. The module duplication is baked into the bundle output.
• Consequence:
• OTLP log export is non-functional despite config and plugin reporting success
• Users must fall back to tailing JSONL file logs (/tmp/openclaw/*.log), losing centralized log aggregation
• No error or warning is emitted — the plugin says "logs exporter enabled" even though no logs will ever be sent
• Time wasted diagnosing (we spent ~20 minutes tracing this from Loki → collector config → OTLP endpoint → plugin code → bundler output before finding the root cause)
Additional information
Suggested Fix
Either:
- Share the logger singleton — export the gateway's loggingState and import it in the plugin-sdk (or use a shared module)
- Bridge via the plugin context — pass the gateway's logger instance to the plugin via ctx so the plugin can call ctx.logger.attachTransport() directly
- Use diagnostic events for logs — emit log records as diagnostic events (like model.usage) instead of relying on attachTransport
Bug type
Behavior bug (incorrect output/state without crash)
Summary
Summary
The diagnostics-otel plugin's log export feature (diagnostics.otel.logs: true) doesn't work. The plugin loads successfully and reports "logs exporter enabled (OTLP/Protobuf)" but no log records are ever sent to the OTLP endpoint. Metrics and traces work fine.
Root Cause
registerLogTransport in the plugin-sdk (plugin-sdk/subsystem-QV9R1a2-.js) maintains its own externalTransports Set and loggingState singleton. The gateway's actual logger lives in a separate bundle (daemon-cli.js) with its own independent copies of both.
When the plugin calls registerLogTransport(callback):
Result: the transport is registered in the wrong module instance and never receives log events.
Evidence
Gateway logger — daemon-cli.js
grep -c "cachedLogger" dist/daemon-cli.js
→ found (has its own loggingState)
Plugin SDK — subsystem-QV9R1a2-.js
grep -c "cachedLogger" dist/plugin-sdk/subsystem-QV9R1a2-.js
→ found (has its own separate loggingState)
Gateway main bundle does NOT reference plugin-sdk logging
grep -c "plugin-sdk" dist/subsystem-kl-vrkYi.js
→ 0
Meanwhile, onDiagnosticEvent (used for metrics/traces) works because it's event-based — the gateway emits diagnostic events that the plugin subscribes to. The log transport uses a different mechanism (attachTransport on the logger instance) which requires a shared singleton.
Steps to reproduce
Environment
• OpenClaw: 2026.2.19-2 (gateway)
• Plugin: @openclaw/diagnostics-otel 2026.3.2
• Node: v24.13.1
• Stack: grafana/otel-lgtm (OTel Collector + Loki + Tempo + Mimir)
Expected behavior
OTLP log export
Actual behavior
no OLTP log export
OpenClaw version
2026.2.19-2 (gateway)
Operating system
24.04.1-Ubuntu
Install method
npm global
Logs, screenshots, and evidence
Impact and severity
• Affected users: All users enabling diagnostics.otel.logs: true — the feature silently does nothing. Metrics and traces users are unaffected.
• Severity: Moderate — doesn't block workflows or cause data loss, but the feature is advertised as working and silently fails with no error. Users will spend time debugging their collector/Loki config before realizing it's an upstream issue.
• Frequency: Always. 100% reproducible on any install. The module duplication is baked into the bundle output.
• Consequence:
• OTLP log export is non-functional despite config and plugin reporting success
• Users must fall back to tailing JSONL file logs (/tmp/openclaw/*.log), losing centralized log aggregation
• No error or warning is emitted — the plugin says "logs exporter enabled" even though no logs will ever be sent
• Time wasted diagnosing (we spent ~20 minutes tracing this from Loki → collector config → OTLP endpoint → plugin code → bundler output before finding the root cause)
Additional information
Suggested Fix
Either: