Skip to content

[Feature]: Support OpenTelemetry GenAI Auto-Instrumentation (OpenLLMetry / IITM) #7312

@henrikrexed

Description

@henrikrexed

Summary

When building an OpenTelemetry observability plugin for OpenClaw, it is currently impossible to use standard GenAI auto-instrumentation libraries (OpenLLMetry / @traceloop/instrumentation-x) to produce for example anthropic.chat spans with full semantic GenAI attributes. This is due to ESM module isolation and IITM (import-in-the-middle) conflicts with @mariozechner/pi-ai.

What We Tried to Achieve

We built an OpenClaw plugin (openclaw-observability-plugin) that exports traces, metrics, and logs via OTLP to an OpenTelemetry Collector. The plugin successfully produces:

  • Connected traces: openclaw.requestopenclaw.agent.turntool.* (using api.on() hooks)
  • Token metrics: openclaw.llm.tokens.{prompt,completion,total} (extracted from agent_end event)
  • Tool spans, message counts, session events

However, we wanted automatic GenAI spans on the actual LLM SDK calls (anthropic.chat) — the standard approach in the OpenTelemetry ecosystem using OpenLLMetry (@traceloop/instrumentation-anthropic). These spans capture:

  • gen_ai.request.model, gen_ai.system
  • gen_ai.usage.input_tokens, gen_ai.usage.output_tokens
  • gen_ai.request.max_tokens, gen_ai.request.temperature
  • Request/response content (when enabled)
  • Per-LLM-call latency (separate from full agent turn duration)

This is the standard OTel GenAI semantic convention and what observability backends (Dynatrace, Grafana, etc.) expect for LLM observability dashboards.

Approach 1: Plugin-Side SDK Patching (Failed)

Attempt

Patch Anthropic.Messages.prototype.create from within the plugin code to wrap LLM calls with OTel spans.

Why It Failed — ESM/CJS Module Isolation

OpenClaw's plugin loader uses jiti, which runs plugin code in a CJS-like context. The @anthropic-ai/sdk package has dual entry points:

  • ESM: @anthropic-ai/sdk/index.mjs (loaded by @mariozechner/pi-ai via import)
  • CJS: @anthropic-ai/sdk/index.js (loaded by plugin via createRequire())

These are completely separate module instances with different prototypes:

// From diagnostic logging:
ESM Anthropic === CJS Anthropic: false

Patching the CJS Messages.prototype.create has zero effect on the ESM instance that pi-ai actually uses. The plugin cannot access the ESM module instance.

Additional Constraint — jiti Blocks Dynamic Import

We tried using import() to access the ESM instance:

const sdk = await import("@anthropic-ai/sdk");

This fails in jiti's VM context:

ERR_VM_DYNAMIC_IMPORT_CALLBACK_MISSING

jiti converts import() to require() internally, making it impossible to access the ESM entry point from plugin code.

Approach 2: NODE_OPTIONS Preload with IITM (Failed)

Attempt

Use the standard OpenTelemetry ESM instrumentation pattern:

NODE_OPTIONS="--import ./instrumentation/preload.mjs"

The preload script:

  1. Imports @opentelemetry/instrumentation/hook.mjs (registers IITM ESM loader hooks)
  2. Creates a NodeSDK with AnthropicInstrumentation from @traceloop/instrumentation-anthropic
  3. Starts the SDK before the application loads

This is the officially recommended approach for instrumenting ESM applications with OpenTelemetry.

Why It Failed — IITM Breaks pi-ai

import-in-the-middle (IITM) registers global ESM loader hooks that intercept ALL module imports. When it intercepts @mariozechner/pi-ai, it breaks the module's named exports:

SyntaxError: The requested module '@mariozechner/pi-ai' does not provide
an export named 'getEnvApiKey'
    at ModuleJob._instantiate (node:internal/modules/esm/module_job:226:21)

This crash-loops the gateway — the process exits immediately on startup, systemd restarts it, and it crashes again.

Root Cause

IITM wraps ESM modules by re-exporting them through a proxy. Some modules with complex export patterns (barrel files, re-exports from sub-modules) can break under this proxy. @mariozechner/pi-ai is one such module — its named exports become unavailable when IITM intercepts the module load.

This is not specific to our instrumentation — any IITM-based OTel instrumentation will trigger this crash because IITM intercepts all ESM modules globally, not just the targeted ones.

Environment

  • Node.js: v22.22.0
  • @opentelemetry/instrumentation: 0.203.0
  • import-in-the-middle: 1.15.0 (transitive via OTel)
  • @mariozechner/pi-ai: (bundled with OpenClaw)
  • @anthropic-ai/sdk: 0.71.2

Approach 3: Manual register() with IITM (Failed)

Attempt

Instead of --import hook.mjs, use register() from node:module to manually register IITM loader hooks, hoping for more selective interception:

import { register } from "node:module";
register("import-in-the-middle/hook.mjs", import.meta.url);

Result

Same crash. register() installs the same global loader hooks as hook.mjs. IITM does not support filtering which modules to intercept at the loader level.

Impact

Without GenAI auto-instrumentation, the plugin cannot produce per-LLM-call spans (anthropic.chat, openai.chat). We work around this by extracting token usage from the agent_end hook event, but this gives us:

Capability With OpenLLMetry Current Workaround
Per-LLM-call spans ✅ Individual anthropic.chat spans ❌ Only aggregate openclaw.agent.turn
Token usage ✅ Per-call gen_ai.usage.* attributes ⚠️ Summed across all calls in a turn
Request/response content gen_ai.content.prompt/completion ❌ Not available
Model per call ✅ Per-call gen_ai.request.model ⚠️ Last model in turn only
Latency per LLM call ✅ Individual call duration ❌ Only full turn duration
Streaming vs non-streaming ✅ Distinguished ❌ Not visible
Multiple LLM calls per turn ✅ Each call is a separate span ❌ All merged into one span
Standard GenAI dashboards ✅ Compatible ❌ Custom dashboards required

Suggested Solutions

Option A: Built-in OTel Hook Point in pi-ai

Add a hook/callback in @mariozechner/pi-ai's provider layer (e.g., streamAnthropic()) that fires before/after the actual SDK call. This would allow plugins to create spans around individual LLM calls without needing IITM:

// Pseudocode — in pi-ai's anthropic provider
const onLLMCallStart = hookRunner?.onLLMCallStart?.({
  provider: "anthropic",
  model: model.id,
  params: sanitizedParams,
});

const stream = client.messages.stream({ ...params, stream: true });

// After completion:
onLLMCallEnd?.({ usage, stopReason, duration });

Option B: Fix IITM Compatibility with pi-ai

Investigate why IITM breaks @mariozechner/pi-ai's named exports. This might be:

  • A barrel file pattern that IITM doesn't handle correctly
  • A need for explicit IITM exclude patterns (currently not supported at loader level)
  • A Node.js 22 regression in IITM's ESM loader hooks

Option C: Expose LLM Call Events on the Plugin API

Similar to the existing agent_end event, emit events for individual LLM API calls:

api.on("llm_call_start", (event) => {
  // event: { provider, model, sessionKey, callId }
});

api.on("llm_call_end", (event) => {
  // event: { provider, model, usage, duration, stopReason, callId }
});

This would give plugins everything needed to create proper GenAI spans without any monkey-patching or loader hooks.

Option D: Native OTel Support in OpenClaw

Bundle OpenTelemetry instrumentation directly in OpenClaw, configured via openclaw.json. Since OpenClaw controls the process startup, it could:

  1. Initialize OTel SDK before any imports
  2. Register instrumentations in a controlled way
  3. Avoid IITM conflicts by managing the loader hook lifecycle

Reproduction

# 1. Clone the plugin
git clone https://github.com/henrikrexed/openclaw-observability-plugin

# 2. Add preload to NODE_OPTIONS in systemd unit
# ~/.config/systemd/user/openclaw-gateway.service
Environment="NODE_OPTIONS=--import /path/to/openclaw-observability-plugin/instrumentation/preload.mjs"

# 3. Restart gateway
systemctl --user daemon-reload
systemctl --user restart openclaw-gateway

# 4. Observe crash loop
journalctl --user -u openclaw-gateway -f
# => SyntaxError: The requested module '@mariozechner/pi-ai' does not provide an export named 'getEnvApiKey'

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions