Skip to content

fix(otel): complete diagnostics-otel OpenTelemetry v2 API migration#12897

Merged
vincentkoc merged 5 commits intoopenclaw:mainfrom
vincentkoc:vincentkoc-code/fix-otel-v2-compat-clean
Feb 19, 2026
Merged

fix(otel): complete diagnostics-otel OpenTelemetry v2 API migration#12897
vincentkoc merged 5 commits intoopenclaw:mainfrom
vincentkoc:vincentkoc-code/fix-otel-v2-compat-clean

Conversation

@vincentkoc
Copy link
Copy Markdown
Member

@vincentkoc vincentkoc commented Feb 9, 2026

Summary

This PR completes the OpenTelemetry v2 compatibility fix for the diagnostics-otel extension with a tightly scoped patch. It addresses the runtime API breaks without pulling in unrelated changes from other subsystems.

This is part of a broader set of improvements I'm personally working on uplifting telemetry in OpenClaw (work started in an earlier umbrella PR #9761) to bring Opik and first-class native AI observability into OpenClaw which includes OTel and hook upgrades.

This has been tested by me on a live instance, output of an OTel trace showing its working
image

Motivation

The current OTel plugin can fail during startup on current OTel package versions (#3201). lobster-biscuit

Changes

Related and Resolves

Greptile Overview

Greptile Summary

Completes OpenTelemetry v2 API migration for the diagnostics-otel extension, addressing runtime compatibility issues with current OTel packages. Key changes include:

  • Replaced deprecated SemanticResourceAttributes with ATTR_SERVICE_NAME and Resource with resourceFromAttributes for v2-compatible resource initialization
  • Fixed LoggerProvider instantiation to use constructor-based processors wiring instead of deprecated addLogRecordProcessor() method
  • Added recursion guard in diagnostic event dispatch to prevent infinite loops (depth limit of 100)
  • Moved diagnostic event state to globalThis to ensure consistent state across module instances
  • Added comprehensive error handling with try-catch blocks in event handlers and log transport
  • Updated unit test mocks to match current OTel v2 APIs

The changes are tightly scoped to the diagnostics-otel extension and diagnostic event infrastructure, following the official OpenTelemetry JS v2 migration guide.

Confidence Score: 4/5

  • Safe to merge with attention to the unhandled SDK start promise issue noted in previous comments
  • The PR correctly implements OTel v2 API migration following the official migration guide, includes proper error handling, adds recursion protection, and has corresponding test updates. However, a previously noted issue about the unhandled promise from sdk.start() (line 135) has been addressed with try-catch and await, which is good. The recursion guard implementation is solid with a depth limit of 100. The global state pattern for diagnostic events correctly addresses multi-instance scenarios. All v2 API changes align with OpenTelemetry documentation.
  • No files require special attention - the OTel migration and recursion guard are well-implemented

@openclaw-barnacle openclaw-barnacle Bot added the extensions: diagnostics-otel Extension: diagnostics-otel label Feb 9, 2026
@vincentkoc vincentkoc marked this pull request as ready for review February 9, 2026 22:18
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment thread extensions/diagnostics-otel/src/service.ts Outdated
@vincentkoc
Copy link
Copy Markdown
Member Author

@greptileai the one issue you raised was outdated, this PR is ready to merge please update the score.

@feng-95
Copy link
Copy Markdown

feng-95 commented Feb 10, 2026

Hi @vincentkoc, have u tested this plugin e2e with a real backend (like Jaeger or a standard OTel collector) to ensure data exported as expected?
I have tried to fix all these issues, but still don't see any otel data in my backend.

@vincentkoc
Copy link
Copy Markdown
Member Author

Hi @vincentkoc, have u tested this plugin e2e with a real backend (like Jaeger or a standard OTel collector) to ensure data exported as expected? I have tried to fix all these issues, but still don't see any otel data in my backend.

@feng-95 was working in my logs but collector is not working (as you point out), i think i know the issue so will switch this branch PR to draft to avoid noise as im testing from branch on a live instance.

@vincentkoc vincentkoc marked this pull request as draft February 10, 2026 06:26
@vincentkoc
Copy link
Copy Markdown
Member Author

vincentkoc commented Feb 10, 2026

image Latest commit [c536b10] is now emitting to the OTel collector im running locally `otelcol-contrib`

@openclaw-barnacle openclaw-barnacle Bot removed extensions: llm-task Extension: llm-task extensions: lobster Extension: lobster labels Feb 13, 2026
@feng-95
Copy link
Copy Markdown

feng-95 commented Feb 13, 2026

Hi @vincentkoc ,

I wanted to check in on the current status of this PR—do you consider it fully ready for review and merge at this stage?

Additionally, since the openclaw repository can sometimes be quite slow with reviewing and merging large PRs, have you considered submitting this to the openclaw-cn community: https://github.com/jiulingyun/openclaw-cn as well? The maintainers there tend to merge faster. Getting it live in openclaw-cn would be an excellent way to validate these new capabilities in the wild, which could provide the proof-of-stability needed to accelerate the review and merge process here in the main openclaw repo.

@vincentkoc
Copy link
Copy Markdown
Member Author

@feng-95 yes this has been 100% tested by hand and coded by hand with some AI assistance (hence the screenshots), all other PRs are broken. Just rebasing the conflicts and formatting which broke after some merge conflicts. I dont feel comfortable commiting to a fork, your welcomed to sync my branch manually if you wish.

@vincentkoc
Copy link
Copy Markdown
Member Author

Rebased branch and resolved merge conflicts, ready to merge again.

@nimarb
Copy link
Copy Markdown

nimarb commented Feb 13, 2026

hi @vincentkoc - amazing, thanks for the patch! tried it, too and works! unfortunately the OTel spans are missing the input / output data.
do you have a way in mind of best adding those? happy to collaborate!

@vincentkoc
Copy link
Copy Markdown
Member Author

@nimarb that is a seperate issue and will be follow-up PRs, feel free to raise an issue and mention this PR and we can go from there. For the interest of the openclaw team small mergable PRs will help move things along faster.

@vitorvasc
Copy link
Copy Markdown

@nimarb @vincentkoc - Hey, folks! I'm setting up an observability ecosystem for OpenClaw and have been tracking this issue for a while; I'm very interested in getting this working. Just wanted to let you know that I'm open to and interested in working in the follow up tasks for the OpenTelemetry instrumentation.

@feng-95
Copy link
Copy Markdown

feng-95 commented Feb 22, 2026

Hi @vincentkoc, congrats on getting this merged! Thanks for the hard work on this issue.

I’m currently trying to verify this on my end, but I’m running into an issue. After enabling the diagnostics-otel plugin and configuring the reporting endpoint, I’m not seeing any traces or metrics arriving at my server.

Could I be missing something in the setup, or is there a specific configuration format required

Here is the configuration I'm using:

 {
  "diagnostics": {
    "enabled": true,
    "otel": {
      "enabled": true,
      "endpoint": "my_otel_collector",
      "protocol": "http/protobuf",
      "serviceName": "openclaw",
      "traces": true,
      "metrics": true,
      "logs": true,
      "flushIntervalMs": 30000
    }
  },
  "plugins": {
    "entries": {
      "diagnostics-otel": {
        "enabled": true
      }
    }
  }

}

Is there any additional logging I can enable to debug why the export might be failing silently?

Thanks a lot!

@vincentkoc
Copy link
Copy Markdown
Member Author

Is there any additional logging I can enable to debug why the export might be failing silently?

@feng-95 you can start gateway in debug mode and see the diagnostics for OTEL, secondly disable metrics and logs and just use traces first and see, thirdly it could be networking/endpoint related issue.

@feng-95
Copy link
Copy Markdown

feng-95 commented Feb 22, 2026

Is there any additional logging I can enable to debug why the export might be failing silently?

@feng-95 you can start gateway in debug mode and see the diagnostics for OTEL, secondly disable metrics and logs and just use traces first and see, thirdly it could be networking/endpoint related issue.

It seems that the diagnostics-otel plugin currently uses http/json encoding to report data. However, my otel-collector only supports the Protocol Buffer format. I will go throught with http/json format.

And I just wonder would it be possible to add support for Protocol Buffers or make the encoding format configurable in a future update?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

extensions: diagnostics-otel Extension: diagnostics-otel maintainer Maintainer-authored PR size: M

Projects

None yet

4 participants