Skip to content

Add diagnostics OTEL capability contract tests#92045

Merged
vincentkoc merged 1 commit into
openclaw:mainfrom
efpiva:edpiva/diagnostics-otel-capability-repro-20260606025046
Jun 14, 2026
Merged

Add diagnostics OTEL capability contract tests#92045
vincentkoc merged 1 commit into
openclaw:mainfrom
efpiva:edpiva/diagnostics-otel-capability-repro-20260606025046

Conversation

@efpiva

@efpiva efpiva commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add focused plugin contract coverage for the official diagnostics-otel package when it is selected from config and matches an installed official package record.
  • Verify that diagnostics-otel receives internalDiagnostics when the registry classifies the config-selected official install as trusted, while untrusted/spoofed services remain denied.

Verification

  • git diff --check
  • pnpm exec oxfmt --check --threads=1 src/plugins/services.test.ts src/plugins/manifest-registry.test.ts
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.plugins.config.ts src/plugins/services.test.ts -t "grants internal diagnostics only to trusted diagnostics exporter services"
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.plugins.config.ts src/plugins/manifest-registry.test.ts -t "marks official diagnostics-otel config paths trusted when the install record matches"

Notes:

  • Current origin/main already has part of the service-side diagnostics coverage, but not the config-path diagnostics-otel trusted official install case added here.
  • The older vitest.unit.config.ts focused command now excludes src/plugins/**; the plugin-scoped wrapper config above is the current repo-approved targeted command for these files.

Real behavior proof

Behavior addressed: Official diagnostics-otel installs selected from config should be classified as trusted when their install record matches, so the diagnostics exporter service receives the internal diagnostics hooks.

Real environment tested: Local OpenClaw source checkout on the PR branch with repository runtime modules loaded through node --import tsx; no Vitest harness or mocks in this proof command.

Exact steps or command run after this patch: Ran a source-checkout Node command that writes a temporary diagnostics-otel plugin manifest, calls loadPluginManifestRegistry with a matching official npm install record, then starts a diagnostics-otel plugin service through startPluginServices using the registry record's trustedOfficialInstall classification.

Evidence after fix: Terminal output from the local source-checkout proof command:

{
  "manifestId": "diagnostics-otel",
  "manifestOrigin": "config",
  "manifestTrustedOfficialInstall": true,
  "serviceInternalDiagnosticsOnEvent": "function",
  "serviceInternalDiagnosticsEmit": "function"
}

Observed result after fix: The config-selected diagnostics-otel manifest record was trusted (manifestTrustedOfficialInstall: true), and the started service context exposed both internal diagnostics hooks as functions.

What was not tested: Full packaged install in a user runtime and live end-to-end OTEL export were not run for this test-only PR; the branch adds focused contract tests plus the source-checkout proof above.

@openclaw-barnacle openclaw-barnacle Bot added size: XS triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. triage: test-only-no-bug Candidate: test-only change has no linked bug or behavior evidence. proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels Jun 10, 2026
@clawsweeper

clawsweeper Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 13, 2026, 11:25 PM ET / 03:25 UTC.

Summary
The PR adds tests for config-selected official diagnostics-otel installs being trusted and receiving internal diagnostics service hooks.

PR surface: Tests +49. Total +49 across 2 files.

Reproducibility: not applicable. This PR adds contract coverage and does not report a failing current-main user workflow.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • none.

Next step before merge

  • No ClawSweeper repair is needed because the patch is coherent, test-only, and has no concrete defect.

Security
Cleared: The patch changes only tests and adds no dependency, workflow, permission, secret, package-resolution, downloaded-code, or runtime execution surface.

Review details

Best possible solution:

Land the focused contract tests after normal maintainer review while keeping the existing runtime trust policy unchanged.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this PR adds contract coverage and does not report a failing current-main user workflow.

Is this the best way to solve the issue?

Yes. Separate registry and service-gate tests are a narrow maintainable way to protect the official-install trust classification and downstream capability grant.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against bd10e1998be0.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-patch terminal output from real source registry and service modules showing trusted config classification and internal diagnostics hooks.

Label justifications:

  • P3: This is low-risk test-only regression coverage for an existing plugin trust and capability contract with no runtime behavior change.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-patch terminal output from real source registry and service modules showing trusted config classification and internal diagnostics hooks.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-patch terminal output from real source registry and service modules showing trusted config classification and internal diagnostics hooks.
Evidence reviewed

PR surface:

Tests +49. Total +49 across 2 files.

View PR surface stats
Area Files Added Removed Net
Source 0 0 0 0
Tests 2 49 0 +49
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 2 49 0 +49

What I checked:

Likely related people:

  • steipete: Authored the merged trusted official diagnostics install fix that introduced the current runtime and registry contract this PR covers. (role: fix author; confidence: high; commits: 1cc00f8ecfff; files: src/plugins/manifest-registry.ts, src/plugins/services.ts, src/plugins/manifest-registry.test.ts)
  • vincentkoc: Merged the prior trusted diagnostics install PR and current-main blame on the reviewed paths points to the latest release repair commit. (role: merger and recent area contributor; confidence: medium; commits: 1cc00f8ecfff, c11fcbcb6acc; files: src/plugins/manifest-registry.ts, src/plugins/services.ts, src/plugins/manifest-registry.test.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P3 Low-priority cleanup, docs, polish, ergonomics, or speculative work. labels Jun 10, 2026
@vincentkoc vincentkoc force-pushed the edpiva/diagnostics-otel-capability-repro-20260606025046 branch from b1ffc2c to a4a2807 Compare June 14, 2026 03:21
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 14, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label Jun 14, 2026
@vincentkoc vincentkoc self-assigned this Jun 14, 2026
@vincentkoc

Copy link
Copy Markdown
Member

Maintainer verification for a4a2807c6cef65ae2d7b049d3d840ebaecd97037:

  • Reviewed the plugin trust boundary: diagnostics-otel is official in scripts/lib/official-external-plugin-catalog.json; loadPluginManifestRegistry marks only matching config/global installed records as trustedOfficialInstall; registerService carries that flag into PluginServiceRegistration; startPluginServices grants internalDiagnostics only to bundled or trusted-official diagnostics exporters.
  • Local focused proof: node scripts/run-vitest.mjs src/plugins/services.test.ts src/plugins/manifest-registry.test.ts --maxWorkers=1 (2 files, 78 tests passed after rebase).
  • Formatting/sanity: git diff --check origin/main...HEAD; ./node_modules/.bin/oxfmt --check src/plugins/services.test.ts src/plugins/manifest-registry.test.ts.
  • Autoreview: .agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main ... returned clean, no accepted/actionable findings.
  • Remote changed gate: Testbox-through-Crabbox tbx_01kv226tpj3n78d9ns25qrr42v, Actions run https://github.com/openclaw/openclaw/actions/runs/27486933346, corepack pnpm check:changed passed.
  • Fresh PR CI on a4a2807c6cef65ae2d7b049d3d840ebaecd97037 is green: 135 successful, 31 skipped, 0 pending/failing.

@vincentkoc vincentkoc merged commit 7f49f87 into openclaw:main Jun 14, 2026
173 of 174 checks passed
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low-priority cleanup, docs, polish, ergonomics, or speculative work. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. size: XS status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. triage: test-only-no-bug Candidate: test-only change has no linked bug or behavior evidence.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants