Skip to content

feat(gateway): add SDK environment discovery RPCs#74867

Merged
BunsDev merged 3 commits intoopenclaw:mainfrom
ai-hpc:fix/sdk-environments-discovery
May 5, 2026
Merged

feat(gateway): add SDK environment discovery RPCs#74867
BunsDev merged 3 commits intoopenclaw:mainfrom
ai-hpc:fix/sdk-environments-discovery

Conversation

@ai-hpc
Copy link
Copy Markdown
Contributor

@ai-hpc ai-hpc commented Apr 30, 2026

Summary

  • Problem: @openclaw/sdk exposed oc.environments.*, but list and status still failed as unsupported and Gateway advertised no environment discovery RPCs.
  • Why it matters: app clients need a read-only way to discover Gateway-local and node environment candidates before create/delete/provisioning semantics are ready.
  • What changed: added typed environments.list and environments.status Gateway RPCs, SDK call-through/types, operator.read scope classification, protocol schemas/generated Swift models, docs, changelog, and focused tests.
  • What did NOT change (scope boundary): oc.environments.create/delete remain unsupported, and per-run runtime/environment options still fail before submission.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

N/A. This is the first additive read-only environment discovery contract for the SDK/Gateway surface.

Regression Test Plan (if applicable)

N/A.

User-visible / Behavior Changes

oc.environments.list() and oc.environments.status(environmentId) now return read-only environment summaries with id, type, status, and optional label / capabilities. oc.environments.create/delete remain unsupported.

Diagram (if applicable)

Before:
[app client] -> [@openclaw/sdk oc.environments.list/status] -> [unsupported error]

After:
[app client] -> [@openclaw/sdk oc.environments.list/status] -> [Gateway environments.* RPC] -> [environment summaries]

Security Impact (required)

  • New permissions/capabilities? (Yes/No) Yes
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) Yes
  • If any Yes, explain risk + mitigation: the new RPCs require operator.read and expose only environment candidate metadata derived from Gateway-local state and existing node discovery surfaces. They do not create environments, run commands, expose secrets, or change auth behavior.

Repro + Verification

Environment

  • OS: Ubuntu/Linux
  • Runtime/container: Node 22+/pnpm dev checkout
  • Model/provider: N/A
  • Integration/channel (if any): Gateway WebSocket RPC / @openclaw/sdk
  • Relevant config (redacted): N/A

Steps

  1. On current main, call oc.environments.list() or oc.environments.status("gateway").
  2. Observe the SDK unsupported error.
  3. Apply this PR.
  4. Call the same SDK methods against Gateway.

Expected

  • SDK environment discovery methods route to Gateway and return typed read-only environment summaries.

Actual

  • Before this PR, SDK environment discovery methods throw unsupported errors.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Verification run locally after rebase onto openclaw/main:

  • pnpm protocol:check
  • pnpm test src/gateway/server-methods/environments.test.ts src/gateway/method-scopes.test.ts packages/sdk/src/index.test.ts src/gateway/protocol/index.test.ts -- --reporter=verbose
  • pnpm lint:core
  • pnpm tsgo:core && pnpm tsgo:core:test
  • pnpm lint:docs && pnpm format:docs:check && pnpm check:changelog-attributions
  • git diff --check openclaw/main...HEAD

Human Verification (required)

  • Verified scenarios: SDK list/status call-through, Gateway list/status handlers, unknown environment rejection, protocol validation, method discovery/scope classification, generated Swift models, docs/changelog checks, and unsupported create/delete behavior remains explicit.
  • Edge cases checked: no-arg SDK list() sends {} for the empty params schema, invalid params reject, unknown status id rejects, paired/offline nodes are surfaced via the existing node catalog, and duplicate capabilities are normalized.
  • What you did not verify: live managed environment provisioning or execution, because this PR intentionally does not add provisioning or runtime selection semantics.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: Environment capability vocabulary may need future refinement as managed runtime semantics mature.
    • Mitigation: This PR keeps the API read-only, reports only Gateway-local and node-discovery-backed candidates, and leaves create/delete/provisioning/runtime-selection unsupported.

Real behavior proof

Behavior or issue addressed: Refs #74708 — App SDK clients had no read-only RPC for discovering Gateway-local + paired-node environments. This patch adds environments.list and environments.status to the Gateway protocol (with TypeBox schemas, scope mapping, server method registration, SDK client/types exposure, Swift bindings, and docs), so clients can enumerate available environments and inspect a single environment without enabling provisioning surfaces.

Real environment tested: macOS Darwin 25.2.0 (arm64), Node 24.15.0 (Homebrew node@24), local OpenClaw source checkout at /Users/a1111/openclaw/ running directly via node --import tsx against the patched src/gateway/server-methods/environments.ts, the new TypeBox validators in src/gateway/protocol/index.ts, and the production node-catalog/pairing helpers under src/infra/. Real paired Mac node and real device-pairing state loaded from ~/.openclaw/ — no mocks.

Exact steps or command run after this patch:

  1. Checked out the patched branch fix/sdk-environments-discovery (HEAD 7d33e3a7e8, rebased onto upstream b8f9137d31).
  2. Invoked the production validateEnvironmentsListParams / validateEnvironmentsStatusParams validators with valid and invalid params.
  3. Replicated the listEnvironments server-handler flow live: read real device + node pairings via listDevicePairing() / listNodePairing(), built the known-node catalog, and mapped each entry through the same summarizeNodeEnvironment shape the handler emits to clients.

Evidence after fix:

Live terminal capture from the patched source running on my Mac (paired Mac node id and labels are local-real, not synthetic):

$ node --import tsx --input-type=module -e "<inline live exercise of the new RPC logic>"
--- environments.list params validation ---
  validateEnvironmentsListParams({}): true
  validateEnvironmentsListParams(undefined): true
  validateEnvironmentsListParams({extra:1}): false
--- environments.status params validation ---
  status({ environmentId:"gateway" }): true
  status({}): false
--- live listEnvironments (gateway local + paired nodes) ---
environments.list response (2 envs):
[
  {
    "id": "gateway",
    "type": "local",
    "label": "Gateway local",
    "status": "available",
    "capabilities": ["agent.run","sessions","tools","workspace"]
  },
  {
    "id": "node:51f14a92b66158cd8d3c8b3c77d2dbd2fef03465a514453ec5a70bca418ec6f2",
    "type": "node",
    "label": "1111's Mac mini",
    "status": "unavailable",
    "capabilities": [
      "browser","browser.proxy","canvas","canvas.a2ui.push","canvas.a2ui.pushJSONL",
      "canvas.a2ui.reset","canvas.eval","canvas.hide","canvas.navigate","canvas.present",
      "canvas.snapshot","screen","screen.snapshot","system.notify","system.run","system.which"
    ]
  }
]

The output is the exact EnvironmentSummary[] shape the new environments.list server handler returns (gateway local first, then node:<nodeId> entries sourced from the paired/known-node catalog, with deduplicated and sorted capabilities). The validators reject malformed param shapes ({extra:1} for list, missing environmentId for status) before the handler runs, matching respondInvalidParams semantics.

Observed result after fix:

The Gateway exposes two new read-only RPCs whose live request/response behavior matches the protocol contract: environments.list enumerates a gateway local environment plus one node:<id> entry per paired/connected node with sorted capabilities; environments.status resolves the same shape for a single environmentId and returns a typed INVALID_REQUEST error for unknown ids. Schema validation runs on the wire payload (via TypeBox validateEnvironmentsListParams / validateEnvironmentsStatusParams) and short-circuits handlers with INVALID_PARAMS for malformed input. SDK client/types and Swift GatewayModels bindings are wired so app clients consume the same shapes without provisioning surface access.

What was not tested:

  • A live App SDK client round-trip over the actual gateway WebSocket (the running launchd gateway on this machine is on an older patched dist that does not yet ship these RPCs; the exercise instead drove the production handler logic and validators directly via the same code path the handler uses).
  • Method-scope authorization edge cases on remote/proxied gateway lanes — the patch maps both methods into the existing read-only scope buckets but only the in-process scope was exercised live.
  • Swift / Apple SDK client-side binding execution on a real iOS/macOS app build (only the type/model file changes were inspected; not separately compiled and run on a device here).

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation app: macos App: macos app: web-ui App: web-ui gateway Gateway runtime size: L labels Apr 30, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 30, 2026

Codex review: needs changes before merge.

Summary
The branch adds typed Gateway environments.list/environments.status RPCs, SDK call-through/types, read-scope registration, generated Swift models, docs, changelog text, and focused tests while leaving create/delete and runtime selection unsupported.

Reproducibility: yes. Source inspection on current main shows SDK list/status still throw unsupported errors, Gateway discovery lacks environments.*, and current docs still mark those calls unsupported; the PR also supplies after-fix terminal/live Gateway proof.

Real behavior proof
Sufficient (terminal): The PR body and discussion include after-fix terminal/live Gateway output showing the new list/status RPC behavior and advertised discovery methods.

Next step before merge
A repair worker can make the narrow changelog relocation; maintainer API approval remains a separate review step after that mechanical blocker is resolved.

Security
Cleared: The diff adds authenticated read-only metadata RPCs under operator.read and does not touch secrets, dependencies, workflows, publishing, or command execution.

Review findings

  • [P2] Move the changelog entry to Unreleased — CHANGELOG.md:1147
Review details

Best possible solution:

Clean the changelog placement, get maintainer API approval on the initial environment vocabulary/managed-candidate choice, then land the additive read-only contract with provisioning surfaces still unsupported.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection on current main shows SDK list/status still throw unsupported errors, Gateway discovery lacks environments.*, and current docs still mark those calls unsupported; the PR also supplies after-fix terminal/live Gateway proof.

Is this the best way to solve the issue?

Mostly yes, but not merge-ready. The protocol/handler/SDK/scope/docs shape follows the existing Gateway RPC pattern; the duplicate changelog block needs a narrow repair and the API vocabulary should be confirmed by the maintainer owner.

Full review comments:

  • [P2] Move the changelog entry to Unreleased — CHANGELOG.md:1147
    This branch adds a second ## 2026.4.29 section and places the new Gateway/SDK environment RPC entry there. That duplicates a released section and leaves the user-facing feature out of Unreleased; move only the new entry to ## Unreleased / ### Changes and remove the duplicate block.
    Confidence: 0.93

Overall correctness: patch is incorrect
Overall confidence: 0.88

Acceptance criteria:

  • git diff --check CHANGELOG.md
  • pnpm check:changelog-attributions

What I checked:

Likely related people:

  • BunsDev: Opened the maintainer-labeled request for SDK-facing environment discovery and was explicitly asked in this PR discussion to decide the initial managed-candidate/vocabulary scope. (role: maintainer API requester and likely follow-up owner; confidence: high; commits: 45fe1650cd6d; files: packages/sdk/src/client.ts, src/gateway/protocol/index.ts, src/gateway/server-methods-list.ts)
  • steipete: GitHub commit history shows Peter Steinberger introduced the OpenClaw SDK package and has recent protocol and node-catalog maintenance in the affected control-plane area. (role: original SDK and gateway/protocol maintainer; confidence: medium; commits: 43f6c8b01aa7, 0ea28ddb165d, f6317fb747f5; files: packages/sdk/src/client.ts, src/gateway/protocol/index.ts, src/gateway/node-catalog.ts)
  • vincentkoc: Local blame on current main attributes the SDK environment namespace, method list, and method-scope regions to recent mainline maintenance, making this a reasonable routing candidate for adjacent Gateway/SDK cleanup. (role: recent current-main maintainer; confidence: medium; commits: a17d4371d101; files: packages/sdk/src/client.ts, src/gateway/server-methods-list.ts, src/gateway/method-scopes.ts)
  • Val Alexander: Recent history shows adjacent artifact RPC work using the same Gateway protocol, SDK, generated model, docs, and changelog pattern that this PR follows. (role: adjacent SDK/Gateway RPC pattern owner; confidence: medium; commits: a102f4dede6a; files: packages/sdk/src/client.ts, src/gateway/protocol/index.ts)

Remaining risk / open question:

  • The managed-candidate behavior and environment capability vocabulary remain an explicit maintainer API decision in the linked request and PR discussion.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 7c13004883f6.

@ai-hpc
Copy link
Copy Markdown
Contributor Author

ai-hpc commented Apr 30, 2026

Answer to @clawsweeper

Thanks for the review. I ran the acceptance criteria locally after the final rebase:

  • pnpm protocol:check
  • pnpm test src/gateway/server-methods/environments.test.ts src/gateway/method-scopes.test.ts packages/sdk/src/index.test.ts src/gateway/protocol/index.test.ts -- --reporter=verbose
  • pnpm lint:core
  • pnpm tsgo:core && pnpm tsgo:core:test
  • pnpm lint:docs && pnpm format:docs:check && pnpm check:changelog-attributions
  • git diff --check openclaw/main...HEAD

All passed.

On the API-shape question: this PR intentionally keeps the first contract narrow and read-only: environments.list / environments.status, operator.read, Gateway-local plus existing node-catalog metadata, no create/delete, no provisioning, no runtime/environment run-option support, and no secrets/config exposure. I agree the final capability vocabulary remains a maintainer API-review decision before landing.

@ai-hpc
Copy link
Copy Markdown
Contributor Author

ai-hpc commented Apr 30, 2026

@greptile-apps

@ai-hpc
Copy link
Copy Markdown
Contributor Author

ai-hpc commented Apr 30, 2026

@BunsDev the remaining API question from the review is scope/vocabulary: this first slice currently exposes Gateway-local and node-catalog candidates only, with read-only status/capabilities under operator.read; should the initial contract also include an explicit unavailable managed candidate, or should managed stay absent until a real managed source exists?

@ai-hpc ai-hpc force-pushed the fix/sdk-environments-discovery branch 5 times, most recently from ba64381 to 805f21e Compare May 1, 2026 15:02
@ai-hpc
Copy link
Copy Markdown
Contributor Author

ai-hpc commented May 2, 2026

Live VPS validation on PR head 805f21e0cf passed.

Checks run:

  • git diff --check refs/remotes/openclaw/main...HEAD
  • pnpm protocol:check
  • pnpm test src/gateway/server-methods/environments.test.ts src/gateway/method-scopes.test.ts packages/sdk/src/index.test.ts src/gateway/protocol/index.test.ts -- --reporter=verbose

Results:

  • protocol:check passed with generated schema/Swift models clean.
  • SDK shard passed: 19 tests.
  • Gateway/protocol shard passed: 65 tests.
  • Total focused test result: 2 Vitest shards passed in 79.85s.

Live smoke:

Started an isolated PR-head Gateway on the VPS with separate state/config:

OPENCLAW_HOME=/home/speedy/openclaw-pr74867-smoke-home \
OPENCLAW_CONFIG_PATH=/home/speedy/openclaw-pr74867-smoke-home/openclaw.json \
OPENCLAW_STATE_DIR=/home/speedy/openclaw-pr74867-smoke-home/state \
OPENCLAW_HIDE_BANNER=1 \
pnpm openclaw gateway --dev --port 19867 --auth none --bind loopback --allow-unconfigured run

The Gateway reached ready on 127.0.0.1:19867, and live SDK calls against it returned:

ENVIRONMENTS_LIST={"environments":[{"id":"gateway","type":"local","label":"Gateway local","status":"available","capabilities":["agent.run","sessions","tools","workspace"]}]}
ENVIRONMENTS_STATUS={"id":"gateway","type":"local","label":"Gateway local","status":"available","capabilities":["agent.run","sessions","tools","workspace"]}
DISCOVERY_METHODS=["environments.list","environments.status"]

The Gateway log also showed the live RPC response:

[ws] ⇄ res ✓ environments.list 355ms

This verifies oc.environments.list(), oc.environments.status("gateway"), and hello-ok.features.methods discovery against a real Gateway on the VPS. The installed user Gateway service on port 18789 was left untouched.

@ai-hpc ai-hpc force-pushed the fix/sdk-environments-discovery branch from 805f21e to 7d33e3a Compare May 5, 2026 08:01
@openclaw-barnacle openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 5, 2026
@BunsDev BunsDev self-assigned this May 5, 2026
@BunsDev
Copy link
Copy Markdown
Member

BunsDev commented May 5, 2026

For the initial contract, keep managed candidates absent until there is a real managed environment source to enumerate.

That keeps environments.list truthful: it reports Gateway-local and node-catalog-backed candidates that currently exist, while the schema/type vocabulary can still grow into managed/ephemeral once provisioning or discovery semantics are concrete. I do not want a synthetic unavailable managed row in v1 because clients may treat it as a real selectable environment, and it would blur the boundary this PR is explicitly preserving: read-only discovery now, no create/delete/provisioning/runtime-selection semantics yet.

Given that scope, #74867 is the right first slice for #74708: it adds the environments.list / environments.status Gateway + SDK contract, advertises the methods, scopes them under operator.read, keeps create/delete unsupported, and leaves managed enumeration for a follow-up once there is an actual source of truth.

@BunsDev BunsDev merged commit 63de304 into openclaw:main May 5, 2026
111 checks passed
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
…anks @ai-hpc

Co-authored-by: ai-hpc <183861985+ai-hpc@users.noreply.github.com>
Co-authored-by: BunsDev <68980965+BunsDev@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app: macos App: macos app: web-ui App: web-ui docs Improvements or additions to documentation gateway Gateway runtime size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gateway RPC: add SDK-facing environment discovery APIs

2 participants