Skip to content

github-copilot: live catalog discovery via /models + add gpt-5.5#79566

Merged
steipete merged 4 commits into
openclaw:mainfrom
efpiva:efpiva/github-copilot-dynamic-discovery
May 9, 2026
Merged

github-copilot: live catalog discovery via /models + add gpt-5.5#79566
steipete merged 4 commits into
openclaw:mainfrom
efpiva:efpiva/github-copilot-dynamic-discovery

Conversation

@efpiva

@efpiva efpiva commented May 9, 2026

Copy link
Copy Markdown
Contributor

What

Two interrelated changes to the bundled github-copilot provider plugin:

  1. Wire up live model catalog discovery via ${baseUrl}/models. The plugin's catalog.run hook already exchanged a GitHub OAuth token for a short-lived Copilot API token and resolved the per-account baseUrl, but it returned models: [] and the runtime relied entirely on the static manifest catalog. This change adds fetchCopilotModelCatalog (in extensions/github-copilot/models.ts) and calls it from the hook so the runtime picks up real per-account model availability + accurate context windows.
  2. Add gpt-5.5 to the static manifest catalog and DEFAULT_MODEL_IDS with correct values from the API (contextWindow: 400000, maxTokens: 128000, reasoning: true, multimodal) so users on discovery.enabled: false still get it without overriding models.providers.github-copilot.models in their config.

Also bumps modelCatalog.discovery."github-copilot" from "static" to "refreshable" in openclaw.plugin.json so the catalog hook is actually invoked at runtime — without this the discovery infrastructure treats the provider as static-only and never calls catalog.run.

Why

  • Static contextWindow values were a conservative 128k for every model, far below reality:
    • gpt-5.4 / 5.4-mini / 5.3-codex / 5.5: 400k actual (statically 128k)
    • claude-opus-4.6-1m / 4.7-1m-internal: 1M actual (not in static catalog at all)
    • claude-sonnet-4 base: 200k actual (statically 128k)
    • gpt-5-mini / gpt-5.1: 264k actual (statically 128k)
  • Newly published Copilot models didn't appear at all until the manifest was patched: gpt-5.5, gpt-5.1, gpt-5.1-codex / -codex-max / -codex-mini, gemini-3-pro-preview, claude-opus-*-1m internal variants. Static catalog was perpetually out of date.
  • Per-account entitlement was invisible — every user saw the same 22-model list regardless of plan / Copilot subscription tier.

How

fetchCopilotModelCatalog (new)

In extensions/github-copilot/models.ts. Calls ${baseUrl}/models with the resolved Copilot API token and the same Editor-Version / Copilot-Integration-Id headers used elsewhere in the plugin. Maps each entry to a ModelDefinitionConfig:

API field ModelDefinitionConfig field
capabilities.limits.max_context_window_tokens contextWindow
capabilities.limits.max_output_tokens maxTokens
supports.vision input: ["text", "image"] | ["text"]
Array.isArray(supports.reasoning_effort) && length > 0 reasoning: boolean
vendor === "Anthropic" or id matches claude* api: "anthropic-messages" else "openai-responses"

Filters out non-chat objects (capabilities.type !== "chat", e.g. embeddings) and internal routers (ids starting with accounts/). Dedupes by id (first wins). 10s default timeout via internal AbortController. Throws on non-2xx HTTP / parse failure so the caller decides recovery shape.

catalog.run hook (modified)

In extensions/github-copilot/index.ts. After the existing token-exchange block resolves both baseUrl and copilotApiToken, the hook now calls fetchCopilotModelCatalog. On any failure it returns models: [], which preserves the static manifest catalog as the visible fallback — no behavior regression for discovery.enabled: false, offline scenarios, or unauthenticated users.

Manifest

   "discovery": {
-    "github-copilot": "static"
+    "github-copilot": "refreshable"
   }

Real behavior proof

Behavior addressed: the bundled github-copilot provider plugin advertised a static model catalog with contextWindow: 128000 baked in for every entry, regardless of the user's Copilot subscription. Newly available Copilot models were invisible until the manifest was hand-patched. The catalog discovery hook was scaffolded (token exchange already worked) but returned an empty model list, so dynamic discovery did nothing.

Real environment tested: WSL2 host running Docker Desktop. Container image built locally from this PR branch with pnpm packnpm install -g openclaw-2026.5.6.tgz. Auth via openclaw models auth login-github-copilot (real device-flow against a personal github.com Copilot Pro subscription on the maintainer's account). No mocks; the container makes live HTTPS calls to api.github.com/copilot_internal/v2/token and api.githubcopilot.com/models.

Exact steps or command run after this patch: from the worktree containing this PR's branch, build the plugin and pack the tarball with pnpm install --frozen-lockfile && pnpm build && pnpm pack. Drop openclaw-2026.5.6.tgz into a downstream container's build-artifacts directory and rebuild the image (docker compose build && docker compose up -d). Inside the running container, authenticate against a personal Copilot account once (persists in the volume): docker compose exec -it codeclaw-openclaw openclaw models auth login-github-copilot. Then force a refresh against the live Copilot API: docker compose exec codeclaw-openclaw openclaw models list --provider github-copilot --refresh. Then run a single agent turn through the resolved model: docker compose exec codeclaw-openclaw openclaw agent --agent main --message 'Reply with exactly: pong'.

Evidence after fix: live terminal output captured from the running container after the refresh:

Model                                      Input      Ctx         Local Auth  Tags
github-copilot/claude-opus-4.6             text+image 977k        no    yes
github-copilot/claude-sonnet-4.6           text+image 977k        no    yes
github-copilot/claude-sonnet-4             text+image 211k        no    yes
github-copilot/gpt-5.4                     text+image 391k        no    yes
github-copilot/gpt-5.4-mini                text+image 391k        no    yes
github-copilot/gpt-5.3-codex               text+image 391k        no    yes
github-copilot/gpt-5-mini                  text+image 258k        no    yes
github-copilot/gpt-5.1                     text+image 258k        no    yes
github-copilot/gpt-5.1-codex               text+image 391k        no    yes
github-copilot/gpt-5.1-codex-max           text+image 391k        no    yes
github-copilot/gemini-3-flash-preview      text+image 125k        no    yes
github-copilot/gemini-3-pro-preview        text+image 125k        no    yes
github-copilot/gemini-3.1-pro-preview      text+image 125k        no    yes
github-copilot/gpt-5.5                     text+image 391k        no    yes   default

A subsequent live agent turn against the Copilot API returned the expected reply, with the gateway confirming the route through the github-copilot provider:

$ docker compose exec codeclaw-openclaw openclaw agent --agent main --message "Reply with exactly: pong"
[plugins] plugins.allow is empty; discovered non-bundled plugins may auto-load: acpx (...)
pong

Gateway log captured at the same moment, confirming the dynamic-resolved model and provider routing:

[agent/embedded] embedded run start: runId=... provider=github-copilot model=gpt-5.5 thinking=medium
[agent/embedded] embedded run prompt start: runId=... provider=github-copilot api=openai-responses endpoint=github-copilot-native route=native policy=none

Observed result after fix: models list --refresh now returns 30 entries with API-accurate context windows. The previously-invisible models surface (gpt-5.1 family, gemini-3-pro-preview, claude-opus-*-1m variants). Existing static-catalog ids that overlap (e.g. gpt-5.4, claude-opus-4.6) get their context windows replaced by the live API values (391k for gpt-5.4, 977k = 1M variant for claude-opus-4.6). gpt-5.5 marked default because the downstream container's config uses it as the agents.defaults.model.primary; live agent invocation reaches the Copilot API and returns the expected response. No regressions to the static fallback path verified by setting discovery.enabled: false (catalog reverts to the bundled 22-entry list with gpt-5.5 included).

What was not tested: behavior on github-enterprise (GHE) Copilot tokens — the token-exchange endpoint (api.github.com/copilot_internal/v2/token) is hardcoded by the existing plugin and neither this PR nor stock plugin support GHE. Tested only on personal github.com Copilot Pro. Streaming / inference for every individual model in the new dynamic catalog was not exercised end-to-end — verified inference for gpt-5.5 only; other entries are exposed via the unchanged prepareRuntimeAuth / wrapStreamFn paths and rely on the same auth token, so behavior is structurally equivalent to the static-catalog entries today. Plugin behavior when the user's Copilot subscription has zero accessible models was not exercised — code returns an empty array, falling back to the static catalog (same behavior as the old models: [] stub).

Tests added

5 new cases in extensions/github-copilot/models.test.ts (29/29 total pass):

  • fetchCopilotModelCatalog maps a representative /models response (chat models incl. an internal 1M-context Anthropic variant, a router, an embedding) to the right ModelDefinitionConfig shape with real context windows.
  • baseUrl trailing slash is normalized.
  • Duplicate ids in the API response are deduped (first wins).
  • Non-2xx HTTP raises so the caller can fall back to the static catalog.
  • Empty token / baseUrl reject synchronously without calling fetch.

pnpm exec oxfmt --check extensions/github-copilot/ clean. pnpm tsgo:core clean. pnpm exec oxlint extensions/github-copilot/ clean against this PR's diff (one pre-existing error in index.test.ts already on origin/main, unrelated to this change).

Backwards compatibility

  • Users on discovery.enabled: false: see exactly the same static catalog as before, plus gpt-5.5.
  • Users with no GitHub auth profile: hook returns null early (existing behavior). No fetch attempted.
  • Token-exchange failures: hook returns the manifest's DEFAULT_COPILOT_API_BASE_URL and an empty model list (existing behavior).
  • /models HTTP failure: caught, hook returns empty model list. Static manifest catalog continues to be visible.

@openclaw-barnacle openclaw-barnacle Bot added size: M triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 9, 2026
@clawsweeper

clawsweeper Bot commented May 9, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs changes before merge.

Summary
The PR adds live GitHub Copilot /models catalog discovery, marks the provider catalog refreshable, adds gpt-5.5 static fallback metadata, and updates docs, tests, and changelog.

Reproducibility: not applicable. as a feature PR rather than a bug report. Source inspection does confirm the current-main baseline: the catalog hook returns an empty model list, the manifest is static, and gpt-5.5 is absent from the static Copilot catalog.

Real behavior proof
Sufficient (terminal): The PR body includes after-fix terminal output from a real Docker-installed OpenClaw build plus gateway logs showing live github-copilot/gpt-5.5 routing and response.

Next step before merge
A narrow repair can address the merge-blocking guarded-fetch issue without changing the feature scope.

Security
Needs attention: The diff adds provider-local network discovery but uses raw fetch for a token-derived URL instead of the repository's guarded fetch helper.

Review findings

  • [P2] Use guarded fetch for Copilot catalog discovery — extensions/github-copilot/models.ts:222-231
Review details

Best possible solution:

Keep the provider-owned dynamic catalog design, route the Copilot /models request through guarded fetch with focused tests, then land one canonical PR and supersede the older overlapping #71924.

Do we have a high-confidence way to reproduce the issue?

Not applicable as a feature PR rather than a bug report. Source inspection does confirm the current-main baseline: the catalog hook returns an empty model list, the manifest is static, and gpt-5.5 is absent from the static Copilot catalog.

Is this the best way to solve the issue?

No, not as written: the provider plugin is the right boundary and the fallback behavior is maintainable, but the new network discovery must use the existing guarded fetch pattern before merge.

Full review comments:

  • [P2] Use guarded fetch for Copilot catalog discovery — extensions/github-copilot/models.ts:222-231
    The new /models request builds its URL from the Copilot token-derived baseUrl and then calls raw fetch, bypassing the SSRF/private-network guard used by the existing Copilot embedding discovery and other provider discovery paths. If the cached token or token-derived proxy endpoint ever points at a local/private host, normal catalog refresh can probe it; route this through fetchWithSsrFGuard with a base-URL policy before merge.
    Confidence: 0.87

Overall correctness: patch is incorrect
Overall confidence: 0.87

Security concerns:

  • [medium] Guard token-derived Copilot catalog fetches — extensions/github-copilot/models.ts:222
    fetchCopilotModelCatalog builds ${baseUrl}/models from the Copilot token exchange result and fetches it directly. Existing Copilot embedding discovery uses fetchWithSsrFGuard for the same endpoint class, so this new path should do the same to preserve private-network and SSRF protections.
    Confidence: 0.87

Acceptance criteria:

  • pnpm test extensions/github-copilot/models.test.ts extensions/github-copilot/index.test.ts
  • pnpm exec oxfmt --check --threads=1 extensions/github-copilot/
  • pnpm tsgo:core

What I checked:

  • Current main catalog stub: Current main still exchanges a GitHub token for a Copilot base URL but returns models: [] from the GitHub Copilot catalog hook, so live model rows are not exposed today. (extensions/github-copilot/index.ts:387, 612e72ebbd43)
  • Current static manifest gap: Current main has modelCatalog.discovery.github-copilot set to static and the static rows include gpt-5.4* but not gpt-5.5. (extensions/github-copilot/openclaw.plugin.json:213, 612e72ebbd43)
  • PR implementation surface: The PR adds fetchCopilotModelCatalog, calls it from the catalog hook, changes the manifest discovery mode to refreshable, and adds gpt-5.5 to the static manifest and generated defaults. (extensions/github-copilot/models.ts:205, a6c50deec445)
  • Security blocker: The new catalog fetch calls raw fetchImpl(url, ...) where url is built from the Copilot token-derived baseUrl, bypassing the SSRF/private-network guard used by other provider discovery paths. (extensions/github-copilot/models.ts:222, a6c50deec445)
  • Existing guarded pattern: The existing GitHub Copilot embedding /models discovery uses fetchWithSsrFGuard with a policy derived from the same base URL class, giving a local pattern for the PR to reuse. (extensions/github-copilot/embeddings.ts:81, 612e72ebbd43)
  • Real behavior proof: The PR body includes copied terminal output from a Docker-installed OpenClaw tarball showing refreshed GitHub Copilot rows, corrected context windows, and a live openclaw agent run routed through github-copilot/gpt-5.5. (a6c50deec445)

Likely related people:

  • obviyus: Commit 42b352c published the current GitHub Copilot manifest catalog and models-defaults surface that this PR extends. (role: static catalog introducer; confidence: high; commits: 42b352c57eb8; files: extensions/github-copilot/openclaw.plugin.json, extensions/github-copilot/models-defaults.ts, extensions/github-copilot/models.test.ts)
  • vincentkoc: Commit 6dba5cc recently changed the GitHub Copilot live discovery config and catalog hook tests around the same provider-owned discovery path. (role: catalog hook/config maintainer; confidence: medium; commits: 6dba5cc2a038; files: extensions/github-copilot/index.ts, extensions/github-copilot/index.test.ts)
  • feiskyer: Commit 88d3620 added the GitHub Copilot embedding provider, including an existing guarded /models discovery implementation relevant to this PR's network-fetch safety issue. (role: adjacent guarded /models owner; confidence: medium; commits: 88d3620a85bf; files: extensions/github-copilot/embeddings.ts, extensions/github-copilot/embeddings.test.ts, extensions/github-copilot/auth.ts)
  • steipete: Recent history shows repeated Copilot/provider-discovery maintenance, including provider discovery config refactors and Copilot live-test upkeep near this surface. (role: recent adjacent maintainer; confidence: medium; commits: 19de5d1b569d, 7e733bedabf5, d8b9ace39cc1; files: extensions/github-copilot/index.ts, extensions/github-copilot/openclaw.plugin.json, extensions/github-copilot/connection-bound-ids.live.test.ts)

Remaining risk / open question:

  • The new raw catalog fetch can reach a local or private host if the Copilot token-derived base URL is unexpected or a cached token is tampered with; this should use the repository's guarded fetch path before merge.
  • The PR head has failing build-artifacts, build-smoke, check-additional, and checks-node-agentic-cli checks plus in-progress checks, so merge readiness is not proven yet.
  • The open fix: add GPT-5.5 capabilities to GitHub Copilot provider catalog #71924 overlaps this work and needs to be superseded or reconciled once one implementation lands.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 612e72ebbd43.

@efpiva efpiva force-pushed the efpiva/github-copilot-dynamic-discovery branch 2 times, most recently from 5475683 to 8eca29d Compare May 9, 2026 00:16
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 9, 2026
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. docs Improvements or additions to documentation size: L and removed proof: sufficient ClawSweeper judged the real behavior proof convincing. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. size: M labels May 9, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 9, 2026
@galiniliev galiniliev self-assigned this May 9, 2026
@steipete steipete force-pushed the efpiva/github-copilot-dynamic-discovery branch from a6c50de to 2e4ed4b Compare May 9, 2026 01:38
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 9, 2026
@steipete steipete force-pushed the efpiva/github-copilot-dynamic-discovery branch from 2e4ed4b to fe5eae0 Compare May 9, 2026 01:43
@openclaw-barnacle openclaw-barnacle Bot added cli CLI command changes commands Command implementations labels May 9, 2026
efpiva and others added 4 commits May 9, 2026 02:48
The plugin's `catalog.run` hook already exchanged a GitHub OAuth token
for a short-lived Copilot API token and resolved the per-account baseUrl,
but it returned `models: []` and the bundled openclaw runtime relied
entirely on the static manifest catalog. That meant:

- Static `contextWindow` values were a conservative 128k for every
  model, far below reality (gpt-5.4/5.5 are 400k, claude-opus-4.6/4.7
  internal variants are 1M, claude-sonnet-4 is 200k, etc.).
- Newly published Copilot models (gpt-5.5, gpt-5.1*, gemini-3-pro-preview,
  the claude-opus-*-1m internal variants, etc.) didn't appear at all
  until the manifest was patched.
- Per-account entitlement was invisible — every user saw the same
  hardcoded 22-model list regardless of plan.

Wire it up:

- Add `fetchCopilotModelCatalog` in `extensions/github-copilot/models.ts`.
  Calls `${baseUrl}/models` with the resolved Copilot API token and the
  same Editor-Version / Copilot-Integration-Id headers used elsewhere in
  the plugin. Maps each entry to a `ModelDefinitionConfig`:
  - `contextWindow` ← `capabilities.limits.max_context_window_tokens`
  - `maxTokens`     ← `capabilities.limits.max_output_tokens`
  - `input`         ← `["text", "image"]` if `supports.vision`, else `["text"]`
  - `reasoning`     ← `Array.isArray(supports.reasoning_effort) && supports.reasoning_effort.length > 0`
  - `api`           ← `anthropic-messages` for Anthropic vendor or claude*
                      ids; otherwise `openai-responses`
  Filters out non-chat objects (embeddings) and internal routers
  (`accounts/...` ids). Dedupes by id. 10s default timeout.

- Update the `catalog.run` hook in `extensions/github-copilot/index.ts`
  to call the new function after token-exchange and return the live
  results. On any HTTP/parse failure it falls back to `models: []`,
  which preserves the static manifest catalog as the visible fallback —
  no behavior regression for users with `discovery.enabled: false` or
  in offline scenarios.

- Bump `modelCatalog.discovery."github-copilot"` from `"static"` to
  `"refreshable"` in `openclaw.plugin.json` so the catalog hook is
  actually invoked at runtime. Without this the discovery infrastructure
  treats the provider as static-only and never calls `catalog.run`.

- Add `gpt-5.5` to the static manifest catalog and `DEFAULT_MODEL_IDS`
  with the correct values from the API (`contextWindow: 400000`,
  `maxTokens: 128000`, `reasoning: true`, multimodal). This means users
  on `discovery.enabled: false` still get gpt-5.5 visible without
  needing to override `models.providers.github-copilot.models` in their
  config.

Tests added (5, all passing alongside the existing 24):

- `fetchCopilotModelCatalog` maps a representative `/models` response
  (chat models incl. an internal 1M-context Anthropic variant, a router,
  an embedding) to the right `ModelDefinitionConfig` shape with real
  context windows.
- baseUrl trailing slash is normalized.
- Duplicate ids in the API response are deduped (first wins).
- Non-2xx HTTP raises so the caller can fall back to the static catalog.
- Empty token / baseUrl reject synchronously without calling fetch.

Targeted run: `pnpm test extensions/github-copilot/models.test.ts` →
29/29 pass. `pnpm exec oxfmt --check extensions/github-copilot/` clean.
`pnpm tsgo:core` clean.

Real-world proof:

Built locally and dropped the resulting tarball into a downstream
container with `gh auth login --hostname github.com` (Copilot
subscription on the linked account). Before this change,
`openclaw models list --provider github-copilot` returned the 22-entry
static catalog with every entry showing 128k context. After this change,
the same command (with `--refresh`) returns 30 entries with API-accurate
context windows including the new gpt-5.1 family, the claude-opus-*-1m
variants, and the corrected `gemini-3*-preview` ids.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add an accordion under the Built-in provider tab describing the runtime
catalog refresh from the Copilot `/models` endpoint and the
`plugins.entries.github-copilot.config.discovery.enabled = false` opt-out
for offline / air-gapped scenarios. Pairs with the
`fetchCopilotModelCatalog` change so users know what the new behavior
is and how to disable it if needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Galin asked for shorter changelog entries — collapse the 2 long
github-copilot bullets (one per code change) into a single one-line
entry that points at the runtime behavior. The PR body retains the
full mapping/fallback/header detail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@steipete steipete force-pushed the efpiva/github-copilot-dynamic-discovery branch from fe5eae0 to 058e1b4 Compare May 9, 2026 01:48
@openclaw-barnacle openclaw-barnacle Bot removed cli CLI command changes commands Command implementations labels May 9, 2026
@steipete steipete merged commit 880c094 into openclaw:main May 9, 2026
108 of 110 checks passed
@steipete

steipete commented May 9, 2026

Copy link
Copy Markdown
Contributor

Landed via rebase onto main.

Thanks @efpiva!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation proof: supplied External PR includes structured after-fix real behavior proof. size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants