Skip to content

feat(codex-runtime): skip unavailable plugins during migration#25437

Merged
teknium1 merged 1 commit into
mainfrom
feat/codex-runtime-availability-filter
May 14, 2026
Merged

feat(codex-runtime): skip unavailable plugins during migration#25437
teknium1 merged 1 commit into
mainfrom
feat/codex-runtime-availability-filter

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Followup to merged PR #24182. Audited recent OpenClaw codex PRs for fixes we should consider; landed one real gap as a new commit.

What this PR changes

The actual fix: when migrating native Codex plugins into ~/.codex/config.toml, skip plugins where codex itself reports availability != AVAILABLE. These are plugins installed but broken — expired OAuth, removed from marketplace, install corrupted. Writing them into config causes the user's first codex turn after enabling the runtime to fail.

Cf. openclaw/openclaw#80815 — they hit the same footgun.

Why this matters

Our /codex-runtime codex_app_server enable flow queries codex's plugin/list and trusts installed=true. But codex also exposes an availability field on each plugin entry. If a user has google-calendar@openai-curated installed but its OAuth has expired, codex reports installed: true, availability: REQUIRES_AUTH. We were migrating it anyway, and the user's next codex turn would fail with an activation error.

The fix is one filter in _query_codex_plugins():

availability = str(plugin.get("availability") or "").upper()
if availability and availability != "AVAILABLE":
    continue

Backward compat: empty/missing field falls through (older codex versions or marketplaces that don't set it still work).

OpenClaw codex PRs I reviewed and chose NOT to apply

Documented in the commit message for the record:

OpenClaw PR Title Decision
#81591 Load Codex for selectable OpenAI agent models No — we resolve runtime per-call, no startup-time gating to fix
#81510 restore Codex cron automation compatibility No — their fix is for OpenClaw-specific cron orchestration; we documented cron as untested
#81223 rotate incompatible context-engine threads No — we don't have a Lossless context engine equivalent
#80688 constrain Codex app-server sandbox No — we don't have an outer-sandbox concept
#80616 release interrupted app-server turns on turn_aborted No — we already handle status=interrupted in turn/completed correctly
#80278 expose activeModel plugin context No — not our surface
#80792 default destructive_actions on No — we don't expose that config knob

Tests

test_plugin_discovery_skips_unavailable_plugins — 4 cases:
  good-plugin   (installed=True, availability=AVAILABLE)      → migrated
  broken-plugin (installed=True, availability=UNAVAILABLE)    → skipped
  auth-pending  (installed=True, availability=REQUIRES_AUTH)  → skipped
  legacy-plugin (installed=True, no availability field)        → migrated
                  (older codex versions; preserve backward compat)

56 codex-runtime migration tests, all green.

Docs

Added a bullet to the docs page's "What's NOT migrated" section explaining the availability filter and why.

Docs URL (live after next deploy):
https://hermes-agent.nousresearch.com/docs/user-guide/features/codex-app-server-runtime

Followup to PR #24182 — caught when scanning OpenClaw for recent codex
fixes we hadn't considered. OpenClaw learned the hard way (#80815) that
migrating plugins which codex itself reports as unavailable produces
config that fails at activation time.

Our /codex-runtime codex_app_server enable path queries codex's
plugin/list and migrates everything where installed=true. We were
trusting codex's installation state and ignoring its availability
field. So a plugin that's installed=true but availability=UNAVAILABLE
(broken local install) or REQUIRES_AUTH (OAuth expired or never
completed) would get an [plugins."<n>@openai-curated"] entry in
~/.codex/config.toml — and the user's first codex turn after enabling
the runtime would fail because codex refuses to activate it.

Fix: filter on availability in _query_codex_plugins(). Only emit
plugins where availability is empty (older codex versions without the
field — preserve backward compat) or explicitly AVAILABLE.

Tests:
  test_plugin_discovery_skips_unavailable_plugins — verifies 4 cases:
    - good-plugin (installed=True, availability=AVAILABLE) → migrated
    - broken-plugin (installed=True, availability=UNAVAILABLE) → skipped
    - auth-pending (installed=True, availability=REQUIRES_AUTH) → skipped
    - legacy-plugin (installed=True, no availability field) → migrated
      (older codex versions; preserve backward compat)

Docs:
  Added bullet to 'What's NOT migrated' list in the docs page calling
  out the availability filter and why.

Other OpenClaw codex PRs I reviewed but did NOT apply (with reasoning):
  - #81591 (load Codex for selectable models): we resolve runtime
    per-call already, no startup-time gating to fix
  - #81510 (cron compatibility): we documented cron as untested; their
    fix is for OpenClaw-specific cron orchestration shape
  - #81223 (rotate incompatible context-engine threads): we don't
    have a Lossless context engine equivalent
  - #80688 (constrain sandbox): we don't have an outer-sandbox concept
  - #80616 (release on turn_aborted): we already handle status=
    interrupted in turn/completed correctly
  - #80278 (expose activeModel in plugin SDK): not our surface
  - #80792 (default destructive_actions on): we don't expose that knob

56 codex-runtime migration tests still green (+1 new).
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: feat/codex-runtime-availability-filter vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8308 on HEAD, 8308 on base (➖ 0)

🆕 New issues (3):

Rule Count
invalid-argument-type 3
First entries
run_agent.py:13663: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
run_agent.py:13660: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:7449: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`

✅ Fixed issues (3):

Rule Count
invalid-argument-type 3
First entries
run_agent.py:7449: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:13660: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:13663: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown, Unknown] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`

Unchanged: 4373 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@teknium1 teknium1 merged commit d5775fe into main May 14, 2026
16 of 17 checks passed
@teknium1 teknium1 deleted the feat/codex-runtime-availability-filter branch May 14, 2026 05:20
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/plugins Plugin system and bundled plugins labels May 14, 2026
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
…esearch#25437)

Followup to PR NousResearch#24182 — caught when scanning OpenClaw for recent codex
fixes we hadn't considered. OpenClaw learned the hard way (#80815) that
migrating plugins which codex itself reports as unavailable produces
config that fails at activation time.

Our /codex-runtime codex_app_server enable path queries codex's
plugin/list and migrates everything where installed=true. We were
trusting codex's installation state and ignoring its availability
field. So a plugin that's installed=true but availability=UNAVAILABLE
(broken local install) or REQUIRES_AUTH (OAuth expired or never
completed) would get an [plugins."<n>@openai-curated"] entry in
~/.codex/config.toml — and the user's first codex turn after enabling
the runtime would fail because codex refuses to activate it.

Fix: filter on availability in _query_codex_plugins(). Only emit
plugins where availability is empty (older codex versions without the
field — preserve backward compat) or explicitly AVAILABLE.

Tests:
  test_plugin_discovery_skips_unavailable_plugins — verifies 4 cases:
    - good-plugin (installed=True, availability=AVAILABLE) → migrated
    - broken-plugin (installed=True, availability=UNAVAILABLE) → skipped
    - auth-pending (installed=True, availability=REQUIRES_AUTH) → skipped
    - legacy-plugin (installed=True, no availability field) → migrated
      (older codex versions; preserve backward compat)

Docs:
  Added bullet to 'What's NOT migrated' list in the docs page calling
  out the availability filter and why.

Other OpenClaw codex PRs I reviewed but did NOT apply (with reasoning):
  - #81591 (load Codex for selectable models): we resolve runtime
    per-call already, no startup-time gating to fix
  - #81510 (cron compatibility): we documented cron as untested; their
    fix is for OpenClaw-specific cron orchestration shape
  - #81223 (rotate incompatible context-engine threads): we don't
    have a Lossless context engine equivalent
  - #80688 (constrain sandbox): we don't have an outer-sandbox concept
  - #80616 (release on turn_aborted): we already handle status=
    interrupted in turn/completed correctly
  - #80278 (expose activeModel in plugin SDK): not our surface
  - #80792 (default destructive_actions on): we don't expose that knob

56 codex-runtime migration tests still green (+1 new).
AlexFoxD pushed a commit to AlexFoxD/hermes-agent that referenced this pull request May 21, 2026
…esearch#25437)

Followup to PR NousResearch#24182 — caught when scanning OpenClaw for recent codex
fixes we hadn't considered. OpenClaw learned the hard way (#80815) that
migrating plugins which codex itself reports as unavailable produces
config that fails at activation time.

Our /codex-runtime codex_app_server enable path queries codex's
plugin/list and migrates everything where installed=true. We were
trusting codex's installation state and ignoring its availability
field. So a plugin that's installed=true but availability=UNAVAILABLE
(broken local install) or REQUIRES_AUTH (OAuth expired or never
completed) would get an [plugins."<n>@openai-curated"] entry in
~/.codex/config.toml — and the user's first codex turn after enabling
the runtime would fail because codex refuses to activate it.

Fix: filter on availability in _query_codex_plugins(). Only emit
plugins where availability is empty (older codex versions without the
field — preserve backward compat) or explicitly AVAILABLE.

Tests:
  test_plugin_discovery_skips_unavailable_plugins — verifies 4 cases:
    - good-plugin (installed=True, availability=AVAILABLE) → migrated
    - broken-plugin (installed=True, availability=UNAVAILABLE) → skipped
    - auth-pending (installed=True, availability=REQUIRES_AUTH) → skipped
    - legacy-plugin (installed=True, no availability field) → migrated
      (older codex versions; preserve backward compat)

Docs:
  Added bullet to 'What's NOT migrated' list in the docs page calling
  out the availability filter and why.

Other OpenClaw codex PRs I reviewed but did NOT apply (with reasoning):
  - #81591 (load Codex for selectable models): we resolve runtime
    per-call already, no startup-time gating to fix
  - #81510 (cron compatibility): we documented cron as untested; their
    fix is for OpenClaw-specific cron orchestration shape
  - #81223 (rotate incompatible context-engine threads): we don't
    have a Lossless context engine equivalent
  - #80688 (constrain sandbox): we don't have an outer-sandbox concept
  - #80616 (release on turn_aborted): we already handle status=
    interrupted in turn/completed correctly
  - #80278 (expose activeModel in plugin SDK): not our surface
  - #80792 (default destructive_actions on): we don't expose that knob

56 codex-runtime migration tests still green (+1 new).
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…esearch#25437)

Followup to PR NousResearch#24182 — caught when scanning OpenClaw for recent codex
fixes we hadn't considered. OpenClaw learned the hard way (#80815) that
migrating plugins which codex itself reports as unavailable produces
config that fails at activation time.

Our /codex-runtime codex_app_server enable path queries codex's
plugin/list and migrates everything where installed=true. We were
trusting codex's installation state and ignoring its availability
field. So a plugin that's installed=true but availability=UNAVAILABLE
(broken local install) or REQUIRES_AUTH (OAuth expired or never
completed) would get an [plugins."<n>@openai-curated"] entry in
~/.codex/config.toml — and the user's first codex turn after enabling
the runtime would fail because codex refuses to activate it.

Fix: filter on availability in _query_codex_plugins(). Only emit
plugins where availability is empty (older codex versions without the
field — preserve backward compat) or explicitly AVAILABLE.

Tests:
  test_plugin_discovery_skips_unavailable_plugins — verifies 4 cases:
    - good-plugin (installed=True, availability=AVAILABLE) → migrated
    - broken-plugin (installed=True, availability=UNAVAILABLE) → skipped
    - auth-pending (installed=True, availability=REQUIRES_AUTH) → skipped
    - legacy-plugin (installed=True, no availability field) → migrated
      (older codex versions; preserve backward compat)

Docs:
  Added bullet to 'What's NOT migrated' list in the docs page calling
  out the availability filter and why.

Other OpenClaw codex PRs I reviewed but did NOT apply (with reasoning):
  - #81591 (load Codex for selectable models): we resolve runtime
    per-call already, no startup-time gating to fix
  - #81510 (cron compatibility): we documented cron as untested; their
    fix is for OpenClaw-specific cron orchestration shape
  - #81223 (rotate incompatible context-engine threads): we don't
    have a Lossless context engine equivalent
  - #80688 (constrain sandbox): we don't have an outer-sandbox concept
  - #80616 (release on turn_aborted): we already handle status=
    interrupted in turn/completed correctly
  - #80278 (expose activeModel in plugin SDK): not our surface
  - #80792 (default destructive_actions on): we don't expose that knob

56 codex-runtime migration tests still green (+1 new).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants