Skip to content

feat(compression): raise compaction trigger to 85% for gpt-5.5 on Codex OAuth#40957

Merged
teknium1 merged 1 commit into
mainfrom
feat/codex-gpt55-compaction-autoraise
Jun 7, 2026
Merged

feat(compression): raise compaction trigger to 85% for gpt-5.5 on Codex OAuth#40957
teknium1 merged 1 commit into
mainfrom
feat/codex-gpt55-compaction-autoraise

Conversation

@teknium1

@teknium1 teknium1 commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Infographic

Codex gpt-5.5 smarter compaction infographic

Summary

The ChatGPT Codex OAuth backend hard-caps gpt-5.5 at a 272K context window. At the default 50% compaction trigger, Hermes starts summarizing at ~136K tokens — half the window the model can actually use. This PR raises the trigger to 85% (~231K) for that one route, gated by a new opt-out config flag, and notifies the user once with the exact revert command.

The same gpt-5.5 slug exposes a much larger window on other routes (1.05M on the direct OpenAI API and OpenRouter, 400K on GitHub Copilot), so the autoraise is scoped to the Codex OAuth route only — every other provider keeps the user's global compression.threshold.

Why 272K is real (not a metadata bug)

Verified live against chatgpt.com/backend-api/codex:

Request size Result
/models probe context_window: 272000, max_context_window: 272000
~250,022 tokens (under 272K) completed (server reported input_tokens=250022)
~330,000 tokens (over 272K, under 400K) rejectedcontext_length_exceeded

A request the server itself counted at 250K input tokens went through; bumping to ~330K — which would fit a 400K window — was hard-rejected. The cap is genuine and enforced by the Codex backend.

Changes

  • agent/auxiliary_client.py
    • New _is_codex_gpt55(model, provider) — matches the gpt-5.5 family (incl. -pro, dated snapshots, aggregator-prefixed) only when provider == "openai-codex".
    • _compression_threshold_for_model() is now provider-aware and takes an allow_codex_gpt55_autoraise flag; returns 0.85 for Codex gpt-5.5, None (use global) otherwise. The existing Arcee Trinity 0.75 override is unchanged and unaffected by the new flag.
  • hermes_cli/config.py — new compression.codex_gpt55_autoraise key (default true); _config_version bumped 27 → 28 so existing configs get the key backfilled on migration.
  • agent/agent_init.py — reads the flag, passes provider into the threshold resolver, and emits a one-time notice (inline print for CLI; _compression_warning replay through status_callback for gateway) with the opt-out command. New _build_codex_gpt55_autoraise_notice() helper builds the shared text.
  • Docsdeveloper-guide/context-compression-and-caching.md documents the new key and route-scoped behavior.

Opt-out

hermes config set compression.codex_gpt55_autoraise false

Tests

tests/agent/test_arcee_trinity_overrides.py extended with Codex gpt-5.5 cases (provider gating, family vs sibling-slug matching, opt-out, and the guarantee that opt-out does not disable the Arcee Trinity override). 40/40 pass.

End-to-end verified on a real AIAgent against the live 272K context length:

  • default → threshold_tokens = 231,200 (85%), notice + _compression_warning set
  • opt-out → threshold_tokens = 136,000 (50%), no notice

…ex OAuth

The ChatGPT Codex OAuth backend hard-caps gpt-5.5 at a 272K context window
(verified live: a ~330K-token request to chatgpt.com/backend-api/codex/responses
is rejected with context_length_exceeded while ~250K succeeds; the same slug
exposes 1.05M on the direct OpenAI API / OpenRouter and 400K on Copilot). At the
default 50% trigger, auto-compaction fires at ~136K — half the usable window.

Raise the trigger to 85% (~231K) on this exact route only, gated by a new
compression.codex_gpt55_autoraise config flag (default true). When it fires,
emit a one-time notice (CLI inline print + gateway status_callback replay) with
the exact opt-back-out command. gpt-5.5 on any other provider keeps the user's
global threshold.

- _is_codex_gpt55() matches the 5.5 family only on provider=openai-codex
- _compression_threshold_for_model() now provider-aware + opt-out param
- config key + _config_version bump (27->28) for backfill
- docs + tests (40 cases in test_arcee_trinity_overrides.py)
@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: feat/codex-gpt55-compaction-autoraise vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9970 on HEAD, 9970 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 5172 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder provider/openai OpenAI / Codex Responses API labels Jun 7, 2026
@teknium1 teknium1 merged commit 0524c9b into main Jun 7, 2026
24 checks passed
@teknium1 teknium1 deleted the feat/codex-gpt55-compaction-autoraise branch June 7, 2026 08:40
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
…ex OAuth (NousResearch#40957)

The ChatGPT Codex OAuth backend hard-caps gpt-5.5 at a 272K context window
(verified live: a ~330K-token request to chatgpt.com/backend-api/codex/responses
is rejected with context_length_exceeded while ~250K succeeds; the same slug
exposes 1.05M on the direct OpenAI API / OpenRouter and 400K on Copilot). At the
default 50% trigger, auto-compaction fires at ~136K — half the usable window.

Raise the trigger to 85% (~231K) on this exact route only, gated by a new
compression.codex_gpt55_autoraise config flag (default true). When it fires,
emit a one-time notice (CLI inline print + gateway status_callback replay) with
the exact opt-back-out command. gpt-5.5 on any other provider keeps the user's
global threshold.

- _is_codex_gpt55() matches the 5.5 family only on provider=openai-codex
- _compression_threshold_for_model() now provider-aware + opt-out param
- config key + _config_version bump (27->28) for backfill
- docs + tests (40 cases in test_arcee_trinity_overrides.py)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have provider/openai OpenAI / Codex Responses API type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants