feat(compression): raise compaction trigger to 85% for gpt-5.5 on Codex OAuth by teknium1 · Pull Request #40957 · NousResearch/hermes-agent

teknium1 · 2026-06-07T03:10:58Z

Infographic

Summary

The ChatGPT Codex OAuth backend hard-caps gpt-5.5 at a 272K context window. At the default 50% compaction trigger, Hermes starts summarizing at ~136K tokens — half the window the model can actually use. This PR raises the trigger to 85% (~231K) for that one route, gated by a new opt-out config flag, and notifies the user once with the exact revert command.

The same gpt-5.5 slug exposes a much larger window on other routes (1.05M on the direct OpenAI API and OpenRouter, 400K on GitHub Copilot), so the autoraise is scoped to the Codex OAuth route only — every other provider keeps the user's global compression.threshold.

Why 272K is real (not a metadata bug)

Verified live against chatgpt.com/backend-api/codex:

Request size	Result
`/models` probe	`context_window: 272000`, `max_context_window: 272000`
~250,022 tokens (under 272K)	completed (server reported `input_tokens=250022`)
~330,000 tokens (over 272K, under 400K)	rejected — `context_length_exceeded`

A request the server itself counted at 250K input tokens went through; bumping to ~330K — which would fit a 400K window — was hard-rejected. The cap is genuine and enforced by the Codex backend.

Changes

agent/auxiliary_client.py
- New _is_codex_gpt55(model, provider) — matches the gpt-5.5 family (incl. -pro, dated snapshots, aggregator-prefixed) only when provider == "openai-codex".
- _compression_threshold_for_model() is now provider-aware and takes an allow_codex_gpt55_autoraise flag; returns 0.85 for Codex gpt-5.5, None (use global) otherwise. The existing Arcee Trinity 0.75 override is unchanged and unaffected by the new flag.
hermes_cli/config.py — new compression.codex_gpt55_autoraise key (default true); _config_version bumped 27 → 28 so existing configs get the key backfilled on migration.
agent/agent_init.py — reads the flag, passes provider into the threshold resolver, and emits a one-time notice (inline print for CLI; _compression_warning replay through status_callback for gateway) with the opt-out command. New _build_codex_gpt55_autoraise_notice() helper builds the shared text.
Docs — developer-guide/context-compression-and-caching.md documents the new key and route-scoped behavior.

Opt-out

hermes config set compression.codex_gpt55_autoraise false

Tests

tests/agent/test_arcee_trinity_overrides.py extended with Codex gpt-5.5 cases (provider gating, family vs sibling-slug matching, opt-out, and the guarantee that opt-out does not disable the Arcee Trinity override). 40/40 pass.

End-to-end verified on a real AIAgent against the live 272K context length:

default → threshold_tokens = 231,200 (85%), notice + _compression_warning set
opt-out → threshold_tokens = 136,000 (50%), no notice

…ex OAuth The ChatGPT Codex OAuth backend hard-caps gpt-5.5 at a 272K context window (verified live: a ~330K-token request to chatgpt.com/backend-api/codex/responses is rejected with context_length_exceeded while ~250K succeeds; the same slug exposes 1.05M on the direct OpenAI API / OpenRouter and 400K on Copilot). At the default 50% trigger, auto-compaction fires at ~136K — half the usable window. Raise the trigger to 85% (~231K) on this exact route only, gated by a new compression.codex_gpt55_autoraise config flag (default true). When it fires, emit a one-time notice (CLI inline print + gateway status_callback replay) with the exact opt-back-out command. gpt-5.5 on any other provider keeps the user's global threshold. - _is_codex_gpt55() matches the 5.5 family only on provider=openai-codex - _compression_threshold_for_model() now provider-aware + opt-out param - config key + _config_version bump (27->28) for backfill - docs + tests (40 cases in test_arcee_trinity_overrides.py)

github-actions · 2026-06-07T03:11:53Z

🔎 Lint report: `feat/codex-gpt55-compaction-autoraise` vs `origin/main`

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9970 on HEAD, 9970 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 5172 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

…ex OAuth (NousResearch#40957) The ChatGPT Codex OAuth backend hard-caps gpt-5.5 at a 272K context window (verified live: a ~330K-token request to chatgpt.com/backend-api/codex/responses is rejected with context_length_exceeded while ~250K succeeds; the same slug exposes 1.05M on the direct OpenAI API / OpenRouter and 400K on Copilot). At the default 50% trigger, auto-compaction fires at ~136K — half the usable window. Raise the trigger to 85% (~231K) on this exact route only, gated by a new compression.codex_gpt55_autoraise config flag (default true). When it fires, emit a one-time notice (CLI inline print + gateway status_callback replay) with the exact opt-back-out command. gpt-5.5 on any other provider keeps the user's global threshold. - _is_codex_gpt55() matches the 5.5 family only on provider=openai-codex - _compression_threshold_for_model() now provider-aware + opt-out param - config key + _config_version bump (27->28) for backfill - docs + tests (40 cases in test_arcee_trinity_overrides.py)

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder provider/openai OpenAI / Codex Responses API labels Jun 7, 2026

teknium1 merged commit 0524c9b into main Jun 7, 2026
24 checks passed

teknium1 deleted the feat/codex-gpt55-compaction-autoraise branch June 7, 2026 08:40

agentswe mentioned this pull request Jun 7, 2026

fix(compression): make Codex gpt-5.5 autoraise a floor, never a reduction #41134

Open

13 tasks

sasquatch9818 mentioned this pull request Jun 7, 2026

fix(compression): keep Codex gpt-5.5 autoraise from lowering a higher threshold #41503

Open

19 tasks

JimStenstrom mentioned this pull request Jun 9, 2026

fix(compaction): add mid-turn in-flight compression safety valve #42898

Draft

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(compression): raise compaction trigger to 85% for gpt-5.5 on Codex OAuth#40957

feat(compression): raise compaction trigger to 85% for gpt-5.5 on Codex OAuth#40957
teknium1 merged 1 commit into
mainfrom
feat/codex-gpt55-compaction-autoraise

teknium1 commented Jun 7, 2026

Uh oh!

github-actions Bot commented Jun 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented Jun 7, 2026

Infographic

Summary

Why 272K is real (not a metadata bug)

Changes

Opt-out

Tests

Uh oh!

github-actions Bot commented Jun 7, 2026

🔎 Lint report: feat/codex-gpt55-compaction-autoraise vs origin/main

ruff

ty (type checker)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

🔎 Lint report: `feat/codex-gpt55-compaction-autoraise` vs `origin/main`