Skip to content

feat: add Codex fast mode toggle (/fast command)#6875

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-d0607f0a
Apr 10, 2026
Merged

feat: add Codex fast mode toggle (/fast command)#6875
teknium1 merged 1 commit into
mainfrom
hermes/hermes-d0607f0a

Conversation

@teknium1

@teknium1 teknium1 commented Apr 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Salvage of PR #6817 by @g-guthrie — cherry-picked onto current main with cleanups.

Adds a /fast slash command to toggle OpenAI Codex service_tier between normal and priority ("fast") inference. Only exposed for models registered in _FAST_MODE_BACKEND_CONFIG (currently gpt-5.4 on openai-codex).

What it does

  • /fast — shows current tier status
  • /fast fast — enables priority service tier
  • /fast normal — disables priority tier
  • /fast status — shows current status
  • Hidden from help/autocomplete when the active model doesn't support fast mode
  • Persists to config.yaml under agent.service_tier
  • Also fixes: omit max_output_tokens for Codex backend (rejects that parameter)

Salvage cleanups from original PR

  • Removed simple_term_menu menu (banned — rendering bugs in tmux/iTerm2)
  • Removed input() fallback (hangs in prompt_toolkit event loop)
  • Bare /fast now shows status (like /reasoning) instead of opening a menu
  • Removed redundant override resolution in _build_api_kwargs — overrides flow solely through request_overrides from route resolution (single source of truth)
  • Updated tests to match

Files changed

  • cli.py/fast handler, service_tier config parsing, command visibility filter
  • run_agent.pyservice_tier + request_overrides on AIAgent, is_codex_backend guard for max_output_tokens
  • hermes_cli/commands.py — CommandDef + command_filter on SlashCommandCompleter
  • hermes_cli/config.pyservice_tier in DEFAULT_CONFIG
  • hermes_cli/models.py_FAST_MODE_BACKEND_CONFIG registry, resolve functions
  • Tests: 224 passed

Live-verified

  • PTY session with openai-codex OAuth + gpt-5.4
  • /fast → shows normal, /fast fast → enables, verified response works
  • /fast normal → disables, verified response works
  • /fast on claude-sonnet-4 → correctly blocked with message

Salvage of #6817 — will close the original PR after merge with credit.

Add /fast slash command to toggle OpenAI Codex service_tier between
normal and priority ('fast') inference. Only exposed for models
registered in _FAST_MODE_BACKEND_CONFIG (currently gpt-5.4).

- Registry-based backend config for extensibility
- Dynamic command visibility (hidden from help/autocomplete for
  non-supported models) via command_filter on SlashCommandCompleter
- service_tier flows through request_overrides from route resolution
- Omit max_output_tokens for Codex backend (rejects it)
- Persists to config.yaml under agent.service_tier

Salvage cleanup: removed simple_term_menu/input() menu (banned),
bare /fast now shows status like /reasoning. Removed redundant
override resolution in _build_api_kwargs — single source of truth
via request_overrides from route.

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
@teknium1 teknium1 merged commit d416a69 into main Apr 10, 2026
5 of 6 checks passed

@ahmdbom518-cyber ahmdbom518-cyber left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants