feat: /goal — persistent cross-turn goals (Ralph loop) by teknium1 · Pull Request #18262 · NousResearch/hermes-agent

teknium1 · 2026-05-01T05:48:11Z

Summary

Adds /goal <text>, a standing-goal slash command that keeps Hermes working toward a stated objective across turns until it is achieved, paused, or the turn budget runs out.

Inspiration / prior art: our take on the Ralph loop, directly inspired by Codex CLI 0.128.0's /goal — built by Eric Traut (Pyright) on the Codex team. Same core idea (keep the goal alive across turns, don't stop until it's achieved); implementation is independent and adapted to Hermes' architecture (central CommandDef registry, SessionDB.state_meta persistence, auxiliary-client judge, adapter-FIFO continuation on gateway).

After each turn, an auxiliary-model judge call asks 'is this goal satisfied by the assistant's last response?'. If not, Hermes feeds a continuation prompt back into the same session as a normal user turn. Any real user message preempts the loop automatically. Judge failures fail OPEN (continue) so a flaky judge can never wedge progress — the turn budget (default 20) is the real backstop.

Commands


`/goal <text>`	Set standing goal (kicks off first turn immediately)
`/goal` or `/goal status`	Show current state
`/goal pause`	Pause the continuation loop
`/goal resume`	Resume (resets turn counter)
`/goal clear`	Drop the goal

Works identically on CLI and gateway via the central CommandDef registry.

Design invariants preserved

Prompt cache — continuation prompts are regular user-role messages appended to history. No system-prompt mutation, no toolset swap.
Role alternation — continuation is a user turn, never injected mid-tool-loop.
Session persistence — goal state lives in SessionDB.state_meta keyed by goal:<session_id>, so /resume picks it up.
Mid-run safety — on the gateway, /goal status|pause|clear are allowed mid-run (control-plane only); setting a new goal requires /stop first so we don't race a second continuation prompt against the current turn.
Any-message preemption — on CLI, goal continuations go through _pending_input so a real user message queued during the judge call runs first. On gateway, they go through the adapter FIFO with the same effect.

Files

hermes_cli/goals.py (new) — GoalManager + judge_goal + GoalState
hermes_cli/commands.py — CommandDef entry (one line)
hermes_cli/config.py — goals.max_turns: 20 default
hermes_cli/web_server.py — dashboard category merge (goals → agent)
cli.py — _handle_goal_command + _maybe_continue_goal_after_turn hook in process_loop
gateway/run.py — _handle_goal_command + _post_turn_goal_continuation wrapping _handle_message_with_agent
tests/hermes_cli/test_goals.py (new, 26 tests)
website/docs/reference/slash-commands.md

Validation


Targeted tests	26/26 in `tests/hermes_cli/test_goals.py`
Broader `tests/hermes_cli/`	3519 passed (1 pre-existing copilot-auth flake, unrelated)
`tests/gateway/`	4407 passed (1 pre-existing `test_teams.py` plugin flake, unrelated)
`tests/hermes_cli/test_web_server.py`	Fixed `test_no_single_field_categories` (merged `goals → agent` in dashboard)
Live test 1 — judge round-trip	`done` / `continue` / `continue` (empty) / `continue` (partial) — all correct, 2-7s latency
Live test 2 — real CLI loop	Goal "print hello, Ralph loop" — turn 1 judge caught ambiguity ('didn't confirm terminal output'), turn 2 agent re-ran with confirmation, judge said done, loop halted cleanly
Live test 3 — budget exhaustion	4-file goal with `max_turns=2` → paused at 2/2 turns with correct `/goal resume` message, zero runaway

Notes

goals.max_turns is the only config knob for now. Kept minimal — adding more keys without concrete need would be speculative.
Judge uses get_text_auxiliary_client("goal_judge") so users can route it to a cheap model via the auxiliary.goal_judge config override if they want.
No _config_version bump needed — adding a new key to DEFAULT_CONFIG is handled by the existing _deep_merge in load_config.

Add a standing-goal slash command that keeps Hermes working toward a user-stated objective across turns until it is achieved, paused, or the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI 0.128.0's /goal. After each turn, a lightweight auxiliary-model judge call asks 'is this goal satisfied by the assistant's last response?'. If not, and we're under the turn budget (default 20), Hermes feeds a continuation prompt back into the same session as a normal user message. Any real user message preempts the continuation loop automatically. Judge failures fail OPEN (continue) so a flaky judge never wedges progress — the turn budget is the real backstop. ### Commands - `/goal <text>` — set a standing goal (kicks off the first turn) - `/goal` or `/goal status` — show current state - `/goal pause` — pause the continuation loop - `/goal resume` — resume (resets turn counter) - `/goal clear` — drop the goal Works on both CLI and gateway platforms via the central CommandDef registry. ### Design invariants preserved - **Prompt cache**: continuation prompts are regular user-role messages appended to history. No system-prompt mutation, no toolset swap. - **Role alternation**: continuation is a user turn, never injected mid-tool-loop. - **Session persistence**: goal state lives in SessionDB.state_meta keyed by `goal:<session_id>`, so `/resume` picks it up. - **Mid-run safety**: on the gateway, `/goal status|pause|clear` are allowed mid-run (control-plane only); setting a new goal requires `/stop` first so we don't race a second continuation prompt against the current turn. ### Files - `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state - `hermes_cli/commands.py` — CommandDef entry - `hermes_cli/config.py` — `goals.max_turns` default - `hermes_cli/web_server.py` — dashboard category merge - `cli.py` — /goal handler + post-turn continuation hook in process_loop - `gateway/run.py` — /goal handler + post-turn continuation hook wrapping _handle_message_with_agent - `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing, fail-open semantics, lifecycle, persistence, budget exhaustion - `website/docs/reference/slash-commands.md` — docs entry

Adds a proper feature page at user-guide/features/goals.md covering the /goal slash command — Hermes' take on the Ralph loop shipped in PR #18262. The slash-commands reference table had two table rows but no narrative doc walking through the judge model, fail-open semantics, turn budget, persistence, user-message preemption, or the aux-model config override. Adds a walkthrough example showing a multi-turn goal running to completion, covers the two judge failure modes with how to recover, and credits Codex CLI 0.128.0 / Eric Traut as prior art. Also cross-links both slash-commands.md rows to the new page so readers discovering /goal from the command reference can dive in.

…ousResearch#18262 + NousResearch#18275

heung0323 · 2026-05-06T21:46:31Z

Post-merge review notes for /goal PR. I found two correctness/reliability issues worth addressing in a follow-up:

P1 — Gateway goal judging blocks the async event loop.
In gateway/run.py, _handle_message() calls _post_turn_goal_continuation(...) synchronously after _handle_message_with_agent(...) returns. That path calls GoalManager.evaluate_after_turn(), which calls judge_goal(), which performs a blocking auxiliary-model client.chat.completions.create(..., timeout=30). Because this runs inside the gateway async handler, every active /goal turn can stall the gateway event loop for the judge latency/timeout, delaying unrelated Telegram/Discord/etc. messages and even control commands. Please move the judge/evaluate work off the event loop (for example asyncio.to_thread(...) / executor, or an async auxiliary client) and then schedule the status send + FIFO enqueue back on the loop.
P2 — /goal clear can be resurrected by /goal resume on a fresh manager.
GoalManager.clear() persists the state as status="cleared", but GoalManager.resume() reactivates any stored state without checking that it is currently paused. In gateway, where _get_goal_manager_for_event() creates a fresh manager per command, this means /goal clear followed by /goal resume revives the supposedly cleared goal from state_meta. The same applies to done goals if a fresh manager loads them. resume() should probably only operate on status == "paused", and/or load_goal() should treat cleared as no goal (or actually delete the meta key).

Validation I ran locally:

python -m pytest tests/hermes_cli/test_goals.py -q -o 'addopts=' → 26 passed
Diff security/supply-chain scan: no package manager, workflow, install script, or dependency-file changes; no obvious credential/lifecycle-command additions in the PR diff.

…18262) Add a standing-goal slash command that keeps Hermes working toward a user-stated objective across turns until it is achieved, paused, or the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI 0.128.0's /goal. After each turn, a lightweight auxiliary-model judge call asks 'is this goal satisfied by the assistant's last response?'. If not, and we're under the turn budget (default 20), Hermes feeds a continuation prompt back into the same session as a normal user message. Any real user message preempts the continuation loop automatically. Judge failures fail OPEN (continue) so a flaky judge never wedges progress — the turn budget is the real backstop. ### Commands - `/goal <text>` — set a standing goal (kicks off the first turn) - `/goal` or `/goal status` — show current state - `/goal pause` — pause the continuation loop - `/goal resume` — resume (resets turn counter) - `/goal clear` — drop the goal Works on both CLI and gateway platforms via the central CommandDef registry. ### Design invariants preserved - **Prompt cache**: continuation prompts are regular user-role messages appended to history. No system-prompt mutation, no toolset swap. - **Role alternation**: continuation is a user turn, never injected mid-tool-loop. - **Session persistence**: goal state lives in SessionDB.state_meta keyed by `goal:<session_id>`, so `/resume` picks it up. - **Mid-run safety**: on the gateway, `/goal status|pause|clear` are allowed mid-run (control-plane only); setting a new goal requires `/stop` first so we don't race a second continuation prompt against the current turn. ### Files - `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state - `hermes_cli/commands.py` — CommandDef entry - `hermes_cli/config.py` — `goals.max_turns` default - `hermes_cli/web_server.py` — dashboard category merge - `cli.py` — /goal handler + post-turn continuation hook in process_loop - `gateway/run.py` — /goal handler + post-turn continuation hook wrapping _handle_message_with_agent - `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing, fail-open semantics, lifecycle, persistence, budget exhaustion - `website/docs/reference/slash-commands.md` — docs entry

Adds a proper feature page at user-guide/features/goals.md covering the /goal slash command — Hermes' take on the Ralph loop shipped in PR NousResearch#18262. The slash-commands reference table had two table rows but no narrative doc walking through the judge model, fail-open semantics, turn budget, persistence, user-message preemption, or the aux-model config override. Adds a walkthrough example showing a multi-turn goal running to completion, covers the two judge failure modes with how to recover, and credits Codex CLI 0.128.0 / Eric Traut as prior art. Also cross-links both slash-commands.md rows to the new page so readers discovering /goal from the command reference can dive in.