Skip to content

feat: /goal — persistent cross-turn goals (Ralph loop)#18262

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-93a0c9ac
May 1, 2026
Merged

feat: /goal — persistent cross-turn goals (Ralph loop)#18262
teknium1 merged 1 commit into
mainfrom
hermes/hermes-93a0c9ac

Conversation

@teknium1

@teknium1 teknium1 commented May 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds /goal <text>, a standing-goal slash command that keeps Hermes working toward a stated objective across turns until it is achieved, paused, or the turn budget runs out.

Inspiration / prior art: our take on the Ralph loop, directly inspired by Codex CLI 0.128.0's /goal — built by Eric Traut (Pyright) on the Codex team. Same core idea (keep the goal alive across turns, don't stop until it's achieved); implementation is independent and adapted to Hermes' architecture (central CommandDef registry, SessionDB.state_meta persistence, auxiliary-client judge, adapter-FIFO continuation on gateway).

After each turn, an auxiliary-model judge call asks 'is this goal satisfied by the assistant's last response?'. If not, Hermes feeds a continuation prompt back into the same session as a normal user turn. Any real user message preempts the loop automatically. Judge failures fail OPEN (continue) so a flaky judge can never wedge progress — the turn budget (default 20) is the real backstop.

Commands

/goal <text> Set standing goal (kicks off first turn immediately)
/goal or /goal status Show current state
/goal pause Pause the continuation loop
/goal resume Resume (resets turn counter)
/goal clear Drop the goal

Works identically on CLI and gateway via the central CommandDef registry.

Design invariants preserved

  • Prompt cache — continuation prompts are regular user-role messages appended to history. No system-prompt mutation, no toolset swap.
  • Role alternation — continuation is a user turn, never injected mid-tool-loop.
  • Session persistence — goal state lives in SessionDB.state_meta keyed by goal:<session_id>, so /resume picks it up.
  • Mid-run safety — on the gateway, /goal status|pause|clear are allowed mid-run (control-plane only); setting a new goal requires /stop first so we don't race a second continuation prompt against the current turn.
  • Any-message preemption — on CLI, goal continuations go through _pending_input so a real user message queued during the judge call runs first. On gateway, they go through the adapter FIFO with the same effect.

Files

  • hermes_cli/goals.py (new) — GoalManager + judge_goal + GoalState
  • hermes_cli/commands.py — CommandDef entry (one line)
  • hermes_cli/config.pygoals.max_turns: 20 default
  • hermes_cli/web_server.py — dashboard category merge (goals → agent)
  • cli.py_handle_goal_command + _maybe_continue_goal_after_turn hook in process_loop
  • gateway/run.py_handle_goal_command + _post_turn_goal_continuation wrapping _handle_message_with_agent
  • tests/hermes_cli/test_goals.py (new, 26 tests)
  • website/docs/reference/slash-commands.md

Validation

Targeted tests 26/26 in tests/hermes_cli/test_goals.py
Broader tests/hermes_cli/ 3519 passed (1 pre-existing copilot-auth flake, unrelated)
tests/gateway/ 4407 passed (1 pre-existing test_teams.py plugin flake, unrelated)
tests/hermes_cli/test_web_server.py Fixed test_no_single_field_categories (merged goals → agent in dashboard)
Live test 1 — judge round-trip done / continue / continue (empty) / continue (partial) — all correct, 2-7s latency
Live test 2 — real CLI loop Goal "print hello, Ralph loop" — turn 1 judge caught ambiguity ('didn't confirm terminal output'), turn 2 agent re-ran with confirmation, judge said done, loop halted cleanly
Live test 3 — budget exhaustion 4-file goal with max_turns=2 → paused at 2/2 turns with correct /goal resume message, zero runaway

Notes

  • goals.max_turns is the only config knob for now. Kept minimal — adding more keys without concrete need would be speculative.
  • Judge uses get_text_auxiliary_client("goal_judge") so users can route it to a cheap model via the auxiliary.goal_judge config override if they want.
  • No _config_version bump needed — adding a new key to DEFAULT_CONFIG is handled by the existing _deep_merge in load_config.

Add a standing-goal slash command that keeps Hermes working toward a
user-stated objective across turns until it is achieved, paused, or
the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI
0.128.0's /goal.

After each turn, a lightweight auxiliary-model judge call asks 'is
this goal satisfied by the assistant's last response?'. If not, and
we're under the turn budget (default 20), Hermes feeds a continuation
prompt back into the same session as a normal user message. Any real
user message preempts the continuation loop automatically.

Judge failures fail OPEN (continue) so a flaky judge never wedges
progress — the turn budget is the real backstop.

### Commands

- `/goal <text>`    — set a standing goal (kicks off the first turn)
- `/goal` or `/goal status` — show current state
- `/goal pause`    — pause the continuation loop
- `/goal resume`   — resume (resets turn counter)
- `/goal clear`    — drop the goal

Works on both CLI and gateway platforms via the central CommandDef
registry.

### Design invariants preserved

- **Prompt cache**: continuation prompts are regular user-role
  messages appended to history. No system-prompt mutation, no toolset
  swap.
- **Role alternation**: continuation is a user turn, never injected
  mid-tool-loop.
- **Session persistence**: goal state lives in SessionDB.state_meta
  keyed by `goal:<session_id>`, so `/resume` picks it up.
- **Mid-run safety**: on the gateway, `/goal status|pause|clear` are
  allowed mid-run (control-plane only); setting a new goal requires
  `/stop` first so we don't race a second continuation prompt against
  the current turn.

### Files

- `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state
- `hermes_cli/commands.py` — CommandDef entry
- `hermes_cli/config.py` — `goals.max_turns` default
- `hermes_cli/web_server.py` — dashboard category merge
- `cli.py` — /goal handler + post-turn continuation hook in
  process_loop
- `gateway/run.py` — /goal handler + post-turn continuation hook
  wrapping _handle_message_with_agent
- `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing,
  fail-open semantics, lifecycle, persistence, budget exhaustion
- `website/docs/reference/slash-commands.md` — docs entry
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery labels May 1, 2026
@teknium1 teknium1 merged commit 265bd59 into main May 1, 2026
10 of 12 checks passed
@teknium1 teknium1 deleted the hermes/hermes-93a0c9ac branch May 1, 2026 06:10
teknium1 added a commit that referenced this pull request May 1, 2026
Adds a proper feature page at user-guide/features/goals.md covering
the /goal slash command — Hermes' take on the Ralph loop shipped in
PR #18262. The slash-commands reference table had two table rows but
no narrative doc walking through the judge model, fail-open semantics,
turn budget, persistence, user-message preemption, or the aux-model
config override.

Adds a walkthrough example showing a multi-turn goal running to
completion, covers the two judge failure modes with how to recover,
and credits Codex CLI 0.128.0 / Eric Traut as prior art.

Also cross-links both slash-commands.md rows to the new page so
readers discovering /goal from the command reference can dive in.
@heung0323

Copy link
Copy Markdown

Post-merge review notes for /goal PR. I found two correctness/reliability issues worth addressing in a follow-up:

  1. P1 — Gateway goal judging blocks the async event loop.
    In gateway/run.py, _handle_message() calls _post_turn_goal_continuation(...) synchronously after _handle_message_with_agent(...) returns. That path calls GoalManager.evaluate_after_turn(), which calls judge_goal(), which performs a blocking auxiliary-model client.chat.completions.create(..., timeout=30). Because this runs inside the gateway async handler, every active /goal turn can stall the gateway event loop for the judge latency/timeout, delaying unrelated Telegram/Discord/etc. messages and even control commands. Please move the judge/evaluate work off the event loop (for example asyncio.to_thread(...) / executor, or an async auxiliary client) and then schedule the status send + FIFO enqueue back on the loop.

  2. P2 — /goal clear can be resurrected by /goal resume on a fresh manager.
    GoalManager.clear() persists the state as status="cleared", but GoalManager.resume() reactivates any stored state without checking that it is currently paused. In gateway, where _get_goal_manager_for_event() creates a fresh manager per command, this means /goal clear followed by /goal resume revives the supposedly cleared goal from state_meta. The same applies to done goals if a fresh manager loads them. resume() should probably only operate on status == "paused", and/or load_goal() should treat cleared as no goal (or actually delete the meta key).

Validation I ran locally:

  • python -m pytest tests/hermes_cli/test_goals.py -q -o 'addopts='26 passed
  • Diff security/supply-chain scan: no package manager, workflow, install script, or dependency-file changes; no obvious credential/lifecycle-command additions in the PR diff.

nickdlkk pushed a commit to nickdlkk/hermes-agent that referenced this pull request May 11, 2026
…18262)

Add a standing-goal slash command that keeps Hermes working toward a
user-stated objective across turns until it is achieved, paused, or
the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI
0.128.0's /goal.

After each turn, a lightweight auxiliary-model judge call asks 'is
this goal satisfied by the assistant's last response?'. If not, and
we're under the turn budget (default 20), Hermes feeds a continuation
prompt back into the same session as a normal user message. Any real
user message preempts the continuation loop automatically.

Judge failures fail OPEN (continue) so a flaky judge never wedges
progress — the turn budget is the real backstop.

### Commands

- `/goal <text>`    — set a standing goal (kicks off the first turn)
- `/goal` or `/goal status` — show current state
- `/goal pause`    — pause the continuation loop
- `/goal resume`   — resume (resets turn counter)
- `/goal clear`    — drop the goal

Works on both CLI and gateway platforms via the central CommandDef
registry.

### Design invariants preserved

- **Prompt cache**: continuation prompts are regular user-role
  messages appended to history. No system-prompt mutation, no toolset
  swap.
- **Role alternation**: continuation is a user turn, never injected
  mid-tool-loop.
- **Session persistence**: goal state lives in SessionDB.state_meta
  keyed by `goal:<session_id>`, so `/resume` picks it up.
- **Mid-run safety**: on the gateway, `/goal status|pause|clear` are
  allowed mid-run (control-plane only); setting a new goal requires
  `/stop` first so we don't race a second continuation prompt against
  the current turn.

### Files

- `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state
- `hermes_cli/commands.py` — CommandDef entry
- `hermes_cli/config.py` — `goals.max_turns` default
- `hermes_cli/web_server.py` — dashboard category merge
- `cli.py` — /goal handler + post-turn continuation hook in
  process_loop
- `gateway/run.py` — /goal handler + post-turn continuation hook
  wrapping _handle_message_with_agent
- `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing,
  fail-open semantics, lifecycle, persistence, budget exhaustion
- `website/docs/reference/slash-commands.md` — docs entry
nickdlkk pushed a commit to nickdlkk/hermes-agent that referenced this pull request May 11, 2026
Adds a proper feature page at user-guide/features/goals.md covering
the /goal slash command — Hermes' take on the Ralph loop shipped in
PR NousResearch#18262. The slash-commands reference table had two table rows but
no narrative doc walking through the judge model, fail-open semantics,
turn budget, persistence, user-message preemption, or the aux-model
config override.

Adds a walkthrough example showing a multi-turn goal running to
completion, covers the two judge failure modes with how to recover,
and credits Codex CLI 0.128.0 / Eric Traut as prior art.

Also cross-links both slash-commands.md rows to the new page so
readers discovering /goal from the command reference can dive in.
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
…18262)

Add a standing-goal slash command that keeps Hermes working toward a
user-stated objective across turns until it is achieved, paused, or
the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI
0.128.0's /goal.

After each turn, a lightweight auxiliary-model judge call asks 'is
this goal satisfied by the assistant's last response?'. If not, and
we're under the turn budget (default 20), Hermes feeds a continuation
prompt back into the same session as a normal user message. Any real
user message preempts the continuation loop automatically.

Judge failures fail OPEN (continue) so a flaky judge never wedges
progress — the turn budget is the real backstop.

### Commands

- `/goal <text>`    — set a standing goal (kicks off the first turn)
- `/goal` or `/goal status` — show current state
- `/goal pause`    — pause the continuation loop
- `/goal resume`   — resume (resets turn counter)
- `/goal clear`    — drop the goal

Works on both CLI and gateway platforms via the central CommandDef
registry.

### Design invariants preserved

- **Prompt cache**: continuation prompts are regular user-role
  messages appended to history. No system-prompt mutation, no toolset
  swap.
- **Role alternation**: continuation is a user turn, never injected
  mid-tool-loop.
- **Session persistence**: goal state lives in SessionDB.state_meta
  keyed by `goal:<session_id>`, so `/resume` picks it up.
- **Mid-run safety**: on the gateway, `/goal status|pause|clear` are
  allowed mid-run (control-plane only); setting a new goal requires
  `/stop` first so we don't race a second continuation prompt against
  the current turn.

### Files

- `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state
- `hermes_cli/commands.py` — CommandDef entry
- `hermes_cli/config.py` — `goals.max_turns` default
- `hermes_cli/web_server.py` — dashboard category merge
- `cli.py` — /goal handler + post-turn continuation hook in
  process_loop
- `gateway/run.py` — /goal handler + post-turn continuation hook
  wrapping _handle_message_with_agent
- `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing,
  fail-open semantics, lifecycle, persistence, budget exhaustion
- `website/docs/reference/slash-commands.md` — docs entry
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
Adds a proper feature page at user-guide/features/goals.md covering
the /goal slash command — Hermes' take on the Ralph loop shipped in
PR NousResearch#18262. The slash-commands reference table had two table rows but
no narrative doc walking through the judge model, fail-open semantics,
turn budget, persistence, user-message preemption, or the aux-model
config override.

Adds a walkthrough example showing a multi-turn goal running to
completion, covers the two judge failure modes with how to recover,
and credits Codex CLI 0.128.0 / Eric Traut as prior art.

Also cross-links both slash-commands.md rows to the new page so
readers discovering /goal from the command reference can dive in.
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
…18262)

Add a standing-goal slash command that keeps Hermes working toward a
user-stated objective across turns until it is achieved, paused, or
the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI
0.128.0's /goal.

After each turn, a lightweight auxiliary-model judge call asks 'is
this goal satisfied by the assistant's last response?'. If not, and
we're under the turn budget (default 20), Hermes feeds a continuation
prompt back into the same session as a normal user message. Any real
user message preempts the continuation loop automatically.

Judge failures fail OPEN (continue) so a flaky judge never wedges
progress — the turn budget is the real backstop.

### Commands

- `/goal <text>`    — set a standing goal (kicks off the first turn)
- `/goal` or `/goal status` — show current state
- `/goal pause`    — pause the continuation loop
- `/goal resume`   — resume (resets turn counter)
- `/goal clear`    — drop the goal

Works on both CLI and gateway platforms via the central CommandDef
registry.

### Design invariants preserved

- **Prompt cache**: continuation prompts are regular user-role
  messages appended to history. No system-prompt mutation, no toolset
  swap.
- **Role alternation**: continuation is a user turn, never injected
  mid-tool-loop.
- **Session persistence**: goal state lives in SessionDB.state_meta
  keyed by `goal:<session_id>`, so `/resume` picks it up.
- **Mid-run safety**: on the gateway, `/goal status|pause|clear` are
  allowed mid-run (control-plane only); setting a new goal requires
  `/stop` first so we don't race a second continuation prompt against
  the current turn.

### Files

- `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state
- `hermes_cli/commands.py` — CommandDef entry
- `hermes_cli/config.py` — `goals.max_turns` default
- `hermes_cli/web_server.py` — dashboard category merge
- `cli.py` — /goal handler + post-turn continuation hook in
  process_loop
- `gateway/run.py` — /goal handler + post-turn continuation hook
  wrapping _handle_message_with_agent
- `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing,
  fail-open semantics, lifecycle, persistence, budget exhaustion
- `website/docs/reference/slash-commands.md` — docs entry
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
Adds a proper feature page at user-guide/features/goals.md covering
the /goal slash command — Hermes' take on the Ralph loop shipped in
PR NousResearch#18262. The slash-commands reference table had two table rows but
no narrative doc walking through the judge model, fail-open semantics,
turn budget, persistence, user-message preemption, or the aux-model
config override.

Adds a walkthrough example showing a multi-turn goal running to
completion, covers the two judge failure modes with how to recover,
and credits Codex CLI 0.128.0 / Eric Traut as prior art.

Also cross-links both slash-commands.md rows to the new page so
readers discovering /goal from the command reference can dive in.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…18262)

Add a standing-goal slash command that keeps Hermes working toward a
user-stated objective across turns until it is achieved, paused, or
the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI
0.128.0's /goal.

After each turn, a lightweight auxiliary-model judge call asks 'is
this goal satisfied by the assistant's last response?'. If not, and
we're under the turn budget (default 20), Hermes feeds a continuation
prompt back into the same session as a normal user message. Any real
user message preempts the continuation loop automatically.

Judge failures fail OPEN (continue) so a flaky judge never wedges
progress — the turn budget is the real backstop.

### Commands

- `/goal <text>`    — set a standing goal (kicks off the first turn)
- `/goal` or `/goal status` — show current state
- `/goal pause`    — pause the continuation loop
- `/goal resume`   — resume (resets turn counter)
- `/goal clear`    — drop the goal

Works on both CLI and gateway platforms via the central CommandDef
registry.

### Design invariants preserved

- **Prompt cache**: continuation prompts are regular user-role
  messages appended to history. No system-prompt mutation, no toolset
  swap.
- **Role alternation**: continuation is a user turn, never injected
  mid-tool-loop.
- **Session persistence**: goal state lives in SessionDB.state_meta
  keyed by `goal:<session_id>`, so `/resume` picks it up.
- **Mid-run safety**: on the gateway, `/goal status|pause|clear` are
  allowed mid-run (control-plane only); setting a new goal requires
  `/stop` first so we don't race a second continuation prompt against
  the current turn.

### Files

- `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state
- `hermes_cli/commands.py` — CommandDef entry
- `hermes_cli/config.py` — `goals.max_turns` default
- `hermes_cli/web_server.py` — dashboard category merge
- `cli.py` — /goal handler + post-turn continuation hook in
  process_loop
- `gateway/run.py` — /goal handler + post-turn continuation hook
  wrapping _handle_message_with_agent
- `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing,
  fail-open semantics, lifecycle, persistence, budget exhaustion
- `website/docs/reference/slash-commands.md` — docs entry
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
Adds a proper feature page at user-guide/features/goals.md covering
the /goal slash command — Hermes' take on the Ralph loop shipped in
PR NousResearch#18262. The slash-commands reference table had two table rows but
no narrative doc walking through the judge model, fail-open semantics,
turn budget, persistence, user-message preemption, or the aux-model
config override.

Adds a walkthrough example showing a multi-turn goal running to
completion, covers the two judge failure modes with how to recover,
and credits Codex CLI 0.128.0 / Eric Traut as prior art.

Also cross-links both slash-commands.md rows to the new page so
readers discovering /goal from the command reference can dive in.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…18262)

Add a standing-goal slash command that keeps Hermes working toward a
user-stated objective across turns until it is achieved, paused, or
the turn budget runs out. Our take on the Ralph loop — cf. Codex CLI
0.128.0's /goal.

After each turn, a lightweight auxiliary-model judge call asks 'is
this goal satisfied by the assistant's last response?'. If not, and
we're under the turn budget (default 20), Hermes feeds a continuation
prompt back into the same session as a normal user message. Any real
user message preempts the continuation loop automatically.

Judge failures fail OPEN (continue) so a flaky judge never wedges
progress — the turn budget is the real backstop.

### Commands

- `/goal <text>`    — set a standing goal (kicks off the first turn)
- `/goal` or `/goal status` — show current state
- `/goal pause`    — pause the continuation loop
- `/goal resume`   — resume (resets turn counter)
- `/goal clear`    — drop the goal

Works on both CLI and gateway platforms via the central CommandDef
registry.

### Design invariants preserved

- **Prompt cache**: continuation prompts are regular user-role
  messages appended to history. No system-prompt mutation, no toolset
  swap.
- **Role alternation**: continuation is a user turn, never injected
  mid-tool-loop.
- **Session persistence**: goal state lives in SessionDB.state_meta
  keyed by `goal:<session_id>`, so `/resume` picks it up.
- **Mid-run safety**: on the gateway, `/goal status|pause|clear` are
  allowed mid-run (control-plane only); setting a new goal requires
  `/stop` first so we don't race a second continuation prompt against
  the current turn.

### Files

- `hermes_cli/goals.py` (new, 380 lines) — GoalManager + judge + state
- `hermes_cli/commands.py` — CommandDef entry
- `hermes_cli/config.py` — `goals.max_turns` default
- `hermes_cli/web_server.py` — dashboard category merge
- `cli.py` — /goal handler + post-turn continuation hook in
  process_loop
- `gateway/run.py` — /goal handler + post-turn continuation hook
  wrapping _handle_message_with_agent
- `tests/hermes_cli/test_goals.py` (new, 26 tests) — judge parsing,
  fail-open semantics, lifecycle, persistence, budget exhaustion
- `website/docs/reference/slash-commands.md` — docs entry
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
Adds a proper feature page at user-guide/features/goals.md covering
the /goal slash command — Hermes' take on the Ralph loop shipped in
PR NousResearch#18262. The slash-commands reference table had two table rows but
no narrative doc walking through the judge model, fail-open semantics,
turn budget, persistence, user-message preemption, or the aux-model
config override.

Adds a walkthrough example showing a multi-turn goal running to
completion, covers the two judge failure modes with how to recover,
and credits Codex CLI 0.128.0 / Eric Traut as prior art.

Also cross-links both slash-commands.md rows to the new page so
readers discovering /goal from the command reference can dive in.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants