Bug Description
hermes config set delegation.model <X> writes config.yaml on disk and reports success, but the change does not take effect for delegate_task in any already-running process (CLI/gateway/cron). The runtime keeps using the previous delegation.model until the process is restarted.
This is silent — there's no warning that a restart is required, and the next delegate_task call happily runs on the old model with no indication anything is stale.
Steps to Reproduce
- Start a CLI/gateway session with
delegation.model: google/gemini-2.5-flash in config.yaml.
- From a separate shell (or mid-session), run
hermes config set delegation.model google/gemini-3-flash-preview. Confirm config.yaml on disk now shows google/gemini-3-flash-preview.
- From the original session, call
delegate_task(...). The subagent runs on google/gemini-2.5-flash, not the new model.
- Restart the process. Now
delegate_task correctly uses google/gemini-3-flash-preview.
Minimal reproducer
This 30-line script reproduces the bug deterministically without spinning up a real subagent:
import os, sys, yaml, tempfile, pathlib
tmp = pathlib.Path(tempfile.mkdtemp(prefix="hermes-repro-"))
cfg_path = tmp / "config.yaml"
cfg_path.write_text(yaml.safe_dump({
"model": {"default": "claude-opus-4-7", "provider": "anthropic"},
"delegation": {"model": "google/gemini-2.5-flash", "provider": "nous"},
}))
os.environ["HERMES_HOME"] = str(tmp)
sys.path.insert(0, "/path/to/hermes-agent")
# Simulate gateway startup: CLI_CONFIG loaded once at import time
from hermes_cli.config import load_config
CLI_CONFIG_v1 = load_config()
# Simulate `hermes config set delegation.model google/gemini-3-flash-preview`
data = yaml.safe_load(cfg_path.read_text())
data["delegation"]["model"] = "google/gemini-3-flash-preview"
cfg_path.write_text(yaml.safe_dump(data))
# Inject the cached CLI_CONFIG to simulate running gateway state
import cli as cli_mod
cli_mod.CLI_CONFIG = CLI_CONFIG_v1
# Call the delegate_task code path
from tools.delegate_tool import _load_config
result = _load_config()
print(f"Returns: {result['model']}")
print(f"On disk: google/gemini-3-flash-preview")
assert result["model"] == "google/gemini-3-flash-preview", "stale cache wins"
Output:
Returns: google/gemini-2.5-flash
On disk: google/gemini-3-flash-preview
AssertionError: stale cache wins
Root cause
tools/delegate_tool.py line 2322 — _load_config() checks the in-memory cli.CLI_CONFIG first and only falls through to a fresh disk read when CLI_CONFIG is empty:
def _load_config() -> dict:
"""Load delegation config from CLI_CONFIG or persistent config."""
try:
from cli import CLI_CONFIG
cfg = CLI_CONFIG.get("delegation", {})
if cfg:
return cfg # <-- runtime cache wins; disk ignored
except Exception:
pass
try:
from hermes_cli.config import load_config
full = load_config()
return full.get("delegation", {})
except Exception:
return {}
CLI_CONFIG is initialized once at cli.py:600 (CLI_CONFIG = load_cli_config()) and never refreshed. hermes config set mutates the file on disk but has no IPC channel to notify the running process. Result: stale cache silently wins for the lifetime of the process.
Why this is more than cosmetic
This makes A/B testing delegation models on a live session impossible without a full gateway restart, and the failure is silent — hermes config set reports success, the file on disk is correct, but the running agent reads stale values. Users will wrongly attribute behavior changes (or non-changes) to the wrong model.
It also affects any other delegation.* knob (provider, max_iterations, max_concurrent_children, reasoning_effort, etc.) since they all flow through the same _load_config() path.
Proposed fix
Smallest change with no protocol/IPC work: have _load_config() always reload from disk (the disk read is cheap — config.yaml is small and reads dozens of times per session is fine). The CLI_CONFIG path was a micro-optimization; the cache-staleness footgun is a worse tradeoff.
Alternative: stat config.yaml's mtime in _load_config() and only reload when it has advanced since the last read. Slightly more code, fewer disk reads.
Happy to submit a PR with the simpler "always reload" version plus a regression test in tests/tools/test_delegate_tool.py.
Environment
- Hermes Agent v0.12.0 (2026.4.30)
- Reproduces on current main (HEAD as of issue filing)
- OS: Linux
Related
Bug Description
hermes config set delegation.model <X>writes config.yaml on disk and reports success, but the change does not take effect fordelegate_taskin any already-running process (CLI/gateway/cron). The runtime keeps using the previousdelegation.modeluntil the process is restarted.This is silent — there's no warning that a restart is required, and the next
delegate_taskcall happily runs on the old model with no indication anything is stale.Steps to Reproduce
delegation.model: google/gemini-2.5-flashin config.yaml.hermes config set delegation.model google/gemini-3-flash-preview. Confirm config.yaml on disk now showsgoogle/gemini-3-flash-preview.delegate_task(...). The subagent runs ongoogle/gemini-2.5-flash, not the new model.delegate_taskcorrectly usesgoogle/gemini-3-flash-preview.Minimal reproducer
This 30-line script reproduces the bug deterministically without spinning up a real subagent:
Output:
Root cause
tools/delegate_tool.pyline 2322 —_load_config()checks the in-memorycli.CLI_CONFIGfirst and only falls through to a fresh disk read when CLI_CONFIG is empty:CLI_CONFIGis initialized once atcli.py:600(CLI_CONFIG = load_cli_config()) and never refreshed.hermes config setmutates the file on disk but has no IPC channel to notify the running process. Result: stale cache silently wins for the lifetime of the process.Why this is more than cosmetic
This makes A/B testing delegation models on a live session impossible without a full gateway restart, and the failure is silent —
hermes config setreports success, the file on disk is correct, but the running agent reads stale values. Users will wrongly attribute behavior changes (or non-changes) to the wrong model.It also affects any other
delegation.*knob (provider, max_iterations, max_concurrent_children, reasoning_effort, etc.) since they all flow through the same_load_config()path.Proposed fix
Smallest change with no protocol/IPC work: have
_load_config()always reload from disk (the disk read is cheap — config.yaml is small and reads dozens of times per session is fine). The CLI_CONFIG path was a micro-optimization; the cache-staleness footgun is a worse tradeoff.Alternative: stat config.yaml's mtime in
_load_config()and only reload when it has advanced since the last read. Slightly more code, fewer disk reads.Happy to submit a PR with the simpler "always reload" version plus a regression test in
tests/tools/test_delegate_tool.py.Environment
Related
/configdisplay reads stale source; symptom-adjacent (display side) but distinct from this (write-side staleness)/configis read-only in TUI mid-session — implicitly acknowledges runtime mutation isn't well-supported, but doesn't cover the silent-failure shape of this bug