Skip to content

fix(delegate): always reload delegation config from disk, drop stale CLI_CONFIG cache (#18946)#18947

Open
yonefive71 wants to merge 1 commit into
NousResearch:mainfrom
yonefive71:fix/delegation-config-cache-staleness
Open

fix(delegate): always reload delegation config from disk, drop stale CLI_CONFIG cache (#18946)#18947
yonefive71 wants to merge 1 commit into
NousResearch:mainfrom
yonefive71:fix/delegation-config-cache-staleness

Conversation

@yonefive71

Copy link
Copy Markdown

Summary

Fixes #18946. hermes config set delegation.<key> <value> writes config.yaml on disk and reports success, but the change had no effect on the running process — delegate_task continued using the previously-cached values until the CLI/gateway was restarted. Silent failure with no warning.

The bug

tools/delegate_tool.py:2322_load_config() consulted cli.CLI_CONFIG first and only fell through to a fresh disk read when CLI_CONFIG was empty:

def _load_config() -> dict:
    try:
        from cli import CLI_CONFIG
        cfg = CLI_CONFIG.get("delegation", {})
        if cfg:
            return cfg          # <-- runtime cache wins; disk ignored
    except Exception:
        pass
    try:
        from hermes_cli.config import load_config
        full = load_config()
        return full.get("delegation", {})
    except Exception:
        return {}

CLI_CONFIG is initialized once at cli.py:600 (CLI_CONFIG = load_cli_config()) and never refreshed. hermes config set mutates the file on disk but has no IPC channel to notify the running process. Result: stale cache silently wins for the lifetime of the process.

This affected every delegation.* knob (model, provider, max_iterations, max_concurrent_children, reasoning_effort, base_url, api_key, inherit_mcp_toolsets, etc.) since they all flow through the same _load_config() path.

Fix

Always read fresh from hermes_cli.config.load_config(). The disk read is cheap — config.yaml is small and only consulted at delegation boundaries (i.e. when a subagent is spawned), not on every API call.

cli.CLI_CONFIG is preserved as a fallback only for test contexts where hermes_cli.config isn't importable but CLI_CONFIG has been mocked in directly. The order is inverted from before: disk first, in-memory cache second.

Tests

tests/tools/test_delegate.py::TestLoadConfigDiskFreshness — three cases:

  • test_disk_changes_visible_after_initial_load — the headline regression: disk holds new value, CLI_CONFIG holds old value, _load_config() returns the new value
  • test_falls_back_to_cli_config_when_disk_read_fails — preserves test-context behaviour where CLI_CONFIG is mocked but hermes_cli.config.load_config raises
  • test_returns_empty_dict_when_both_sources_fail — defensive: both sources fail → returns {}, doesn't propagate exception

Verified the empty-CLI_CONFIG fallback path implicitly through the existing 121 tests in TestDelegationCredentialResolution / TestDelegationProviderIntegration / etc., which all mock _load_config directly and therefore aren't affected by the implementation change.

Full test run:

$ pytest tests/tools/test_delegate.py -q
124 passed in 2.44s

Notes for reviewers

  • The disk read on each _load_config() call is well within budget — delegation is invoked at subagent-spawn time (a heavy operation involving network roundtrips, agent construction, and tool schema enumeration). One additional YAML load adds microseconds to an operation that already takes hundreds of milliseconds minimum.
  • An mtime-based reload would be slightly more efficient, but adds complexity (cache invalidation logic, atomicity concerns on writes, etc.) for a savings that's not measurable. If profiling later shows this as a hot path, an mtime check is a backwards-compatible refinement.
  • Behavior in test contexts that mock cli.CLI_CONFIG directly (not hermes_cli.config.load_config) is preserved by the fallback — see test_falls_back_to_cli_config_when_disk_read_fails. The 121 existing tests in this file mock _load_config itself, so they're indifferent to the change.

Fixes #18946

…CLI_CONFIG cache

`tools.delegate_tool._load_config()` consulted `cli.CLI_CONFIG` first
and only fell through to a fresh disk read when CLI_CONFIG was empty.
CLI_CONFIG is initialized once at `cli.py:600` and never refreshed, so
`hermes config set delegation.<key> <value>` wrote config.yaml on disk
correctly but had no effect on the running process — `delegate_task`
kept using the previously-cached values until the gateway/CLI was
restarted. The failure was silent: `hermes config set` reported success,
config.yaml on disk was correct, but the running agent quietly kept
running on stale config.

This affected every `delegation.*` knob (model, provider, max_iterations,
max_concurrent_children, reasoning_effort, base_url, api_key, etc.)
since they all flow through the same _load_config() path.

Fix: always read fresh from `hermes_cli.config.load_config()`. The
disk read is cheap (config.yaml is small and only consulted at
delegation boundaries — not on every API call). `cli.CLI_CONFIG` is
preserved as a fallback only for test contexts where `hermes_cli.config`
isn't importable but CLI_CONFIG has been mocked in directly.

Tests: TestLoadConfigDiskFreshness covers the three branches —
disk-fresh-wins-over-cached-CLI_CONFIG, fallback-to-CLI_CONFIG-when-disk-fails,
both-sources-fail-returns-empty-dict.

`pytest tests/tools/test_delegate.py` → 124 passed.

Fixes NousResearch#18946
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists tool/delegate Subagent delegation labels May 2, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #12053 and #15540 — same stale CLI_CONFIG cache in delegate_tool._load_config(). This PR fixes #18946.

@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #12053 and #15540 — same stale CLI_CONFIG cache in delegate_tool._load_config(). This PR fixes #18946.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P2 Medium — degraded but workaround exists tool/delegate Subagent delegation type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: hermes config set delegation.* silently has no effect on running process (CLI_CONFIG cache stale)

2 participants