Skip to content

fix(agent): resolve agent cwd from TERMINAL_CWD via one reader (closes #24882, #24969, #27383)#35028

Merged
teknium1 merged 9 commits into
NousResearch:mainfrom
banditburai:worktree-fix+gateway-cwd-override
Jun 1, 2026
Merged

fix(agent): resolve agent cwd from TERMINAL_CWD via one reader (closes #24882, #24969, #27383)#35028
teknium1 merged 9 commits into
NousResearch:mainfrom
banditburai:worktree-fix+gateway-cwd-override

Conversation

@banditburai

@banditburai banditburai commented May 29, 2026

Copy link
Copy Markdown
Contributor

Problem

The agent reports — and cds to — the daemon launch directory (the hermes-agent install dir) instead of the configured terminal.cwd / TERMINAL_CWD, so the model ends up working in the wrong place. This bit gateway, cron, and Telegram sessions.

  • Root cause: build_environment_hints() emitted Current working directory: {os.getcwd()} (agent/prompt_builder.py:805), ignoring TERMINAL_CWD entirely.
  • Secondary site: context-file discovery read TERMINAL_CWD ad-hoc (os.getenv("TERMINAL_CWD") or None) in agent/system_prompt.py, duplicating the cwd-resolution decision so the two prompt tiers could disagree.

This is a reader-only correctness bug — not a config, transport, or service-manager bug.

Fix

One reader resolver as the single source of truth, with the two prompt-tier read sites routed through it.

Two intentionally asymmetric resolvers:

Resolver Returns Logic Why
resolve_agent_cwd() Path strip + expanduser + is_dir() guard + os.getcwd() fallback A human-facing display line must degrade to the real launch dir, never show a bogus path.
resolve_context_cwd() Path | None strip + expanduser, no isdir guard, no getcwd arm None = unset → the caller (build_context_files_prompt) getcwds, so discovery still runs (never skipped). A set-but-missing dir is returned as-is → discovery simply finds nothing. The getcwd fallback is owned by the caller here, by the resolver there.

Remote backends unchanged: both fixes sit inside the existing is_remote_backend gate (prompt_builder.py:791). docker / modal / ssh still suppress the host cwd line and use the live in-backend probe; the probe's cache key (:694) intentionally keeps reading os.getenv.

Behavior-preserving: local-CLI (TERMINAL_CWD unset) and remote backends produce output identical to before. The only new behavior is .strip() + expanduser() — strict improvements (whitespace-only no longer yields a " " path; ~ now expands).

Design decision: why no gateway re-bridge

The superseded PRs #27488 / #29365 also added a per-turn re-bridge (_reload_runtime_env_preserving_config_authority in gateway/run.py) that re-reads terminal.cwd and re-sets TERMINAL_CWD every session. We deliberately do not adopt it:

Linkage

Ref Type Disposition Why
#24882 Issue Closes terminal.cwd not injected into the system prompt — fixed at prompt_builder.py:805.
#24969 Issue Closes Cron --workdir not reflected in prompt cwd — same read site.
#27383 Issue Closes Telegram agent wrong working dir — gateway sets TERMINAL_CWD, now read correctly.
#29265 Issue Provenance only (already closed) The QQBot/Weixin request behind PR #29365; closed, so no auto-close keyword.
#24888 PR Supersedes Same env-hint fix only; ours is the SSOT superset.
#24985 PR Supersedes Cron-path variant; subsumed by the single resolver.
#27488 PR Supersedes Env-hint fix plus the session-reset re-bridge we drop.
#29365 PR Supersedes Gateway-cwd fix plus the re-bridge we drop.
#11312 Issue Related, not closed Service-manager WorkingDirectory / hermes update env loss — a different root cause.
#34805 PR Related (landed) Service-unit fix anchoring WorkingDirectory; resolves #11312's actual cause, complementary to this PR.

Closes #24882, closes #24969, closes #27383.

Related but out of scope: #11312 (service-manager environment loss on update) is a distinct root cause and is not addressed here — it is handled separately by the landed service-unit fix #34805.

Behavior changes & regression analysis

Aspect Old New Safe? Test-pinned?
Type into build_context_files_prompt(cwd=) str / None Path / None ✅ callee does Path(cwd).resolve(); one prod caller test_system_prompt.py
Whitespace handling none .strip() ✅ whitespace-only tests (both resolvers)
Tilde handling none .expanduser() test_expands_leading_tilde ×2
.resolve() in resolver n/a not called (display-only; callee resolves for context) ✅ intentional ✅ contract test resolves at call site
Local-CLI unset path getcwd via prompt identical (agent→getcwd, context→None→callee getcwd) #19242 preserved ✅ fallback tests
Remote-backend suppression TERMINAL_CWD-first identical
OSError on dead cwd caught at :805 still caught at call site; resolver propagates test_propagates_oserror_from_getcwd
import os in system_prompt.py present removed; no orphaned refs grep-verified

Scope / out of scope

Deliberately limited to the 2 prompt read sites. These readers were already TERMINAL_CWD-first and are untouched: tools/terminal_tool.py:1039, tools/file_tools.py:124, agent/tool_executor.py:198/667, tools/code_execution_tool.py:1681, tools/delegate_tool.py:653, agent/agent_init.py:1539, gateway/run.py (the bridge writer), and prompt_builder.py:694 (remote-probe cache key).

tests/tools/test_gateway_cwd_contract.py is a characterization fence over three of those already-correct tool sites (terminal / file / execute_code) — it pins existing behavior to make the #29365 supersession airtight. It is not coverage of code changed here.

Edge cases

  • Whitespace-only TERMINAL_CWD → strip → falsy → agent uses getcwd, context returns None.
  • Relative / trailing-slash TERMINAL_CWD → resolver does not .resolve(); acceptable because the value is bridged from config and expected absolute.
  • Stale / typo'd TERMINAL_CWD → agent: is_dir() fails → getcwd fallback; context: returns the missing dir (no guard) → discovery finds nothing.

Testing

Touched-area suites pass — 146 passed, 1 skipped; ruff clean on all 7 changed files. Pinned behaviors: whitespace strip · OSError propagation · isdir asymmetry · None→getcwd · ~ expansion (both resolvers) · end-to-end hint (set → value, unset → launch dir) · tool-surface contract.

Infographic

gateway-reliability-three-fixes

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels May 29, 2026
Cover the two new hardening behaviors that were unpinned: whitespace-only
TERMINAL_CWD falling through to getcwd/None, and OSError from the getcwd
fallback arm propagating to the build_environment_hints try/except guard.
…earch#29265)

Port PR NousResearch#29365's tool-surface contract test: terminal/file/execute_code
already honor TERMINAL_CWD (out of scope for the resolver cluster). Pinning
the behavior makes the supersession of NousResearch#29365 airtight and guards against a
future refactor silently regressing the workspace contract.
@banditburai banditburai force-pushed the worktree-fix+gateway-cwd-override branch from 2129623 to 1dbfa92 Compare May 30, 2026 15:15
@teknium1 teknium1 merged commit 128da68 into NousResearch:main Jun 1, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

3 participants