Skip to content

Fix/tirith missing binary cache#23855

Closed
xxxigm wants to merge 4 commits into
NousResearch:mainfrom
xxxigm:fix/tirith-missing-binary-cache
Closed

Fix/tirith missing binary cache#23855
xxxigm wants to merge 4 commits into
NousResearch:mainfrom
xxxigm:fix/tirith-missing-binary-cache

Conversation

@xxxigm

@xxxigm xxxigm commented May 11, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Stops tirith from spamming an identical WARNING line on every single terminal command when the binary is missing. On a Windows install without the Rust toolchain (or any host that removed the binary), subprocess.run([tirith_path, ...]) raises OSError on every scan, and check_command_security calls logger.warning("tirith spawn failed: ...") each time. A chat session with 20 terminal calls leaves 20 copies of:

WARNING tools.tirith_security: tirith spawn failed: [WinError 2] The system cannot find the file specified.

drowning the log and hiding real errors. The same pattern applies to the "tirith path resolved to None" warning when _resolve_tirith_path caches a failed auto-install.

The fix routes both warnings through a small _log_unavailable_once(path_key, message) helper that dedupes by path. The first miss for a given path logs WARNING; subsequent misses are silent. When subprocess.run later succeeds for that path, the cache entry is cleared via _clear_unavailable_log so a true regression (e.g. binary uninstalled mid-session) still surfaces one fresh warning instead of being silenced forever.

No verdict semantics change — fail_open still allows, fail_closed still blocks, every spawn is still attempted (we only deduplicate the log message). Timeouts and unexpected exit codes are intentionally NOT deduped: those are potentially transient and operators want to see them every time.

Related Issue

Fixes #23845

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests
  • ♻️ Refactor
  • 🎯 New skill

Changes Made

Four logically separated commits:

  1. fix(tools/tirith): add one-shot logging helpers for unavailable-binary warnings — adds _log_unavailable_once, _clear_unavailable_log, _reset_unavailable_log plus the _unavailable_paths_logged set and dedicated _unavailable_log_lock. Pure-additive; no call sites changed.
  2. fix(tools/tirith): dedup spawn-failure warnings via one-shot helper (#23845) — rewires the two repeating WARNING sites in check_command_security:
    • OSError from subprocess.run (line ~648) now calls _log_unavailable_once(tirith_path, ...).
    • tirith_path is None (line ~635) now calls _log_unavailable_once(f"<none>:{cfg['tirith_path']}", ...).
    • The successful subprocess return path calls _clear_unavailable_log(tirith_path) so a later regression re-arms one fresh warning.
  3. test(tools/tirith): cover the unavailable-log helpers (#23845) — 7 unit tests for the helpers: first-call logs, repeat is silent, distinct keys each log once, clear re-arms, clear of unknown key is a no-op, full reset wipes all keys, thread-safety under 20 racing threads → exactly one WARNING.
  4. test(tools/tirith): end-to-end regression for #23845 log spam — 7 integration tests driving check_command_security:
    • 20 repeated FileNotFoundError fail-open scans → exactly one WARNING (the original repro).
    • 10 repeated PermissionError fail-open scans → exactly one WARNING.
    • 5 repeated FileNotFoundError fail-closed scans → one WARNING (verdict still block on every call).
    • Success between two failure bursts → exactly two WARNINGs (re-arm path).
    • 15 repeated _resolve_tirith_path → None calls → one "resolved to None" WARNING.
    • Two distinct missing-binary paths in one session → one WARNING per path (two total).
    • Timeouts are not deduplicated (still one per call — they're potentially transient).

Files touched:

  • tools/tirith_security.py (+77, −3)
  • tests/tools/test_tirith_security.py (+380, −0)

How to Test

  1. Check out the branch and ensure .venv is set up:

    python3 -m venv .venv && source .venv/bin/activate && pip install -e ".[all,dev]"
  2. Run the new tests on their own:

    scripts/run_tests.sh \
      tests/tools/test_tirith_security.py::TestUnavailableLogHelpers \
      tests/tools/test_tirith_security.py::TestSpawnFailureLogSpamSuppression -v

    Expected: 14 passed.

  3. Run the full tirith suite to confirm no regression in the existing 63 tests:

    scripts/run_tests.sh tests/tools/test_tirith_security.py

    Expected: 77 passed (63 pre-existing + 14 new).

  4. Reproduce the bug on main to confirm the regression test catches it:

    git checkout upstream/main -- tools/tirith_security.py
    scripts/run_tests.sh \
      tests/tools/test_tirith_security.py::TestUnavailableLogHelpers \
      tests/tools/test_tirith_security.py::TestSpawnFailureLogSpamSuppression

    Expected: 14 errors (AttributeError on missing helpers — proves the tests would catch the spam regression).

    Then restore the fix:

    git checkout HEAD -- tools/tirith_security.py

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(tools/tirith): …, test(tools/tirith): …)
  • I searched for existing PRs to make sure this isn't a duplicate — none open against tools/tirith_security.py for this issue
  • My PR contains only changes related to this fix (no unrelated commits)
  • I've run scripts/run_tests.sh tests/tools/test_tirith_security.py and all tests pass (77 passed)
  • I've added tests for my changes (14 new test cases across two test classes)
  • I've tested on my platform: macOS 15.6 (Darwin 24.6.0), Python 3.12.7

Documentation & Housekeeping

  • I've updated relevant documentation — N/A (operator-visible behavior change is "one warning instead of N", no doc/example claims either way)
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — fix is platform-agnostic (the helper keys by string path); Windows is the dominant reproducer because the binary is hardest to install there, but the dedup applies identically on Linux/macOS hosts that uninstall the binary mid-session
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A

Screenshots / Logs

$ scripts/run_tests.sh tests/tools/test_tirith_security.py::TestUnavailableLogHelpers tests/tools/test_tirith_security.py::TestSpawnFailureLogSpamSuppression -v
============================== 14 passed in 1.24s ==============================

$ scripts/run_tests.sh tests/tools/test_tirith_security.py
============================== 77 passed in 1.10s ==============================

Bug reproduction proof — with the production code reverted to upstream/main, the new regression tests fail:

$ git checkout upstream/main -- tools/tirith_security.py
$ scripts/run_tests.sh tests/tools/test_tirith_security.py::TestUnavailableLogHelpers tests/tools/test_tirith_security.py::TestSpawnFailureLogSpamSuppression
============================== 14 errors in 1.15s ==============================

xxxigm added 4 commits May 11, 2026 22:07
…y warnings

Issue NousResearch#23845 reports that on Windows installs without the tirith binary
on PATH, every single terminal command emits an identical "tirith spawn
failed" WARNING.  A session with 20+ terminal calls leaves 20 copies of
the same line in the log, drowning out anything actionable.

Introduce two thread-safe helpers plus a small reset hook:
- `_log_unavailable_once(path_key, message)`: emits the warning the
  first time a given path key is seen, silent on every later call.
- `_clear_unavailable_log(path_key)`: drops a key from the cache so a
  later regression (e.g. binary uninstalled mid-session) still surfaces
  one fresh warning instead of being silenced forever.
- `_reset_unavailable_log()`: maintenance/test helper.

Pure-additive commit — no existing call site updated yet (next commit).
…ousResearch#23845)

When `tirith` is missing from PATH and no install path is configured,
`subprocess.run` raises OSError on every terminal command and
`check_command_security` logs an identical WARNING each time.  On a
Windows install without the Rust toolchain (or any host that removed
the binary), a chat session with 20 terminal calls produces 20 copies
of:

    WARNING tools.tirith_security: tirith spawn failed: [WinError 2]
    The system cannot find the file specified.

The same anti-pattern applies to the `tirith path resolved to None`
warning emitted when `_resolve_tirith_path` returns None (install
attempt cached as failed).

Route both warnings through `_log_unavailable_once` so each distinct
path key logs once per process.  On the success path, call
`_clear_unavailable_log(tirith_path)` so a later regression
(binary uninstalled mid-session) still surfaces one fresh warning
rather than being silenced by an earlier suppression.

No behavior change to verdicts:
- fail_open still allows on spawn failure
- fail_closed still blocks on spawn failure
- timeouts / unknown exit codes still log every time (they're
  potentially transient, not the "binary missing" repeating pattern
  this issue is about)

Refs NousResearch#23845.
…3845)

Adds `TestUnavailableLogHelpers` with seven cases exercising the
`_log_unavailable_once` / `_clear_unavailable_log` / `_reset_unavailable_log`
contract introduced in the previous commit:

- first call logs exactly one WARNING with the suppression-hint suffix
- second/third calls with the same key are silent
- distinct keys each log once
- `_clear_unavailable_log` re-arms a key so it logs again on the next miss
- `_clear_unavailable_log` of an unknown key is a no-op (does not raise)
- `_reset_unavailable_log` wipes every key
- thread-safety: 20 concurrent threads racing on the same key emit
  exactly one WARNING

Setup/teardown calls `_reset_unavailable_log` so cases are order-
independent regardless of how the existing 63-test suite runs first.
…spam

Adds `TestSpawnFailureLogSpamSuppression` with seven cases that drive
the original reproducer through `check_command_security` instead of the
internal helpers:

- 20 repeated FileNotFoundError fail-open scans → exactly ONE WARNING
- 10 repeated PermissionError fail-open scans → exactly ONE WARNING
- 5 repeated FileNotFoundError fail-closed scans → exactly ONE WARNING
  (verdict still `block` on every call)
- success between two failure bursts → exactly TWO WARNINGs (the
  intermediate success re-arms the cache so a true regression still
  surfaces)
- 15 repeated `_resolve_tirith_path` → None calls → exactly ONE
  "resolved to None" WARNING
- two distinct missing-binary paths in the same session → ONE
  WARNING per path (two total)
- timeouts are NOT deduplicated (potentially transient — not the
  persistent binary-missing pattern this issue is about)

Every test still asserts the verdict (`allow` under fail_open, `block`
under fail_closed) so we catch any regression that trades log spam for
incorrect security behaviour.

I verified the suite reproduces the bug by reverting the production
commits on this branch: all 14 new cases ERROR out because the
helpers don't exist on `upstream/main`, and the dedup invariants fail
once the helpers are reintroduced without wiring them into
`check_command_security`.

Fixes NousResearch#23845.
@teknium1

Copy link
Copy Markdown
Contributor

Automated hermes-sweeper review: this tirith missing-binary log-spam fix is already implemented on current main.

Evidence:

  • tools/tirith_security.py:104 defines the warn-once state/helper used for hot-path tirith warnings.
  • tools/tirith_security.py:722 routes the tirith path resolved to None; scanning disabled warning through _warn_once(...).
  • tools/tirith_security.py:740 routes OSError spawn failures such as FileNotFoundError / PermissionError through _warn_once(...), while preserving fail-open/fail-closed verdict behavior.
  • tools/tirith_security.py:197 resets the dedupe state when install failure state is cleared, so later regressions can surface again.
  • tests/tools/test_tirith_security.py:1207 covers repeated spawn failures logging exactly once, and tests/tools/test_tirith_security.py:1285 covers repeated path-None warnings logging once.
  • The implementation landed in 4aec25bc4 (fix(windows): stop spamming cwd-missing + tirith-spawn warnings on every terminal call) and is contained in v2026.5.16.

Thanks for the detailed repro and tests in the original PR; the bug class it targeted is now covered on main.

@teknium1 teknium1 closed this Jun 11, 2026
@teknium1 teknium1 added the sweeper:implemented-on-main Sweeper: behavior already present on current main label Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tools Tool registry, model_tools, toolsets P3 Low — cosmetic, nice to have sweeper:implemented-on-main Sweeper: behavior already present on current main tool/terminal Terminal execution and process management type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tirith: repeated WARNING spam on every terminal command when binary is not installed (Windows)

3 participants