Bug Description
Hermes gateway terminal/runtime repeatedly failed to enumerate a large external macOS volume directory that another local coding agent (Pi) could enumerate in milliseconds using the same commands. Hermes could read metadata (stat, df, mount info) but find, ls -1f, and Python os.scandir all timed out with no stdout/stderr. This led the agent to misdiagnose the problem as external SSD/folder responsiveness instead of recognizing a Hermes runtime/tool-capture/session issue sooner.
Target path in observed session:
/Volumes/SSD/1. Sample Library - NEW
Environment observed:
- macOS/Darwin host
- Hermes gateway via Telegram DM
- External volume:
/Volumes/SSD, Journaled HFS+, USB, writable, SMART verified
- Directory metadata was readable:
drwxr-xr-x justyn:staff
diskutil verifyVolume was accidentally started during debugging, then stopped; after confirming no verification/fsck jobs were running, failures persisted
Steps to Reproduce
From a Hermes gateway/Telegram session with terminal tools enabled, run:
ps aux | grep -Ei '[d]iskutil|[f]sck|[h]fs|[v]erifyVolume' || true
TIMEFORMAT='find_elapsed=%3R'; time /usr/bin/find '/Volumes/SSD/1. Sample Library - NEW' -maxdepth 1 -mindepth 1 -print | wc -l
TIMEFORMAT='ls1f_elapsed=%3R'; time /bin/ls -1f '/Volumes/SSD/1. Sample Library - NEW' | wc -l
python3 - <<'PY'
from pathlib import Path
import os, time
p=Path('/Volumes/SSD/1. Sample Library - NEW')
start=time.time(); n=0
with os.scandir(p) as it:
for e in it:
n+=1
print(f'os_scandir_count={n} elapsed={time.time()-start:.4f}s')
PY
Observed in Hermes:
- no
diskutil / fsck / verifyVolume jobs running
find timed out after 60-90s with no stdout/stderr
ls -1f timed out after 60-90s with no stdout/stderr
- Python
os.scandir timed out after 60-90s with no stdout/stderr
- post-timeout process checks showed no leftover commands
- metadata commands such as
stat continued to work
Control result from another local coding agent (Pi) on the same machine/path:
find -maxdepth 1 completed in ~0.004s
ls -1f completed in ~0.004s
- recursive metadata scan completed in under 1s
Expected Behavior
Hermes should be able to enumerate the directory as quickly as the local shell/other agent, or at minimum should classify the failure as a Hermes terminal/runtime/session/capture problem once multiple simple enumeration methods time out while metadata works.
The agent should not keep spinning for minutes or blame the external drive/folder without stronger evidence.
Actual Behavior
Hermes repeatedly retried equivalent enumeration methods, spent many minutes, started an unnecessary diskutil verifyVolume, and produced low-confidence/incorrect conclusions about external volume I/O.
Requested Fixes
- Add a regression test or diagnostic for terminal commands that hang only under the Hermes gateway/runtime path while succeeding in normal local execution.
- Improve terminal tool timeout handling so it captures partial output and distinguishes:
- child command timeout
- shell/session timeout
- command-capture deadlock
- filesystem-level stall
- Add a runtime/session refresh path that can be invoked when repeated terminal commands time out while equivalent commands work outside Hermes.
- Consider a built-in diagnostic command for macOS removable volume access from gateway-launched Hermes processes, including TCC/removable-volume permission checks.
- Update agent guidance/default skills to avoid repeated equivalent filesystem probes after multiple bounded timeouts.
Impact
This blocks Hermes from reliably helping with large external audio/video/sample-library workflows, where external SSDs are common and directory enumeration is a basic operation.
Labels Suggested
- bug
- terminal
- gateway
- macos
- reliability
Bug Description
Hermes gateway terminal/runtime repeatedly failed to enumerate a large external macOS volume directory that another local coding agent (
Pi) could enumerate in milliseconds using the same commands. Hermes could read metadata (stat,df, mount info) butfind,ls -1f, and Pythonos.scandirall timed out with no stdout/stderr. This led the agent to misdiagnose the problem as external SSD/folder responsiveness instead of recognizing a Hermes runtime/tool-capture/session issue sooner.Target path in observed session:
Environment observed:
/Volumes/SSD, Journaled HFS+, USB, writable, SMART verifieddrwxr-xr-x justyn:staffdiskutil verifyVolumewas accidentally started during debugging, then stopped; after confirming no verification/fsck jobs were running, failures persistedSteps to Reproduce
From a Hermes gateway/Telegram session with terminal tools enabled, run:
Observed in Hermes:
diskutil/fsck/verifyVolumejobs runningfindtimed out after 60-90s with no stdout/stderrls -1ftimed out after 60-90s with no stdout/stderros.scandirtimed out after 60-90s with no stdout/stderrstatcontinued to workControl result from another local coding agent (
Pi) on the same machine/path:find -maxdepth 1completed in ~0.004sls -1fcompleted in ~0.004sExpected Behavior
Hermes should be able to enumerate the directory as quickly as the local shell/other agent, or at minimum should classify the failure as a Hermes terminal/runtime/session/capture problem once multiple simple enumeration methods time out while metadata works.
The agent should not keep spinning for minutes or blame the external drive/folder without stronger evidence.
Actual Behavior
Hermes repeatedly retried equivalent enumeration methods, spent many minutes, started an unnecessary
diskutil verifyVolume, and produced low-confidence/incorrect conclusions about external volume I/O.Requested Fixes
Impact
This blocks Hermes from reliably helping with large external audio/video/sample-library workflows, where external SSDs are common and directory enumeration is a basic operation.
Labels Suggested