fix(skills): translate skill_view skill_dir for sandbox backends#27514
Closed
briandevans wants to merge 1 commit into
Closed
fix(skills): translate skill_view skill_dir for sandbox backends#27514briandevans wants to merge 1 commit into
briandevans wants to merge 1 commit into
Conversation
Contributor
Author
|
CI audit — all 21 test failures are pre-existing baselines on clean Reproduced the exact same 21 failures with identical error text on main's push CI run 25989069834:
Happy to address any failure that's actually in scope here. |
5 tasks
0015883 to
ea1ae33
Compare
c5e9147 to
59f78df
Compare
When the terminal backend bind-mounts the skill tree (docker, singularity, modal), the existing skill_view response returns the gateway-side host path for ``skill_dir`` (e.g. ``/opt/data/skills/nextcloud``). The agent then references that path inside its execute_code sandbox, which sees the same files mounted at ``/root/.hermes/skills/nextcloud`` — so any bundled-script invocation against the host path fails to find the file (NousResearch#27491). ``${HERMES_SKILL_DIR}`` substitution in SKILL.md content has the same problem. This adds a ``runtime_skill_dir`` field to the skill_view response, computed by mapping the host ``skill_dir`` through the active ``get_skills_directory_mount()`` entries. The same path is now used for ``${HERMES_SKILL_DIR}`` template substitution so SKILL.md bodies render with the sandbox path the agent will actually hit. ``skill_dir`` is left unchanged (host path) so Python callers like ``agent.skill_commands._load_skill_payload`` keep working — they read the skill off the gateway-side filesystem. For the ``local`` backend, ``runtime_skill_dir`` mirrors ``skill_dir``; the fallthrough also covers backends that use file_sync rather than bind mounts (ssh, daytona, vercel_sandbox), and the case where no mount maps the skill_dir at all (e.g. a transient test path). Fixes NousResearch#27491. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
59f78df to
19d4edc
Compare
Contributor
Author
|
Closing to focus the queue on security/file-safety work where civilian merges are landing. Happy to reopen if maintainers want this picked up. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When the terminal backend bind-mounts the skill tree into a sandbox (Docker, Singularity, Modal),
skill_view's response and the${HERMES_SKILL_DIR}template substitution return the gateway-side host path (e.g./opt/data/skills/nextcloud). The agent then references that path inside itsexecute_codesandbox, which mounts the same skill at/root/.hermes/skills/nextcloud— so any bundled-script invocation fails. This PR adds aruntime_skill_dirfield that translates the host path through the active mount mapping, and uses the translated path for${HERMES_SKILL_DIR}substitution.Fixes #27491.
The bug
tools/skills_tool.py::skill_view()returns:where
skill_diris the gateway-side absolute path (the gateway runs in Docker withHERMES_HOME=/opt/data, so this is/opt/data/skills/nextcloud). The reporter (#27491) describes the symptom precisely:The same path is fed into
agent.skill_preprocessing.preprocess_skill_content, which substitutes it for every${HERMES_SKILL_DIR}token in the rendered SKILL.md body — so a skill that documentsbash ${HERMES_SKILL_DIR}/scripts/run.shrenders the wrong path in the message the agent sees.The skill tree is actually mounted at
/root/.hermes/skills/<name>inside the Podman-in-Docker sandbox (the canonicalcontainer_baseused bytools/credential_files.get_skills_directory_mount()and consumed bytools/environments/docker.py,tools/environments/singularity.py,tools/environments/modal.py).The fix
Add a small helper
_to_runtime_skill_dir(skill_dir, backend)intools/skills_tool.pythat:skill_dirunchanged for thelocalbackend.docker/singularity/modalbackends, walks theget_skills_directory_mount()results, finds the mount whosehost_pathcontainsskill_dir, and rewrites the prefix to that mount'scontainer_path(/root/.hermes/skills/...or/root/.hermes/external_skills/<idx>/...forskills.external_dirsentries).skill_dirwhen no mount matches — e.g. backends that usefile_syncinstead of bind mounts (ssh,daytona,vercel_sandbox), or when the skills mount registration hasn't run yet.The skill_view result then exposes both fields:
…and the runtime path is the one passed into
preprocess_skill_content, so${HERMES_SKILL_DIR}substitution emits the path the agent actually needs at execute-code time.skill_diris intentionally left as the host path becauseagent.skill_commands._load_skill_payloadreads it back as aPathto access the SKILL.md on the gateway-side filesystem (see PR #10587) — rewriting it to a sandbox path would break that caller.Test plan
tests/tools/test_skills_tool.py::TestSkillView:test_skill_view_returns_host_skill_dir_for_local_backend—runtime_skill_dir == skill_dir == <host path>whenTERMINAL_ENV=local.test_skill_view_returns_sandbox_skill_dir_for_docker_backend— whenTERMINAL_ENV=dockerandget_skills_directory_mount()returns the canonicalhost_path → /root/.hermes/skillsmapping, the response hasskill_dir == <host path>andruntime_skill_dir == /root/.hermes/skills/nextcloud.test_skill_view_template_substitution_uses_sandbox_path—${HERMES_SKILL_DIR}inside SKILL.md substitutes the sandbox path under the same backend; the host path no longer appears in the renderedcontent.test_skill_view_categorized_skill_keeps_subdirs_in_sandbox_path— categorized skillintegration/nextcloudends up at/root/.hermes/skills/integration/nextcloud(relative path preserved).test_skill_view_runtime_path_falls_back_when_no_mount_matches— empty mount list ⇒runtime_skill_dir == skill_dir; no confidently-wrong sandbox path is emitted.runtime_skill_diris absent from the response and${HERMES_SKILL_DIR}substitutes the host path. With the fix, all five pass.uv run --with pytest --with pytest-xdist --with pytest-asyncio python3 -m pytest tests/tools/test_skills_tool.py::TestSkillView -v— 19 passed.tests/tools/test_skills_tool.py tests/agent/test_skill_commands.py tests/tools/test_credential_files.py— 160 passed.${HERMES_SKILL_DIR}substitution under the defaultlocalbackend is unchanged (existingtest_skill_view_applies_template_varsstill passes —runtime_skill_dir == skill_dirso the rendered string is byte-identical to before).Related
skill_dirfield; this one adds a sandbox-aware sibling for the agent's runtime view.tools/credential_files.get_skills_directory_mount) thattools/environments/docker.pyalready feeds to--volumeflags — so the runtime path matches the actual bind mount.