Skip to content

fix(agent): detect symlink cycles in iter_skill_index_files#20658

Open
haosenwang1018 wants to merge 1 commit into
NousResearch:mainfrom
haosenwang1018:fix/18809-skill-utils-symlink-cycle
Open

fix(agent): detect symlink cycles in iter_skill_index_files#20658
haosenwang1018 wants to merge 1 commit into
NousResearch:mainfrom
haosenwang1018:fix/18809-skill-utils-symlink-cycle

Conversation

@haosenwang1018

Copy link
Copy Markdown

Issue

Closes #18809

Root cause

os.walk(skills_dir, followlinks=True) performs no cycle detection on its own — it relies on the caller to filter cyclic entries. A stray symlink under `~/.hermes/skills` that pointed back to an ancestor (the bug report's reproduction does this with test-cycle/circular -> ~/.hermes/skills) caused iter_skill_index_files to recurse indefinitely until the OS rejected the path with ENAMETOOLONG / ELOOP. Because skill discovery runs during agent init, this blocked startup entirely, not just the /skill command.

The pre-existing EXCLUDED_SKILL_DIRS filter only covers .git/.github/.hub/.archive — it provides no defense against a generic symlink loop.

Fix

Track each visited subdirectory's resolved canonical path via os.path.realpath and drop any candidate whose realpath has already been entered. This is the standard cycle-safe idiom for os.walk(followlinks=True). The EXCLUDED_SKILL_DIRS filter still runs first so behavior on plain non-cyclic trees is identical to before.

OSError during realpath resolution is treated as "skip" rather than propagated, so a single broken or unreadable symlink can't take down skill discovery.

Tests

Added in tests/agent/test_skill_utils.py:

  • test_self_referencing_subdir_does_not_loop — reproduces the bug report's scenario; without the fix this test never returns.
  • test_cycle_via_symlinked_sibling_does_not_loop — covers the harder case of a two-cycle formed by sibling cross-links.
  • test_normal_tree_unchanged — regression guard for plain trees.

The symlink tests are skipped on Windows because POSIX ELOOP semantics don't apply.

Closes NousResearch#18809

``os.walk(skills_dir, followlinks=True)`` performs no cycle detection,
so a self-referencing skills tree (e.g. a stray symlink such as
``~/.hermes/skills/test-cycle/circular -> ~/.hermes/skills``) caused
infinite recursion until the OS rejected the path with
``ENAMETOOLONG`` / ``ELOOP``. This blocked agent startup and any
``/skill`` discovery path because skill listing happens during init.

Track each visited subdirectory's resolved (canonical) path via
``os.path.realpath`` and skip any directory we've already entered.
This is the standard cycle-safe idiom for ``os.walk(followlinks=True)``
and matches how ``shutil`` and similar stdlib walkers handle the case.
``EXCLUDED_SKILL_DIRS`` filtering is preserved as the first pass so
behavior on non-cyclic trees is identical.

Tests cover:

- A self-referencing subdir cycle terminates and still returns the
  legitimate skill once.
- A two-cycle through cross-linked siblings terminates and returns
  both real ``SKILL.md`` files exactly once.
- A plain non-symlinked tree is unaffected (regression guard).

Symlink-specific tests are skipped on Windows where the semantics
differ.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder tool/skills Skills system (list, view, manage) labels May 6, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #18815 — both add cycle detection to iter_skill_index_files() via visited-set tracking. Please coordinate with that PR or close one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists tool/skills Skills system (list, view, manage) type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: os.walk(followlinks=True) in iter_skill_index_files may infinite-loop on cyclic symlinks

2 participants