Skip to content

fix: skip dependency dirs in skill rglob fallback to prevent context injection#18724

Open
liuhao1024 wants to merge 1 commit into
NousResearch:mainfrom
liuhao1024:fix/skill-rglob-skip-deps
Open

fix: skip dependency dirs in skill rglob fallback to prevent context injection#18724
liuhao1024 wants to merge 1 commit into
NousResearch:mainfrom
liuhao1024:fix/skill-rglob-skip-deps

Conversation

@liuhao1024

@liuhao1024 liuhao1024 commented May 2, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

_build_skill_message() and skill_view() use rglob('*') to discover supporting files when linked_files is empty. This traverses into node_modules/, .venv/, __pycache__/ and similar directories, injecting thousands of file paths into the agent context — silently inflating token usage and potentially causing context overflow.

Changes

agent/skill_commands.py_build_skill_message() fallback scan:

  • Added _SKIP_DIRS frozenset: .git, .hg, .svn, node_modules, .venv, venv, .tox, __pycache__, .pytest_cache, .mypy_cache, .ruff_cache
  • Added _MAX_SUPPORTING_FILES = 200 hard cap with truncation message
  • Skip empty files (f.stat().st_size > 0)
  • Skip any path whose parts contain a skip-dir name

tools/skills_tool.pyskill_view() file scan:

  • Same _SKIP_DIRS filter on skill_dir.rglob("*") (line 1111)
  • Same filter + empty-file check on assets_dir.rglob("*") (line 1210)

Root Cause

skill_view uses non-recursive glob(ext) for scripts/ and references/, but the fallback in _build_skill_message uses recursive rglob("*") with no exclusions. When skill_view returns linked_files: null (e.g. scripts/ only contains node_modules/ with no top-level .py/.sh/.js), the unfiltered fallback kicks in and scans everything.

Root Cause

skill_view uses non-recursive glob(ext) for scripts/ and references/, but the fallback in _build_skill_message uses recursive rglob("*") with no exclusions. When skill_view returns linked_files: null (e.g. scripts/ only contains node_modules/ with no top-level .py/.sh/.js), the unfiltered fallback kicks in and scans everything.

Related Issue

Fixes #18675

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

agent/skill_commands.py_build_skill_message() fallback scan:

  • Added _SKIP_DIRS frozenset: .git, .hg, .svn, node_modules, .venv, venv, .tox, __pycache__, .pytest_cache, .mypy_cache, .ruff_cache
  • Added _MAX_SUPPORTING_FILES = 200 hard cap with truncation message
  • Skip empty files (f.stat().st_size > 0)
  • Skip any path whose parts contain a skip-dir name

tools/skills_tool.pyskill_view() file scan:

  • Same _SKIP_DIRS filter on skill_dir.rglob("*") (line 1111)
  • Same filter + empty-file check on assets_dir.rglob("*") (line 1210)

Root Cause

skill_view uses non-recursive glob(ext) for scripts/ and references/, but the fallback in _build_skill_message uses recursive rglob("*") with no exclusions. When skill_view returns linked_files: null (e.g. scripts/ only contains node_modules/ with no top-level .py/.sh/.js), the unfiltered fallback kicks in and scans everything.

Fixes #18675

How to Test

  1. Run pytest tests/ -q — all tests should pass
  2. Verify the specific scenario described above is resolved

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS 26.4.1

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture and workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder tool/skills Skills system (list, view, manage) labels May 2, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #18677 and #18717 which also fix #18675 (unbounded rglob in _build_skill_message). Maintainer: pick one of these three.

1 similar comment
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #18677 and #18717 which also fix #18675 (unbounded rglob in _build_skill_message). Maintainer: pick one of these three.

…injection

_build_skill_message() and skill_view() use rglob('*') to discover
supporting files when linked_files is empty. This traverses into
node_modules/, .venv/, __pycache__/ and similar directories, injecting
thousands of file paths into the agent context.

- Add _SKIP_DIRS set (node_modules, .venv, venv, __pycache__, .tox,
  .pytest_cache, .mypy_cache, .ruff_cache, .git, .hg, .svn)
- Add _MAX_SUPPORTING_FILES = 200 hard cap
- Skip empty files
- Apply same filters to skills_tool.py assets/ and skill_dir rglob

Fixes NousResearch#18675
@liuhao1024 liuhao1024 force-pushed the fix/skill-rglob-skip-deps branch from 2518a5d to 7b2a496 Compare May 2, 2026 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists tool/skills Skills system (list, view, manage) type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: _build_skill_message fallback rglob can inject thousands of files into context from nested dependency dirs

2 participants