Skip to content

fix(agent): preserve _skill_commands cache on scan failure; cap rglob fallback (#18659, #18675)#18717

Closed
Bartok9 wants to merge 0 commit into
NousResearch:mainfrom
Bartok9:fix/18659-scan-skill-commands-preserve-on-failure
Closed

fix(agent): preserve _skill_commands cache on scan failure; cap rglob fallback (#18659, #18675)#18717
Bartok9 wants to merge 0 commit into
NousResearch:mainfrom
Bartok9:fix/18659-scan-skill-commands-preserve-on-failure

Conversation

@Bartok9

@Bartok9 Bartok9 commented May 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes two related bugs in agent/skill_commands.py:

Fix 1 — #18659: Cache cleared on scan failure (P2)

Root cause: scan_skill_commands() set _skill_commands = {} unconditionally before the try block that populated it. Any exception (broken import, unreadable skills dir, etc.) left the global permanently empty with no user-visible error.

Fix: Build results into a local fresh dict inside the try block. On success, atomically replace the global. On failure, log a WARNING and leave the previous cache intact so all previously-registered slash commands keep working.

# Before
_skill_commands = {}    # ← clears BEFORE try — data loss on exception
try:
    ...populate...
except Exception:
    pass                # ← silently swallowed, global stays empty

# After
fresh = {}
try:
    ...populate fresh...
except Exception as exc:
    logger.warning("scan_skill_commands: scan failed, preserving %d cached command(s). Error: %s", ...)
    return _skill_commands   # ← previous cache preserved
_skill_commands = fresh      # ← atomic replace only on success

Fix 2 — #18675: rglob fallback injects node_modules (P2)

Root cause: The _build_skill_message fallback path used subdir.rglob("*") with no directory exclusions or file count cap. Skills with scripts/node_modules/ could inject thousands of file paths into the model context.

Fix: Mirror skill_view()'s intent by:

  1. Skipping any path whose components include common package-manager dirs: node_modules, .venv, venv, __pycache__, .git, .tox, dist, build, .mypy_cache
  2. Capping at _MAX_SUPPORTING_FILES = 50

Tests Added

  • test_preserves_cache_on_total_scan_failure — RuntimeError inside scan leaves cache intact
  • test_preserves_cache_on_import_failure — ImportError inside scan leaves cache intact
  • TestBuildSkillMessageRglobFallback.test_node_modules_excluded_from_fallback_scan — node_modules paths don't appear in skill message
  • TestBuildSkillMessageRglobFallback.test_fallback_caps_at_max_files — 200 reference files → ≤50 listed

Checklist

  • Fixes root cause, not symptom
  • Backward compatible — successful scans behave identically
  • Tests cover both failure modes
  • Logging added for scan failure (warning, not silent)

@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder tool/skills Skills system (list, view, manage) P2 Medium — degraded but workaround exists labels May 2, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Overlaps with #18668 (cache preservation) and #18677 (rglob bounding) which fix #18659 and #18675 separately.

1 similar comment
@alt-glitch

Copy link
Copy Markdown
Collaborator

Overlaps with #18668 (cache preservation) and #18677 (rglob bounding) which fix #18659 and #18675 separately.

@Bartok9 Bartok9 closed this May 5, 2026
@Bartok9 Bartok9 force-pushed the fix/18659-scan-skill-commands-preserve-on-failure branch from b9e9a59 to af312cc Compare May 5, 2026 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists tool/skills Skills system (list, view, manage) type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants