Skip to content

fix(skill_commands): preserve cache when scan fails#22514

Closed
wesleysimplicio wants to merge 1 commit into
NousResearch:mainfrom
wesleysimplicio:fix/agente-b-skill-commands-preserve-on-error
Closed

fix(skill_commands): preserve cache when scan fails#22514
wesleysimplicio wants to merge 1 commit into
NousResearch:mainfrom
wesleysimplicio:fix/agente-b-skill-commands-preserve-on-error

Conversation

@wesleysimplicio

Copy link
Copy Markdown
Contributor

Problem

scan_skill_commands() clears _skill_commands and _skill_lower_to_canonical at the start of the function. If a later import or scan step raises, the function returns the wiped dicts and leaves module state empty — every previously cached skill command is silently lost until the next successful rescan.

Root cause

In-place clear of module-level dicts before the work that could fail.

Fix

Build a local new_commands dict and only commit to module state once the full scan succeeds. On exception, return the existing _skill_commands (cache preserved).

Tests

TestScanPreservesCacheOnFailure:

  • test_scan_failure_keeps_previous_cache — patches a downstream import to raise mid-scan, asserts the previously populated cache is preserved.
  • test_successful_scan_replaces_cache — confirms a successful rescan still replaces the cache normally.

Stash-verify confirmed the new test flips red without the fix.

Closes #18659

Problem: scan_skill_commands() cleared `_skill_commands` and
`_skill_lower_to_canonical` at the start of the function. If a later
import or scan step raised, the function returned the wiped dicts and
left module state empty — every previously cached skill command was
silently lost until the next successful rescan.

Root cause: in-place clear of module-level dicts before the work
that could fail.

Fix: build a local `new_commands` dict and only commit to module state
once the full scan succeeds. On exception, return the existing
`_skill_commands` (cache preserved).

Tests: TestScanPreservesCacheOnFailure covers (1) cache preserved when
the scan raises mid-flight and (2) successful rescan still replaces
the cache normally.

Closes NousResearch#18659
Copilot AI review requested due to automatic review settings May 9, 2026 11:29

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a failure mode in agent.skill_commands.scan_skill_commands() where an exception during scanning could previously wipe the process-global skill command cache, leaving no slash commands available until a later successful rescan.

Changes:

  • Refactor scan_skill_commands() to build results into a local dict and only commit to module state after a successful scan; on failure it returns the existing cache.
  • Add regression tests to ensure scan failures preserve the previously populated cache and successful scans still replace it.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
agent/skill_commands.py Builds skill command mapping locally and commits only on success; preserves existing cache on scan/import failure.
tests/agent/test_skill_commands.py Adds regression coverage for preserving cache on scan failure and replacing cache on successful rescan.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread agent/skill_commands.py
Comment on lines 305 to +308
except Exception:
pass
# Top-level scan failed (e.g. ImportError from skills_tool /
# skill_utils). Preserve the existing cache rather than wiping it.
return _skill_commands
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder tool/skills Skills system (list, view, manage) labels May 11, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #18668 — same fix (build local map, swap on success) for scan_skill_commands() unconditionally clearing cache before try block. Also see #18720 (already marked dupe of #18668).

@wesleysimplicio

Copy link
Copy Markdown
Contributor Author

Closing in favor of #18668 per @alt-glitch's note — same scan_skill_commands() cache preservation fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists tool/skills Skills system (list, view, manage) type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: scan_skill_commands unconditionally clears _skill_commands before try block, silently loses all skills on scan failure

3 participants