Description
agent/skill_commands.py:222 unconditionally sets _skill_commands = {} before entering the try block that populates it. If any exception occurs between the clear and the first successful skill addition, all skill slash commands are silently lost.
# agent/skill_commands.py:221-227
global _skill_commands
_skill_commands = {} # ← cleared BEFORE try
try:
from tools.skills_tool import SKILLS_DIR, _parse_frontmatter, ...
from agent.skill_utils import get_external_skills_dirs, iter_skill_index_files
disabled = _get_disabled_skill_names()
...
except Exception:
pass # ← exception swallowed, stays empty
return _skill_commands
Trigger conditions
In current production code, tools.skills_tool is already cached in sys.modules before scan_skill_commands() runs (imported during tool discovery at startup), so the specific from tools.skills_tool import ... line cannot fail. get_disabled_skill_names() and iter_skill_index_files() are also defensively written.
However, the bug can trigger when SKILLS_DIR exists but os.walk fails (permission error, broken filesystem). More importantly, it is a latent risk for any future code added between line 222 and the try block, or any refactor that changes the import order or scanning logic.
Impact
Scenario 1: startup permanently broken
scan_skill_commands() called at startup -> exception -> _skill_commands = {}
- User runs
/reload-skills -> reload_skills() snapshots before={}, scans fail again -> after={}
- User sees:
No new skills detected. 0 skill(s) available
- All 90+ skill slash commands are gone with zero indication
Scenario 2: hot-reload after startup
- Startup succeeds -> 90 skills loaded
- Something causes
scan_skill_commands() to fail on re-invocation
/reload-skills -> before=90 skills snapshotted, after=0 -> "Removed 90 skills"
- User sees 90 skills removed,
0 skill(s) available -- looks intentional but cause is opaque
Fix
Build the new dict in a local variable, assign to the global only on success:
def scan_skill_commands() -> Dict[str, Dict[str, Any]]:
global _skill_commands
new_commands: Dict[str, Dict[str, Any]] = {}
try:
...
# build new_commands instead of _skill_commands
except Exception:
pass # keep old _skill_commands intact
else:
_skill_commands = new_commands
return _skill_commands
Description
agent/skill_commands.py:222unconditionally sets_skill_commands = {}before entering thetryblock that populates it. If any exception occurs between the clear and the first successful skill addition, all skill slash commands are silently lost.Trigger conditions
In current production code,
tools.skills_toolis already cached insys.modulesbeforescan_skill_commands()runs (imported during tool discovery at startup), so the specificfrom tools.skills_tool import ...line cannot fail.get_disabled_skill_names()anditer_skill_index_files()are also defensively written.However, the bug can trigger when
SKILLS_DIRexists butos.walkfails (permission error, broken filesystem). More importantly, it is a latent risk for any future code added between line 222 and the try block, or any refactor that changes the import order or scanning logic.Impact
Scenario 1: startup permanently broken
scan_skill_commands()called at startup -> exception ->_skill_commands = {}/reload-skills->reload_skills()snapshotsbefore={}, scans fail again ->after={}No new skills detected. 0 skill(s) availableScenario 2: hot-reload after startup
scan_skill_commands()to fail on re-invocation/reload-skills->before=90 skillssnapshotted,after=0-> "Removed 90 skills"0 skill(s) available-- looks intentional but cause is opaqueFix
Build the new dict in a local variable, assign to the global only on success: