Skip to content

[Bug]: Skill load failure is treated as successful slash-command invocation #11200

@NewTurn2017

Description

@NewTurn2017

Bug Description

build_skill_invocation_message() returns a non-empty placeholder string ([Failed to load skill: ...]) when the skill exists in the command cache but loading the actual SKILL.md payload fails. CLI/gateway callers treat any truthy return value as success, so the failure is silently routed into the model as if it were a valid skill prompt.

Affected files / lines

  • agent/skill_commands.py:311-313
  • cli.py:5571-5580
  • gateway/run.py:2877-2882
  • gateway/platforms/webhook.py:383-388

Why this is a bug

The callers do if msg: to distinguish success from load failure. Returning a truthy error string bypasses the user-facing error path and injects a bracketed failure note into agent input.

So a missing/deleted/corrupt skill can look like a successful slash-command invocation, while the model receives [Failed to load skill: ...] as conversation content.

Minimal reproduction / evidence

I reproduced this on main by scanning a temporary skill, deleting SKILL.md, then invoking the cached command:

import agent.skill_commands as sc
from pathlib import Path
from tempfile import TemporaryDirectory
from unittest.mock import patch

with TemporaryDirectory() as td:
    root = Path(td)
    skill_dir = root / "demo"
    skill_dir.mkdir(parents=True)
    (skill_dir / "SKILL.md").write_text("---\nname: demo\ndescription: demo\n---\n\n# Demo\n", encoding="utf-8")
    with patch("agent.skill_utils.get_external_skills_dirs", return_value=[root]), \
         patch("tools.skills_tool.SKILLS_DIR", Path(td) / "missing_local_skills"):
        sc.scan_skill_commands()
        (skill_dir / "SKILL.md").unlink()
        print(sc.build_skill_invocation_message("/demo", "hello"))

Observed output:

[Failed to load skill: demo]

Relevant implementation:

loaded = _load_skill_payload(skill_info["skill_dir"], task_id=task_id)
if not loaded:
    return f"[Failed to load skill: {skill_info['name']}]"

Expected Behavior

Load failure should be surfaced as a real failure signal to the caller (for example None or an exception/result object), so the UI/gateway can show Failed to load skill ... to the user instead of enqueuing bogus prompt text.

Actual Behavior

Load failure returns a truthy string, and callers proceed as though skill loading succeeded.

Suggested investigation direction

  • Change build_skill_invocation_message() to return a falsey/error-typed result on load failure.
  • Update CLI/gateway callers to distinguish "unknown command" vs "known command but failed to load".
  • Add a regression test for a post-scan load failure (file deleted/corrupted between scan and invocation).

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt buildercomp/cliCLI entry point, hermes_cli/, setup wizardcomp/gatewayGateway runner, session dispatch, deliverytool/skillsSkills system (list, view, manage)type/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions