fix(skills): surface skill-load failures instead of returning success by zerone0x · Pull Request #11408 · NousResearch/hermes-agent

zerone0x · 2026-04-17T04:54:23Z

Problem

build_skill_invocation_message() returned a truthy placeholder string ([Failed to load skill: <name>]) when a registered skill's SKILL.md could not be loaded (e.g. file deleted or corrupted between scan and invocation). Every caller used if msg: to distinguish success from failure, so the placeholder was treated as a successful invocation and silently fed into the agent as user prompt content. A missing/deleted/corrupt skill therefore looked like a working slash command from the user's perspective, while the model received [Failed to load skill: ...] as conversation input.

Affected call sites flagged in the bug report:

agent/skill_commands.py:311-313
cli.py:5571-5580
gateway/run.py:2877-2882
gateway/platforms/webhook.py:383-388

Fix

build_skill_invocation_message() now returns None on load failure (matching the existing "unknown skill" path) and logs a warning with the cmd_key + skill_dir for diagnosis. The docstring is updated to document the new contract.
gateway/run.py: when the slash command resolves to a registered skill but loading fails, return a real error message to the user (instead of silently falling through and sending the bare /cmd to the LLM as free text).
gateway/platforms/webhook.py: when a configured route skill is registered but fails to load, log loudly at ERROR and skip that skill rather than injecting the placeholder string into the prompt.
cli.py already had if not msg: branches for both the skill slash handler and _handle_plan_command; those branches were effectively dead code under the old return contract and now fire correctly on real load failures.

Reproduction

The reporter's reproduction now returns None instead of '[Failed to load skill: demo]':

import agent.skill_commands as sc
from pathlib import Path
from tempfile import TemporaryDirectory
from unittest.mock import patch

with TemporaryDirectory() as td:
    root = Path(td)
    skill_dir = root / "demo"
    skill_dir.mkdir(parents=True)
    (skill_dir / "SKILL.md").write_text("---
name: demo
description: demo
---

# Demo
", encoding="utf-8")
    with patch("agent.skill_utils.get_external_skills_dirs", return_value=[root]), \
         patch("tools.skills_tool.SKILLS_DIR", Path(td) / "missing_local_skills"):
        sc.scan_skill_commands()
        (skill_dir / "SKILL.md").unlink()
        print(sc.build_skill_invocation_message("/demo", "hello"))  # -> None

Tests

New TestBuildSkillInvocationMessage::test_returns_none_when_payload_load_fails_after_scan covers the regression: scan a skill, delete its SKILL.md, assert build_skill_invocation_message returns None.
pytest tests/agent/test_skill_commands.py — 27 passed.
pytest tests/gateway/test_webhook_integration.py — 4 passed.

`build_skill_invocation_message()` returned a truthy placeholder string (`[Failed to load skill: ...]`) when a registered skill's `SKILL.md` could not be loaded. Every caller checked the result with `if msg:` to distinguish success from failure, so the placeholder string was treated as a successful invocation and silently fed into the agent as user prompt content. A missing/deleted/corrupt skill therefore looked like a working slash command from the user's perspective. Make the function return `None` on load failure (matching the existing "unknown skill" path) and update the gateway and webhook callers to emit a real user-facing error / log instead of forwarding the bare slash command. The CLI handler and `_handle_plan_command` already had dead `if not msg:` branches that now actually fire on load failure. Adds a regression test that scans a skill, deletes its SKILL.md before invocation, and asserts `None` is returned. Fixes #11200 Co-Authored-By: Claude <noreply@anthropic.com>

alt-glitch · 2026-04-25T00:21:19Z

Likely duplicate of #14752 — both fix build_skill_invocation_message() returning truthy error string instead of None on load failure. Same root cause as #11200 and #14713.

zerone0x · 2026-04-30T13:03:37Z

Confirmed this overlaps with #14752 on the same root fix: make build_skill_invocation_message() return None on skill payload load failure and have callers surface the failure instead of queueing the sentinel string. Since #14752 is open, I’m closing this one to reduce duplicate PR noise.

cola-runner mentioned this pull request Apr 17, 2026

Treat failed skill loads as actual command failures #11545

Closed

malaiwah mentioned this pull request Apr 17, 2026

[Bug]: Skill load failure is treated as successful slash-command invocation #11200

Open

This was referenced Apr 23, 2026

Slash-skill load failures are queued as input instead of surfacing an error #14713

Closed

fix: skill load failures return None instead of queued sentinel (#14713) #14752

Closed

zerone0x closed this Apr 30, 2026

alt-glitch mentioned this pull request May 16, 2026

fix(skills): return None instead of truthy stub when skill load fails #27147

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(skills): surface skill-load failures instead of returning success#11408

fix(skills): surface skill-load failures instead of returning success#11408
zerone0x wants to merge 1 commit into
NousResearch:mainfrom
zerone0x:fix/skill-load-failure-11200

zerone0x commented Apr 17, 2026 •

edited

Loading

Uh oh!

alt-glitch commented Apr 25, 2026

Uh oh!

zerone0x commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zerone0x commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Reproduction

Tests

Uh oh!

alt-glitch commented Apr 25, 2026

Uh oh!

zerone0x commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zerone0x commented Apr 17, 2026 •

edited

Loading