Skip to content

skill_commands cache can leak one platform's disabled-skill view into another #14536

@NewTurn2017

Description

@NewTurn2017

Bug Description

agent.skill_commands caches one global _skill_commands mapping, but the scan itself depends on platform-specific disabled-skill config via _get_disabled_skill_names(). In a long-lived multi-platform process, the first platform to scan commands wins; later platforms can inherit the wrong enabled/disabled skill set.

Affected files / lines

  • agent/skill_commands.py:206-212 — scan reads platform-aware disabled skill names
  • agent/skill_commands.py:265-269get_skill_commands() reuses the global cache until it is empty

Why this is a bug

The cache is process-global, but disabled-skill resolution is platform-specific. If telegram disables one set of skills and discord disables another, whichever platform scans first seeds the cache for the other.

This can cause wrong slash-command availability and stale skill menus across gateway platforms.

Minimal reproduction

source venv/bin/activate
python - <<'PY'
import os, tempfile, textwrap
from pathlib import Path
from agent.skill_commands import scan_skill_commands, get_skill_commands

with tempfile.TemporaryDirectory() as td:
    home = Path(td)
    os.environ['HERMES_HOME'] = str(home)
    for name in ['alpha', 'beta']:
        d = home / 'skills' / name
        d.mkdir(parents=True)
        (d / 'SKILL.md').write_text(textwrap.dedent(f'''\
---
name: {name}
description: {name} skill
---
body
'''))

    (home / 'config.yaml').write_text(textwrap.dedent('''\
skills:
  platform_disabled:
    telegram: [alpha]
    discord: [beta]
'''))

    os.environ['HERMES_PLATFORM'] = 'telegram'
    print('telegram:', sorted(get_skill_commands().keys()))

    os.environ['HERMES_PLATFORM'] = 'discord'
    print('discord without rescan:', sorted(get_skill_commands().keys()))
    print('discord with explicit rescan:', sorted(scan_skill_commands().keys()))
PY

Current output:

telegram: ['/beta']
discord without rescan: ['/beta']
discord with explicit rescan: ['/alpha']

Expected Behavior

Each platform should see its own disabled-skill view, even inside the same process.

Actual Behavior

get_skill_commands() can return a stale command set computed for a different platform.

Related context

This looks adjacent to profile/platform command-registration issues such as #8108, but the root cause here is the missing platform dimension in the in-process skill_commands cache itself.

Suggested investigation direction

  • Key _skill_commands by platform/home/config state, or
  • always rescan when platform context changes, or
  • move platform filtering out of the global cache and apply it per call.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt buildertool/skillsSkills system (list, view, manage)type/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions