Skip to content

Skill auto-creation learns from transient failures, causing persistent tool avoidance (learned helplessness) #6051

@ldk00315-jpg

Description

@ldk00315-jpg

Hi Hermes team — first, thank you for building such a strong automatic skill generation and learning system. It is genuinely one of Hermes's biggest strengths.

I found a failure mode where this strength backfires: Hermes can persist transient environment failures as durable skill guidance, then continue avoiding a tool even after the environment is fixed. In behavioral terms, this resembles learned helplessness.

What Happened (Real Example)

After migrating from Linux Mint to WSL2, browser launch failed because Playwright was not yet installed. Hermes auto-created a skill called browser-tool-launch-issue under ~/.hermes/skills/devops/.

Even after Playwright was installed and verified working (chromium.launch(headless=True) succeeded), the agent continued to avoid browser tools entirely.

In addition, negative statements accumulated in other skills (e.g., today-market-summary), such as:

  • "browser tools do not work"
  • "default_api cannot be called from execute_code"
  • "even reinstalling the browser does not resolve the issue"

As a result, the agent kept falling back to unstable curl + HTML parsing instead of retrying now-available tools.

After manually deleting the negative skill and cleaning the modified skill, the agent immediately resumed normal browser tool usage — confirming the issue was purely in the learned skill content, not in actual tool availability.

Reproduction Steps

  1. Migrate a Hermes installation to a new environment (e.g., Linux Mint → WSL2)
  2. Before installing all dependencies, ask the agent to use the browser tool
  3. Observe: auto-created skill records the failure as a permanent constraint
  4. Install the missing dependency (npx playwright install --with-deps chromium)
  5. Ask the agent to use the browser tool again
  6. Expected: agent uses the browser tool successfully
  7. Actual: agent refuses, citing the auto-created skill

Root Cause Hypothesis

The auto-skill pipeline does not sufficiently distinguish transient failures (missing packages, path mismatches after migration, temporary setup errors) from permanent constraints (API specification limitations, architectural boundaries).

Once a failure is learned, it is treated as a long-term rule and not revisited when conditions change.

From an adaptive behavior perspective, this is a missing sampling behavior mechanism (from optimal foraging theory): even when one strategy appears best, an agent should occasionally test alternatives at low frequency to detect environmental change.

Suggestions

  1. Classify before persisting: During auto-skill creation, avoid recording environment-dependent errors (missing binaries, path errors, migration artifacts) as permanent constraints.
  2. TTL / confidence scoring: Add expiration dates or confidence scores to negative operational claims, with periodic re-validation.
  3. Sampling Behavior principle: Include a Sampling Behavior directive in the default AGENTS.md template, encouraging periodic re-attempts of previously failed approaches.
  4. Built-in negative learning detection: A mechanism (or built-in scheduled task) that scans skills for defeatist/avoidance patterns and flags them for review or automatic removal.

Current Workaround

  • Manually added Sampling Behavior principles to AGENTS.md
  • Created a custom negative-skill-cleaner skill to periodically scan for and remove stale negative constraints
  • After environment changes, manually audit ~/.hermes/skills/ for auto-created avoidance skills

Environment

  • Hermes Agent (updated to latest at time of report)
  • Migrated from Linux Mint (bare metal) to WSL2 (Ubuntu 24.04)
  • Model: gemini-3-flash-preview via custom provider

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt buildertool/skillsSkills system (list, view, manage)type/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions