Skip to content

[Bug]: bundled skill sync can leave stale .bak directories #34860

@jerome-benoit

Description

@jerome-benoit

Bug Description

Bundled skill sync can leave persistent *.bak directories after a successful update when the backup tree contains read-only files. This can happen with skills copied from immutable/read-only package sources such as a Nix store install.

The cleanup failure is currently silent because sync_skills() calls shutil.rmtree(backup, ignore_errors=True). Leftover backups then remain under ~/.hermes/skills and can interfere with later sync runs and skill discovery/list consistency.

Steps to Reproduce

  1. Install/run Hermes from a package source where bundled skill files are read-only after copy, e.g. Nix store packaging.
  2. Have an already-synced bundled skill whose on-disk hash still matches the manifest origin hash.
  3. Update Hermes so the bundled version of that skill changes.
  4. Run bundled skill sync, e.g. through Hermes startup/update or tools.skills_sync.sync_skills().
  5. Inspect ~/.hermes/skills for leftover *.bak directories.

Observed locally after a v0.15.1 Nix-based update: multiple stale *.bak directories were left behind and had to be made writable before removal.

Expected Behavior

After a successful skill update:

  • the temporary backup directory is removed reliably; or
  • if cleanup fails, the failure is reported/logged and the sync result exposes it.

No persistent *.bak directories should remain after a successful update.

Actual Behavior

tools/skills_sync.py moves the old skill to a backup, copies the new bundled skill, updates the manifest, then removes the backup with silent error suppression:

backup = dest.with_suffix(".bak")
shutil.move(str(dest), str(backup))
...
shutil.copytree(skill_src, dest)
manifest[skill_name] = bundled_hash
...
shutil.rmtree(backup, ignore_errors=True)

If rmtree() cannot remove read-only files/directories, the failure is ignored and the *.bak directory remains.

Affected Component

Skills (skill loading, skill hub, skill guard)

Messaging Platform

N/A (CLI only)

Debug Report

Redacted local debug report generated with hermes debug share --local --lines 50.

Relevant excerpt:

version:          0.15.1 (2026.5.29)
os:               Linux 7.0.10-101.fc43.x86_64 x86_64
python:           3.12.13
openai_sdk:       2.24.0
profile:          default
hermes_home:      ~/.hermes
terminal:         local
features:
  toolsets:           hermes-cli
  skills:             104

Full log paste/upload intentionally omitted because this is a deterministic code-level bug in tools/skills_sync.py; recent agent/gateway logs are unrelated and contain no additional reproduction signal.

Operating System

Fedora Linux, kernel 7.0.10-101.fc43.x86_64

Python Version

Hermes runtime: Python 3.12.13

Hermes Version

Hermes Agent v0.15.1 (2026.5.29)

Additional Logs / Traceback

Local post-cleanup verification after manually removing stale backups:

bak_dirs_or_files_count 0
copied 0 []
updated 0 []
skipped 89
user_modified ['google-workspace']
cleaned []
total_bundled 90

google-workspace was intentionally user-modified locally for an unrelated Gmail header fix.

Root Cause Analysis

Root cause is in tools/skills_sync.py, around the bundled skill update path:

  • old skill is moved to dest.with_suffix(".bak")
  • new bundled skill is copied to dest
  • manifest is updated
  • backup cleanup uses shutil.rmtree(backup, ignore_errors=True)

Because ignore_errors=True suppresses cleanup failures, stale backups can accumulate when copied files/directories are not user-writable.

Proposed Fix

Replace the silent cleanup with a robust helper, for example:

  • recursively make backup files/directories writable before removal when needed;
  • call shutil.rmtree() without silent ignore_errors=True;
  • log a warning and/or include cleanup failures in the returned sync result;
  • preferably test with a read-only backup tree.

This preserves the intended temporary backup/restore behavior while preventing silent persistent .bak directories.

PR Readiness

I can submit a PR if maintainers agree with the approach.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low — cosmetic, nice to havearea/nixNix flake, NixOS module, container packagingtool/skillsSkills system (list, view, manage)type/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions