Skip to content

fix(skills): fix transaction ordering in reset_bundled_skill and handle read-only files in rmtree#35131

Open
annguyenNous wants to merge 1 commit into
NousResearch:mainfrom
annguyenNous:fix/skills-sync-nix-readonly
Open

fix(skills): fix transaction ordering in reset_bundled_skill and handle read-only files in rmtree#35131
annguyenNous wants to merge 1 commit into
NousResearch:mainfrom
annguyenNous:fix/skills-sync-nix-readonly

Conversation

@annguyenNous

Copy link
Copy Markdown
Contributor

Problem

Two related bugs in tools/skills_sync.py affecting installs with read-only files (Nix store, deb/rpm packaging).

#34972reset_bundled_skill corrupts manifest on rmtree failure

hermes skills reset <name> --restore --yes deletes the manifest entry before attempting shutil.rmtree(). If rmtree fails (e.g. read-only files from Nix store), the function returns early — but the manifest entry is already gone. The skill enters a permanent limbo state:

  • sync_skills() sees skill_name not in manifest && dest exists && dir_hash != bundled_hash → silently skips
  • Skill is permanently stuck without a manifest entry to baseline it
  • No error reported to user (rc=0)

#34860 — Stale .bak directories after sync

sync_skills() calls shutil.rmtree(backup, ignore_errors=True) which silently fails on read-only files, leaving persistent .bak directories under ~/.hermes/skills/.

Fix

Transaction reordering (#34972)

Reorder reset_bundled_skill() — attempt rmtree first, only delete manifest entry after rmtree succeeds. If rmtree fails, nothing is changed:

Step Before After
1 Delete manifest entry Try rmtree (with writable fallback)
2 Try rmtree → fail → return Delete manifest entry
3 sync_skills (never reached) sync_skills

_rmtree_writable helper (#34860)

Add _rmtree_writable(path) that uses shutil.rmtree(onerror=...) to make read-only files writable before retrying removal. Used in both:

  • sync_skills() backup cleanup (replaces ignore_errors=True)
  • reset_bundled_skill() user copy deletion

Tests

  • Added regression test: test_reset_restore_preserves_manifest_on_rmtree_failure — verifies manifest is preserved when rmtree fails
  • All 50 existing tests pass

Fixes #34972
Fixes #34860

…le read-only files in rmtree

Two related bugs in tools/skills_sync.py affecting Nix-store and
immutable-package installs:

**NousResearch#34972 — reset_bundled_skill corrupts manifest on rmtree failure:**
The function deleted the manifest entry BEFORE attempting rmtree. If
rmtree failed (read-only files from Nix store), the function returned
early — leaving the skill in a manifest-less limbo state where future
syncs silently skip it forever.

Fix: reorder steps — attempt rmtree FIRST, only delete manifest entry
after rmtree succeeds. If rmtree fails, nothing is changed.

**NousResearch#34860 — stale .bak directories after sync:**
sync_skills() called shutil.rmtree(backup, ignore_errors=True) which
silently failed on read-only files, leaving persistent .bak dirs.

Fix: add _rmtree_writable() helper that makes files writable via an
onerror callback before retrying removal. Used in both sync_skills()
backup cleanup and reset_bundled_skill().

Fixes NousResearch#34972
Fixes NousResearch#34860
@rodriguez46p-ui

Copy link
Copy Markdown

Hermes Agent Review Note

I checked this PR on Paul’s Windows host because the change touches read-only/removal behavior.

The new regression test currently fails on native Windows/MSYS:

python -m pytest tests/tools/test_skills_sync.py::TestResetBundledSkill::test_reset_restore_preserves_manifest_on_rmtree_failure -q -o addopts=
FAILED ... assert result["ok"] is False
E       assert True is False

Why: os.chmod(dest, stat.S_IREAD | stat.S_IRGRP | stat.S_IROTH) does not force _rmtree_writable(dest) to fail on this host; the restore succeeds and returns ok=True, so the test does not actually cover the intended failure path cross-platform.

Suggested fix: simulate the deletion failure deterministically by monkeypatching _rmtree_writable / shutil.rmtree to raise OSError, then assert the manifest is preserved. That should make the transaction-ordering regression independent of POSIX vs Windows chmod semantics.

Evidence worktree: C:\Users\yolop\AppData\Local\Temp\hermes-pr35131 at 84369a418.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/nix Nix flake, NixOS module, container packaging P3 Low — cosmetic, nice to have tool/skills Skills system (list, view, manage) type/bug Something isn't working

Projects

None yet

3 participants