fix(skills): transactional skills reset --restore + read-only Nix tree cleanup (#34972)#35251
Merged
Conversation
…le read-only files in rmtree Two related bugs in tools/skills_sync.py affecting Nix-store and immutable-package installs: **#34972 — reset_bundled_skill corrupts manifest on rmtree failure:** The function deleted the manifest entry BEFORE attempting rmtree. If rmtree failed (read-only files from Nix store), the function returned early — leaving the skill in a manifest-less limbo state where future syncs silently skip it forever. Fix: reorder steps — attempt rmtree FIRST, only delete manifest entry after rmtree succeeds. If rmtree fails, nothing is changed. **#34860 — stale .bak directories after sync:** sync_skills() called shutil.rmtree(backup, ignore_errors=True) which silently failed on read-only files, leaving persistent .bak dirs. Fix: add _rmtree_writable() helper that makes files writable via an onerror callback before retrying removal. Used in both sync_skills() backup cleanup and reset_bundled_skill(). Fixes #34972 Fixes #34860
…just files The cherry-picked fix's onerror handler chmod'd only the failing path, but unlinking a child requires write permission on its PARENT directory. On a true Nix-store copy (r-xr-xr-x dirs + files) rmtree still failed. Now chmod the parent dir as well before retrying. Also rewrites the regression test: the original asserted the helper FAILS on a read-only dir (documenting the limitation), which is the wrong success criterion. Split into two tests — restore succeeds on a full read-only tree (real Nix case), and manifest is preserved when removal genuinely cannot proceed (monkeypatched).
Contributor
🔎 Lint report:
|
Collaborator
The hermetic CI env (slice 4/6) redirects HERMES_HOME, so a post-restore
_read_manifest() can resolve to an empty/redirected manifest path and return
{}. Assert on sync_skills's in-memory return value (synced["copied"]) instead,
which is the resilient signal that the skill was re-copied and is no longer in
limbo.
This was referenced May 30, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
hermes skills reset <name> --restoreis now transactional and survives read-only Nix-store skill copies — it no longer corrupts the manifest into a permanent limbo state.Root cause:
reset_bundled_skilldeleted the manifest entry beforeshutil.rmtree(dest). On Nix-store installs (read-onlyr-xr-xr-xdirs), rmtree failed and the function returned early — but the manifest entry was already gone, so futuresync_skillsruns silently skipped the skill forever (user-modified-UNTRACKED).Closes #34972.
Changes
tools/skills_sync.py:action: "not_reset", message "Manifest entry preserved")._rmtree_writable()chmods read-only entries and retries viaonerror. The retry handler now chmods the parent directory as well as the failing path — unlinking a child requires write permission on its parent, so the original (file-only chmod) still failed on a true read-only tree.sync_skillsuses the same writable rmtree.tests/tools/test_skills_sync.py: two regression tests —r-xr-xr-xdirs +r--files), andValidation
--restoreon read-only files only--restoreon read-only dirs+files (Nix store)E2E verified against a real read-only tree (isolated
HERMES_HOME, realreset_bundled_skill): restore succeeds, nested read-only file removed, bundled re-copied, manifest re-tracked. Targeted suite:tests/tools/test_skills_sync.py51/51 pass.Credit
Salvaged from #35131 by @annguyenNous (transaction reordering + initial
_rmtree_writable), cherry-picked onto currentmainwith authorship preserved. Follow-up commit fixes the parent-dir chmod gap and corrects the regression test.Supersedes #35182 (@TKwave — read-only backup cleanup, a strict subset of this fix).
Infographic