Skip to content

Prune removed skills from server feeds (#1403)#1408

Merged
Aaronontheweb merged 3 commits into
netclaw-dev:devfrom
Aaronontheweb:claude-wt-skill-feed-prune
Jun 15, 2026
Merged

Prune removed skills from server feeds (#1403)#1408
Aaronontheweb merged 3 commits into
netclaw-dev:devfrom
Aaronontheweb:claude-wt-skill-feed-prune

Conversation

@Aaronontheweb

Copy link
Copy Markdown
Collaborator

Closes #1403.

Problem

ServerFeedSkillSyncService only iterated forward through a server feed's RFC index — it downloaded/updated skills that exist, but never pruned skills that were deleted or renamed on the server. The stale skill directory persisted on disk, the SkillSyncState entry stayed, a rename left both old and new versions registered, and skill_load kept working on deleted skills.

Fix

Add a reverse pass after the forward sync loop (SkillSyncHelpers.PruneRemovedSkills) that removes sync-state entries and on-disk skill directories no longer present in the index. The registry self-heals via the existing disk-rebuild (RescanAndUpdateIndex).

Pruning runs per-feed and only after a confirmed, non-empty index fetch, so a transient outage or empty response never wipes locally synced skills.

Safety hardening (from an adversarial self-review)

  • Empty-index guard inside the helper, not just the caller — the destructive op is safe by construction.
  • Only directories that own a SKILL.md are deleted (the same rule SkillScanner uses), so a stray non-skill directory under the feed root is never recursively deleted. This also neutralizes a scan-vs-prune race: prune no longer deletes the kind of directory SkillScanner's unguarded enumeration would choke on.
  • RescanAndUpdateIndex is wrapped in the same defensive try/catch the per-feed loop uses, so a directory vanishing mid-scan can't tear down the background service.

Deliberate non-goals

  • No grace period / circuit breaker for partial indexes. Transient/partial indexes self-heal: prune clears the sync-state entry, so the next full index re-downloads the skill. The only cost is a one-interval gap, automatically recovered.
  • No skill-server changes. GetRfcIndexAsync returns a complete snapshot (no pagination), so "absent from the index" reliably means removed. Netclaw.SkillClient stays a pure HTTP client.

Tests

SkillSyncHelpersTests covers: stale dirs + state pruned while present skills and .staging/.sync-state.json bookkeeping are preserved; non-skill directories survive prune (state still cleared); and an empty index is a no-op.

Validation

  • dotnet test — 31/31 skill tests pass (incl. 4 new prune tests)
  • dotnet slopwatch analyze — 0 issues
  • Add-FileHeaders.ps1 -Verify — all headers present

ServerFeedSkillSyncService only iterated forward through the server's RFC
index, so skills deleted or renamed on a private skill server were never
cleaned up: the on-disk directory persisted, the sync-state entry remained
stale, and a rename left both old and new versions registered.

Add a reverse pass after the forward sync loop. SkillSyncHelpers.PruneRemovedSkills
removes sync-state entries and on-disk skill directories absent from the
server index, skipping dot-prefixed bookkeeping (.staging, .sync-state.json)
and tolerating per-directory delete failures. The registry self-heals via the
existing RescanAndUpdateIndex disk rebuild.

Pruning runs per-feed and only after a confirmed, non-empty index fetch, so a
transient outage or empty response never wipes locally synced skills. No
skill-server changes are needed: GetRfcIndexAsync returns a complete snapshot,
so absence from the index reliably means removed.
Apply the surviving findings from the high-effort review of the prune pass:

- Guard against an empty index in PruneRemovedSkills itself, not just the
  caller, so the destructive op is safe by construction.
- Only delete directories that own a SKILL.md (SkillScanner's own skill-dir
  rule), so a stray non-skill directory under the feed root is never
  recursively deleted. This also neutralizes the scan-vs-prune race: prune no
  longer deletes the dot/non-SKILL.md dirs that SkillScanner's unguarded Pass-2
  enumeration would choke on.
- Wrap the post-sync RescanAndUpdateIndex in the same defensive try/catch the
  per-feed loop uses, so a directory vanishing mid-scan cannot tear down the
  background service.

Transient or partial-index conditions are intentionally left to self-heal:
prune clears the sync-state entry, so the next full index re-downloads the
skill. That makes a deletion grace period / circuit breaker unnecessary, and
the delete-then-forget ordering is dropped since the registry is rebuilt from
disk and the forward loop never reads a removed skill's stale state.

Tests: non-skill directories survive prune, stale state is still cleared, and
an empty index is a no-op.
@Aaronontheweb Aaronontheweb added the skills Skill loading, handling, authoring, indexing, and evals. label Jun 15, 2026
@Aaronontheweb Aaronontheweb marked this pull request as ready for review June 15, 2026 20:04

@Aaronontheweb Aaronontheweb left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Aaronontheweb Aaronontheweb enabled auto-merge (squash) June 15, 2026 20:05
@Aaronontheweb Aaronontheweb merged commit 0c75c55 into netclaw-dev:dev Jun 15, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skills Skill loading, handling, authoring, indexing, and evals.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Skill server feed: no prune/delete handling for removed skills

1 participant