feat(skills): add batch-migration skill#13693
Open
MorAlekss wants to merge 2 commits into
Open
Conversation
Collaborator
Contributor
Author
|
@alt-glitch This is not a duplicate, it's a working and adapted version of the skill for Hermes. I explained why in the comment to #380. Happy to adjust based on feedback. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
batch-migration skill — Parallel Code Migration via Git Worktree Isolation
Closes #380
1. What this skill does
batch-migrationis a skill for orchestrating large-scale, parallelizable code migrations across a codebase. It decomposes a migration into independent work units, spawns one Hermes agent per unit in an isolated git worktree, and each agent independently implements, tests, and creates a PR. The coordinator tracks progress, verifies results, merges PRs in the correct order, and cleans up.Trigger phrases:
"batch migrate X to Y","migrate all files from X to Y","bulk refactor across the codebase","parallel migration"2. How it's implemented — phases
The skill implements the three-phase workflow described in issue #380, with two additional phases added based on real-world testing.
Phases from issue #380
Phase 1 — Research and Plan
clarify(): mandatory user approval gate before any work startsPhase 2 — Execute
.gitignore,requirements.txt, commit and pushghauth, saveROOT_DIR, create git worktrees withprintf "%02d"zero-paddingterminal(background=True)Phase 3 — Monitor and Verify
Additional phases (not in original issue)
Phase 4 — Merge
clarify()gate presents the final summary table with PR links and merge order, and waits for explicit user approval before any merge begins.Phase 5 — Cleanup
3. Phase alignment with issue #380
process(poll)tracking, workers create PRs,clarify()approval gate4. Additions beyond the issue description
The skill implements everything in Phase 1 MVP and goes further with several additions that emerged from real-world testing:
Full migration cycle — beyond PR creation
The issue describes the workflow ending at PR creation. This skill goes further and implements a complete end-to-end cycle:
Automated merge with approval gate: after all workers finish, the coordinator presents a final summary table with PR links and recommended merge order. A mandatory
clarify()tool call stops execution and waits for explicit user approval. Only after the user confirms does the coordinator proceed with merging. This is a hard gate; unlike text-based STOP instructions,clarify()cannot be skipped by the agent.Post-merge verification: after all PRs are merged, the coordinator pulls main and runs the full test suite, comparing results against the baseline recorded in Phase 1. This catches regressions that only appear after all changes are combined, which individual worktree tests cannot detect.
Self-healing cleanup worker: if post-merge tests fail (e.g. stale mocks, missing
await, orphaned test files), the coordinator automatically spawns a cleanup worker with a fully self-contained prompt describing exactly what to fix. The cleanup worker creates a fix branch, opens a PR, and merges it. The coordinator then re-runs the full test suite to confirm all regressions are resolved before declaring the migration complete.This means the skill delivers a migration that is verified working on main, not just a set of open PRs that may or may not pass when combined.
Merge order logic: source-only PRs must merge before test-file PRs. This prevents stale mock failures after merge because test mocks reference the new httpx pattern which doesn't exist until the source PR lands.
Cross-unit async dependency tracking: for async migrations, the coordinator reads each test file's imports, identifies functions from other units that will change signature (sync→async), and explicitly tells the test-file worker to add
awaitand@pytest.mark.asyncioto affected test functions. Prevents the most common post-merge failure pattern.Mock ownership manifest: every test-file worker receives an explicit table of every mock it must update, with current value and required new value. With a warning that mocks may pass in the worktree (because
patch()creates attributes) but will fail after merge.Pre-commit scope check with
exit 1: each worker prompt contains a bash gate that blocksgit commitif any file outside the worker's assigned list appears in the diff. Catches scope violations before they become merge conflicts.Mock verification grep with
exit 1: workers grep for stale old-pattern references before committing. If found, commit is blocked.Orphaned test file check: The pre-spawn verification checks that every test file containing the old pattern is assigned to exactly one unit before spawning. Prevents the case where test files retain stale mocks because no worker owned them.
Port uniqueness check: for API/UI patterns, The pre-spawn verification checks that no two worker prompts use the same localhost port, preventing server startup conflicts during parallel verification.
E2E verification templates: The E2E recipe section provides four concrete templates (source-only/pytest, dev server + curl, browser tool calls, CLI output grep) so coordinators know exactly what to put in each worker's verification section.
printf "%02d"worktree naming: replacesseq -wwhich produced inconsistent zero-padding across platforms. Guaranteesunit-01...unit-30naming.Per-file slicing instead of per-directory: the issue describes slicing per-directory or per-module (following Claude Code's pattern for large frontend codebases). The skill uses per-file slicing instead. This is an intentional decision: per-file units are smaller, more independently mergeable, and produce cleaner PRs with minimal conflict surface. For codebases where one file is trivially small, the coordinator can group files into one unit, but the default is per-file for maximum isolation and reviewability.
5. How aspects missing from Hermes infrastructure are solved
delegate_taskas fallback spawning methodThe issue describes two spawning approaches. The skill implements both. Primary method is
terminal(background=True)which bypasses the 3-child limit. The skill also documentsdelegate_taskas an explicit fallback for environments wherehermes chat -wis unavailable, with a maximum of 3 workers at once in that mode.Manual-only invocation (
disableModelInvocationequivalent)The issue specifies
disableModelInvocation: true— the skill must never auto-trigger. This is enforced through COORDINATOR RULES at the top of the skill (read before anything else) and explicit trigger phrases that must be present in the user's message. The skill description also states "Manual invocation only — never auto-trigger this skill."Worker isolation (no native worktree isolation)
Hermes has no native
isolation: "worktree"parameter. The skill works around this by having the coordinator manually rungit worktree add .hermes/worktrees/unit-NN -b batch/unit-NN mainfor each unit. Workerscdinto their worktree as the first step and verify withpwd. TheNEVER edit files outside your worktreerule is enforced at the prompt level.Parallel spawning beyond 3-child limit
delegate_taskis limited to 3 concurrent children. The skill usesterminal(background=True)+process(poll/wait/log)instead, which has no such limit.Code review (no
simplifyskill in Hermes)The issue references Claude Code's
simplifyskill for worker self-review. Hermes doesn't havesimplify. The skill uses/requesting-code-review: Hermes's built-in code review slash command: as a direct equivalent. Every worker prompt requires/requesting-code-reviewbefore committing, with a hard STOP gate.Progress dashboard (no native dashboard)
The coordinator maintains a live status table rendered as markdown and updated after every
process(poll)call:Combined with the
todotool for tracking completion state.6. Tests conducted
The skill was tested across multiple projects and migration types. All tests resulted in baseline test count matching post-migration test count (zero regressions). Tests were run with 6 to 12 concurrent workers per run depending on codebase size.
Existing integration test suite pattern (pytest)
CLI verification pattern
Dev server + curl pattern (API)
Browser automation pattern (UI)
The skill includes dedicated support for UI migrations. The E2E recipe section provides a browser verification template and each worker prompt contains a
## Browser verificationsection separate from the shell-based E2E recipe. This separation is intentional: browser tool calls (browser_navigate,browser_snapshot,browser_vision) are agent tool calls, not shell commands, and cannot be placed in the same bash block as the server start command. The coordinator fills in the shell part (start dev server, assign unique port per unit) in the E2E recipe, and the browser steps in the dedicated section. The skill also includes a pre-spawn check that verifies UI units have the browser verification section filled and not left as N/A.7. Answers to Open Questions from issue #380
"What's the right default concurrency limit?"
Testing showed stable operation with up to 12 concurrent workers across multiple test runs. The skill recommends 5-20 units as the default range, with a maximum of 30 for large codebases. No rate limit issues were observed at these levels.
"Should workers share any state or be fully isolated?"
Fully isolated. Each worker gets everything it needs in its self-contained prompt: overall goal, specific task, codebase conventions, mock patterns, cross-unit async dependencies, and E2E test recipe. Workers have zero dependency on coordinator context or each other.
"How should the user approve the plan: via
clarifytool or file-based review?"Via
clarify()tool. Tested across all runs: it physically stops the coordinator and waits for explicit user input before any work begins. Text-based STOP instructions are unreliable (agents skip them);clarify()cannot be skipped."How to handle partially failed migrations?"
The coordinator tracks all workers via the
todotool and status table. If some workers fail, the coordinator notes them in the final summary and the user decides whether to fix manually or spawn targeted cleanup workers. If post-merge tests fail, the coordinator automatically spawns a cleanup worker."Should we increase MAX_CONCURRENT_CHILDREN or rely on subprocess spawning?"
Subprocess spawning via
terminal(background=True)is the right approach: it bypasses the 3-childdelegate_tasklimit entirely with no code changes needed.delegate_taskis kept as a documented fallback for environments wherehermes chat -wis unavailable."How to handle migrations where units aren't truly independent?"
The
## Cross-unit async dependenciessection in worker prompts handles the most common case: when one unit's source changes affect another unit's test expectations. The coordinator detects these dependencies in Phase 1 and explicitly instructs each test-file worker about functions from other units that will change signature."Should there be a
--dry-runmode?"Phase 1 effectively serves as a dry-run. The coordinator does a full research pass, decomposes the codebase into work units, builds the complete plan with file lists and E2E recipes, and presents it via
clarify(). The user sees the entire migration plan before a single file is touched. Only after explicit approval does execution begin.8. How the Cons/Risks from issue #380 are addressed
"Orphaned worktrees or branches after failed runs"
Phase 5 (Cleanup) is explicitly marked as non-skippable in the skill. It removes all worktrees with
git worktree remove --force, runsgit worktree prune, and deletes both local and remote branches. Cleanup usesprintf "%02d"naming which guarantees consistent branch names across all platforms."Worker divergence: inconsistent stylistic choices"
Each worker prompt contains the exact migration pattern with before/after code examples, codebase conventions discovered in Phase 1, and explicit instructions for every file it must modify. Workers follow the same pattern because they receive the same template: not general instructions.
"PR sprawl: 30 PRs overwhelming code review"
The merge phase handles this automatically. The coordinator merges all PRs in the correct order (source-only first, test-file PRs last) without requiring manual review of each PR. A spot-check of all PR diffs verifies scope before merge. Individual PR review is optional: the coordinator handles it.
"Git worktree edge cases with submodules or unusual configurations"
The skill uses standard git worktree commands and has been tested on macOS and Linux. Submodule support is not explicitly tested and may require additional handling.
Skill file
skills/software-development/batch-migration/SKILL.mdVersion:
3.37.1-E2EPlatforms:
linux,macosRequires:
terminaltoolset,ghCLI authenticated