fix: clean stale MCP servers on install/update/uninstall and prevent .claude folder creation#201
Conversation
….claude folder creation - Track managed MCP servers in lockfile (mcp_servers field) to detect stale entries - Remove stale MCP servers from .vscode/mcp.json and Copilot CLI config on update - Clear apm_yml parse cache after --update to avoid stale MCP dependency names - Read old MCP server set before lockfile regeneration to preserve diff baseline - Guard Claude-specific re-integration in uninstall with target detection - Recompute MCP deps from remaining packages on uninstall and clean stale servers Fixes: .claude folder appearing in VS Code-only projects after uninstall, stale MCP servers persisting after package updates or removals.
There was a problem hiding this comment.
Pull request overview
This PR addresses install/update/uninstall lifecycle bugs by tracking APM-managed MCP servers in apm.lock, removing stale MCP server entries after dependency changes, and preventing Claude-specific re-integration from creating .claude/ artifacts in VS Code-only projects.
Changes:
- Extend
apm.lockschema withmcp_serversand wire it into YAML serialization/deserialization. - Update
installto capture prior MCP state, clearapm.ymlparse cache on--update, compute stale MCP server diffs, remove stale entries, and persist the new set. - Update
uninstallre-integration to guard Claude-specific integration via target detection, and add best-effort stale MCP cleanup after uninstall.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/apm_cli/deps/lockfile.py |
Adds mcp_servers tracking to the lockfile format to support stale MCP cleanup. |
src/apm_cli/cli.py |
Implements MCP stale removal + lockfile updates during install/uninstall, and guards Claude re-integration based on detected target. |
You can also share your feedback on Copilot code review. Take the survey.
- Clarify stale_names docstring: short server names, not full refs - Make _remove_stale_mcp_servers runtime-aware (respect --runtime/--exclude) - Add Codex CLI (~/.codex/config.toml) stale server cleanup - Add lockfile mcp_servers round-trip unit tests (3 new tests) - Document mcp_servers field in manifest-schema.md and dependencies.md - Preserve mcp_servers in lockfile when --only=apm is used
…ckages When running 'apm install <pkg>', the lockfile was regenerated from only the filtered package set, wiping all other dependencies and mcp_servers. Now merges new entries into the existing lockfile via LockFile.read() + add_dependency() when only_packages is set.
|
Additional fix: lockfile wipe when installing individual packages ( During end-to-end testing I discovered a pre-existing bug: running Root cause: Fix: When Validated: Ran |
…ame collision The bare set() call was invoking Click's 'config set' command instead of creating a Python set, causing 'Got unexpected extra argument' errors after MCP server installation.
|
Fix: When implementing review comment 2 (make Every other |
Thorough integration testing uncovered that `apm install <package>` failed to collect MCP servers (and other primitives) from packages deeper than one level in the dependency tree. Root cause: the only_packages filter in _install_apm_dependencies() kept only the explicitly named package, discarding all its transitive dependencies from deps_to_install. Because the lockfile is built from installed_packages, those transitive deps were never recorded and _collect_transitive_mcp_deps() only scans packages present in the lockfile, so their MCP server definitions were silently skipped. Fix: after building the identity set from user-supplied specs, walk the dependency tree to expand the set with all transitive descendants of the requested packages before filtering deps_to_install. Validation: - 436 related unit tests pass (install, dependency, transitive, mcp, lockfile keywords) - Manual integration test: installing a squad package that depends on an infrastructure package with MCP servers now correctly collects and configures those servers (7 vs 4 previously) - Uninstall correctly removes the transitive MCP servers when the parent package is removed
|
@microsoft-github-policy-service agree |
Additional fix: transitive deps excluded by
|
Cover 7 scenarios with 15 tests exercising the full MCP chain: - Selective install with transitive MCP dep collection - Orphan package exclusion via lockfile scoping - Stale server removal from .vscode/mcp.json after uninstall - Lockfile mcp_servers list update and clearance - MCP server rename detection (stale old + new present) - MCP removal detection and cleanup - Root vs transitive deduplication (root wins) - Virtual-path package MCP collection - Self-defined server trust_private gating All use synthetic package names (no private identifiers).
…ckage install Coverage hardening for the transitive MCP fix: - Unit: deep chain (A→B→C→D), diamond dependency, MCP at every level - Integration: deep chain, diamond, multi-package selective install through CLI CliRunner All 27 MCP lifecycle tests pass (19 unit + 8 integration).
Test coverage hardening (938a388)Added edge-case tests to close the gaps that motivated this PR: Unit tests (
Integration tests (
These specifically exercise the Full suite: 27 MCP lifecycle tests pass (19 unit + 8 integration), 1707 total pass with no regressions. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
You can also share your feedback on Copilot code review. Take the survey.
Address Copilot review: use single quotes inside f-strings for getattr(result, 'stderr', '') to improve readability and compatibility with Python < 3.12.
…nd uninstall race
- docs: update mcp_servers field description from 'short names' to
'MCP dependency references' in manifest-schema.md and dependencies.md
- fix: _remove_stale_mcp_servers now expands stale names to include
last-segment variants so Copilot CLI and Codex config keys (derived
from server_url.split('/')[-1]) are matched correctly
- fix: uninstall captures old_mcp_servers before lockfile mutation so
stale-MCP cleanup works even when all deps are removed and the
lockfile is deleted
- test: add TestStaleCleanupKeyNormalization with 3 tests covering
last-segment matching, full-ref matching, and short-name fallback
danielmeppiel
left a comment
There was a problem hiding this comment.
Solid work, Sergio. The bugs are real, the root-cause analysis is thorough, the fixes are correct, and the test coverage is impressive (27 MCP lifecycle tests covering deep chains, diamonds, dedup, stale cleanup, --only=apm preservation, and key normalization).
A few non-blocking observations for future hardening — I've opened #209 to track them:
- Silent
except Exception: pass— The best-effort pattern makes sense for cleanup, butlogger.debug()in those blocks would help when debugging issues in the field. _collect_descendantscycle guard — Avisitedset would be cheap insurance against stack overflow if a cycle ever slips through the resolver.- Lockfile double read-modify-write —
_update_lockfile_mcp_serversre-reads the lockfile that_install_apm_dependenciesjust wrote. Ordering is correct today but fragile; worth a comment or passing the object through. cli.pysize — The three new helpers are good candidates to extract intomcp_lifecycle.pyas MCP features grow.
None of these block the merge. CI is green, all review comments addressed. Thanks for the diligent E2E testing and the clear commit-by-commit narrative.
Summary
Fixes two bugs in the install/update/uninstall lifecycle in complex package scenarios:
.vscode/mcp.jsonand Copilot CLI config..claude/folder created in VS Code-only projects — Duringuninstall, the re-integration phase called Claude-specific integrators unconditionally, creating.claude/commands/and.claude/agents/directories even when the project only targets VS Code.Root Causes & Fixes
Stale MCP servers (3 sub-issues)
mcp_serversfield to lockfile to track managed server names--update_apm_yml_cacheinapm_package.pyreturned old parsed data after--updateoverwrote module filesclear_apm_yml_cache()after_install_apm_dependencieswhenupdate=Trueold_mcp_serverswas read from lockfile after_install_apm_dependenciesregenerated it (losing themcp_serversfield)_install_apm_dependencies.claude folder in VS Code projects
uninstallre-integration calledcommand_integrator.integrate_package_commands(),hook_integrator.integrate_package_hooks_claude(), andagent_integrator.integrate_package_agents_claude()without checking the integration targetdetect_target/should_integrate_claudeguard before the re-integration loop; Claude-specific calls now only run whenintegrate_claude=TrueChanges
src/apm_cli/cli.py(+162 lines)_get_mcp_dep_names(),_remove_stale_mcp_servers(),_update_lockfile_mcp_servers()install(): reads old MCP set before dep install, clears parse cache on update, computes diff, removes stale serversuninstall(): target detection before re-integration, MCP recomputation and stale cleanupsrc/apm_cli/deps/lockfile.py(+4 lines)mcp_servers: List[str]field toLockFiledataclass with serialization supportTesting
Full end-to-end test cycle verified:
.claude/, correct lockfilegithub→github-http)githubremovedgithub-http→github)github-httpremovedxxxxxxxxxx, +17 deps, +3 MCP)xxxxx).claude/at any step