Skip to content

fix(profiles,banner): exclude infrastructure from --clone-all + fix stale update-check repo resolution (salvage of #5025, #5026, #21728)#22475

Merged
kshitijk4poor merged 2 commits into
mainfrom
salvage/cloneall-banner-update-check
May 9, 2026
Merged

fix(profiles,banner): exclude infrastructure from --clone-all + fix stale update-check repo resolution (salvage of #5025, #5026, #21728)#22475
kshitijk4poor merged 2 commits into
mainfrom
salvage/cloneall-banner-update-check

Conversation

@kshitijk4poor

Copy link
Copy Markdown
Collaborator

Summary

Hybrid salvage combining the best parts of three open PRs (#5025, #5026, #21728) that all targeted bug #5022: --clone-all was copying ~2.3 GB of infrastructure alongside ~40 MB of actual profile data.

What this PR does

fix(profiles): exclude infrastructure artifacts when cloning with --clone-all

When the source profile is the default (~/.hermes), shutil.copytree() was copying multi-GB infrastructure: hermes-agent/ (repo checkout + 3 GB venv), .worktrees/, profiles/ (sibling profiles — recursive!), bin/ (installed binaries, ~10 MB), node_modules/ (hundreds of MB).

Fix: Add _CLONE_ALL_DEFAULT_EXCLUDE_ROOT frozenset with these five entries and pass an ignore callback to shutil.copytree(). Exclusions are gated on the source actually being the default profile (is_default_source) so a named-profile source is never affected.

Also exclude at any depth: __pycache__/, *.pyc, *.pyo, *.sock, *.tmp.

Profile data (config.yaml, .env, auth.json, state.db, sessions/, skills/, logs/) is preserved intact — clone-all means "complete snapshot minus infrastructure", which matches what the docs promise.

Mirrors the approach already used by _default_export_ignore() and _DEFAULT_EXPORT_EXCLUDE_ROOT (the export-side exclusion set is broader because it produces a portable archive, not a live clone).

fix(banner): resolve update-check repo from running code, not profile-scoped path

check_for_updates() and _resolve_repo_dir() were preferring $HERMES_HOME/hermes-agent/ over Path(__file__).parent.parent.resolve() when looking for a .git checkout. For profiles created with --clone-all, $HERMES_HOME/hermes-agent/ points to a stale copy with a frozen HEAD, causing persistent "N commits behind" banners that never resolved.

Fix: Flip the resolution order — prefer the running code's location first, fall back to $HERMES_HOME/hermes-agent/ only when the live checkout doesn't have a .git (system-wide pip installs, distro packages). The embedded-rev branch (HERMES_REVISION env var, set by nix builds) is unaffected — it uses git ls-remote against upstream, never reads the local checkout's HEAD.

How this salvage was assembled

Source PR Author What we took What we dropped
#5025 donrhmexe / rahimsais is_default gating, bin in exclusion set, test_clone_all_excludes_default_infrastructure pattern Narrow profiles-only exclusion, no .pyc/.pyo coverage
#5026 MustafaKara7 Idea of aligning with export-side exclude pattern _DEFAULT_EXPORT_EXCLUDE_ROOT reuse (strips state.db/logs/caches — wrong for clone-all); package.json exclusion
#21728 fahdad Frozenset typing, two-tier docstring style, Path.resolve() with try/except, *.pyc/*.pyo at any depth, banner.py fix as separate commit Unconditional exclusion (no is_default gate), missing bin from exclusion set, anonymized commits

What was deliberately excluded (and why)

Item Reason
state.db, sessions/, logs/, caches, checkpoints/ These are profile data — clone-all is a "complete snapshot", not an export archive. #5026's approach (reusing _DEFAULT_EXPORT_EXCLUDE_ROOT) incorrectly stripped them.
Ownership gate on clone-all exclusions #5025's is_default gating ensures named-profile sources are never silently stripped — a conservative choice. #21728's unconditional approach is defensible (those names don't exist in named profiles) but the gated version is more explicit about intent.

Author attribution (for contributor audit)

  • Commit 1 (fix(profiles): ...): donrhmexe, with Co-authored-by trailers for MustafaKara7 and fahdad
  • Commit 2 (fix(banner): ...): fahdad (re-authored with 30740087+fahdad@users.noreply.github.com — their commits on fix(cli): resolve update-check repo from running code + exclude infra artifacts from --clone-all #21728 were anonymized as hermes@agent.local)
  • donrhmexe (don.rhm@gmail.com) and MustafaKara7 (karamusti912@gmail.com) are already in AUTHOR_MAP
  • fahdad's noreply email resolves automatically via resolve_author() pattern matching — no AUTHOR_MAP edit needed

Test plan

  • bash scripts/run_tests.sh tests/hermes_cli/test_profiles.py tests/hermes_cli/test_update_check.py tests/hermes_cli/test_banner.py → 125 passed
  • New test test_clone_all_excludes_default_infrastructure asserts: hermes-agent, .worktrees, profiles, bin, node_modules excluded; pycache/.pyc/.pyo/.sock/.tmp excluded at any depth; profile data (skills, config, .env, state.db, sessions, logs) preserved
  • Ruff diff vs origin/main: All checks passed

Closes

Closes #5022, #5025, #5026, #21728.


Credit to @donrhmexe, @rahimsais, @MustafaKara7, and @fahdad — this PR combines their independent findings into a unified fix.

donrhmexe and others added 2 commits May 9, 2026 15:39
…lone-all

When the source profile is the default (~/.hermes), shutil.copytree()
was copying multi-GB infrastructure alongside the ~40 MB of actual
profile data: hermes-agent/ (repo checkout + 3 GB venv), .worktrees/,
profiles/ (sibling profiles — recursive!), bin/ (installed binaries),
node_modules/ (hundreds of MB).

Add _CLONE_ALL_DEFAULT_EXCLUDE_ROOT frozenset with these five entries
and pass an ignore callback to copytree().  Exclusions are gated on
the source actually being the default profile (is_default_source) so
named-profile sources are never affected.

Also exclude at any depth: __pycache__/, *.pyc, *.pyo, *.sock, *.tmp.
Profile data (config.yaml, .env, auth.json, state.db, sessions/,
skills/, logs/) is preserved intact — clone-all means 'complete
snapshot minus infrastructure'.

Mirrors the approach already used by _default_export_ignore() and
_DEFAULT_EXPORT_EXCLUDE_ROOT (the export-side exclusion set which is
broader because it produces a portable archive, not a live clone).

Co-authored-by: MustafaKara7 <karamusti912@gmail.com>
Co-authored-by: fahdad <30740087+fahdad@users.noreply.github.com>
Fixes #5022
Based on PRs #5025, #5026, and #21728
…-scoped path

check_for_updates() and _resolve_repo_dir() were preferring
$HERMES_HOME/hermes-agent/ over Path(__file__).parent.parent.resolve()
when looking for a .git checkout.  For profiles created with
--clone-all, $HERMES_HOME/hermes-agent/ points to a stale copy
with a frozen HEAD, causing persistent "N commits behind" banners
that never resolved.

Flip the resolution order: prefer the running code's location first,
fall back to $HERMES_HOME/hermes-agent/ only when the live checkout
doesn't have a .git (system-wide pip installs, distro packages).

The embedded-rev branch (HERMES_REVISION env var, set by nix builds)
is unaffected — it uses git ls-remote against upstream, never reads
the local checkout's HEAD.

Based on PR #21728 by @fahdad
@github-actions

github-actions Bot commented May 9, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: salvage/cloneall-banner-update-check vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 7887 on HEAD, 7887 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4174 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: profile create --clone-all copies entire infrastructure (~2.3GB) instead of just profile data

3 participants