Skip to content

fix(update): harden venv rebuild + verify core deps after install#38887

Merged
teknium1 merged 1 commit into
mainfrom
fix/update-venv-rebuild-and-verify-deps
Jun 4, 2026
Merged

fix(update): harden venv rebuild + verify core deps after install#38887
teknium1 merged 1 commit into
mainfrom
fix/update-venv-rebuild-and-verify-deps

Conversation

@teknium1

@teknium1 teknium1 commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes a silent partial-install failure in hermes update that broke desktop GUI rebuilds and per-profile skill sync in the wild.

What broke

A real-world hermes update run on Windows (logs: this morning, my local box):

  1. Pulled 145 commits cleanly.
  2. managed_uv.rebuild_venv() tried to recreate the venv. shutil.rmtree(ignore_errors=True) couldn't delete files held open by the running hermes.exe shim (and AV scanner). Some files were left behind.
  3. uv venv then errored with Caused by: A directory already exists at: venv (and uv's own hint said to use --clear). Return value was ignored.
  4. The update fell back to installing on top of the stale venv. uv's incremental resolver produced a partial install that missed exactly one newly-added base dep — pathspec==1.1.1.
  5. hermes desktop --build-only (called next) imports pathspec at the top of its content-hash skip check. It died with ModuleNotFoundError. The parent update only saw a non-zero exit code and printed ⚠ Desktop build failed (non-fatal; run hermes desktop to retry).
  6. Same missing module made the next stage's "default: sync failed" line, because per-profile skill-sync shells out via subprocess and hit the same import.

Net result: a "successful" update with a broken desktop app and broken profile sync.

Two complementary fixes

1. rebuild_venv retries with --clear

If uv venv fails with already exists in stderr, retry once with --clear. Only matches that specific failure pattern — disk-full / interpreter-download errors still surface unchanged so we don't mask real problems.

2. Post-install dep verification (_verify_core_dependencies_installed)

Belt-and-suspenders so future uv resolver quirks (or any other cause of partial installs) surface immediately instead of hours later in a downstream subprocess:

  1. Reads [project.dependencies] straight from pyproject.toml (doesn't trust the venv's stale metadata).
  2. Filters by environment markers via packaging.requirements.Requirement so cross-platform exclusions (ptyprocess ; sys_platform != 'win32') don't false-positive on Windows.
  3. Runs importlib.metadata.version() for each remaining dep inside the target venv interpreter (resolved from VIRTUAL_ENV, not sys.executable — they're different during update).
  4. If anything is missing: reinstall base with --reinstall to force re-resolution. If still missing, force-install each one with its pinned spec from pyproject.
  5. Final failure is a warning, not a hard error — a single broken-on-PyPI dep shouldn't block an otherwise-successful update — but the message names the missing packages and points at hermes update --force so the user knows what to do.

Test Plan

  • Reproduced the original failure locally (pathspec missing → desktop --build-only crashes).
  • Verified the fix path: installed pathspec manually, re-ran hermes desktop --build-only, desktop rebuild now succeeds.
  • New test: TestRebuildVenv::test_retries_with_clear_when_dir_already_exists simulates the rmtree-couldn't-delete-it failure mode and asserts the --clear retry path is taken and succeeds.
  • New test: TestRebuildVenv::test_does_not_retry_when_first_failure_is_not_dir_exists guards against masking real failures (disk full, etc.).
  • New file: test_verify_core_dependencies.py — 7 tests covering happy path, missing-dep triggers --reinstall, per-package fallback, platform-marker filter (Windows doesn't try ptyprocess), missing-pyproject is a no-op, and the VIRTUAL_ENV resolver.
  • Existing TestRebuildVenv tests still pass.
  • ruff check clean on all modified files.

Known pre-existing test failures (unrelated)

Confirmed by running tests against unmodified origin/main:

  • 6 tests in test_managed_uv.py (TestResolveUv, TestEnsureUv, TestUpdateManagedUv, TestInstallUvInternals) fail on Windows due to fake-executable subprocess permission issues — pre-existing, not introduced here.
  • 4 tests in test_cmd_update.py and test_update_autostash.py fail because cmd_update now wraps git calls with -c config args that the tests' assertions don't account for — pre-existing, not introduced here.

These deserve their own PR; flagging so reviewers don't think they're regressions.

@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: fix/update-venv-rebuild-and-verify-deps vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9796 on HEAD, 9793 on base (🆕 +3)

🆕 New issues (3):

Rule Count
unused-type-ignore-comment 1
unresolved-import 1
unresolved-attribute 1
First entries
hermes_cli/main.py:9122: [unused-type-ignore-comment] unused-type-ignore-comment: Unused blanket `type: ignore` directive
tests/hermes_cli/test_verify_core_dependencies.py:24: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
hermes_cli/main.py:9150: [unresolved-attribute] unresolved-attribute: Object of type `~None` has no attribute `evaluate`

✅ Fixed issues: none

Unchanged: 5086 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/bug Something isn't working comp/cli CLI entry point, hermes_cli/, setup wizard P1 High — major feature broken, no workaround labels Jun 4, 2026
Two complementary fixes for a silent partial-install failure that bit
``hermes update`` in the wild: a fresh checkout pulled 145 commits,
``rebuild_venv`` failed to recreate the venv on Windows because
``shutil.rmtree(ignore_errors=True)`` couldn't delete files held open by
the running ``hermes.exe`` shim. ``uv venv`` then refused with
"A directory already exists at: venv" and the update fell back to
installing on top of the stale venv. The resulting partial install
missed exactly one newly-added base dep — ``pathspec==1.1.1`` — which
``hermes desktop --build-only`` imports at the top of its content-hash
check. The desktop rebuild died with ModuleNotFoundError and the parent
update only logged "⚠ Desktop build failed (non-fatal)". Same root cause
made the "default: sync failed" line in the skill-sync stage, because
that sync subprocess hit the same missing import.

Fix 1: ``rebuild_venv`` retries with ``--clear``
------------------------------------------------
If ``uv venv`` fails with "already exists" in stderr (which is what uv
prints, and what uv's own hint tells you to fix with --clear), retry
once with ``--clear``. Only this specific failure pattern triggers the
retry — disk-full / interpreter-download failures still surface as
before so we don't mask real problems.

Fix 2: post-install dep verification
------------------------------------
Belt-and-suspenders so future uv resolver quirks (or any other cause of
partial installs) surface immediately instead of hours later in a
downstream subprocess. After ``_install_python_dependencies_with_optional_fallback``
runs, ``_verify_core_dependencies_installed``:

  1. Reads ``[project.dependencies]`` straight from pyproject.toml
     (so we don't trust the venv's stale metadata).
  2. Filters by environment markers via ``packaging.requirements.Requirement``
     so cross-platform exclusions (``ptyprocess ; sys_platform != 'win32'``)
     don't false-positive on Windows.
  3. Runs ``importlib.metadata.version()`` for each remaining dep inside
     the *target* venv interpreter (resolved from ``VIRTUAL_ENV``, not
     ``sys.executable``).
  4. If anything is missing, reinstalls the base group with
     ``--reinstall`` to force re-resolution. If a second probe still
     reports missing deps, force-installs each one with its pinned spec.
  5. Treats final failure as a warning rather than a hard error — a
     single broken-on-PyPI dep shouldn't block an otherwise-successful
     update — but the message points at ``hermes update --force`` and
     names the missing packages so the user knows what's wrong.

Tests
-----
- ``TestRebuildVenv::test_retries_with_clear_when_dir_already_exists`` —
  simulates the rmtree-couldn't-delete-it failure mode and asserts the
  ``--clear`` retry path is taken and succeeds.
- ``TestRebuildVenv::test_does_not_retry_when_first_failure_is_not_dir_exists``
  — guards against masking real failures (disk full, etc.).
- ``test_verify_core_dependencies.py`` — 7 tests covering the happy
  path, the regression (missing pathspec triggers --reinstall), the
  per-package fallback when --reinstall doesn't help, the platform-
  marker filter so Windows doesn't try to install ptyprocess, the
  missing-pyproject noop, and the VIRTUAL_ENV resolver.

Co-authored-by: Kyssta <218078013+kyssta-exe@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants