Skip to content

fix(backend): stop fuzzy requests installing literal dirs#9276

Merged
jdx merged 8 commits intojdx:mainfrom
AsgardMuninn:fix/upgrade-force-reinstall
Apr 22, 2026
Merged

fix(backend): stop fuzzy requests installing literal dirs#9276
jdx merged 8 commits intojdx:mainfrom
AsgardMuninn:fix/upgrade-force-reinstall

Conversation

@AsgardMuninn
Copy link
Copy Markdown
Contributor

@AsgardMuninn AsgardMuninn commented Apr 21, 2026

Summary

Fix the root cause behind stale real install directories like installs/<tool>/latest/ and explicit prefix requests resolving to literal on-disk directories when mise cannot determine a concrete version.

This PR also keeps the one-time migration for existing broken installs and teaches runtime symlink rebuild to replace stale real dirs with the correct symlink once a concrete install exists.

Importantly, this does not blanket-treat every short version like 1.0, 4.13, or 2026.1 as fuzzy. ~/src/mise-versions contains many tools with legitimate 1-part or 2-part stable versions (lua, ffmpeg, staticcheck, ccache, hlint, upx, etc.), so exact short versions must continue to work.

Root Cause

There were two separate problems:

  1. Resolver fallback: truly fuzzy requests could collapse to the literal request string instead of a concrete version.
  • resolve_prefix in src/toolset/tool_version.rs used None => prefix, so an explicit prefix request could resolve to the literal prefix when remote matching returned nothing.
  • latest requests also had a bad fallback loop through latest_installed_version when installs/<tool>/latest existed as a real directory.
  1. Runtime selection regression: after dc840866 (fix(env): use runtime symlink paths for fuzzy versions), PATH/bin resolution began preferring request paths like latest, 24, and 24.0. That made previously latent bad directories actively win at runtime.

There was also a reinforcing loop in latest_installed_version: if installs/<tool>/latest existed as a real dir instead of a symlink, mise treated that as installed latest and kept reusing the alias literally.

Changes

  • src/toolset/tool_version.rs

    • Stop resolving latest and explicit prefix: requests to literal install dirs when no concrete version can be found.
    • Preserve exact/direct version requests, including legitimate short versions like 1.0, 4.13, or 2026.1.
  • src/backend/mod.rs

    • Change latest_installed_version(None) so a real latest/ directory is no longer treated as installed latest.
    • If latest/ is not a runtime symlink, fall back to the highest concrete installed version instead.
  • src/runtime_symlinks.rs

    • Extend runtime symlink rebuild to replace stale real dirs whose basename differs from the concrete target (for example latest -> ./2.0.0, 24 -> ./24.3.1).
    • This lets normal install/upgrade flows self-heal stale dirs once a concrete install exists.
  • src/migrate.rs

    • Keep a one-time migration that rewrites existing stale fuzzy-version dirs into runtime symlinks and writes a marker file so it only runs once.
    • Removal note left in place so the migration can be dropped after the grace period.

Tests

  • Added e2e/backend/test_fuzzy_versions_do_not_install_literal_dirs
    • Verifies mise install tool@latest, tool@prefix:24, and tool@prefix:24.0 fail instead of creating literal install dirs when no concrete version can be resolved.
  • Kept e2e/cli/test_upgrade_latest_stale
    • Verifies a stale real latest/ dir is repaired to ./2.0.0 during upgrade and runtime exec uses the new binary.
  • Kept e2e/tools/test_runtime_symlink_migration
    • Verifies the one-time migration rewrites stale 1.0/ and latest/ dirs and writes the marker file.
  • Kept e2e/tools/test_runtime_symlinks
    • Verifies normal runtime symlink behavior still works.
  • Added unit coverage for latest_installed_version ignoring a real latest/ dir.

Validation

  • cargo check --all-features
  • cargo test --all-features test_latest_installed_version_ignores_real_latest_dir -- --nocapture
  • cargo test --all-features runtime_path_does_not_return_file_based_runtime_symlink -- --nocapture
  • mise run test:e2e e2e/backend/test_fuzzy_versions_do_not_install_literal_dirs e2e/cli/test_upgrade_latest_stale e2e/tools/test_runtime_symlink_migration e2e/tools/test_runtime_symlinks

Notes

This PR fixes the definite broken cases without regressing tools that genuinely use short exact versions:

  • existing broken installs are migrated once
  • stale dirs are self-healed during normal runtime symlink rebuilds
  • new latest and explicit prefix: requests no longer create literal install directories when resolution fails
  • exact short versions remain installable for tools whose real release scheme uses them

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the upgrade logic in src/cli/upgrade.rs to force the reinstallation of outdated tools, ensuring that channel-based versions like @latest or nightly are correctly updated even if their installation directories already exist. The review feedback identifies a potential failure when global locked settings are enabled and suggests explicitly setting locked: false in the InstallOptions to allow the installation of new versions before the lockfile is updated.

Comment thread src/cli/upgrade.rs Outdated
Comment on lines 265 to 268
force: true,
jobs: self.jobs,
raw: self.raw,
resolve_options: ResolveOptions {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When Settings::get().locked is enabled (e.g., via global configuration), InstallOptions will inherit locked: true by default. This causes mise upgrade to fail during the installation of new versions because they are not yet present in the lockfile with their associated metadata (like URLs and checksums). Since the purpose of upgrade is to resolve and install new versions and subsequently update the lockfile, it should explicitly set locked: false to bypass this verification during the installation phase.

Additionally, note that setting force: true globally in InstallOptions will cause any already-installed dependencies of the outdated tools to be reinstalled as well. While this ensures a clean state, it might lead to unnecessary work if a tool has many dependencies that are already up-to-date.

Suggested change
force: true,
jobs: self.jobs,
raw: self.raw,
resolve_options: ResolveOptions {
force: true,
jobs: self.jobs,
raw: self.raw,
locked: false,
resolve_options: ResolveOptions {

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

This PR fixes two root-cause bugs behind stale literal install directories: resolve_prefix no longer falls back to the literal prefix string when remote returns no matches, and is_version_installed now detects when a request path (e.g., installs/<id>/latest/) differs from the resolved concrete version, forcing a real reinstall instead of a silent no-op. Supporting changes add a one-time migration to rewrite existing stale dirs into runtime symlinks and extend rebuild_symlinks_in_dir to self-heal stale real dirs during normal runtime rebuilds.

Confidence Score: 4/5

Safe to merge with two minor P2 items; no blocking correctness issues remain.

All remaining findings are P2: a narrower channel-dir filter in the latest_installed_version fallback (functionally benign due to sort order but worth hardening) and a migration test that only checks marker creation without exercising an actual stale-dir rewrite. Core logic in resolve_prefix, is_version_installed, rebuild_symlinks_in_dir, and migrate_real_dirs_in_dir is correct and the new e2e tests cover the primary failure modes.

e2e/tools/test_runtime_symlink_migration — test does not exercise stale-dir migration; src/backend/mod.rs — latest_installed_version fallback filter.

Important Files Changed

Filename Overview
src/toolset/tool_version.rs Fixes two root-cause bugs: resolve_prefix no longer falls back to the literal prefix string when remote returns empty; resolve_version adds matches.last() shortcut before the previous resolve_prefix tail-call, making the final fallback build(v) for unresolved exact versions instead of the now-error-raising prefix path.
src/backend/mod.rs Extends is_version_installed guard to ToolRequest::Version (covering @latest), and adds a concrete-version fallback to latest_installed_version(None) that skips real latest/ dirs; minor: fallback only excludes "latest" but not other bare channel dir names.
src/runtime_symlinks.rs Adds rebuild_symlinks_in_dir heuristic to replace stale real dirs (guarded by concrete_installs set) and a new migrate_real_dirs / migrate_real_dirs_in_dir pair; guard is correct for the common cases.
src/migrate.rs Adds idempotent one-time migration migrate_runtime_symlink_dirs gated on a marker file; wires into run() correctly with marker written only on success.
e2e/backend/test_fuzzy_versions_do_not_install_literal_dirs New e2e test that asserts @latest, prefix:24, and prefix:24.0 all fail (not silently create literal dirs) when no versions are listed; directly validates the primary bug fix.
e2e/cli/test_upgrade_latest_stale Regression test that simulates a stale real latest/ dir and verifies mise upgrade re-installs and runtime resolves to the concrete 2.0.0 path.
e2e/tools/test_runtime_symlink_migration Only creates a concrete 20.0.0/bin dir; migration does nothing observable and the test only checks marker creation — the actual rewrite-stale-dir path is never exercised.
e2e/tools/test_runtime_symlink_migration_preserves_concrete_short_version Correctly guards against regressions for tools with legitimate two-part concrete versions: verifies 20/ stays a real dir (not converted to a symlink) when 20.0.0/ also exists.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[mise install / upgrade] --> B{ToolRequest type?}
    B -->|Version / Prefix| C[request.install_path]
    C --> D{path exists?}
    D -->|No| G[check tv.install_path\nresolved concrete path]
    D -->|Yes| E{request basename\n== tv.version?}
    E -->|Yes| F[✓ already installed]
    E -->|No — stale dir| G
    G --> H{concrete path exists?}
    H -->|Yes| F
    H -->|No| I[run backend install hook\nwrites installs/id/2.0.0/]

    J[startup / mise run] --> K[migrate_runtime_symlink_dirs\none-time, marker gated]
    K --> L[migrate_real_dirs_in_dir\nfor each backend]
    L --> M{from in\nconcrete_installs?}
    M -->|Yes| N[skip — real concrete install]
    M -->|No| O[remove_all stale dir\nmake_symlink_or_file]

    P[rebuild runtime symlinks] --> Q[rebuild_symlinks_in_dir]
    Q --> R{is_runtime_symlink?}
    R -->|Yes, wrong target| S[remove symlink → recreate]
    R -->|No — real dir| T{from_name != to_name\nAND not concrete?}
    T -->|Yes| U[remove_all stale dir → recreate]
    T -->|No| V[continue]
Loading

Reviews (8): Last reviewed commit: "fix(backend): preserve concrete installs..." | Re-trigger Greptile

@risu729
Copy link
Copy Markdown
Contributor

risu729 commented Apr 21, 2026

You should add an e2e test that fails without this change, but passes with this change. I use latest for many tools, but it upgrades fine. I don't think this is the correct way to fix.
Also, doesn't it force-reinstall even the actual latest version is the same as the current one?

@jdx
Copy link
Copy Markdown
Owner

jdx commented Apr 21, 2026

I haven't looked deep but I suspect this will break shit. I'm surprised the e2e tests passed.

… pins

`is_version_installed` short-circuits on `tv.request.install_path()`, which
is derived from the REQUEST version. For a channel pin like `@latest`, the
request path is `installs/<id>/latest/`. If that dir exists as a real
directory (not a runtime symlink) — e.g. a prior install left `tv.version`
set to the literal string "latest" because resolve fell back to the alias
(offline, transient remote 404, etc.) — the early return fires and the
backend's install hook never runs. `mise upgrade` prints
`✓ installed <new-version>` while the on-disk binary is stale.

An existing guard handled this for `ToolRequest::Prefix` by comparing
`install_path.file_name()` against `tv.version` and falling through to the
resolved path when they differ. Extend the same guard to
`ToolRequest::Version`, which covers both `@latest` and any other alias
that resolves to a concrete version.

Adds `e2e/cli/test_upgrade_latest_stale` reproducing the scenario: a real
`installs/dummy/latest/` dir with a sentinel file in place before
`mise up`. Before the fix the install silently skipped; with the fix,
`installs/dummy/2.0.0/` is created and `mise x dummy -- dummy` runs the
new binary.

Refs: jdx#9275
@AsgardMuninn AsgardMuninn force-pushed the fix/upgrade-force-reinstall branch from 9ffdfe1 to f1e70bc Compare April 21, 2026 15:49
@AsgardMuninn AsgardMuninn changed the title fix(upgrade): force reinstall to avoid silent skip for channel-pinned tools fix(backend): check resolved path in is_version_installed for channel pins Apr 21, 2026
@AsgardMuninn
Copy link
Copy Markdown
Contributor Author

@risu729 @jdx — you were both right, the force: true approach was too broad. I've force-pushed a replacement commit that addresses the root cause in is_version_installed instead.

TL;DR: The early return compares tv.request.install_path() (which uses the request version like "latest") against the filesystem, but never falls through to tv.install_path() (the resolved version). An existing guard already handled this for ToolRequest::Prefix; I've extended it to ToolRequest::Version so that channel pins like @latest check the resolved path when the request dir name doesn't match the resolved version.

This answers @risu729's "doesn't it force-reinstall even when the current version is the same as the latest?" — it no longer does. force stays false on the upgrade path; the narrow guard only triggers when the request basename (e.g. "latest") genuinely differs from the resolved tv.version (e.g. "1.21.0").

It also answers why you haven't hit this with most @latest tools — the bad state (a real installs/<id>/latest/ directory instead of a symlink) only arises when the first install left tv.version set to the literal "latest", which happens when resolve falls back to the alias in resolve_prefix. Backends like aqua / vfox / core / npm rarely trip that path; the Go backend is the most exposed because sub-module paths + goproxy behavior make transient resolve failures more likely.

E2e regression test included per @risu729's request: e2e/cli/test_upgrade_latest_stale sets up the bad state with a sentinel file and asserts that mise up actually installs the new version. Full root-cause analysis and tradeoffs in the updated PR description.

@jdx
Copy link
Copy Markdown
Owner

jdx commented Apr 21, 2026

oh is this why the "latest" as directory thing started happening? we definitely need to fix this

@AsgardMuninn
Copy link
Copy Markdown
Contributor Author

AsgardMuninn commented Apr 21, 2026

@jdx Confirmed — that's the mechanism. Three links in the chain:

  1. is_version_installed short-circuit → fixed by this PR.
  2. runtime_symlinks::rebuild (src/runtime_symlinks.rs:58) deliberately skips real dirs.
  3. latest_installed_version (src/backend/mod.rs:855-878) returns the literal "latest" when the path isn't a symlink — that's the self-reinforcing loop: stale real dir → "latest" as a version → resolve_version re-pins tv.version = "latest" → runtime resolves back to the stale dir.

So the current PR makes install_version_ run but the stale installs/<id>/latest/ stays and wins at runtime — exactly what my e2e just caught.

Pushing a follow-up commit on this branch: after install_version_ succeeds, when the request basename ≠ resolved tv.version, replace the stale real dir at the request path with a runtime symlink to the resolved dir — using the existing make_symlink_or_file helper, ~15 LOC, Windows-safe, tight guards (only Version/Prefix request, basename mismatch, non-symlink, inside our own installs dir). The existing e2e/cli/test_upgrade_latest_stale stays as regression proof (fails today, passes after).

Two calls for you:

  • Local cleanup in install_version (surgical) or extend runtime_symlinks::rebuild to self-heal existing stuck states across all tools (broader, higher blast radius)?
  • Want me to also harden latest_installed_version so it refuses to return the literal alias when the path isn't a symlink? That's the longer-lived root cause, but I'd do it as a separate PR unless you'd rather bundle.

AsgardMuninn added a commit to AsgardMuninn/mise that referenced this pull request Apr 21, 2026
Extends the `is_version_installed` fix (f1e70bc) to repair the on-disk
state so the stale alias dir can never silently shadow the resolved
install again.

Chain of bugs:
1. `is_version_installed` short-circuited on `tv.request.install_path()`
   — FIXED by f1e70bc (check resolved path for channel pins).
2. `runtime_symlinks::rebuild` deliberately refuses to overwrite real
   directories with a symlink — intentional safety rail.
3. Consequence of (2): once `installs/<id>/latest/` exists as a real dir
   (offline first-install, transient remote 404, etc.), `latest/` stays
   a real dir forever. `latest_installed_version` (src/backend/mod.rs
   ~line 855) even returns the literal `"latest"` string for it,
   self-reinforcing the stale state.

The fix runs inside `install_version` after `install_version_` succeeds
and writes backend metadata. It is gated by five guards that together
limit the action to the exact bug surface:
- `ToolRequest::Version | Prefix` only (channel pins; skips Ref, Sub,
  Path, System)
- request basename != resolved basename (it is actually a channel pin,
  not a direct `@2.0.0`)
- request path and resolved path differ (not the trivial case)
- request path lives under `tv.ba().installs_path` (NEVER touch
  `--system` / `--shared` dirs; @jdx flagged this explicitly)
- request path exists
- request path is NOT already a runtime symlink (idempotent; re-runs
  of `mise install` do nothing)

When all guards pass, `remove_all_with_warning` nukes the stale real
dir and `make_symlink_or_file` creates `./<resolved-version>` in its
place — a plain relative runtime symlink, identical to what
`runtime_symlinks::rebuild` would have written on a clean install.
Windows falls back to a text file, handled by the helper.

Addresses @jdx's "'latest' as a directory" concern from PR jdx#9276
review: after this fix, the directory literally cannot remain as a
real dir past a successful install.

e2e test updated to lock the invariant with `test -L
$MISE_DATA_DIR/installs/dummy/latest` alongside the existing
binary-version assertion.

Refs: jdx#9275, jdx#9276
@AsgardMuninn AsgardMuninn force-pushed the fix/upgrade-force-reinstall branch from 9d050e6 to f1e70bc Compare April 21, 2026 21:54
@jdx
Copy link
Copy Markdown
Owner

jdx commented Apr 21, 2026

I want some time to look this over properly but should get back to you soon. Nice work identifying this.

@jdx
Copy link
Copy Markdown
Owner

jdx commented Apr 22, 2026

I dug into the regression path a bit more and found the behavior split across two changes:

  1. The stale real directories themselves come from an old fallback in resolve_prefix / latest resolution where we fall back to the literal request when remote matching returns no result (latest, 24, 24.0, etc.). That can leave real install dirs named after the fuzzy request instead of the resolved version.
  2. The reason it started breaking users recently is dc840866 (fix(env): use runtime symlink paths for fuzzy versions). After that change, runtime PATH/bin resolution started preferring request paths like latest, 24, and 24.0, so once one of those existed as a real directory, it began winning at runtime instead of the concrete install dir.

To address both the new and existing bad states, I pushed a follow-up commit on this branch (2f01b8051):

  • an install-time repair that rewrites a stale request-path dir into the correct runtime symlink after a successful resolved install
  • a one-time startup migration in migrate.rs that rewrites existing stale fuzzy-version dirs into runtime symlinks and drops a marker file so it only runs once

That handles existing installs with low ongoing blast radius, which seems to match the concern here better than a perpetual broad cleanup. I also left a removal note on the migration so it can be deleted after the grace period.

Locally verified with:

  • cargo check --all-features
  • mise run test:e2e e2e/cli/test_upgrade_latest_stale e2e/tools/test_runtime_symlink_migration e2e/tools/test_runtime_symlinks

This comment was generated by an AI coding assistant.

@jdx jdx changed the title fix(backend): check resolved path in is_version_installed for channel pins fix(backend): stop fuzzy requests installing literal dirs Apr 22, 2026
Comment thread src/runtime_symlinks.rs
@jdx jdx enabled auto-merge (squash) April 22, 2026 13:03
@jdx jdx disabled auto-merge April 22, 2026 13:05
Comment thread src/runtime_symlinks.rs
@jdx jdx merged commit 70c3f7b into jdx:main Apr 22, 2026
32 of 34 checks passed
jdx added a commit that referenced this pull request Apr 22, 2026
## Summary

Fixes the `test_go_install_slow` failure on the 2026.4.19 release PR
where `mise x go:github.com/go-kratos/kratos/cmd/kratos/v2@latest --
bash -c 'kratos --help 2>&1'` exits 127 (`kratos: command not found`).

## Root Cause

`parse_goproxy()` in [src/backend/go.rs](src/backend/go.rs:441) used
`env::var(\"GOPROXY\").unwrap_or_else(...)`. `unwrap_or_else` only fires
when the variable is *unset*; it does not fire when the variable is set
to the empty string.

The e2e harness forwards `GOPROXY=\"\${GOPROXY:-}\"` in
[e2e/run_test:84](e2e/run_test:84), which evaluates to an empty string
inside the docker container. That empty string feeds into
`parse_goproxy_value(\"\")`, which returns no proxies, so:

1. `fetch_proxy_versions` returns `None`.
2. The `go list -m -versions` fallback returns an empty `Versions` list
for submodules without tagged releases (e.g.
`github.com/go-kratos/kratos/cmd/kratos/v2`).
3. `_list_remote_versions` returns `[]` and mise prints `WARN No
versions found for go:…`.

This bug was latent. Before
[#9276](#9276), `resolve_version` fell
through to `resolve_prefix(\"latest\")`, which resolved to the literal
string `\"latest\"` — and `go install …@latest` handled it natively.
After #9276, resolution errors with `no versions found` and `mise x`
never installs the tool, so `kratos` is not on PATH → `bash -c 'kratos
--help'` exits 127.

Go's own tooling treats `GOPROXY=` identically to unset:

\`\`\`
\$ env GOPROXY= go env GOPROXY
https://proxy.golang.org,direct
\`\`\`

## Fix

Treat both unset and empty `GOPROXY` as the default.

\`\`\`rust
let goproxy = std::env::var(\"GOPROXY\")
    .ok()
    .filter(|s| !s.is_empty())
    .unwrap_or_else(|| DEFAULT_GOPROXY.to_string());
\`\`\`

Added unit test `parse_goproxy_empty_uses_default`.

## Validation

- \`cargo test --bin mise backend::go::tests\` — all 12 tests pass,
including the new one.
- Reproduced the failure locally with \`env -i GOPROXY= … mise ls-remote
go:github.com/go-kratos/kratos/cmd/kratos/v2\` (returned nothing before
the fix, now returns \`2.0.0-20260404020628-f149714c1d54\`).

Failing run this fixes:
https://github.com/jdx/mise/actions/runs/24803479152/job/72592666111

*This PR was generated by an AI coding assistant.*

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Low risk: small, localized change to Go backend proxy parsing plus a
unit test; behavior only changes when `GOPROXY` is set to an empty
string.
> 
> **Overview**
> Fixes Go backend proxy resolution to treat `GOPROXY=` (empty) the same
as an unset `GOPROXY`, defaulting to `https://proxy.golang.org,direct`
to match `go env` behavior.
> 
> Adds a unit test (`parse_goproxy_empty_uses_default`) that temporarily
sets `GOPROXY` to empty and asserts the default proxy is used.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
61d7c9f. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants