Skip to content

Conversation

@charliermarsh
Copy link
Member

@charliermarsh charliermarsh commented Jan 27, 2025

Summary

On Windows, we have a lot of issues with atomic replacement and such. There are a bunch of different failure modes, but they generally involve: trying to persist a fail to a path at which the file already exists, trying to replace or remove a file while someone else is reading it, etc.

This PR adds locks to all of the relevant database paths. We already use these advisory locks when building source distributions; now we use them when unzipping wheels, storing metadata, etc.

Closes #11002.

Test Plan

I ran the following script:

# Define the cache directory path
$cacheDir = "C:\Users\crmar\workspace\uv\cache"

# Clear the cache directory if it exists
if (Test-Path $cacheDir) {
    Remove-Item -Recurse -Force $cacheDir
}

# Create the cache directory again
New-Item -ItemType Directory -Force -Path $cacheDir

# Define the command to run with --cache-dir flag
$command = {
    param ($venvPath)

    # Create a virtual environment in the specified path with --python
    uv venv $venvPath

    # Run the pip install command with --cache-dir flag
    C:\Users\crmar\workspace\uv\target\profiling\uv.exe pip install flask==1.0.4 --no-binary flask --cache-dir C:\Users\crmar\workspace\uv\cache -v --python $venvPath
}

# Define the paths for the different virtual environments
$venv1 = "C:\Users\crmar\workspace\uv\venv1"
$venv2 = "C:\Users\crmar\workspace\uv\venv2"
$venv3 = "C:\Users\crmar\workspace\uv\venv3"
$venv4 = "C:\Users\crmar\workspace\uv\venv4"
$venv5 = "C:\Users\crmar\workspace\uv\venv5"

# Start the command in parallel five times using Start-Job, each with a different venv
$job1 = Start-Job -ScriptBlock $command -ArgumentList $venv1
$job2 = Start-Job -ScriptBlock $command -ArgumentList $venv2
$job3 = Start-Job -ScriptBlock $command -ArgumentList $venv3
$job4 = Start-Job -ScriptBlock $command -ArgumentList $venv4
$job5 = Start-Job -ScriptBlock $command -ArgumentList $venv5

# Wait for all jobs to complete
$jobs = @($job1, $job2, $job3, $job4, $job5)
$jobs | ForEach-Object { Wait-Job $_ }

# Retrieve the results (optional)
$jobs | ForEach-Object { Receive-Job -Job $_ }

# Clean up the jobs
$jobs | ForEach-Object { Remove-Job -Job $_ }

And ensured it succeeded in five straight invocations (whereas on main, it consistently fails with a variety of different traces).

@charliermarsh charliermarsh added bug Something isn't working windows Specific to the Windows platform labels Jan 27, 2025
@charliermarsh
Copy link
Member Author

I'll fix up the tests on approval -- they're just count offsets due to writing more files to the cache (sadly).

@konstin
Copy link
Member

konstin commented Jan 28, 2025

Can you add the script you used to provoke the error and check that it's test? Otherwise we don't have a regression test should we ever have to revisit this code.

Connectivity::Offline => CacheControl::AllowStale,
};

// Acquire an advisory lock, to guard against concurrent writes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extend the comment explaining the [cfg(windows)]?

Comment on lines +48 to +52
/// On Windows, this uses the `junction` crate to create a junction point. The
/// operation is _not_ atomic, as we first delete the junction, then create a
/// junction at the same path.
///
/// Note that because junctions are used, the source must be a directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do i read this comment correctly that there is no way for us to make this atomic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try, but making it atomic doesn't solve the problem. We could still run into issues whereby we attempt to overwrite the file while it's open, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good to know, i didn't think of that case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's somewhat tragic. I think we should still try to make this atomic though.

@charliermarsh
Copy link
Member Author

charliermarsh commented Jan 28, 2025

Can you add the script you used to provoke the error and check that it's test? Otherwise we don't have a regression test should we ever have to revisit this code.

Sorry, do you mean, add a test to the codebase itself? The script I used is in the PR summary.

@konstin
Copy link
Member

konstin commented Jan 28, 2025

Sorry, totally overlooked that. Is it possible to port this to an integration test, or alternatively could we add that script to scripts/?

@charliermarsh charliermarsh merged commit f1840c7 into main Jan 28, 2025
73 checks passed
@charliermarsh charliermarsh deleted the charlie/windows branch January 28, 2025 20:33
@charliermarsh
Copy link
Member Author

@konstin -- I'm just gonna leave it in here for now. I think this is ~as good as putting it in scripts given that we'll never run it and it will likely bitrot there.

tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Jan 29, 2025
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.5.24` -> `0.5.25` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>astral-sh/uv (astral-sh/uv)</summary>

### [`v0.5.25`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0525)

[Compare Source](astral-sh/uv@0.5.24...0.5.25)

##### Enhancements

-   Allow installation of manylinux wheels on loongarch64 ([#&#8203;10927](astral-sh/uv#10927))
-   Allow optional `=` for editables in `requirements.txt` ([#&#8203;10954](astral-sh/uv#10954))
-   Add Windows aarch64 to the release binaries ([#&#8203;10885](astral-sh/uv#10885))

##### Bug fixes

-   Use spec-compliant (`128+n`) exit codes for `uv run` and `uv tool run` on Unix ([#&#8203;10781](astral-sh/uv#10781))
-   Fix best-interpreter lookups when there is an invalid interpreter in the `PATH` ([#&#8203;11030](astral-sh/uv#11030))
-   Guard against concurrent cache writes on Windows ([#&#8203;11007](astral-sh/uv#11007))
-   Prioritize package preferences with greater package versions ([#&#8203;10963](astral-sh/uv#10963))
-   Reject `--editable` flag on non-directory requirements ([#&#8203;10994](astral-sh/uv#10994))
-   Respect `--no-sources` for `uv pip install` workspace discovery ([#&#8203;11003](astral-sh/uv#11003))
-   Set `JEMALLOC_SYS_WITH_LG_PAGE=16` in ARM Docker builds ([#&#8203;10943](astral-sh/uv#10943))
-   Update `riscv64` Python downloads to allow install on `riscv64gc` ([#&#8203;10937](astral-sh/uv#10937))
-   Fix file persist retries on Windows ([#&#8203;11008](astral-sh/uv#11008))
-   Fix incorrect error message when specifying `tool.uv.sources.(package).workspace` with other options ([#&#8203;11013](astral-sh/uv#11013))
-   Improve SIGINT handling in `uv run` ([#&#8203;11009](astral-sh/uv#11009))

##### Documentation

-   Add `SECURITY` policy ([#&#8203;11035](astral-sh/uv#11035))
-   Add `Requires-Python` upper bound behavior to the docs ([#&#8203;10964](astral-sh/uv#10964))
-   Add a troubleshooting section and reproducible example guide ([#&#8203;10947](astral-sh/uv#10947))
-   Add documentation for `uv add -r` ([#&#8203;10926](astral-sh/uv#10926))
-   Amend `requires-python` rules in resolver documentation ([#&#8203;10993](astral-sh/uv#10993))
-   Reference workspaces in `--no-sources` documentation ([#&#8203;10995](astral-sh/uv#10995))
-   Update documentation for activating virtual environments in different shell ([#&#8203;11000](astral-sh/uv#11000))
-   Add Docker SHA pinning tip ([#&#8203;10955](astral-sh/uv#10955))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4xMzcuMiIsInVwZGF0ZWRJblZlciI6IjM5LjEzNy4yIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
issokuos added a commit to issokuos/uv that referenced this pull request Jan 29, 2025
* main: (53 commits)
  Shorten "Using existing Python versions" nav item so it fits on one line (astral-sh#11077)
  docs: suggest copy linking for GitLab integration guide (astral-sh#11067)
  Refactor `uv tool run` hint into separate function (astral-sh#11069)
  Fix typo in no-deps docs/comments/cli description (astral-sh#11073)
  Allow `--no-dev --invert` in `uv tree` (astral-sh#11068)
  Add docs for signal handling (astral-sh#11041)
  Add a bit more context about SIGTERM and PID 1 (astral-sh#11036)
  Reflow CLI documentation comments (astral-sh#11040)
  doc typo: unnecessary backslashes to represent brackets in markdown (astral-sh#11059)
  Update Dependabot links (astral-sh#11054)
  Document `gather_credentials` (astral-sh#11024)
  Link to our MRE documentation in the issue template (astral-sh#11045)
  Avoid sharing state between universal and non-universal resolves (astral-sh#11051)
  Mark metadata as dynamic when reading from built wheel cache (astral-sh#11046)
  Fix formatting of `RUST_LOG` documentation (astral-sh#10053)
  Bump version to 0.5.25 (astral-sh#11042)
  Add CVE disclosure to security policy (astral-sh#11037)
  Guard against concurrent cache writes on Windows (astral-sh#11007)
  Add SECURITY policy (astral-sh#11035)
  Improve SIGINT handling in `uv run` (astral-sh#11009)
  ...
devin-ai-integration bot pushed a commit to AutoIDM/uv that referenced this pull request Nov 19, 2025
This commit addresses the remaining concurrency issues on Windows when
multiple processes try to install the same package concurrently.

The previous fix in PR astral-sh#11007 added locks to cache operations, but the
issue persisted because the synchronized_copy function only locked the
destination directory. When multiple processes try to install the same
package, one process may be reading from the archive cache while another
is still writing to it, causing 'file is being used by another process'
errors on Windows.

This fix adds source directory locking on Windows in the synchronized_copy
function to prevent concurrent reads while another process is writing.

Fixes astral-sh#11002
devin-ai-integration bot pushed a commit to AutoIDM/uv that referenced this pull request Nov 19, 2025
astral-sh#11007)

This branch is based on commit 321f8cc,
which is the parent commit of PR astral-sh#11007 that added cache locking improvements.

Testing this historical baseline will prove that the concurrent cache access
issue existed BEFORE the fix was implemented.

Test configuration:
- 40 concurrent jobs
- 3 iterations with cache clearing between each
- Fail-fast on first failure

Expected result: Test should FAIL with concurrent cache access errors,
proving the issue was real and needed the fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working windows Specific to the Windows platform

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Concurrent cache access causes errors on Windows

4 participants