-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Debug sync_dry_run flake by panicking with verbose output on failure
#13817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
b32d18b to
4529f68
Compare
4529f68 to
f11d11d
Compare
165bf0a to
94571ee
Compare
sync_dry_run on infinite loopsync_dry_run flake by panicking with verbose output on failure
f00f018 to
27aa4a3
Compare
This reverts commit f11d11d.
27aa4a3 to
74864ed
Compare
Gankra
approved these changes
Jun 12, 2025
This was referenced Jun 26, 2025
zanieb
added a commit
that referenced
this pull request
Jun 26, 2025
In addition to our flake catch, keep a snapshot. Extends #13817
zanieb
added a commit
that referenced
this pull request
Jun 26, 2025
zanieb
added a commit
that referenced
this pull request
Jun 30, 2025
This fixes an obscure cache collision in Python interpreter queries, which we believe to be the root cause of CI flakes we've been seeing where a project environment is invalidated and recreated. This work follows from the logs in [this CI run](https://github.com/astral-sh/uv/actions/runs/15934322410/job/44950599993?pr=14326) which captured one of the flakes with tracing enabled. There, we can see that the project environment is invalidated because the Python interpreter in the environment has a different version than expected: ``` DEBUG Checking for Python environment at `.venv` TRACE Cached interpreter info for Python 3.12.9, skipping probing: .venv/bin/python3 DEBUG The interpreter in the project environment has different version (3.12.9) than it was created with (3.9.21) ``` (this message is updated to reflect #14329) The flow is roughly: - We create an environment with 3.12.9 - We query the environment, and cache the interpreter version for `.venv/bin/python` - We create an environment for 3.9.12, replacing the existing one - We query the environment, and read the cached information The Python cache entries are keyed by the absolute path to the interpreter, and rely on the modification time (ctime, nsec resolution) of the canonicalized path to determine if the cache entry should be invalidated. The key is a hex representation of a u64 sea hasher output — which is very unlikely to collide. After an audit of the Python query caching logic, we determined that the most likely cause of a collision in cache entries is that the modification times of underlying interpreters are identical. This seems pretty feasible, especially if the file system does not support nanosecond precision — though it appears that the GitHub runners do support it. The fix here is to include the canonicalized path in the cache key, which ensures we're looking at the modification time of the _same_ underlying interpreter. This will "invalidate" all existing interpreter cache entries but that's not a big deal. This should also have the effect of reducing cache churn for interpreters in virtual environments. Now, when you change Python versions, we won't invalidate the previous cache entry so if you change _back_ to the old version we can re-use our cached information. It's a bit speculative, since we don't have a deterministic reproduction in CI, but this is the strongest candidate given the logs and should increase correctness regardless. Closes #14160 Closes #13744 Closes #13745 Once it's confirmed the flakes are resolved, we should revert - #14275 - #13817
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Investigating #13744
I tried reproducing here by running the test in a loop, but could not. I presume it's an interaction with other tests.
This drops the snapshot, but I think it's worth it to try to examine the flake?