Remove our dependency on nanoid by woodruffw · Pull Request #19201 · astral-sh/uv

woodruffw · 2026-04-28T18:12:19Z

Summary

This introduces a new workspace member, uv-fastid. uv-fastid provides IDs using the same alphabet as nanoid, but with a slightly smaller token size (16 instead of 21) and fewer knobs for configuration.

In practice it should also be faster, since our IDs are now trivially copyable + don't require any heap allocations to construct internally. This is probably not a huge deal in our case anyways (ID generation isn't exactly dominating our runtime), but it doesn't hurt to have a fast primitive here.

The two main APIs I've added are insecure() and secure(), whose names refer to the kind of PRNG they use internally. Right now we only use insecure(), since we don't actually need a CSPRNG for the uses of nanoid we're replacing (revision and archive IDs).

Some other notes:

Arguably we could remove the secure/insecure distinction entirely, and just keep the CSPRNG path -- with rand we have a purely userspace CSPRNG (seeded from the OS's CSPRNG), so we don't really need fastrand for performance here.
I'm guilty of using unsafe in this, although I think the usage is demonstrably safe (as long as crate-private invariants are maintained) 🙂
I think there's no breakage risk with this -- FWICT our only usage of nanoid doesn't make any format assumptions, so switching to a slightly shorter ID form here shouldn't break anything. Our previous IDs also aren't part of any public interface AFAICT.

Closes #19176.

Test Plan

Added unit tests to the new uv-fastid crate. Our existing tests should cover this transitively as well.

Signed-off-by: William Woodruff <william@astral.sh>

woodruffw · 2026-04-28T18:50:06Z

Some other notes:

I picked a token size of 16 because log2(64) * 16 = 96 bits of entropy, which I think is more than enough for our use cases (in the non-security-sensitive setting we could go even lower, but even in the security setting that's a comfortable guessing/collision margin).

Signed-off-by: William Woodruff <william@astral.sh>

woodruffw · 2026-04-28T19:39:45Z

Ah, looks like we do have a behavioral assumption around ArchiveId's format:

 thread 'tests::test_link_deserialize' (133698) panicked at crates/uv-cache/src/lib.rs:1455:9:
    assertion failed: Link::from_str("archive-v0/foo").is_ok()

(The assumption before was that any String can be an ID, but now we're strict about it being a uv_fastid::Id-formatted string.)

I think we have a few options here:

I could bump the CacheBucket for archives to archive-v1, which (IIUC) will prevent us from reading the old entries and failing.
I could change the Link::from_str tests to make them pass, under the assumption that we'll just silently ignore the now-invalid archive entries.
We could revert to ArchiveId(String) instead of ArchiveId(uv_fastid::Id), which loses us a few allocations but preserves exact compatibility with the current IDs.

Edit: This also might be minor in any case, since FWICT Link is basically dead code except one Windows path?

charliermarsh · 2026-04-28T21:01:58Z

Do we need to be able to read existing IDs from disk? Like, do we need to be able to read these into our ID format?

❯ ls archive-v0
_3OAfZctF8UydglecrX0x aDcgovMVjxtaHXNCqt5MB gDOrNLlXb_61jOj8QHVNd mptS-Rhq4XPOyS7GFLAo7 TPYObcfx552jcjLbC-Ze_
_3oBciWusDv3wJMSA2k1j aDesp9jd9o_OfiZmqQt1Q ge4B5Av6_I1AmYM_SE2n4 MQKGCSwouTUcv-18fv2ic tQkdqY0-v4mSiexMnhgIx
_4Xo0kTSe9pOtzJmIwQe1 ADEwFc5XvSpW4F_n6w3cc gemQALx3mCH46z5DdJG_1 MQxWuQXNpwxkEP_WGVqOP TR-qz4iFLjPmqB2DAzRal
_5fR2nwdv0mukbpiURbo5 AdPGqIgQ6dIcqtVY6Ukzj GfC2kpPVxWjalEumO-FTc MrHj-Y-c1IwwVts8Q-hqj TRypPoYDSpSP9hE9nikNZ
_6uAgliFRctrSMvHkcgtw AdSANrr6dclYwxb5sUoI8 gfMBbzvHc6a0nY6rVDCI3 mRS75UNUPLwHMLfuHiYRL TS167myrBkRdHhfM2poMG
_7KPaIBs7gGkMZpCdkfUW AdwlJTA4HJmFauEKavD8S gfr7BTVGbqfWntoX9UXRM MRuSt3YdrkmMC8OkDWKGy tsiuXo_IEK4p57anWCKPE
_AWeEmMq3mrf8W_aPCcje AfyEl63gGYBZyravasAFW gGhXEwIaf4gbprwV0DtJr MS7FFdYzPtIw747AkAQO9 TSKa2juut_KuN-MDZLI_u
_CH8IEno0SZOFqwzSLX4d Ag_LijIN9aG30sSg2vQKS GgiV8fiNcZYdgMckklgnr MsfFoacczQ5Sycu6B6QvH tsnnqoFF-VkINtiDfdHnx
_cy8IlLQTNmkpj3XqHHjy ahMwCMSnw_FImlGvNWwIw GGqsuRRigeQW-Fj7uaAH4 MSZG2bqP04ghGHMaBUqIr TTBwEyhLXjmUHwIR_-eaj

woodruffw · 2026-04-28T22:34:40Z

I think if we bump the archive version then no, the old ones will just become orphaned (like other bucket bumps). But I'll test that tomorrow, since it looks like the archive bucket hasn't been bumped before.

zsol · 2026-04-29T13:38:55Z

Bumping the archive version is a breaking change though. Is that warranted here? Maybe we could do that as a separate step later?

Edit: the case I mainly have in mind is a read-only mounted uv cache, where it's undesirable/impossible for the uv process to refresh it (because e.g. it's network-isolated)

woodruffw · 2026-04-29T14:22:47Z

Bumping the archive version is a breaking change though. Is that warranted here? Maybe we could do that as a separate step later?

I might be missing it, but what makes it a breaking change? IIUC in the read-only cache case like that it'll result in cache misses rather than hard failures, but I don't think we typically consider that a breaking change (since we've done bumps with e.g. the metadata cache on patch releases).

With that said I agree it's not warranted here 😅 -- this is all a bit of a micro-optimization/attempt to save a few heap allocations on archive cache keys. I think the easiest thing for me to do here is to just revert back to using String for these keys and make a note that in the future we can re-roll the type + bucket version here.

Signed-off-by: William Woodruff <william@astral.sh>

zanieb · 2026-04-29T14:36:25Z

Bumping the bucket isn't considered a breaking change.

It does seem reasonable to avoid bumping for now and to do it when we have a stronger motiviation.

Signed-off-by: William Woodruff <william@astral.sh>

…9298) (#19301) ## Summary Closes #19298. #19201 (0.11.9) switched revision IDs from 21-character `nanoid` to `uv_fastid`'s 16-character IDs, but didn't bump the cache bucket version. As a result, when 0.11.9+ opens a cache populated by an older uv, deserialization of an existing revision pointer hard-fails with: ``` Failed to deserialize cache entry invalid ID: "HM0NxJml5hc7UjbfTWT1r" (must be 16 ID characters in the alphabet) ``` This effectively poisons long-lived caches on shared CI workers after upgrade. ## Fix Mirror the existing convention already documented on `ArchiveId` in `crates/uv-cache/src/archive.rs`: ```rust /// Note: for compatibility with the existing `archive-v0` bucket, this is a newtype /// around a `String` instead of a newtype around `uv_fastid::Id`. In the future, /// we may want to bump to `archive-v1` and switch to using `uv_fastid::Id` directly. ``` `RevisionId` had the same shape but was migrated to `uv_fastid::Id` directly in #19201, which is what made the legacy 21-character pointers undecodable. Switching `RevisionId` back to `String` (with the same `uv_fastid::insecure()` source for new IDs) restores forward-compatibility with both formats. The ID is only used as an opaque path/string component, so no other call site needs to change. ## Test plan - [x] New unit test in `crates/uv-distribution/src/source/revision.rs`: - `deserialize_legacy_nanoid_revision` round-trips a `Revision` with a 21-character legacy ID through `rmp_serde` (the exact failing path from the bug report) - `round_trip_current_revision` confirms freshly-generated IDs still serialize as the new 16-character form - [x] `cargo build -p uv` clean - [x] `cargo test -p uv-distribution --lib` -> 21/21 passing (incl. the 2 new tests) - [x] `cargo test -p uv-fastid` -> 4/4 passing - [x] `cargo fmt --check` clean - [x] `cargo clippy -p uv-distribution --lib --tests` clean --------- Signed-off-by: William Woodruff <william@yossarian.net> Co-authored-by: William Woodruff <william@yossarian.net>

Remove our dependency on nanoid

39be6f3

Signed-off-by: William Woodruff <william@astral.sh>

woodruffw requested review from charliermarsh and zsol April 28, 2026 18:12

woodruffw self-assigned this Apr 28, 2026

woodruffw added the internal A refactor or improvement that is not user-facing label Apr 28, 2026

woodruffw commented Apr 28, 2026

View reviewed changes

Comment thread Cargo.toml

woodruffw commented Apr 28, 2026

View reviewed changes

Comment thread crates/uv-cache/src/archive.rs

woodruffw added 2 commits April 28, 2026 14:14

Fix cargo shear

7932b4a

Signed-off-by: William Woodruff <william@astral.sh>

Add a description

5330033

Signed-off-by: William Woodruff <william@astral.sh>

woodruffw added 4 commits April 28, 2026 14:51

Add serde support

64bccb5

Signed-off-by: William Woodruff <william@astral.sh>

Dedupe types, plumb serde use

a73c421

Signed-off-by: William Woodruff <william@astral.sh>

Clippy

0d913f9

Signed-off-by: William Woodruff <william@astral.sh>

Add some FromStr tests

adcc685

Signed-off-by: William Woodruff <william@astral.sh>

Go back to a string for ArchiveId

a78e660

Signed-off-by: William Woodruff <william@astral.sh>

woodruffw marked this pull request as ready for review April 29, 2026 14:32

infallible FromStr

408dec7

Signed-off-by: William Woodruff <william@astral.sh>

zanieb approved these changes Apr 29, 2026

View reviewed changes

Elaborate on comments

5374a33

Signed-off-by: William Woodruff <william@astral.sh>

woodruffw enabled auto-merge (squash) April 29, 2026 15:47

woodruffw merged commit 0e69f0c into main Apr 29, 2026
102 of 103 checks passed

woodruffw deleted the ww/rm-nanoid branch April 29, 2026 15:56

g3n35i5 mentioned this pull request May 6, 2026

Regression in 0.11.9: Failed to deserialize cache entry: invalid ID for IDs written by older uv versions #19298

Closed

ChrisJr404 mentioned this pull request May 6, 2026

fix(cache): accept legacy ID format from pre-0.11.9 cache entries (#19298) #19301

Merged

6 tasks

woodruffw mentioned this pull request May 8, 2026

Encapsulate fastid APIs in their type #19328

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove our dependency on nanoid#19201

Remove our dependency on nanoid#19201
woodruffw merged 10 commits intomainfrom
ww/rm-nanoid

woodruffw commented Apr 28, 2026

Uh oh!

Uh oh!

Uh oh!

woodruffw commented Apr 28, 2026

Uh oh!

woodruffw commented Apr 28, 2026 •

edited

Loading

Uh oh!

charliermarsh commented Apr 28, 2026

Uh oh!

woodruffw commented Apr 28, 2026

Uh oh!

zsol commented Apr 29, 2026 •

edited

Loading

Uh oh!

woodruffw commented Apr 29, 2026

Uh oh!

zanieb commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

woodruffw commented Apr 28, 2026

Summary

Test Plan

Uh oh!

Uh oh!

Uh oh!

woodruffw commented Apr 28, 2026

Uh oh!

woodruffw commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charliermarsh commented Apr 28, 2026

Uh oh!

woodruffw commented Apr 28, 2026

Uh oh!

zsol commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

woodruffw commented Apr 29, 2026

Uh oh!

zanieb commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

woodruffw commented Apr 28, 2026 •

edited

Loading

zsol commented Apr 29, 2026 •

edited

Loading