Skip to content

fix(pacquet/fs): serialize concurrent CAS writes to the same path#11758

Merged
zkochan merged 7 commits into
mainfrom
claude/fix-ubuntu-ci-flaky-PD1bJ
May 20, 2026
Merged

fix(pacquet/fs): serialize concurrent CAS writes to the same path#11758
zkochan merged 7 commits into
mainfrom
claude/fix-ubuntu-ci-flaky-PD1bJ

Conversation

@KSXGitHub

@KSXGitHub KSXGitHub commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes a flaky CI failure on Ubuntu only (Pacquet CI / Lint and Test (ubuntu-latest)) where pacquet install --frozen-lockfile aborts with:

failed to import "<store>/v11/files/65/<hash>" to ".../LICENSE":
No such file or directory (os error 2)

Root cause

Two install snapshots whose tarballs ship identical file content (e.g. the shared LICENSE between @pnpm.e2e/hello-world-js-bin and @pnpm.e2e/hello-world-js-bin-parent) hash to the same CAS path and call ensure_file concurrently. The race:

  1. Writer A: O_CREAT|O_EXCL succeeds, starts write_all.
  2. Writer B: O_CREAT|O_EXCLAlreadyExistsverify_or_rewrite sees meta.len() < content.len() (A is still writing) → write_atomic renames a temp file over the live source.
  3. A concurrent rayon worker calling link_file's reflink / fs::hard_link against the source observes a transient NotFound window during the rename.

Why Ubuntu only: ext4/tmpfs report the partial size during a concurrent write fast enough to open the window; macOS/APFS and Windows CI runners' stat cadence and lower parallelism don't reliably open it.

The existing docstring in ensure_file.rs explicitly acknowledged that pacquet had omitted pnpm's locker map under the (broken) assumption that "the hot path calls ensure_file at most once per CAS file per install." Sibling packages with shared files violate that.

Fix

Port pnpm v11's locker to a per-path Mutex (slightly stronger than pnpm's dedup-only cache). Concurrent writers of the same CAS path now serialize, so the second caller's verify_or_rewrite always takes the byte-match fast path and never rewrites.

Test plan

  • New regression test concurrent_writers_of_same_path_do_not_swap_the_inode fires 32 threads at one path with identical content and asserts the inode never swaps — the observable signal that no writer took the write_atomic rename path.
  • cargo test -p pacquet-fs — 25 passed.
  • cargo clippy --workspace --all-targets -- -D warnings clean.
  • cargo check --workspace --all-targets clean.

https://claude.ai/code/session_01UizmMquu22ZMzB7wrxswr4

Summary by CodeRabbit

  • Improvements

    • File writes to the same path are now serialized so concurrent callers no longer interfere; this ensures the fast byte-compare path is applied reliably and prevents partial or swapped writes.
  • Tests

    • Added a Unix-only multithreaded test that verifies concurrent writers succeed and that the file’s contents and on-disk identity remain consistent after simultaneous writes.

Review Change Stack

Two install snapshots whose tarballs ship identical file content
(e.g. a shared LICENSE across sibling packages like
`@pnpm.e2e/hello-world-js-bin` and `@pnpm.e2e/hello-world-js-bin-parent`)
hash to the same CAS path and call `ensure_file` concurrently. Without
serialization the second writer's `O_CREAT|O_EXCL` hits `AlreadyExists`
while the first writer is mid-`write_all`, falls into
`verify_or_rewrite`'s `meta.len() != content.len()` arm, and runs
`write_atomic` — which renames a temp file over the live source.
On Linux/ext4 the partial-size observation is fast enough that this
window opens often, surfacing as flaky CI failures of the form:

    failed to import "<store>/v11/files/65/<hash>" to ".../LICENSE":
    No such file or directory (os error 2)

emitted by `link_file` whose `reflink`/`fs::hard_link` raced the
rename. macOS/APFS and Windows tests pass because their `stat` cadence
and CI runner parallelism don't reliably open the window.

Port pnpm v11's `locker: Map<string, number>` semantics — slightly
stronger in pacquet, as a per-path `Mutex` rather than a dedup cache —
so two writers of the same CAS path serialize through it. The second
caller acquires the lock after the first writer's `write_all` has
finished, then takes the byte-match fast path inside `verify_or_rewrite`
and never has to rewrite. The previous docstring at the bottom of the
"Differences from pnpm" list explicitly acknowledged the locker
omission and predicted it would matter; this change closes that gap.

Add a regression test (`concurrent_writers_of_same_path_do_not_swap_the_inode`)
that fires 32 threads at one path with identical content and asserts
the inode never swaps — the observable signal that no writer ever took
the `write_atomic` rename path.
@coderabbitai

coderabbitai Bot commented May 20, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d7b02b3c-0190-4ebc-b7bc-ff3c44e07795

📥 Commits

Reviewing files that changed from the base of the PR and between 97b3225 and bc0d409.

📒 Files selected for processing (1)
  • pacquet/crates/fs/src/ensure_file/tests.rs
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Lint and Test (windows-latest)
  • GitHub Check: Lint and Test (ubuntu-latest)
  • GitHub Check: Lint and Test (macos-latest)
  • GitHub Check: Dylint
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Code Coverage
  • GitHub Check: Compile & Lint
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs

📄 CodeRabbit inference engine (pacquet/AGENTS.md)

pacquet/**/*.rs: When porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(), or streamParser.write(), mirror the call site, payload, and ordering so the reporter parses pacquet's NDJSON the same way it parses pnpm's.
Declare a newtype wrapper for branded string types. Do not collapse the brand into a plain String or &str.
If upstream always validates before construction, validate in pacquet's wrapper too. The wrapper must construct only via TryFrom<String> and/or FromStr. Do not provide an infallible public constructor.
If upstream never validates, just brand for type-safety. Expose an infallible From<String> (and From<&str> when convenient).
If upstream occasionally constructs without validation, expose from_str_unchecked as an escape hatch alongside the validating constructor.
Match upstream serde behavior for branded types that cross JSON, YAML, or INI boundaries. Use #[serde(try_from = "String")] for deserialization and #[serde(into = "String")] for serialization.
Use #[derive(derive_more::From)] and #[derive(derive_more::Into)] for mechanical conversion impls. Fall back to manual impl only when conversion needs custom logic.
String-literal unions should become enums, not newtype wrappers. Model closed sets of valid string values as enums.
Template literal types should be treated as branded strings with validation discipline from rules 2-5.
Choose owned vs. borrowed parameters to minimize copies. Widen to the most encompassing type (&Path over &PathBuf, &str over &String) when it doesn't force extra copies.
Prefer Arc::clone(&x) / Rc::clone(&x) over x.clone() for reference-counted types, so the cost is visible at the call site.
Follow Rust API Guidelines for naming conventions.
Do not use star imports inside module bodies. Write use super::{Foo, bar} instead of use super::*;. Two forms stay allowed: external-crate preludes like use rayon::prelude::*; and root-of-module re-...

Files:

  • pacquet/crates/fs/src/ensure_file/tests.rs
🔇 Additional comments (1)
pacquet/crates/fs/src/ensure_file/tests.rs (1)

304-319: This no longer proves do_not_swap_the_inode.

All inode samples are taken after ensure_file returns, so a single verify_or_rewrite rename that completes before every worker reaches Line 344 will still make observed converge on the replacement inode and Lines 358-363 pass. The doc comment already says that, which means the test name / regression contract is now stronger than what the assertion actually verifies. Please either strengthen the test with an in-flight observation point or rename/re-scope it to the weaker smoke-test guarantee instead. As per coding guidelines, "Doc comments (///, //!) document the contract" and "Tests are documentation."

Also applies to: 358-363


📝 Walkthrough

Walkthrough

This PR adds a DashMap-backed, process-local per-path mutex to serialize concurrent ensure_file writers, integrates mutex acquisition into ensure_file, and adds a Unix-only concurrency test verifying inode and content invariants.

Changes

Per-path concurrency serialization

Layer / File(s) Summary
Per-path lock infrastructure
pacquet/crates/fs/Cargo.toml, pacquet/crates/fs/src/ensure_file.rs
Added dashmap workspace dependency and created cas_write_lock helper that maintains a static OnceLock<DashMap<PathBuf, Arc<Mutex<()>>>> to store and return per-path mutexes, with imports updated to include DashMap, OnceLock, and sync primitives.
Locking integration in ensure_file
pacquet/crates/fs/src/ensure_file.rs
Integrated per-path mutex acquisition at the start of ensure_file before open/verify/atomic-rewrite operations, with poisoned-lock recovery via unwrap_or_else(...into_inner()). Updated documentation to replace the prior "no locker cache" bullet with description of the new process-local per-path serialization behavior.
Concurrent writer validation
pacquet/crates/fs/src/ensure_file/tests.rs
Added Unix-only concurrency test concurrent_writers_of_same_path_do_not_swap_the_inode that spawns multiple threads synchronized by a barrier, concurrently invokes ensure_file on the same target path, then asserts all succeed and the resulting file matches expected bytes while inode and length invariants are preserved.

Sequence Diagram(s)

sequenceDiagram
  participant Writer as Writer Thread
  participant LockMap as cas_write_lock
  participant Mutex as per-path Mutex
  participant FS as Filesystem
  Writer->>LockMap: ensure_file(path) -> get_or_insert Arc<Mutex>
  LockMap->>Mutex: return Arc<Mutex> for path
  Writer->>Mutex: acquire lock (guard)
  Writer->>FS: open / verify / atomic-rewrite
  FS-->>Writer: write result
  Writer->>Mutex: release lock
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

area: supply chain security

Suggested reviewers

  • zkochan

Poem

A rabbit locks files with care and grace, 🔒
Each path gets its own synchronous place,
Threads queue like stars in a neat little line,
Inodes stay steady and contents align,
Hooray for tidy writes — hop, safe, and fine! 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding per-path serialization for concurrent CAS writes. It clearly identifies the problem being fixed and the solution.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/fix-ubuntu-ci-flaky-PD1bJ

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov-commenter

codecov-commenter commented May 20, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.64%. Comparing base (3687b0e) to head (bc0d409).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #11758      +/-   ##
==========================================
- Coverage   90.12%   89.64%   -0.49%     
==========================================
  Files         146      151       +5     
  Lines       16876    17578     +702     
==========================================
+ Hits        15209    15757     +548     
- Misses       1667     1821     +154     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

`ensure_file` is public; references to private `verify_or_rewrite`,
`write_atomic`, and `cas_write_lock` only resolve under
`--document-private-items`, which trips `rustdoc::private-intra-doc-links`
under `-D warnings`. Drop the link form and keep the names as plain
backticked identifiers — the docstring still reads correctly and
`cargo doc` no longer fails.
Comment thread pacquet/crates/fs/src/ensure_file/tests.rs Outdated
Address review feedback on #11758 — single-letter constants are
opaque; `WRITER_COUNT` reads as the loop count of concurrent
writers the test fires at one CAS path.
@github-actions

github-actions Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.00      5.9±0.22ms   729.4 KB/sec    1.06      6.3±0.08ms   686.6 KB/sec

@KSXGitHub KSXGitHub left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the documentation more concise and focused.

Comment thread pacquet/crates/fs/src/ensure_file/tests.rs Outdated
@github-actions

github-actions Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Integrated-Benchmark Report (Linux)

Scenario: Frozen Lockfile

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.385 ± 0.109 2.313 2.688 1.03 ± 0.05
pacquet@main 2.318 ± 0.051 2.237 2.381 1.00
pnpm 4.607 ± 0.047 4.517 4.659 1.99 ± 0.05
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.3847090452,
      "stddev": 0.10875690455721122,
      "median": 2.358349492,
      "user": 2.80250764,
      "system": 3.6088811799999996,
      "min": 2.312693414,
      "max": 2.687785175,
      "times": [
        2.312693414,
        2.374138229,
        2.326695113,
        2.355275498,
        2.687785175,
        2.368928559,
        2.359946387,
        2.378330254,
        2.326545226,
        2.356752597
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.3184605131000002,
      "stddev": 0.05100345041453707,
      "median": 2.3370192475000002,
      "user": 2.67949914,
      "system": 3.5894638799999994,
      "min": 2.236840649,
      "max": 2.3811184770000002,
      "times": [
        2.3405539490000002,
        2.255330449,
        2.291562521,
        2.3811184770000002,
        2.333484546,
        2.3498498260000003,
        2.362569836,
        2.269402859,
        2.236840649,
        2.363892019
      ]
    },
    {
      "command": "pnpm",
      "mean": 4.6067262121,
      "stddev": 0.047260176659664886,
      "median": 4.61934752,
      "user": 7.818185440000001,
      "system": 4.04494098,
      "min": 4.51730027,
      "max": 4.658805881,
      "times": [
        4.654355049,
        4.608028232,
        4.5711421020000005,
        4.658805881,
        4.628032748,
        4.612539746,
        4.644116067,
        4.546786732,
        4.626155294,
        4.51730027
      ]
    }
  ]
}

Scenario: Frozen Lockfile (Hot Cache)

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 715.3 ± 91.7 671.4 974.2 1.00 ± 0.15
pacquet@main 713.9 ± 53.2 674.8 845.3 1.00
pnpm 2528.8 ± 79.0 2395.0 2644.1 3.54 ± 0.29
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.71534736988,
      "stddev": 0.09168674480439003,
      "median": 0.6855012609800001,
      "user": 0.38330262,
      "system": 1.5805075800000001,
      "min": 0.6714006919800001,
      "max": 0.97423969998,
      "times": [
        0.97423969998,
        0.6944527369800001,
        0.6832946829800001,
        0.6810138529800001,
        0.6794770239800001,
        0.6901694069800001,
        0.6714006919800001,
        0.67832783598,
        0.7133899279800001,
        0.6877078389800001
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.71391133818,
      "stddev": 0.05316824439222206,
      "median": 0.6955709834800001,
      "user": 0.37812222,
      "system": 1.5865728799999999,
      "min": 0.67479088598,
      "max": 0.84527939898,
      "times": [
        0.70326112598,
        0.69740106698,
        0.67479088598,
        0.6937408999800001,
        0.84527939898,
        0.6884604029800001,
        0.7006652129800001,
        0.68018079998,
        0.7696282989800001,
        0.68570528898
      ]
    },
    {
      "command": "pnpm",
      "mean": 2.52884831268,
      "stddev": 0.07903863866586336,
      "median": 2.52155315198,
      "user": 3.00704162,
      "system": 2.22444798,
      "min": 2.39504738898,
      "max": 2.6440563799800003,
      "times": [
        2.57754821998,
        2.41684780498,
        2.6208668789800003,
        2.56222989398,
        2.52346125598,
        2.51964504798,
        2.51747491198,
        2.39504738898,
        2.51130534398,
        2.6440563799800003
      ]
    }
  ]
}

Address review feedback on #11758: drop body-narrating prose, keep
only the contract and the non-obvious why. The test docstring no
longer references "pre-fix code" so it reads independent of this
PR's history.
@KSXGitHub KSXGitHub marked this pull request as ready for review May 20, 2026 06:06
@KSXGitHub KSXGitHub requested a review from zkochan as a code owner May 20, 2026 06:06
Copilot AI review requested due to automatic review settings May 20, 2026 06:06

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pacquet/crates/fs/src/ensure_file/tests.rs`:
- Around line 339-343: The inode assertion only compares two metadata reads
taken after the operation and can miss an inode swap that happened mid-run;
modify the test in ensure_file/tests.rs to read and store the original inode
(e.g., call fs::metadata(&*path).ino() into a variable like original_ino) before
performing the join/operation and then assert that final_meta.ino() equals
original_ino after the operation, and keep the existing content and length
assertions (use the existing identifiers path, final_meta, and content to locate
and update the test).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: bd53df16-4545-4d4a-93e6-51047fab273f

📥 Commits

Reviewing files that changed from the base of the PR and between 3a54205 and e3df5a5.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • pacquet/crates/fs/Cargo.toml
  • pacquet/crates/fs/src/ensure_file.rs
  • pacquet/crates/fs/src/ensure_file/tests.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Agent
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Lint and Test (windows-latest)
  • GitHub Check: Run benchmark on ubuntu-latest
🧰 Additional context used
📓 Path-based instructions (1)
pacquet/**/*.rs

📄 CodeRabbit inference engine (pacquet/AGENTS.md)

pacquet/**/*.rs: When porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(), or streamParser.write(), mirror the call site, payload, and ordering so the reporter parses pacquet's NDJSON the same way it parses pnpm's.
Declare a newtype wrapper for branded string types. Do not collapse the brand into a plain String or &str.
If upstream always validates before construction, validate in pacquet's wrapper too. The wrapper must construct only via TryFrom<String> and/or FromStr. Do not provide an infallible public constructor.
If upstream never validates, just brand for type-safety. Expose an infallible From<String> (and From<&str> when convenient).
If upstream occasionally constructs without validation, expose from_str_unchecked as an escape hatch alongside the validating constructor.
Match upstream serde behavior for branded types that cross JSON, YAML, or INI boundaries. Use #[serde(try_from = "String")] for deserialization and #[serde(into = "String")] for serialization.
Use #[derive(derive_more::From)] and #[derive(derive_more::Into)] for mechanical conversion impls. Fall back to manual impl only when conversion needs custom logic.
String-literal unions should become enums, not newtype wrappers. Model closed sets of valid string values as enums.
Template literal types should be treated as branded strings with validation discipline from rules 2-5.
Choose owned vs. borrowed parameters to minimize copies. Widen to the most encompassing type (&Path over &PathBuf, &str over &String) when it doesn't force extra copies.
Prefer Arc::clone(&x) / Rc::clone(&x) over x.clone() for reference-counted types, so the cost is visible at the call site.
Follow Rust API Guidelines for naming conventions.
Do not use star imports inside module bodies. Write use super::{Foo, bar} instead of use super::*;. Two forms stay allowed: external-crate preludes like use rayon::prelude::*; and root-of-module re-...

Files:

  • pacquet/crates/fs/src/ensure_file/tests.rs
  • pacquet/crates/fs/src/ensure_file.rs
🔇 Additional comments (2)
pacquet/crates/fs/Cargo.toml (1)

14-14: LGTM!

pacquet/crates/fs/src/ensure_file.rs (1)

1-1: LGTM!

Also applies to: 8-11, 155-161, 171-174, 201-214

Comment thread pacquet/crates/fs/src/ensure_file/tests.rs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a CI-only race in pacquet’s CAS writer where concurrent ensure_file calls targeting the same CAS path can trigger an unexpected atomic rewrite and break concurrent linking.

Changes:

  • Serialize concurrent ensure_file calls per target path using a process-local lock map.
  • Add a regression test intended to cover concurrent writers to the same path.
  • Add dashmap as a dependency for the lock map implementation.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

File Description
pacquet/crates/fs/src/ensure_file.rs Adds a process-local per-path mutex (via DashMap) and acquires it in ensure_file to serialize same-path writes.
pacquet/crates/fs/src/ensure_file/tests.rs Adds a new concurrent-writer regression test (currently has correctness/compilation issues noted in comments).
pacquet/crates/fs/Cargo.toml Adds dashmap dependency for the new lock map.
Cargo.lock Updates lockfile to include dashmap for pacquet-fs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +331 to +342
})
})
.collect();

for handle in handles {
handle.join().expect("writer thread should not panic");
}

let final_meta = fs::metadata(&*path).unwrap();
assert_eq!(fs::read(&*path).unwrap(), *content);
assert_eq!(final_meta.len(), content.len() as u64);
assert_eq!(final_meta.ino(), fs::metadata(&*path).unwrap().ino());

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two parts:

  • *content move: incorrect — assert_eq!(a, b) expands to match (&a, &b) { ... } so *content is a place expression that's immediately re-borrowed via &. No move happens. The test compiles and the suite has been passing locally and in CI. Other pre-existing tests in this file use the same *arc_of_vec pattern.
  • Mid-run inode swap: agreed and applied in 5c571a7 — added per-thread observed_inodes capture plus a pre-create that gives the assertion a reference point (original_ino) taken before the contended run.

Written by an agent (Claude Code, claude-opus-4-7).


Generated by Claude Code

Comment on lines +203 to +207
/// Returning `Arc<Mutex<()>>` (rather than a `Ref` into the map) so
/// the caller `.lock()`s after the shard guard from `entry()` is
/// released — otherwise a recursive lookup on the same shard could
/// deadlock. The map is never pruned: entries are one per unique
/// CAS path the install wrote, bounded by the working set.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deliberate. Pacquet is a CLI that terminates after one install — the process lifetime equals one install's working set, so the map can't grow beyond that. The entries are small (a PathBuf + a 24-byte Arc<Mutex<()>> ≈ ~150 bytes each); even ~10k unique CAS files is well under 2 MiB, dwarfed by the rest of the install's working set.

A Weak-based cleanup pass would have to gate on "no outstanding lock holders", which means atomically checking refcount under the DashMap shard guard while holding the lock to insert — a real complexity cliff for no win on the only caller (ensure_file from a one-shot CLI). If pacquet ever grows a long-lived server mode, revisit then; until then the simpler form matches the lifecycle.


Written by an agent (Claude Code, claude-opus-4-7).


Generated by Claude Code

… test

Address CodeRabbit's and Copilot's review feedback on #11758. The
previous assert compared two metadata reads taken at the same moment
after join — tautological. Pre-creating the file gives an
`original_ino` reference taken before the contended run, and each
writer also captures the path's inode immediately after its own
`ensure_file` returns so a mid-run rename swap from another writer
is visible to at least one of those observations.
Address `perfectionist::macro_trailing_comma` warning surfaced by
the Dylint CI on #11758 — multi-line macro invocations must end with
a trailing comma. (My earlier local `dylint --all` run missed this
because the perfectionist library hadn't fully recompiled against the
new test code.)
Copilot AI review requested due to automatic review settings May 20, 2026 06:17

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.

Comment on lines +320 to +327
// Pre-create so every contender below routes through
// `verify_or_rewrite` — the path that can swap the inode via
// `write_atomic`'s rename on a size mismatch. `original_ino` is
// the reference point both the post-run and per-writer inode
// observations are compared against.
ensure_file(&path, &content, None).expect("pre-create");
let original_ino = fs::metadata(&*path).unwrap().ino();

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point — applied in bc0d409: dropped the pre-create. The shape is now what you described (fresh dirent, races on the initial create/write).

A caveat I should be upfront about: any inode observation taken after ensure_file returns has already missed the rename window. In a single-rename race the late-comers all observe whatever the post-rename inode is and the test slips past. The per-thread observation does catch a multi-rename race (where renames continue happening while earlier observers have already captured), but a strict regression test would need an in-flight observer thread that polls during write_all, which I didn't add — the docstring now flags this as a smoke test rather than a tight regression.

Happy to add the polling-observer variant if you'd like a stronger test. Re: the unbounded-LOCKS comment, replied separately on that thread — deliberate given pacquet's one-shot CLI lifecycle.


Written by an agent (Claude Code, claude-opus-4-7).


Generated by Claude Code

Address Copilot review feedback on #11758. The pre-create made every
contender take the byte-match fast path against a complete file, so
the lock made no observable difference and the test couldn't even
weakly distinguish lock from no-lock. Removing it leaves the
fresh-dirent shape (`O_CREAT|O_EXCL` race + verify_or_rewrite for
the rest), and the per-thread inode observations can now diverge
under a multi-rename race without the lock.

Honest about the limitation in the docstring: any observation taken
after `ensure_file` returns has already missed the rename window, so
a single-rename race converges on one inode and slips past. The
test catches the multi-rename case and validates the "no deadlock,
all writers see correct content" baseline.
@coderabbitai coderabbitai Bot added the area: supply chain security Issues related to minimumReleaseAge, blockExoticSubdeps, build script safety, and trust policies. label May 20, 2026
@zkochan zkochan merged commit df77f64 into main May 20, 2026
28 checks passed
@zkochan zkochan deleted the claude/fix-ubuntu-ci-flaky-PD1bJ branch May 20, 2026 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: supply chain security Issues related to minimumReleaseAge, blockExoticSubdeps, build script safety, and trust policies.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants