Draft: Path::is_normalized and is_lexically_bounded by MikkelPaulson · Pull Request #152740 · rust-lang/rust

MikkelPaulson · 2026-02-17T03:05:18Z

Draft proposal for #134694. See parent task for discussion.

Update normalize_lexically to be infallible and return Cow
Add (for discussion) is_normalized to shortcut the distinction between Cow::Borrowed and Cow::Owned
Add (for discussion) is_lexically_bounded to broadly handle the use case of the Result previously returned by normalize_lexically

To do (if the draft is approved):

add unit tests, and move some of the excessive doctests there
add tests for non-Unix environments, and fix the inevitable bugs therefrom
clean up is_normalized, including Const path separators libs-team#744
- some logic from is_normalized may need to be moved to Components to keep things from being too brittle

Draft proposal for rust-lang#134694. * Update `normalize_lexically` to be infallible and return `Cow` * Add (for discussion) `is_normalized` to shortcut the distinction between `Cow::Borrowed` and `Cow::Owned` * Add (for discussion) `is_lexically_bounded` to broadly handle the use case of the `Result` previously returned by `normalize_lexically` TODO: doctests are handled (maybe excessively), but unit tests are needed

Update behaviour of Path::is_lexically_bounded to consider the working directory unbounded, as promised by the doc comment description but not by the doctest.

rust-log-analyzer · 2026-02-17T04:57:45Z

The job tidy failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)

fmt: checked 6703 files
tidy check
tidy [rustdoc_json (src)]: `rustdoc-json-types` modified, checking format version
tidy: Skipping binary file check, read-only filesystem
tidy [style (library)]: /checkout/library/std/src/path.rs:3361: TODO is used for tasks that should be done before merging a PR; If you want to leave a message in the codebase use FIXME
tidy [style (library)]: /checkout/library/std/src/path.rs:3389: TODO is used for tasks that should be done before merging a PR; If you want to leave a message in the codebase use FIXME
tidy [style (library)]: /checkout/library/std/src/path.rs:3413: TODO is used for tasks that should be done before merging a PR; If you want to leave a message in the codebase use FIXME
tidy [style (library)]: FAIL
removing old virtual environment
creating virtual environment at '/checkout/obj/build/venv' using 'python3.10' and 'venv'
creating virtual environment at '/checkout/obj/build/venv' using 'python3.10' and 'virtualenv'
Requirement already satisfied: pip in ./build/venv/lib/python3.10/site-packages (25.3)
Collecting pip
---
info: ES-Check: there were no ES version matching errors!  🎉
typechecking javascript files
tidy: The following check failed: style (library)
Bootstrap failed while executing `test src/tools/tidy tidyselftest --extra-checks=py,cpp,js,spellcheck`
Command `/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-tools-bin/rust-tidy --root-path=/checkout --cargo-path=/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo --output-dir=/checkout/obj/build --concurrency=4 --npm-path=/node/bin/yarn --extra-checks=py,cpp,js,spellcheck` failed with exit code 1
Created at: src/bootstrap/src/core/build_steps/tool.rs:1612:23
Executed at: src/bootstrap/src/core/build_steps/test.rs:1365:29

Command has failed. Rerun with -v to see more details.
Build completed unsuccessfully in 0:02:54
  local time: Tue Feb 17 04:57:33 UTC 2026
  network time: Tue, 17 Feb 2026 04:57:33 GMT
##[error]Process completed with exit code 1.
##[group]Run echo "disk usage:"

rust-bors · 2026-03-04T04:34:15Z

☔ The latest upstream changes (presumably #153377) made this pull request unmergeable. Please resolve the merge conflicts.

ChrisDenton

I'm generally in favour of returning a Cow from normalize_lexically. Or even for PathBuf to have a version that mutates in place.

As for is_normalized and is_lexically_bounded, I think it's hard to evaluate in isolation. I'd like to see the motivation use cases and how that fits in overall. Usually when it's important to only use the result if a condition is true, we have the API return an Option or Result (or in some cases a closure but that seems less relevant here). Also usually we'd return a separate type so they can't be mixed up but I don't know if people would consider that overkill here.

View changes since this review

ChrisDenton · 2026-03-09T14:41:40Z

library/std/src/path.rs

+    ///
+    /// * Starts with either
+    ///   * a prefix followed by root (`C:\`); or
+    ///   * a prefix (`C:`); or


I realise this is a pre-existing issue but I think a "normalized" path shouldn't have some of the weirder Windows prefixes because they're harder to handle. For example, C: is technically relative but not relative to the current directory. This can break people's assumptions of what is_relative means.

My interpretation of "normalized" is just "is expressed in simplest possible terms". There is no path, however strange, that can't have a boolean response to that question. Although this is a good point for normalize_lexically, which I think in its present implementation will normalize C:..\ to C:\.

I don't think C:foo is expressed In it's simplest term. It's roughly equivalent to doing %=C:%\foo in a command prompt. Simplifying it would require expanding it to get the real path.

clarfonthey · 2026-03-09T16:46:52Z

library/std/src/path.rs

+    /// assert!(Path::new("abc/../def").is_lexically_bounded());
+    ///
+    /// assert!(!Path::new("").is_lexically_bounded());
+    /// assert!(!Path::new(".").is_lexically_bounded());


This one feels weird to me; should this not be bounded? I would expect anything below the current directory to not be bounded, but at or above to be bounded.

That was a last-minute change on my part; I previously had it implemented as you describe. But I thought the most likely use case for this would be recreating a directory structure from an untrusted source (eg. extracting at archive), where you wouldn't want the source data to be able to address the pwd directly. I'm not sure if there's a compelling use case for pwd-or-lower.

That's fair; perhaps is_strictly_lexically_bounded and is_lexically_bounded could make that distinction? They could use the same code internally.

In the extracting files case, extracting multiple files would require a strict bound, but extracting a single file could allow for a loose bound. At least, when talking about the directory the files are in.

But even when extracting a single file you wouldn't want to allow a "file" called ".".

The more elegant solution might be an enum along the lines of Inner/Pwd/Outer/Absolute. Which could ultimately stand in for is_absolute() as well. But I'm not sure we gain much from the extra complexity.

Reading your first paragraph before you added the second, I actually wasn't sure if is_strictly_lexically_bounded was meant to be the case with or without .. Which implies some naming confusion if nothing else.

Yeah, sorry, was typing up on the bus so I was trying to brief and then correcting myself as I realised how not-detailed it was.

Your suggestion of having an enum actually does sound like a good idea, since it feels similar to how classify works on floats: you have one method to get every possible class, and other methods to explicitly check which class it is. is_absolute is a special case because it can short-circuit whereas the others have to read the whole path.

Also to clarify on the strictness: I was thinking it in terms of the way "strict" is used in mathematics: a subset of a set could be the set itself, but a strict subset explicitly forbids this.

So, in this sense, it would be whether the path is a "strictly" bounded within the current directory or not, lexically. So, a strict bound excludes the bound, whereas a loose bound allows the bound itself.

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 17, 2026

This comment has been minimized.

Sign in to view

make Path::is_lexically_bounded exclude cwd

efa4b59

Update behaviour of Path::is_lexically_bounded to consider the working directory unbounded, as promised by the doc comment description but not by the doctest.

MikkelPaulson mentioned this pull request Feb 17, 2026

Tracking Issue for normalize_lexically #134694

Open

4 tasks

ChrisDenton reviewed Mar 9, 2026

View reviewed changes

clarfonthey reviewed Mar 9, 2026

View reviewed changes

Uh oh!

Conversation

MikkelPaulson commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment has been minimized.

rust-log-analyzer commented Feb 17, 2026

Uh oh!

rust-bors bot commented Mar 4, 2026

Uh oh!

ChrisDenton left a comment • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

clarfonthey Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

MikkelPaulson commented Feb 17, 2026 •

edited

Loading

ChrisDenton left a comment •

edited by rustbot

Loading

clarfonthey Mar 9, 2026 •

edited

Loading