Skip to content

Allow filesystem commands to access files with glob metachars in name#10557

Closed
bobhy wants to merge 33 commits intonushell:mainfrom
bobhy:wax_glob2
Closed

Allow filesystem commands to access files with glob metachars in name#10557
bobhy wants to merge 33 commits intonushell:mainfrom
bobhy:wax_glob2

Conversation

@bobhy
Copy link
Copy Markdown
Contributor

@bobhy bobhy commented Sep 30, 2023

MEGA-EDIT:
Completely changing goals of this PR!
Work to date has found a fix for a good bug, but the original idea of supporting {} brace expansion like bash no longer seems achievable as a globbing enhancement. So this PR is now all about:

Fix problem #10571.

Description

before: Args to cp, mv, rm are assumed to be globs within each command. That means a file that looks like a glob is not matched by the glob and takes some thinking about to access. That is, the glob pattern shadows a file with the same spelling.

> ls a*
╭───┬────────┬──────┬──────┬────────────────╮
│ # │  name  │ type │ size │    modified    │
├───┼────────┼──────┼──────┼────────────────┤
│ 0 │ a[bc]d │ file │  0 B │ 34 seconds ago │
│ 1 │ abd    │ file │  0 B │ now            │
│ 2 │ acd    │ file │  0 B │ now            │
╰───┴────────┴──────┴──────┴────────────────╯

> cp --verbose 'a[bc]d' dest
copied /home/bobhy/src/rust/work/r4/abd to /home/bobhy/src/rust/work/r4/dest/abd
copied /home/bobhy/src/rust/work/r4/acd to /home/bobhy/src/rust/work/r4/dest/acd

> ## Note -- a[bc]d *not* copied, and seemingly hard to access.
> cp --verbose 'a\[bc\]d' dest
Error:   × No matches found
   ╭─[entry #33:1:1]
 1 │ cp --verbose 'a\[bc\]d' dest
   ·              ─────┬────
   ·                   ╰── no matches found
   ╰────

> #.. but not completely inaccessible.
> cp --verbose 'a[[]bc[]]d' dest
copied /home/bobhy/src/rust/work/r4/a[bc]d to /home/bobhy/src/rust/work/r4/dest/a[bc]d

Before_2: if file has glob metachars but isn't a valid pattern, user gets a confusing error:

> touch 'a[b'
> cp 'a[b' dest
Error:   × Pattern syntax error near position 30: invalid range pattern
   ╭─[entry #13:1:1]
 1 │ cp 'a[b' dest
   ·    ──┬──
   ·      ╰── invalid pattern
   ╰────

After: Args to cp, mv, etc. are tried first as literal files, and only as globs if not found to be files.

> cp --verbose 'a[bc]d' dest
copied /home/bobhy/src/rust/work/r4/a[bc]d to /home/bobhy/src/rust/work/r4/dest/a[bc]d
> cp --verbose '[a][bc]d' dest
copied /home/bobhy/src/rust/work/r4/abd to /home/bobhy/src/rust/work/r4/dest/abd
copied /home/bobhy/src/rust/work/r4/acd to /home/bobhy/src/rust/work/r4/dest/acd

After_2: file with glob metachars but invalid pattern just works (though the name probably requires string quotes)

> cp --verbose 'a[b' dest
copied /home/bobhy/src/rust/work/r4/a[b to /home/bobhy/src/rust/work/r4/dest/a[b

So, with this fix, a file shadows a glob pattern with the same spelling. You must invent some other spelling for the glob pattern to match the desired files. I think that's less confusing to the user: if ls shows a file with a weird name, s/he'll still be able to copy, rename or delete it.

;tl;dr

Original title: Enable {} glob metachars in command arguments

... To make nu feel more like csh.
Fixes #10498.

So far, only ucp command has been touched, as a proof of concept. But we should proceed through the rest of the filesystem commands.

This PR uses crate wax for globbing instead of fixing the internal crate nu_glob. Once all the clients are upgraded, nu_glob could be deprecated.

Description

Enables glob alternation to be used in command line arguments (where arguments are supposed to refer to paths in filesystem).
e.g:

> ucp cr{awl,eep}.txt dest
> ls dest
╭───┬────────────────┬──────┬──────┬───────────────╮
│ # │      name      │ type │ size │   modified    │
├───┼────────────────┼──────┼──────┼───────────────┤
│ 0 │ dest/crawl.txt │ file │  0 B │ 2 minutes ago │
│ 1 │ dest/creep.txt │ file │  0 B │ 2 minutes ago │
╰───┴────────────────┴──────┴──────┴───────────────╯

{ and } are nu parser tokens, so must be quoted if they are not embedded within the glob.

> ucp '{crawl,hobble,sprint}.txt' dest
> ls dest
╭───┬─────────────────┬──────┬──────┬────────────────╮
│ # │      name       │ type │ size │    modified    │
├───┼─────────────────┼──────┼──────┼────────────────┤
│ 0 │ dest/crawl.txt  │ file │  0 B │ 11 seconds ago │
│ 1 │ dest/hobble.txt │ file │  0 B │ 11 seconds ago │
│ 2 │ dest/sprint.txt │ file │  0 B │ 11 seconds ago │
╰───┴─────────────────┴──────┴──────┴────────────────╯

Note that alternation works for more than 2 alternatives, as shown above.

User-Facing Changes

Currently, the new metachars can only be used in ucp command.

All uses of the old metachars should give same results as they used to, so no breaking changes.
Small changes to error messages when a glob fails to match anything.

Tests + Formatting

  • 🟢 toolkit fmt
  • 🟢 toolkit clippy
  • 🟢 toolkit test
  • 🟢 toolkit test stdlib

After Submitting

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Sep 30, 2023

Our friendly neighborhood windows CI strikes again. Without looking further, I have 2 guesses why it's failing.

  1. Paths are in double quotes so with "C:\path\to\file" all "\" become escapes.
  2. There's a mixture of slashes with 'C:\path/to/file.txt'.
    I've seen both of these scenarios fail sometimes.

Generally, I think this PR is cool.

@bobhy
Copy link
Copy Markdown
Contributor Author

bobhy commented Sep 30, 2023

I see a couple of problems. Firing up my Windows dev environment, which is a performance hit all of its own...

Comment on lines +23 to +28
let processed_pattern = pattern
.item
.replace('\\', "/")
.replace(r#":/"#, r#"\:/"#) // must come after above
.replace('(', r#"\("#) // when path includes Program Files (x86)
.replace(')', r#"\)"#);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what effect this will have on paths with spaces like Program Files and Program Files (x86). I see that lots of things are being replaced here but I'm not sure how that affects things here with these particular paths.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Embedded spaces won't be affected by these changes. A pattern or simple path with an embedded space would have to be enclosed in quotes anyway. It has been tested and works with C:\Program Files (x86)\*!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks for the explanation. I'm wondering if it matters that this is different than how we cd to folders with spaces in them now, and that is with surrounding the path in backticks, which I think, does some bare word magic. I don't really follow how JT made it work. Just wondering if it matters here.

@sophiajt
Copy link
Copy Markdown
Contributor

sophiajt commented Oct 1, 2023

We have to be careful with adding glob syntax. Anything that's part of a valid filename on any of our platforms can't be used.

Can you confirm that { isn't part of a valid filename on Linux, Windows, and macOS?

@bobhy
Copy link
Copy Markdown
Contributor Author

bobhy commented Oct 1, 2023

A key point of this change is to be able to use a simple file path or a glob pattern as argument to a filesystem command like ucp. That means that normal looking file paths have to be parsable as glob patterns (they will match just one file).

That's a problem on Windows, were backslashes and colons in absolute (rooted) patterns will cause glob parse errors. This is a well-known issue: nu_glob has some adjustments, but wax does not. The .NET glob (written by Microsoft engineers) doesn't even try to support a rooted pattern.

For now, I'm including a hack for path/pattern arguments on windows only that covers many cases.

// Sanitize a glob pattern for use on windows.
// this allows filesystem commands like `cp` to accept an arg that can be a path or a glob.
// Deal with 3 aspects of windows paths that can confound glob patterns:
// * path separator: change `\` to `/`. Rust file API on windows supports this in folder separators and in UNC paths.
// * drive letter colon: change `C:` to `C\:`
// * quote parens (such as the infamous `Program Files (x86)`).
// But this means user cannot use `\` to quote other metachars.
// (Windows) users of glob should adopt the habit of quoting with single char classes, e.g instead of `\*`, use `[*]`.
// e.g transform something like: `C:\Users\me\.../{abc,def}*.txt` to `C\:/Users/me/.../{abc,def}*.txt`
//todo: investigate enhancing [wax] to support this more elegantly.
pub fn windows_pattern_hack(pat: &str) -> String {
    pat.replace('\\', "/")
        .replace(r#":/"#, r#"\:/"#) // must come after above
        .replace('(', r#"\("#) // when path includes Program Files (x86)
        .replace(')', r#"\)"#)
}

But there will always be paths you might want to copy data from that are unnatural / impossible to quote so they parse as a glob. And telling users that they have to do glob quoting to access a file with [] or {} seems rude.

Maybe we should change arg parsing to check first whether the arg has any glob metachars? But that's expensive (needs a fat regex to cover all the cases). Maybe just stat the argument and assume it's a file if that works or fails in a file-like way?

I'd like to hear how far this hack gets us... Windows users, please give this a try (remember, glob command and ucp only, for now.

@bobhy
Copy link
Copy Markdown
Contributor Author

bobhy commented Oct 1, 2023

@jntrnr On the contrary, I can confirm that {} are legal in both unix and windows filenames.
But we already had that problem -- glob already supported [], and those are also legal in filenames.

But paths with these special characters to cause a problem: if you do have a path with these glob chars and you specify a similar glob. Example:

> touch a[b]c abc abd
> mkdir foo
> ls
╭───┬───────┬──────┬──────┬──────────╮
│ # │ name  │ type │ size │ modified │
├───┼───────┼──────┼──────┼──────────┤
│ 0 │ a[b]c │ file │  0 B │ now      │
│ 1 │ abc   │ file │  0 B │ now      │
│ 2 │ abd   │ file │  0 B │ now      │
│ 3 │ foo   │ dir  │  2 B │ now      │
╰───┴───────┴──────┴──────┴──────────╯
> ucp a[b]c foo
> ls foo
╭───┬─────────┬──────┬──────┬──────────╮
│ # │  name   │ type │ size │ modified │
├───┼─────────┼──────┼──────┼──────────┤
│ 0 │ foo/abc │ file │  0 B │ now      │
╰───┴─────────┴──────┴──────┴──────────╯

User could reasonably complain that the command should have copied file a[b]c.

@bobhy
Copy link
Copy Markdown
Contributor Author

bobhy commented Oct 1, 2023

... Looking into this situation a bit more, it's a bigger problem. Opening #10571 to discuss.

I propose we merge this PR (since it's a step in a direction we (or at least, I) want to go and doesn't make the new issue any worse), then focus on resolving the new issue.

@sophiajt
Copy link
Copy Markdown
Contributor

sophiajt commented Oct 1, 2023

@jntrnr On the contrary, I can confirm that {} are legal in both unix and windows filenames.

Then let's not go this direction. I'd rather make the built-in glob conservative, and if people want a fancier glob capability, that they would use the glob command.

#10498 is not a bug. (I comment on why in the issue)

@bobhy
Copy link
Copy Markdown
Contributor Author

bobhy commented Oct 2, 2023

the last checkin fixes the problem of #10571, but reverts to the problem of absolute glob patterns on Windows: e.g c:\users\me/*.txt fails with glob syntax error because both : and \ are glob metacharacters. But I think I can cover that scenario too.
Marking as draft while I work on it...

@bobhy bobhy marked this pull request as draft October 2, 2023 16:04
@bobhy
Copy link
Copy Markdown
Contributor Author

bobhy commented Oct 10, 2023

3rd step done!

@bobhy
Copy link
Copy Markdown
Contributor Author

bobhy commented Oct 10, 2023

Now that I'm down to the bit of the project that brought me here, I'm pretty sure it's a bad idea.
In traditional shells, the {} is not really a glob construct, it's what bash calls "brace expansion", and it operates entirely at the syntax level, doesn't care whether the resulting strings match files or not.
And that's the feature I want!, so that commands like mv a.out{,.old} or diff {../worktrees/main,.}/crates/nu-cmd-base/src/lib.rs can be used. I still think nushell should have this ability, but it's not something that makes sense for a globbing facility to do, and probably should use different metacharacters in deference to nu block {}.
wax does have a {} construct, but it is strictly a pattern matching thing and won't return nonexistant paths (when globbing files), so nu glob will support this more limited ability.

So, bottom line, I'm submitting this PR for review as it now stands, updating the title to match what it actually accomplishes and moving on to other work.

@bobhy bobhy changed the title Enable {} glob metachars in command arguments Allow filesystem commands to access files with glob metachars in name Oct 10, 2023
@bobhy
Copy link
Copy Markdown
Contributor Author

bobhy commented Oct 10, 2023

Last note: behavior of ls command is unchanged, because it uses a different globbing function than mv, rm, cp and the rest of the filesystem commands. This might create some user confusion, but I can't wrap my head around it right now.

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Oct 11, 2023

@bobhy can you update the issue body to explain what exactly this does. there have been quite a few changes and a reverse of direction here. It's difficult to consume this PR as it sits because it's hard to wade through all the subsequent steps you went through to get where you're at here. I'd like to point the core-team here and ask them to make a decision based on what the body/description says and then reviewing the code. Maybe you've already done that. Please let me know.

@bobhy
Copy link
Copy Markdown
Contributor Author

bobhy commented Oct 12, 2023

Squashing the history and reposting as a new clean PR #10694.

@bobhy bobhy closed this Oct 12, 2023
@bobhy bobhy deleted the wax_glob2 branch October 13, 2023 15:37
fdncred added a commit that referenced this pull request Oct 18, 2023
…#10694)

(squashed version of #10557, clean commit history and review thread)

Fixes #10571, also potentially: #10364, #10211, #9558, #9310,


# Description
Changes processing of arguments to filesystem commands that are source
paths or globs.
Applies to `cp, cp-old, mv, rm, du` but not `ls` (because it uses a
different globbing interface) or `glob` (because it uses a different
globbing library).

The core of the change is to lookup the argument first as a file and
only glob if it is not. That way,
a path containing glob metacharacters can be referenced without glob
quoting, though it will have to be single quoted to avoid nushell
parsing.

Before: A file path that looks like a glob is not matched by the glob
specified as a (source) argument and takes some thinking about to
access. You might say the glob pattern shadows a file with the same
spelling.
```
> ls a*
╭───┬────────┬──────┬──────┬────────────────╮
│ # │  name  │ type │ size │    modified    │
├───┼────────┼──────┼──────┼────────────────┤
│ 0 │ a[bc]d │ file │  0 B │ 34 seconds ago │
│ 1 │ abd    │ file │  0 B │ now            │
│ 2 │ acd    │ file │  0 B │ now            │
╰───┴────────┴──────┴──────┴────────────────╯

> cp --verbose 'a[bc]d' dest
copied /home/bobhy/src/rust/work/r4/abd to /home/bobhy/src/rust/work/r4/dest/abd
copied /home/bobhy/src/rust/work/r4/acd to /home/bobhy/src/rust/work/r4/dest/acd

> ## Note -- a[bc]d *not* copied, and seemingly hard to access.
> cp --verbose 'a\[bc\]d' dest
Error:   × No matches found
   ╭─[entry #33:1:1]
 1 │ cp --verbose 'a\[bc\]d' dest
   ·              ─────┬────
   ·                   ╰── no matches found
   ╰────

> #.. but is accessible with enough glob quoting.
> cp --verbose 'a[[]bc[]]d' dest
copied /home/bobhy/src/rust/work/r4/a[bc]d to /home/bobhy/src/rust/work/r4/dest/a[bc]d
```
Before_2: if file has glob metachars but isn't a valid pattern, user
gets a confusing error:

```
> touch 'a[b'
> cp 'a[b' dest
Error:   × Pattern syntax error near position 30: invalid range pattern
   ╭─[entry #13:1:1]
 1 │ cp 'a[b' dest
   ·    ──┬──
   ·      ╰── invalid pattern
   ╰────
```

After: Args to cp, mv, etc. are tried first as literal files, and only
as globs if not found to be files.

```
> cp --verbose 'a[bc]d' dest
copied /home/bobhy/src/rust/work/r4/a[bc]d to /home/bobhy/src/rust/work/r4/dest/a[bc]d
> cp --verbose '[a][bc]d' dest
copied /home/bobhy/src/rust/work/r4/abd to /home/bobhy/src/rust/work/r4/dest/abd
copied /home/bobhy/src/rust/work/r4/acd to /home/bobhy/src/rust/work/r4/dest/acd
```
After_2: file with glob metachars but invalid pattern just works.
(though Windows does not allow file name to contain `*`.).

```
> cp --verbose 'a[b' dest
copied /home/bobhy/src/rust/work/r4/a[b to /home/bobhy/src/rust/work/r4/dest/a[b
```

So, with this fix, a file shadows a glob pattern with the same spelling.
If you have such a file and really want to use the glob pattern, you
will have to glob quote some of the characters in the pattern. I think
that's less confusing to the user: if ls shows a file with a weird name,
s/he'll still be able to copy, rename or delete it.

# User-Facing Changes
Could break some existing scripts. If user happened to have a file with
a globbish name but was using a glob pattern with the same spelling, the
new version will process the file and not expand the glob.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

---------

Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
gaetschwartz pushed a commit to gaetschwartz/nushell that referenced this pull request Oct 20, 2023
…nushell#10694)

(squashed version of nushell#10557, clean commit history and review thread)

Fixes nushell#10571, also potentially: nushell#10364, nushell#10211, nushell#9558, nushell#9310,


# Description
Changes processing of arguments to filesystem commands that are source
paths or globs.
Applies to `cp, cp-old, mv, rm, du` but not `ls` (because it uses a
different globbing interface) or `glob` (because it uses a different
globbing library).

The core of the change is to lookup the argument first as a file and
only glob if it is not. That way,
a path containing glob metacharacters can be referenced without glob
quoting, though it will have to be single quoted to avoid nushell
parsing.

Before: A file path that looks like a glob is not matched by the glob
specified as a (source) argument and takes some thinking about to
access. You might say the glob pattern shadows a file with the same
spelling.
```
> ls a*
╭───┬────────┬──────┬──────┬────────────────╮
│ # │  name  │ type │ size │    modified    │
├───┼────────┼──────┼──────┼────────────────┤
│ 0 │ a[bc]d │ file │  0 B │ 34 seconds ago │
│ 1 │ abd    │ file │  0 B │ now            │
│ 2 │ acd    │ file │  0 B │ now            │
╰───┴────────┴──────┴──────┴────────────────╯

> cp --verbose 'a[bc]d' dest
copied /home/bobhy/src/rust/work/r4/abd to /home/bobhy/src/rust/work/r4/dest/abd
copied /home/bobhy/src/rust/work/r4/acd to /home/bobhy/src/rust/work/r4/dest/acd

> ## Note -- a[bc]d *not* copied, and seemingly hard to access.
> cp --verbose 'a\[bc\]d' dest
Error:   × No matches found
   ╭─[entry nushell#33:1:1]
 1 │ cp --verbose 'a\[bc\]d' dest
   ·              ─────┬────
   ·                   ╰── no matches found
   ╰────

> #.. but is accessible with enough glob quoting.
> cp --verbose 'a[[]bc[]]d' dest
copied /home/bobhy/src/rust/work/r4/a[bc]d to /home/bobhy/src/rust/work/r4/dest/a[bc]d
```
Before_2: if file has glob metachars but isn't a valid pattern, user
gets a confusing error:

```
> touch 'a[b'
> cp 'a[b' dest
Error:   × Pattern syntax error near position 30: invalid range pattern
   ╭─[entry nushell#13:1:1]
 1 │ cp 'a[b' dest
   ·    ──┬──
   ·      ╰── invalid pattern
   ╰────
```

After: Args to cp, mv, etc. are tried first as literal files, and only
as globs if not found to be files.

```
> cp --verbose 'a[bc]d' dest
copied /home/bobhy/src/rust/work/r4/a[bc]d to /home/bobhy/src/rust/work/r4/dest/a[bc]d
> cp --verbose '[a][bc]d' dest
copied /home/bobhy/src/rust/work/r4/abd to /home/bobhy/src/rust/work/r4/dest/abd
copied /home/bobhy/src/rust/work/r4/acd to /home/bobhy/src/rust/work/r4/dest/acd
```
After_2: file with glob metachars but invalid pattern just works.
(though Windows does not allow file name to contain `*`.).

```
> cp --verbose 'a[b' dest
copied /home/bobhy/src/rust/work/r4/a[b to /home/bobhy/src/rust/work/r4/dest/a[b
```

So, with this fix, a file shadows a glob pattern with the same spelling.
If you have such a file and really want to use the glob pattern, you
will have to glob quote some of the characters in the pattern. I think
that's less confusing to the user: if ls shows a file with a weird name,
s/he'll still be able to copy, rename or delete it.

# User-Facing Changes
Could break some existing scripts. If user happened to have a file with
a globbish name but was using a glob pattern with the same spelling, the
new version will process the file and not expand the glob.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

---------

Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
hardfau1t pushed a commit to hardfau1t/nushell that referenced this pull request Dec 14, 2023
…nushell#10694)

(squashed version of nushell#10557, clean commit history and review thread)

Fixes nushell#10571, also potentially: nushell#10364, nushell#10211, nushell#9558, nushell#9310,


# Description
Changes processing of arguments to filesystem commands that are source
paths or globs.
Applies to `cp, cp-old, mv, rm, du` but not `ls` (because it uses a
different globbing interface) or `glob` (because it uses a different
globbing library).

The core of the change is to lookup the argument first as a file and
only glob if it is not. That way,
a path containing glob metacharacters can be referenced without glob
quoting, though it will have to be single quoted to avoid nushell
parsing.

Before: A file path that looks like a glob is not matched by the glob
specified as a (source) argument and takes some thinking about to
access. You might say the glob pattern shadows a file with the same
spelling.
```
> ls a*
╭───┬────────┬──────┬──────┬────────────────╮
│ # │  name  │ type │ size │    modified    │
├───┼────────┼──────┼──────┼────────────────┤
│ 0 │ a[bc]d │ file │  0 B │ 34 seconds ago │
│ 1 │ abd    │ file │  0 B │ now            │
│ 2 │ acd    │ file │  0 B │ now            │
╰───┴────────┴──────┴──────┴────────────────╯

> cp --verbose 'a[bc]d' dest
copied /home/bobhy/src/rust/work/r4/abd to /home/bobhy/src/rust/work/r4/dest/abd
copied /home/bobhy/src/rust/work/r4/acd to /home/bobhy/src/rust/work/r4/dest/acd

> ## Note -- a[bc]d *not* copied, and seemingly hard to access.
> cp --verbose 'a\[bc\]d' dest
Error:   × No matches found
   ╭─[entry nushell#33:1:1]
 1 │ cp --verbose 'a\[bc\]d' dest
   ·              ─────┬────
   ·                   ╰── no matches found
   ╰────

> #.. but is accessible with enough glob quoting.
> cp --verbose 'a[[]bc[]]d' dest
copied /home/bobhy/src/rust/work/r4/a[bc]d to /home/bobhy/src/rust/work/r4/dest/a[bc]d
```
Before_2: if file has glob metachars but isn't a valid pattern, user
gets a confusing error:

```
> touch 'a[b'
> cp 'a[b' dest
Error:   × Pattern syntax error near position 30: invalid range pattern
   ╭─[entry nushell#13:1:1]
 1 │ cp 'a[b' dest
   ·    ──┬──
   ·      ╰── invalid pattern
   ╰────
```

After: Args to cp, mv, etc. are tried first as literal files, and only
as globs if not found to be files.

```
> cp --verbose 'a[bc]d' dest
copied /home/bobhy/src/rust/work/r4/a[bc]d to /home/bobhy/src/rust/work/r4/dest/a[bc]d
> cp --verbose '[a][bc]d' dest
copied /home/bobhy/src/rust/work/r4/abd to /home/bobhy/src/rust/work/r4/dest/abd
copied /home/bobhy/src/rust/work/r4/acd to /home/bobhy/src/rust/work/r4/dest/acd
```
After_2: file with glob metachars but invalid pattern just works.
(though Windows does not allow file name to contain `*`.).

```
> cp --verbose 'a[b' dest
copied /home/bobhy/src/rust/work/r4/a[b to /home/bobhy/src/rust/work/r4/dest/a[b
```

So, with this fix, a file shadows a glob pattern with the same spelling.
If you have such a file and really want to use the glob pattern, you
will have to glob quote some of the characters in the pattern. I think
that's less confusing to the user: if ls shows a file with a weird name,
s/he'll still be able to copy, rename or delete it.

# User-Facing Changes
Could break some existing scripts. If user happened to have a file with
a globbish name but was using a glob pattern with the same spelling, the
new version will process the file and not expand the glob.

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

---------

Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants