Skip to content

open, rm, umv, cp, rm and du: Don't globs if inputs are variables or string interpolation#11886

Merged
WindSoilder merged 26 commits intonushell:mainfrom
WindSoilder:no_glob_variable
Feb 23, 2024
Merged

open, rm, umv, cp, rm and du: Don't globs if inputs are variables or string interpolation#11886
WindSoilder merged 26 commits intonushell:mainfrom
WindSoilder:no_glob_variable

Conversation

@WindSoilder
Copy link
Copy Markdown
Contributor

@WindSoilder WindSoilder commented Feb 18, 2024

Description

This is a follow up to #11621 (comment)

Also Fixes: #11838

About the code change

It applys the same logic when we pass variables to external commands:

match arg.expr {
// refer to `parse_dollar_expr` function
// the expression type of $variable_name, $"($variable_name)"
// will be Expr::StringInterpolation, Expr::FullCellPath
Expr::StringInterpolation(_) | Expr::FullCellPath(_) => {
arg_keep_raw.push(true)
}
_ => arg_keep_raw.push(false),
}

That is: if user input dynamic things(like variables, sub-expression, or string interpolation), it returns a quoted NuPath, then user input won't be globbed

User-Facing Changes

Given two input files: a*c.txt, abc.txt

  • let f = "a*c.txt"; rm $f will remove one file: a*c.txt.
    * let f = "a*c.txt"; rm --glob $f will remove a*c.txt and abc.txt
  • let f: glob = "a*c.txt"; rm $f will remove a*c.txt and abc.txt

Rules about globbing with variable

Given two files: a*c.txt, abc.txt

Cmd Type example Result
builtin let f = "a*c.txt"; rm $f remove a*c.txt
builtin let f: glob = "a*c.txt"; rm $f remove a*c.txt and abc.txt
builtin let f = "a*c.txt"; rm ($f | into glob) remove a*c.txt and abc.txt
custom def crm [f: glob] { rm $f }; let f = "a*c.txt"; crm $f remove a*c.txt and abc.txt
custom def crm [f: glob] { rm ($f | into string) }; let f = "a*c.txt"; crm $f remove a*c.txt
custom def crm [f: string] { rm $f }; let f = "a*c.txt"; crm $f remove a*c.txt
custom def crm [f: string] { rm $f }; let f = "a*c.txt"; crm ($f | into glob) remove a*c.txt and abc.txt

In general, if a variable is annotated with glob type, nushell will expand glob pattern. Or else, we need to use into | glob to expand glob pattern

Tests + Formatting

Done

After Submitting

I think str glob-escape command will be no-longer required. We can remove it.

@WindSoilder WindSoilder added the notes:breaking-changes This PR implies a change affecting users and has to be noted in the release notes label Feb 18, 2024
@kubouch
Copy link
Copy Markdown
Contributor

kubouch commented Feb 18, 2024

I think it is a good change to remove auto-globbing from variables and string interpolations. Also agreed that we can get rid of str glob-escape.

I'm just not sold on adding --glob to glob-capable builtin commands because it seems like a limited solution that does not solve the problem in all cases where globs can occur:

First, custom commands:

> def gcmd [g: glob] { open $g }
> 'a' > a.txt
> 'b' > b.txt
> gcmd *.txt 
# opens a.txt and b.txt
> let g = '*.txt'
> gcmd $g  # <-- How to expand the glob here?

Note: gcmd *.txt doesn't work on this PR but works on main, I think it's a bug in this PR.

Second, externals. Currently, we do expand globs for externals:

> mkdir cliargs
> cd cliargs
> cargo init
> 'fn main() {
    println!("{:?}", std::env::args().collect::<Vec<String>>());
}' o> src/main.rs
> let gl = '*'
> cargo run -- $gl
["/home/kubouch/.cargo/target/debug/cliargs", ".git", ".gitignore", "Cargo.lock", "Cargo.toml", "src"]

If we remove the variable/string interp. auto-globbing (which we should), there would be no way to allow it for externals because we can't add a --glob flag there.

Third, plugins? Any other cases?

I am thinking of three alternatives (not mutually exclusive):

  1. Add a new Value::Glob variant which could be passed around and cast to/from string (into string, into glob) to prevent glob expansion or collected into a list if needed. Example:
> let g: glob = 'a*c.txt'
> rm $g  # removes abc.txt, azc.txt, ...
> rm a*c.txt   # same
> rm ($g | into string) # removes only a*c.txt
> rm 'a*c.txt' # same
> external-command $g # passes [abc.txt, azc.txt, ...] to the external
> external-command ($g | into string) # passes only [a*c.txt] to the external
  1. Not adding a new value type, but rely more heavily on the spread operator:
> let g: string = 'a*c.txt'
> rm ...(glob $g)  # removes abc.txt, azc.txt, ...
> rm a*c.txt   # same
> rm $g # removes only a*c.txt
> rm 'a*c.txt' # same
> external-command ...(glob $g)  # passes [abc.txt, azc.txt, ...] to the external
> external-command $g  # passes only [a*c.txt] to the external
  1. Allow casting of lists to globs:
> def gcmd [g: glob] { open $g }
> let g = '*.txt'
> gcmd (glob $g)  # opens a.txt and b.txt

We have two other features related to globbing: Value::QuotedString and the glob command. So whatever we do, these should be kept in mind so that we don't duplicate functionality. In the option 1, for example, we might be able to remove both of them.

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Feb 18, 2024

A much more radical approach to number 2 of @kubouch's comments above is to remove globbing from every command and only allow globs via the glob command and the used with the spread operator. It could cause people to lose their mind, but it would simplify things. (maybe remove it from all commands except ls?)

I'm up for giving most anything a try in order to get something nushell-y that we love.

@WindSoilder
Copy link
Copy Markdown
Contributor Author

WindSoilder commented Feb 18, 2024

  1. Add a new Value::Glob variant which could be passed around and cast to/from string (into string, into glob) to prevent glob expansion or collected into a list if needed. Example:

We can fell into trouble if user uses rm with each command: ["a*b.txt", "acc.txt"] | each {|it| rm $it}. I think the type of $it is unclear. Maybe we need to make a decision to say $it is a string rather than glob?

Personally I like 2nd and 3rd solution :) Although I'm not sure how to make it works with ls and du commands, because they don't have rest parameter, then we can't use spread operator on these two commands

@WindSoilder WindSoilder marked this pull request as draft February 19, 2024 13:22
@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Feb 20, 2024

How does this perform with Kubouch's glob torture tests? BTW, I put it here so we can make changes to it easily https://hackmd.io/y5yMvEnrR7WTLH3vVBQlFg?edit

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Feb 20, 2024

into glob sounds interesting. i'm looking forward to playing with it a bit.

@WindSoilder
Copy link
Copy Markdown
Contributor Author

WindSoilder commented Feb 20, 2024

I still fail into a trouble to play with the following two cases, given two files: touch a*c.txt abc.txt.

Number Custom command run result
1 def foo [g: glob] { rm $g } let f = "a*c.txt"; foo $f a*c.txt and abc.txt will be removed
2 def foo [g: string] { rm $g } let f = "a*c.txt"; foo $f only a*c.txt is removed
  • for 1st one: to not apply a glob, we need to use into string inside foo: def foo [g: glob] { rm ($g | into string) }
    * for 2nd one: to apply a glob, we need to use into glob command: def foo [g: string] { rm ($g | into glob) }

Not really sure if the behavior is ok


Edit: the 2nd is resolved. There is only 1st issue left, but sadly I don't think there is a solution.

@WindSoilder WindSoilder marked this pull request as ready for review February 21, 2024 02:36
@WindSoilder
Copy link
Copy Markdown
Contributor Author

Reply to @kubouch 's proposal:

Add a new Value::Glob variant which could be passed around and cast to/from string (into string, into glob) to prevent glob expansion or collected into a list if needed.

Done :)

Not adding a new value type, but rely more heavily on the spread operator

Nushell supports this feature on most of filesystem command except ls and du. Because they don't accept rest parameters.

Allow casting of lists to globs: def gcmd [g: glob] { open $g }; let g = '*.txt'; gcmd (glob $g) # opens a.txt and b.txt

Currently this pr doesn't implement it. We can do this by let g: glob = "*.txt"; gcmd $g or let g = "*.txt"; gcmd ($g | into glob)

@WindSoilder WindSoilder marked this pull request as draft February 21, 2024 22:08
@WindSoilder WindSoilder marked this pull request as ready for review February 21, 2024 22:57
@WindSoilder
Copy link
Copy Markdown
Contributor Author

I think we can land and iterate through it

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Feb 23, 2024

go for it, if you think it's ready.

@WindSoilder WindSoilder merged commit f7d647a into nushell:main Feb 23, 2024
@WindSoilder WindSoilder deleted the no_glob_variable branch February 23, 2024 01:52
@hustcer hustcer added this to the v0.91.0 milestone Feb 23, 2024
@matklad
Copy link
Copy Markdown

matklad commented Feb 23, 2024

let f: glob = "a*c.txt"; rm $f will remove a*c.txt and abc.txt

That's a fantastic solution to the problem, thanks!

@fdncred fdncred added the A:glob Behavior around file-system globbing with regular commands or `glob`. See also A:quoting/expansion label Feb 27, 2024
kik4444 pushed a commit to kik4444/nushell-fork that referenced this pull request Feb 28, 2024
…ariables or string interpolation (nushell#11886)

# Description
This is a follow up to
nushell#11621 (comment)

Also Fixes: nushell#11838 

## About the code change
It applys the same logic when we pass variables to external commands:


https://github.com/nushell/nushell/blob/0487e9ffcbc57c2d5feca606e10c3f8221ff5e00/crates/nu-command/src/system/run_external.rs#L162-L170

That is: if user input dynamic things(like variables, sub-expression, or
string interpolation), it returns a quoted `NuPath`, then user input
won't be globbed
 
# User-Facing Changes
Given two input files: `a*c.txt`, `abc.txt`

* `let f = "a*c.txt"; rm $f` will remove one file: `a*c.txt`. 
~* `let f = "a*c.txt"; rm --glob $f` will remove `a*c.txt` and
`abc.txt`~
* `let f: glob = "a*c.txt"; rm $f` will remove `a*c.txt` and `abc.txt`

## Rules about globbing with *variable*
Given two files: `a*c.txt`, `abc.txt`
| Cmd Type | example | Result |
| ----- | ------------------ | ------ |
| builtin | let f = "a*c.txt"; rm $f | remove `a*c.txt` |
| builtin | let f: glob = "a*c.txt"; rm $f | remove `a*c.txt` and
`abc.txt`
| builtin | let f = "a*c.txt"; rm ($f \| into glob) | remove `a*c.txt`
and `abc.txt`
| custom | def crm [f: glob] { rm $f }; let f = "a*c.txt"; crm $f |
remove `a*c.txt` and `abc.txt`
| custom | def crm [f: glob] { rm ($f \| into string) }; let f =
"a*c.txt"; crm $f | remove `a*c.txt`
| custom | def crm [f: string] { rm $f }; let f = "a*c.txt"; crm $f |
remove `a*c.txt`
| custom | def crm [f: string] { rm $f }; let f = "a*c.txt"; crm ($f \|
into glob) | remove `a*c.txt` and `abc.txt`

In general, if a variable is annotated with `glob` type, nushell will
expand glob pattern. Or else, we need to use `into | glob` to expand
glob pattern

# Tests + Formatting
Done

# After Submitting
I think `str glob-escape` command will be no-longer required. We can
remove it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A:glob Behavior around file-system globbing with regular commands or `glob`. See also A:quoting/expansion notes:breaking-changes This PR implies a change affecting users and has to be noted in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mv $path_containing_glob_chars /path/to/dest not working

5 participants