-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Path Enhancement Project #2 #3219
Description
In the spirit of my previous PRs (#2742 #3123), I propose more changes to the Nushell path system. Some of these ideas come from the discussion with @jonathandturner and @John-Goff on Discord. First, I was considering a draft PR but it might as well span multiple PRs so I decided to create this tracking issue that could be treated as a sort of mini-RFC. Without further ado:
1. Path as structured data (WIP #3256)
Currently, we have a bunch of path subcommands with hard-coded functionality. If we need more complex path manipulations, we need to change the Rust code or use some clumsy workaround. What we could do is to express path as structured data which could potentially allow more flexibility.
Proposed changes
path parse: Break a path into structured data (what would be good fields? home? exists?):
> echo ['/home/viking/spam.txt', 'C:\Users\viking\spam.txt'] | path parse
# prefix parent stem extension
---------------------------------------------
0 /home/viking spam txt
1 C: C:\Users\viking spam txt
- It could expand all tildes and dots but wouldn't make the path absolute (this should be handled by
path expand)
path split: Split a path into parts, without the structuring
> echo '/home/viking/spam.txt' | path split
0 /
1 home
2 viking
3 spam.txt
> echo 'C:\Users\viking\spam.txt' | path split
0 C:
1 Users
2 viking
3 spam.txt
path join: Change the current command to be more multifunctional. It would:
- Convert a table (obtained via
path parseor constructed manually) back into a path string.- It would look for column names (drive, dirname, etc.) and throw an error if it couldn't find any that would match the path table spec
- Allow incomplete table (e.g. only filestem and extension columns)
- Joins with OS-specific separator (
\vs./). Filename + extension joins with.
- Join a list of paths/strings back into a path (similar to
str collect) - Still allow current functionality of appending a path/string passed as argument
Existing path subcommands:
- Remove
extension - Remove
filestembut we lose the suffix/prefix flags functionality. - Keep for now:
basename,dirname(Why? Explained later.) - Keep
expand - Keep
typeandexistsor move them to the path table
Limitations
I considered having the parts (output of path split) and basename in the path table. This would allow us not having path split at all and removing path basename. However, it is problematic when you need to replace something — you would need to replace it in multiple places! For example, changing the filestem, you'd need to edit the basename and the last entry of parts as well. Therefore, no duplication is allowed in the output of path parse. This could be resolved having "dependent columns" as introduced in my other issue: #3220
Question
- Currently, we have both path and string data types in nu. Should we remove the path type once we have the path table and handle path-like strings as regular strings? Or is it possible to set the path table as the path type?
- How do we handle complex extensions? (e.g.,
.tar.gz) — this could be a flag
2. (ON HOLD) Platform independence & related fixes
One of the goals of this revamp is to have platform-independent paths. It should be possible on any OS to create a path for another OS.
Proposed changes
- There could be a flag to the
pathcommand which would force the path to follow the target OS conventions (path separator, drive/root and user home folder):echo ~viking/spam.txt | path --windows→C:\Users\viking\spam.txtecho ~viking/spam.txt | path --unix→/home/viking/spam.txt- Without the flag,
pathwould follow the host OS.
- The previous point raises questions how to expand drive and user home folder. It could be controlled by directly modifying the path table or the following:
| Option | Default | Cargo.toml | Env. var. | Flag to path command |
|---|---|---|---|---|
| drive | C: (Windows) or $nothing | drive = "..." | DRIVE | --drive letter: |
| home folder | /home (Linux), /Users (Mac), drive:\Users (Win) | home = "..." | HOME | --home |
| user | derive from current user | N/A | USER | --user |
| operating system | derive from host OS | N/A | OS | --unix or --windows |
(Right overwrites left)
- Uniform behavior of special characters on all OSes. The following should work everywhere:
~,~user.,..,...expansion- Liberal mixing of
\and/
- Path separator is always only one (back)slash
- Translate multi-slashes into only one (e.g.,
\\,//,///or\\/\\//////\into/on Unix or\on Windows)
- Translate multi-slashes into only one (e.g.,
3. Additional features & fixes
Mostly unrelated stuff to the above but good to have IMO
- Add missing features
- Replace prefix/suffix
- Construct relative paths (
echo /home/viking/foo.txt | path relative-to /home→viking/foo.txt) - Command to query a path separator
- Fix
path expand- Currently,
path expanddoes not expand non-existing path which is confusing (e.g., expanding../existing-folderworks properly while../non-existing-folderjust returns the same string) - Also,
echo .. | path expandjust entered an infinite loop on Windows for me, eating all my RAM...
- Currently,
- Look into related issues
Future outlook
I think the additional verbosity of basename and dirname is a significant drawback, therefore, I would propose keeping them in addition to the new commands. Later, when we have a standard library, the dirname, basename subcommands could be removed and implemented in nu on top of the new commands.
Examples, case studies & edge cases
There is a lot of edge cases to cover. I'm just writing how the new subcommands could be used and trying to break them. It's a bit of a rambling from now on.
Replicating the functionality of current subcommands
Getting a basename (path join would need to be smart enough to join file stem and extension with .). I'm not a big fan of the verbosity but can be easily hidden inside a custom command.
> echo ['/home/viking/spam.txt', 'C:\Users\viking\spam.txt']
| path parse
| select filestem extension
| path join
0 spam.txt
1 spam.txt
dirname, filestem and extension are direct outputs of parse.
Let's check some dirname flags. How about path dirname -n 3?
> echo '/home/viking/foo/bar/baz/spam.txt'
| path parse
| get dirname
| path split
| drop 3
0 /home/viking
Even though more verbose, it can be extended to allow more complex dirname manipulations, including replacement (-r flag). For example
> echo '/home/viking/foo/bar/baz/spam.txt' | path parse | update dirname {
let parts = $(get dirname | path split)
echo [ $(echo $parts | first) arthur/britons $(echo $parts | last 2) ] | path join
} | path join
0 /home/arthur/britons/baz/spam.txt
This is currently impossible using plain dirname flags. We could use some better mechanism for replacing rows in nu (or I just didn't see it).
Replacing filestem and extension is trivial using the output of path parse. However, path filestem has --prefix and --suffix flags that strip preffix/suffix from the filestem. I believe this could be better implemented by extending the str subcommands since it might be useful for generic strings as well (and filestem is just a string after all).
expand, exists and type could be left as they are. exists and type could be potentially fields of path parse output.
Other examples
The new path join could still accept an argument:
> echo [ home arthur britons ] | path join spam.txt
/home/arthur/britons/spam.txt
Let's check some OS-specific examples. Mixing slashes is fine:
> echo ~arthur/britons\\/spam.txt | path --unix
/home/arthur/britons/spam.txt
Typing --windows every time can be annoying (assuming we're on Mac for example). We could have an env var session instead:
> echo ~arthur | with-env [OS windows] {
path join | autoview
... handling more windows paths
}
C:\Users\arthur
...
How about empty drive? (partial table is fine)
> echo [[drive dirname]; [$nothing usr]] | path --windows
usr # assume it's just a relative path
How would drive on Unix work?
> echo [[drive dirname]; [$nothing usr]] | path --unix
/usr # uses empty string as "drive", join with path separator
> echo [[drive dirname]; [C: usr]] | path --unix
# Should throw error
Empty string should be treated as $nothing:
> echo [[drive dirname]; ["" usr]] | path --unix
/usr
Some joining:
> echo [/home/arthur britons] | path --windows join
/home/arthur\britons
This could be fixed by encoding the path again. But we have another problem:
> echo [/home/arthur britons] | with-env [OS windows] { path join | path join }
\home\arthur\britons # Should it be this?
C:\Users\arthur\britons # or this?
Should we autodetect the home path without ~? I'm not sure, probably best to keep it simple and stick with the 1st option.
Should non-existing path be file or directory (or error?)
> echo [foo/bar] | path parse
# drive dirname filestem extension
---------------------------------------------
0 foo/bar # this?
0 foo bar # or this?
# or throw an error?