Skip to content

Rework sorting and add cell path and closure comparators to sort-by#13154

Merged
IanManske merged 37 commits intonushell:mainfrom
132ikl:sort-custom
Oct 10, 2024
Merged

Rework sorting and add cell path and closure comparators to sort-by#13154
IanManske merged 37 commits intonushell:mainfrom
132ikl:sort-custom

Conversation

@132ikl
Copy link
Copy Markdown
Member

@132ikl 132ikl commented Jun 14, 2024

Description

Closes #12535
Implements sort-by functionality of #8322
Fixes sort-by part of #8667

This PR does two main things: add a new cell path and closure parameter to sort-by, and attempt to make Nushell's sorting behavior well-defined.

sort-by features

The columns parameter is replaced with a comparator parameter, which can be a cell path or a closure. Examples are from docs PR.

  1. Cell paths

    The basic interactive usage of sort-by is the same. For example, ls | sort-by modified still works the same as before. It is not quite a drop-in replacement, see behavior changes.

    Here's an example of how the cell path comparator might be useful:

    > let cities = [
        {name: 'New York', info: { established: 1624, population: 18_819_000 } }
        {name: 'Kyoto', info: { established: 794, population: 37_468_000 } }
        {name: 'São Paulo', info: { established: 1554, population: 21_650_000 } }
    ]
    > $cities | sort-by info.established
    ╭───┬───────────┬────────────────────────────╮
    │ # │   name    │            info            │
    ├───┼───────────┼────────────────────────────┤
    │ 0Kyoto     │ ╭─────────────┬──────────╮ │
    │   │           │ │ established │ 794      │ │
    │   │           │ │ population  │ 37468000 │ │
    │   │           │ ╰─────────────┴──────────╯ │
    │ 1São Paulo │ ╭─────────────┬──────────╮ │
    │   │           │ │ established │ 1554     │ │
    │   │           │ │ population  │ 21650000 │ │
    │   │           │ ╰─────────────┴──────────╯ │
    │ 2New York  │ ╭─────────────┬──────────╮ │
    │   │           │ │ established │ 1624     │ │
    │   │           │ │ population  │ 18819000 │ │
    │   │           │ ╰─────────────┴──────────╯ │
    ╰───┴───────────┴────────────────────────────╯
  2. Key closures

    You can supply a closure which will transform each value into a sorting key (without changing the underlying data). Here's an example of a key closure, where we want to sort a list of assignments by their average grade:

    > let assignments = [
        {name: 'Homework 1', grades: [97 89 86 92 89] }
        {name: 'Homework 2', grades: [91 100 60 82 91] }
        {name: 'Exam 1', grades: [78 88 78 53 90] }
        {name: 'Project', grades: [92 81 82 84 83] }
    ]
    > $assignments | sort-by { get grades | math avg }
    ╭───┬────────────┬───────────────────────╮
    │ # │    name    │        grades         │
    ├───┼────────────┼───────────────────────┤
    │ 0Exam 1     │ [78, 88, 78, 53, 90]  │
    │ 1Project    │ [92, 81, 82, 84, 83]  │
    │ 2Homework 2 │ [91, 100, 60, 82, 91] │
    │ 3Homework 1 │ [97, 89, 86, 92, 89]  │
    ╰───┴────────────┴───────────────────────╯
  3. Custom sort closure

    The --custom, or -c, flag will tell sort-by to interpret closures as custom sort closures. A custom sort closure has two parameters, and returns a boolean. The closure should return true if the first parameter comes before the second parameter in the sort order.

    For a simple example, we could rewrite a cell path sort as a custom sort (see here for a more complex example):

    > ls | sort-by -c {|a, b| $a.size < $b.size }
    ╭───┬─────────────────────┬──────┬──────────┬────────────────╮
    │ # │        name         │ type │   size   │    modified    │
    ├───┼─────────────────────┼──────┼──────────┼────────────────┤
    │ 0 │ my-secret-plans.txt │ file │    100 B10 minutes ago │
    │ 1 │ shopping_list.txt   │ file │    100 B2 months ago   │
    │ 2 │ myscript.nu         │ file │  1.1 KiB2 weeks ago    │
    │ 3 │ bigfile.img         │ file │ 10.0 MiB3 weeks ago    │
    ╰───┴─────────────────────┴──────┴──────────┴────────────────╯

Making sort more consistent

I think it's important for something as essential as sort to have well-defined semantics. This PR contains some changes to try to make the behavior of sort and sort-by consistent. In addition, after working with the internals of sorting code, I have a much deeper understanding of all of the edge cases. Here is my attempt to try to better define some of the semantics of sorting (if you are just interested in changes, skip to "User-Facing changes")

  • sort, sort -v, and sort-by now all work the same. Each individual sort implementation has been refactored into two functions in sort_utils.rs: sort, and sort_by. These can also be used in other parts of Nushell where values need to be sorted.
    • sort and sort-by used to handle -i and -n differently.
      • sort -n would consider all values which can't be coerced into a string to be equal
      • sort-by -i and sort-by -n would only work if all values were strings
      • In this PR, insensitive sort only affects comparison between strings, and natural sort only applies to numbers and strings (see below).
  • (not a change) Before and after this PR, sort and sort-by support sorting mixed types. There was a lot of discussion about potentially making sort and sort-by only work on lists of homogeneous types, but the general consensus was that sort should not error just because its input contains incompatible types.
    • In order to try to make working with data containing null values easier, I changed the PartialOrd order to sort Nothing values to the end of a list, regardless of what other types the list contains. Before, null would be sorted before Binary, CellPath, and Custom values.
    • (not a change) When sorted, lists of mixed types will contain sorted values of each type in order, for the most part
      • (not a change) For example, [0x[1] (date now) "a" ("yesterday" | into datetime) "b" 0x[0]] will be sorted as ["a", "b", a day ago, now, [0], [1]], where sorted strings appear first, then sorted datetimes, etc.
      • (not a change) The exception to this is Ints and Floats, which will intermix, Strings and Globs, which will intermix, and None as described above. Additionally, natural sort will intermix strings with ints and floats (see below).
  • Natural sort no longer coerce all inputs to strings.
    • I did originally make natural only apply to strings, but @fdncred pointed out that the previous behavior also allowed you to sort numeric strings with numbers. This seems like a useful feature if we are trying to support sorting with mixed types, so I settled on coercing only numbers (int, float). This can be reverted if people don't like it.
      • Here is an example of this behavior in action, which is the same before and after this PR:
        $ [1 "4" 3 "2"] | sort --natural
        ╭───┬───╮
         0  1 
         1  2 
         2  3 
         3  4 
        ╰───┴───╯

User-Facing Changes

New features

  • Replaces the columns string parameter of sort-by with a cell path or a closure.
    • The cell path parameter works exactly as you would expect
    • By default, the closure parameter acts as a "key sort"; that is, each element is transformed by the closure into a sorting key
    • With the --custom (-c) parameter, you can define a comparison function for completely custom sorting order.

Behavior changes

sort -v does not coerce record values to strings

This was a bit of a surprising behavior, and is now unified with the behavior of sort and sort-by. Here's an example where you can observe the values being implicitly coerced into strings for sorting, as they are sorted like strings rather than numbers:

Old behavior:

$ {foo: 9 bar: 10} | sort -v
╭─────┬────╮
 bar  10 
 foo  9  
╰─────┴────╯

New behavior:

$ {foo: 9 bar: 10} | sort -v
╭─────┬────╮
 foo  9  
 bar  10 
╰─────┴────╯
Changed sort-by parameters from string to cell-path or closure. Typical interactive usage is the same as before, but if passing a variable to sort-by it must be a cell path (or closure), not a string

Old behavior:

$ let sort = "modified"
$ ls | sort-by $sort
╭───┬──────┬──────┬──────┬────────────────╮
 # │ name │ type │ size │    modified    │
├───┼──────┼──────┼──────┼────────────────┤
 0  foo   file   0 B  10 hours ago   
 1  bar   file   0 B  35 seconds ago 
╰───┴──────┴──────┴──────┴────────────────╯

New behavior:

$ let sort = "modified"
$ ls | sort-by $sort
Error: nu::shell::type_mismatch

  × Type mismatch.
   ╭─[entry #10:1:14]
 1  ls | sort-by $sort
   ·              ──┬──
   ·                ╰── Cannot sort using a value which is not a cell path or closure
   ╰────
$ let sort = $."modified"
$ ls | sort-by $sort
╭───┬──────┬──────┬──────┬───────────────╮
 # │ name │ type │ size │   modified    │
├───┼──────┼──────┼──────┼───────────────┤
 0  foo   file   0 B  10 hours ago  
 1  bar   file   0 B  2 minutes ago 
╰───┴──────┴──────┴──────┴───────────────╯
Insensitve and natural sorting behavior reworked

Previously, the -i and -n worked differently for sort and sort-by (see "Making sort more consistent"). Here are examples of how these options result in different sorts now:

  1. sort -n

    • Old behavior (types other than numbers, strings, dates, and binary sorted incorrectly)
      $ [2sec 1sec] | sort -n
      ╭───┬──────╮
       0  2sec 
       1  1sec 
      ╰───┴──────╯
    • New behavior
      $ [2sec 1sec] | sort -n
      ╭───┬──────╮
       0  1sec 
       1  2sec 
      ╰───┴──────╯
  2. sort-by -i

    • Old behavior (uppercase words appear before lowercase words as they would in a typical sort, indicating this is not actually an insensitive sort)
      $ ["BAR" "bar" "foo" 2 "FOO" 1] | wrap a | sort-by -i a
      ╭───┬─────╮
       # │  a  │
      ├───┼─────┤
       0    1 
       1    2 
       2  BAR 
       3  FOO 
       4  bar 
       5  foo 
      ╰───┴─────╯
    • New behavior (strings are sorted stably, indicating this is an insensitive sort)
      $ ["BAR" "bar" "foo" 2 "FOO" 1] | wrap a | sort-by -i a
      ╭───┬─────╮
       # │  a  │
      ├───┼─────┤
       0    1 
       1    2 
       2  BAR 
       3  bar 
       4  foo 
       5  FOO 
      ╰───┴─────╯
  3. sort-by -n

    • Old behavior (natural sort does not work when data contains non-string values)
      $ ["10" 8 "9"] | wrap a | sort-by -n a
      ╭───┬────╮
       # │ a  │
      ├───┼────┤
       0   8 
       1  10 
       2  9  
      ╰───┴────╯
    • New behavior
      $ ["10" 8 "9"] | wrap a | sort-by -n a
      ╭───┬────╮
       # │ a  │
      ├───┼────┤
       0   8 
       1  9  
       2  10 
      ╰───┴────╯
Sorting a list of non-record values with a non-existent column/path now errors instead of sorting the values directly (sort should be used for this, not sort-by)

Old behavior:

$ [2 1] | sort-by foo
╭───┬───╮
 0  1 
 1  2 
╰───┴───╯

New behavior:

$ [2 1] | sort-by foo
Error: nu::shell::incompatible_path_access

  × Data cannot be accessed with a cell path
   ╭─[entry #29:1:17]
 1  [2 1] | sort-by foo
   ·                 ─┬─
   ·                  ╰── int doesn't support cell paths
   ╰────
sort and sort-by output List instead of ListStream

This isn't a meaningful change (unless I misunderstand the purpose of ListStream), since sort and sort-by both need to collect in order to do the sorting anyway, but is user observable.

Old behavior:

$ ls | sort | describe -d
╭──────────┬───────────────────╮
 type      stream            
 origin    nushell           
 subtype   {record 3 fields} 
 metadata  {record 1 field}  
╰──────────┴───────────────────╯
$ ls | sort-by name | describe -d
╭──────────┬───────────────────╮
 type      stream            
 origin    nushell           
 subtype   {record 3 fields} 
 metadata  {record 1 field}  
╰──────────┴───────────────────╯

New behavior:

ls | sort | describe -d
╭────────┬─────────────────╮
 type    list            
 length  22              
 values  [table 22 rows] 
╰────────┴─────────────────╯
$ ls | sort-by name | describe -d
╭────────┬─────────────────╮
 type    list            
 length  22              
 values  [table 22 rows] 
╰────────┴─────────────────╯
  • sort now errors when nothing is piped in (sort-by already did this)

Tests + Formatting

I added lots of unit tests on the new sort implementation to enforce new sort behaviors and prevent regressions.

After Submitting

See docs PR, which is ~2/3 finished.

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Jun 14, 2024

Thanks for this. So far, so good! I'd like to see the closure parameter built into sort or sort-by or both so we don't have another command. Also, since you don't have examples yet, it would be good to put an example in the description so other people know how to use it.

@132ikl
Copy link
Copy Markdown
Member Author

132ikl commented Jun 14, 2024

Good idea! Maybe sort custom and sort-by could both be rolled into sort 🤔 I'm not sure what to do with --ignore-case, --values, and --natural. Maybe just ignore them when not relevant? That does seem to be what sort already does when passing eg. a list to sort -i or sort -v.

@NotTheDr01ds
Copy link
Copy Markdown
Contributor

Exactly what I came here (from Discord) to suggest. sort-by {closure} seems pretty good. And yes, either ignore or error when there's a conflicting arg.

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Jun 14, 2024

Thanks for the examples, they go a long way to helping us to understand. Looks great!

Maybe sort custom and sort-by could both be rolled into sort 🤔

For now, I'd like to have just sort and sort-by and not combine them.

@132ikl
Copy link
Copy Markdown
Member Author

132ikl commented Jun 14, 2024

Just realized this doesn't entirely close #8322 as it doesn't cover group-by.

@cablehead
Copy link
Copy Markdown
Contributor

Thanks for picking this up! sort-by {closure} makes the most sense to me too

NotTheDr01ds and others added 3 commits June 19, 2024 12:35
…hell#13160)

# Description

Fixes nushell#13143 by returning an empty list when there are no results found
by `std help --find/-f`

# User-Facing Changes

In addition, prints a message to stderr.

# Tests + Formatting

- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
@132ikl 132ikl changed the title Add sort custom command Add cell path and closure comparators to sort-by Jun 20, 2024
@132ikl
Copy link
Copy Markdown
Member Author

132ikl commented Jun 20, 2024

Had to do some major refactoring of sort_utils to support both cell paths and closures, and I carried over the error handling idea from sort custom, definitely makes the code a lot cleaner and easier to follow (and faster!) 😄

I saw the note at the top of sort_utils suggesting that it could be moved somewhere else, and would be fine with doing that in this PR, but I'm not familiar enough with the codebase to know where else it might go. Currently the only thing that uses it is sort-by. I think I will also address #13008 by splitting up compare_cell_path into a general compare_values function, and then make compare_cell_path just follow the cell path and hand the values to compare_values. Then, sort can just use compare_values. I need to verify more carefully still, but the logic from sort seems to be copy pasted from sort_utils, so it should be fine to move all logic to sort_utils.

I'm not sure if these sort_value and sort_value_in_place functions are still needed, they aren't used anywhere except their respective tests. I'm not sure why they were introduced, the history is a bit hard to follow. Furthermore, they're a bit more unwieldy to use now that you need to construct a Vec<Comparator> instead of just a Vec<String>. If there isn't any good reason to keep them around, I can remove them.

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Jun 20, 2024

I still want to be able to sort-by columns too. Does that still work?

@132ikl
Copy link
Copy Markdown
Member Author

132ikl commented Jun 20, 2024

For interactive use, you can do sort-by columns in the same way that you would before, for example ls | sort-by modified still works the same way as before, just that modified is now parsed as a cell path referring to the modified column instead of a string column name. This doesn't work if you pass a string variable to sort-by now, but I'm not sure how common of a use case that would be, and it's just a matter of converting string literals to to cell-path literals or using into cell-path.

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Jun 21, 2024

ok, sounds good so far. thanks.

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Aug 4, 2024

Do you have time to get the ci green?

@132ikl
Copy link
Copy Markdown
Member Author

132ikl commented Aug 4, 2024

The test failures are intentional since I was doing some test-driven development, so I'll get those passing the next time I can take a look. I'll make sure to address the spellcheck and clippy as well.

@sholderbach sholderbach linked an issue Aug 18, 2024 that may be closed by this pull request
@132ikl
Copy link
Copy Markdown
Member Author

132ikl commented Sep 11, 2024

Finally cleared out enough of my massive backlog of tasks, so I should be able to wrap this up finally! I just added a good number of unit tests to define behavior for edge cases and prevent regressions to behavior currently observable on main. I also documented the invariants which sort (from sort_utils, not the sort command) provides, with corresponding unit tests to enforce those invariants. Here's the remaining tasks for this PR:

  • Implement functionality for (newly introduced) failing unit tests
    • sort null to end
    • reintroduce proper natural sorting
  • Add key sort functionality sort-by
  • Add pipeline input to sort-by closure (so you can use $in)
  • Add examples for new features
  • Update documentation with new functionality and more details on invariants provided by sort commands

@132ikl
Copy link
Copy Markdown
Member Author

132ikl commented Oct 7, 2024

Thank you for the feedback! I'll take a look through it in the next couple days.

Regarding detecting number of closure parameters to determine key closure or closure closure, I proposed the idea in the Discord and @devyn advised against it:

Yeah I wouldn't recommend that, I think we try not to make built-in commands have argument handling behavior you couldn't do within Nushell itself

I'm still not sure how I feel about it. It would be better ergonomics, but it's a bit magic-y. It would also require removing $in.0/$in.1 for custom closures, since otherwise zero argument closures would be ambiguous.

 
I should note that I think it's possible to rewrite any priority sort as an equivalent multiple sort (using cell paths here, but same would apply for a key closure + a custom closure):

sort-by foo bar baz

is equivalent to

sort-by baz | sort-by bar | sort-by foo

with the drawback that baz has to be compared for all values regardless of whether values of bar or foo are equal.

@IanManske
Copy link
Copy Markdown
Member

IanManske commented Oct 9, 2024

Great points! I would also assume that custom comparators would be the rarer case, so having it behind a flag doesn't seem too bad. So, disregard my previous comment, the current behavior/interface looks pretty good 👍 .

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Oct 9, 2024

Are we ready to land this then?

@132ikl
Copy link
Copy Markdown
Member Author

132ikl commented Oct 9, 2024

I've got to go through Ian's review still, hopefully sometime today. Other than that, I'm good as long as no one else has any concerns before then.

@fdncred fdncred marked this pull request as draft October 9, 2024 15:47
@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Oct 9, 2024

ok, i'm marking it draft until you're done. then we can try and get it landed.

@132ikl 132ikl marked this pull request as ready for review October 9, 2024 22:36
@132ikl
Copy link
Copy Markdown
Member Author

132ikl commented Oct 9, 2024

I think we're good to go on this @fdncred

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Oct 9, 2024

ok, just need @IanManske's sign-off since he had a bunch of comments.

Copy link
Copy Markdown
Member

@IanManske IanManske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes, this looks good 🚀

@IanManske IanManske merged commit 36c1073 into nushell:main Oct 10, 2024
@hustcer hustcer added this to the v0.99.0 milestone Oct 10, 2024
@fdncred fdncred added deprecated:pr-commands (deprecated: too vague) This PR changes our commands in some way notes:mention Include the release notes summary in the "Hall of Fame" section labels Oct 10, 2024
sholderbach pushed a commit that referenced this pull request Jan 8, 2025
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

This PR adds type checking of all command input types at run-time.
Generally, these errors should be caught by the parser, but sometimes we
can't know the type of a value at parse-time. The simplest example is
using the `echo` command, which has an output type of `any`, so
prefixing a literal with `echo` will bypass parse-time type checking.

Before this PR, each command has to individually check its input types.
This can result in scenarios where the input/output types don't match
the actual command behavior. This can cause valid usage with an
non-`any` type to become a parse-time error if a command is missing that
type in its pipeline input/output (`drop nth` and `history import` do
this before this PR). Alternatively, a command may not list a type in
its input/output types, but doesn't actually reject that type in its
code, which can have unintended side effects (`get` does this on an
empty pipeline input, and `sort` used to before #13154).

After this PR, the type of the pipeline input is checked to ensure it
matches one of the input types listed in the proceeding command's
input/output types. While each of the issues in the "before this PR"
section could be addressed with each command individually, this PR
solves this issue for _all_ commands.

**This will likely cause some breakage**, as some commands have
incorrect input/output types, and should be adjusted. Also, some scripts
may have erroneous usage of commands. In writing this PR, I discovered
that `toolkit.nu` was passing `null` values to `str join`, which doesn't
accept nothing types (if folks think it should, we can adjust it in this
PR or in a different PR). I found some issues in the standard library
and its tests. I also found that carapace's vendor script had an
incorrect chaining of `get -i`:

```nushell
let expanded_alias = (scope aliases | where name == $spans.0 | get -i 0 | get -i expansion)
```

Before this PR, if the `get -i 0` ever actually did evaluate to `null`,
the second `get` invocation would error since `get` doesn't operate on
`null` values. After this PR, this is immediately a run-time error,
alerting the user to the problematic code. As a side note, we'll need to
PR this fix (`get -i 0 | get -i expansion` -> `get -i 0.expansion`) to
carapace.

A notable exception to the type checking is commands with input type of
`nothing -> <type>`. In this case, any input type is allowed. This
allows piping values into the command without an error being thrown. For
example, `123 | echo $in` would be an error without this exception.
Additionally, custom types bypass type checking (I believe this also
happens during parsing, but not certain)

I added a `is_subtype` method to `Value` and `PipelineData`. It
functions slightly differently than `get_type().is_subtype()`, as noted
in the doccomments. Notably, it respects structural typing of lists and
tables. For example, the type of a value `[{a: 123} {a: 456, b: 789}]`
is a subtype of `table<a: int>`, whereas the type returned by
`Value::get_type` is a `list<any>`. Similarly, `PipelineData` has some
special handling for `ListStream`s and `ByteStream`s. The latter was
needed for this PR to work properly with external commands.

Here's some examples.

Before:
```nu
1..2 | drop nth 1
Error: nu::parser::input_type_mismatch

  × Command does not support range input.
   ╭─[entry #9:1:8]
 1 │ 1..2 | drop nth 1
   ·        ────┬───
   ·            ╰── command doesn't support range input
   ╰────

echo 1..2 | drop nth 1
# => ╭───┬───╮
# => │ 0 │ 1 │
# => ╰───┴───╯
```

After this PR, I've adjusted `drop nth`'s input/output types to accept
range input.

Before this PR, zip accepted any value despite not being listed in its
input/output types. This caused different behavior depending on if you
triggered a parse error or not:
```nushell
1 | zip [2]
# => Error: nu::parser::input_type_mismatch
# => 
# =>   × Command does not support int input.
# =>    ╭─[entry #3:1:5]
# =>  1 │ 1 | zip [2]
# =>    ·     ─┬─
# =>    ·      ╰── command doesn't support int input
# =>    ╰────
echo 1 | zip [2]
# => ╭───┬───────────╮
# => │ 0 │ ╭───┬───╮ │
# => │   │ │ 0 │ 1 │ │
# => │   │ │ 1 │ 2 │ │
# => │   │ ╰───┴───╯ │
# => ╰───┴───────────╯
```

After this PR, it works the same in both cases. For cases like this, if
we do decide we want `zip` or other commands to accept any input value,
then we should explicitly add that to the input types.
```nushell
1 | zip [2]
# => Error: nu::parser::input_type_mismatch
# => 
# =>   × Command does not support int input.
# =>    ╭─[entry #3:1:5]
# =>  1 │ 1 | zip [2]
# =>    ·     ─┬─
# =>    ·      ╰── command doesn't support int input
# =>    ╰────
echo 1 | zip [2]
# => Error: nu:🐚:only_supports_this_input_type
# => 
# =>   × Input type not supported.
# =>    ╭─[entry #14:2:6]
# =>  2 │ echo 1 | zip [2]
# =>    ·      ┬   ─┬─
# =>    ·      │    ╰── only list<any> and range input data is supported
# =>    ·      ╰── input type: int
# =>    ╰────
```

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

**Breaking change**: The type of a command's input is now checked
against the input/output types of that command at run-time. While these
errors should mostly be caught at parse-time, in cases where they can't
be detected at parse-time they will be caught at run-time instead. This
applies to both internal commands and custom commands.

Example function and corresponding parse-time error (same before and
after PR):
```nushell
def foo []: int -> nothing {
  print $"my cool int is ($in)"
}

1 | foo
# => my cool int is 1

"evil string" | foo
# => Error: nu::parser::input_type_mismatch
# => 
# =>   × Command does not support string input.
# =>    ╭─[entry #16:1:17]
# =>  1 │ "evil string" | foo
# =>    ·                 ─┬─
# =>    ·                  ╰── command doesn't support string input
# =>    ╰────
# => 
```

Before:
```nu
echo "evil string" | foo
# => my cool int is evil string
```

After:
```nu
echo "evil string" | foo
# => Error: nu:🐚:only_supports_this_input_type
# => 
# =>   × Input type not supported.
# =>    ╭─[entry #17:1:6]
# =>  1 │ echo "evil string" | foo
# =>    ·      ──────┬──────   ─┬─
# =>    ·            │          ╰── only int input data is supported
# =>    ·            ╰── input type: string
# =>    ╰────
```

Known affected internal commands which erroneously accepted any type:
* `str join`
* `zip`
* `reduce`

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
- 🟢 `toolkit fmt`
- 🟢 `toolkit clippy`
- 🟢 `toolkit test`
- 🟢 `toolkit test stdlib`


# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
* Play whack-a-mole with the commands and scripts this will inevitably
break
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deprecated:pr-commands (deprecated: too vague) This PR changes our commands in some way notes:mention Include the release notes summary in the "Hall of Fame" section

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can i specify the sort rule when sort an table Support sort-by with nested columns

6 participants