Skip to content

feat: added items command for Records#8640

Merged
fdncred merged 5 commits intonushell:mainfrom
Sygmei:feat/filters_items_command
Apr 14, 2023
Merged

feat: added items command for Records#8640
fdncred merged 5 commits intonushell:mainfrom
Sygmei:feat/filters_items_command

Conversation

@Sygmei
Copy link
Copy Markdown
Contributor

@Sygmei Sygmei commented Mar 27, 2023

Description

This PR adds an items command which allows the user to iterate over both columns and values of a Record<> type at the same time.

image

User-Facing Changes

No breaking changes, only a new items command.

Formatting

  • cargo fmt --all -- --check 👌
  • cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A clippy::needless_collect 👌
  • cargo test --workspace 👌

@Sygmei Sygmei changed the title feat: added items command feat: added items command for Records Mar 27, 2023
@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Mar 27, 2023

Was just playing around and had a panic.

ls | items {|k, v| print $'($k) ($v)' } 
thread 'main' panicked at 'PipelineData::ExternalStream had no span', crates\nu-command\src\filters\items.rs:122:26
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
error: process didn't exit successfully: `C:\CarTar\debug\nu.exe` (exit code: 101)

@Sygmei
Copy link
Copy Markdown
Contributor Author

Sygmei commented Mar 27, 2023

Good catch @fdncred, is there a reason why ls doesn't provide a span ?
Fixed with Span::unknown() when span is not found :)

@francoisthire
Copy link
Copy Markdown

francoisthire commented Mar 27, 2023

I have tried to implement items in nu directly:

> def items [f] { transpose | rename a b | each {|key| do $f $key.a $key.b } } 

and then:

 > {new: york, san: francisco} | items {|key value| echo $'($key)-($value)' }
 ╭───┬───────────────╮
│ 0 │ new-york      │
│ 1 │ san-francisco │
╰───┴───────────────╯

I have tried the example of @fdncred, and it gives an output which is probably not the expected one:

 ls | items {|k, v| print $'($k) ($v)' }                                                                                                                                                                                                                                                      03/27/2023 05:37:13 pm
name #*message*-20221028-162839#
type file
size 130 B
modified Sat, 29 Oct 2022 10:40:36 +0200 (4 months ago)

What is the expected result when items is applied on a table?

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Mar 27, 2023

I think it's important to understand what to do with tables since items is really for records (I think). So, if you look at ls it returns a table or a list of records. ls | columns = keys and ls | values = values. So, ls | items should get those same keys and values, shouldn't it?

BTW - I get this now

ls | items {|k, v| print $'($k) ($v)' } 
Error: nu::shell::only_supports_this_input_type

  × Input type not supported.
   ╭─[source:1:1]
 1 │ nu
   · ▲
   · ╰── input type: raw data
   ╰────
   ╭─[entry #1:1:1]
 1 │ ls | items {|k, v| print $'($k) ($v)' }
   ·      ──┬──
   ·        ╰── only record input data is supported
   ╰────

Another test

ls | each {|r| $r | items {|k, v| print $'key:[($k)] value:[($v)]'}}
key:[name] value:[.cargo]
key:[type] value:[dir]
key:[size] value:[0 B]
key:[modified] value:[Fri, 24 Mar 2023 07:15:50 -0500 (3 days ago)]
key:[name] value:[.github]
key:[type] value:[dir]
key:[size] value:[4.1 KB]
key:[modified] value:[Fri, 24 Mar 2023 07:15:50 -0500 (3 days ago)]
key:[name] value:[.gitignore]
key:[type] value:[file]
key:[size] value:[658 B]
key:[modified] value:[Fri, 24 Mar 2023 07:15:50 -0500 (3 days ago)]
key:[name] value:[.typos.toml]
key:[type] value:[file]
key:[size] value:[254 B]
key:[modified] value:[Fri, 24 Mar 2023 07:15:50 -0500 (3 days ago)]
key:[name] value:[CODE_OF_CONDUCT.md]
key:[type] value:[file]
key:[size] value:[3.5 KB]
key:[modified] value:[Fri, 24 Mar 2023 07:15:50 -0500 (3 days ago)]
key:[name] value:[CONTRIBUTING.md]
key:[type] value:[file]
key:[size] value:[9.8 KB]
key:[modified] value:[Fri, 24 Mar 2023 07:15:50 -0500 (3 days ago)]
key:[name] value:[Cargo.lock]
key:[type] value:[file]
key:[size] value:[151.9 KB]
key:[modified] value:[Mon, 27 Mar 2023 07:18:55 -0500 (3 hours ago)]
key:[name] value:[Cargo.toml]
key:[type] value:[file]
key:[size] value:[5.2 KB]
key:[modified] value:[Mon, 27 Mar 2023 07:18:55 -0500 (3 hours ago)]
key:[name] value:[Cross.toml]
key:[type] value:[file]
key:[size] value:[371 B]
key:[modified] value:[Fri, 24 Mar 2023 07:15:50 -0500 (3 days ago)]
key:[name] value:[LICENSE]
key:[type] value:[file]
key:[size] value:[1.1 KB]
key:[modified] value:[Fri, 24 Mar 2023 07:15:50 -0500 (3 days ago)]
.. truncated

╭────┬────────────────╮
│  0 │ [list 4 items] │
│  1 │ [list 4 items] │
│  2 │ [list 4 items] │
│  3 │ [list 4 items] │
│  4 │ [list 4 items] │
│  5 │ [list 4 items] │
│  6 │ [list 4 items] │
│  7 │ [list 4 items] │
│  8 │ [list 4 items] │
│  9 │ [list 4 items] │
╰────┴────────────────╯
... truncated

The nushell table at the end doesn't look right to me.

@Sygmei
Copy link
Copy Markdown
Contributor Author

Sygmei commented Mar 27, 2023

Just added support for table type :)
image

For your example it actually looks good to me @fdncred

You are doing an each on a ls output, hence getting a list of 9 items (your 9 files), each file record was sent into the items command returning nothing (only printing stuff on stdout), so the end result is a list of 9 lists containing each 4 empty elements (4 columns of file record).

If you expand the table you will get
image

If you return a static value for example, you'll get
image

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Mar 28, 2023

What does the python equivalent return? IIRC, this items command came up to kind of mimic that. We may want to make it look like that (if it doesn't already) but more nushell-y.

@Sygmei
Copy link
Copy Markdown
Contributor Author

Sygmei commented Mar 28, 2023

The Python equivalent looks like :

for key, value in mydict.items():
    print(key, value)

Under the hood, it is a lazy generator returning a tuple containing (key, value)

If we wanted to keep the same principle, it would be something like

mydict | items | each { |keyval| echo $keyval.0 $keyval.1 }

items would then return a list of tuples containing key, value

I find it less concise and agreable to use but I'm open to feedback

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Mar 28, 2023

ok. I like the non-tuple version myself. Thanks for letting me know.

@sholderbach
Copy link
Copy Markdown
Member

While avoiding the intermediate record is also great for performance and gives the nicest each replacement given the current syntax (and to a lesser extent type system) capabilities.

I might caveat that having a command that just performs the iteration into a different structure would have benefits for other commands beyond each for basic iteration. This would compose with the other adapters like reduce / skip while but would require some pattern matching like destructuring in the closure arguments or elsewhere to be really pleasant (conflicts with the current implicitly added index in each!).

Furthermore to be as performant as this implementation some additional heavy lifting would probably be necessary, which would be another trade-off.

@Sygmei
Copy link
Copy Markdown
Contributor Author

Sygmei commented Mar 28, 2023

@sholderbach the user can still transform the record into the type it wants and pipe it into something like reduce

{a: 1, b: 2} | items { |key, value| {key: $key, value: $value } | reduce  ...

We could even add later

{a: 1, b: 2} | items list
# returns [[a, 1], [b, 2]]

@sholderbach
Copy link
Copy Markdown
Member

sholderbach commented Mar 28, 2023

Yeah, I would agree. Let's not make perfect the enemy of the good. I think having the ability to quickly write something to operate over records is more valuable right now than perfect iterator design (in a pythonesque or rustic way).
We could iterate on that as soon as we have some diverse user code (cue the breaking changes :D)

@Sygmei
Copy link
Copy Markdown
Contributor Author

Sygmei commented Mar 29, 2023

Is it ready to merge then :) ?

Comment on lines +107 to +118
PipelineData::ListStream(stream, ..) => {
let v: Vec<_> = stream.into_iter().collect();
let cols = get_columns(&v);
let vals = get_values(&v, call.head, input_span)?;

Ok(cols
.into_iter()
.zip(vals.into_iter())
.into_iter()
.map_while(run_for_each_item)
.into_pipeline_data(ctrlc))
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be inconsistent behavior. This accepts a table if the list is a PipelineData::ListStream but not if it is a ...::Value(Value::List{}) of records.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the type error messages I would say lets remove this match arm and only handle records and LazyRecords right now.

@kubouch
Copy link
Copy Markdown
Contributor

kubouch commented Apr 5, 2023

Instead of Span::unknown() it's better to use call.head. Unknown span points at a random location (whatever was the first input ever parsed by Nushell). The call head points at the items command call.

Otherwise, I don't mind having items, though I think it could be implemented in the standard library.

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Apr 14, 2023

@Sygmei I think if you can resolve the points by @sholderbach and @kubouch we can land this.

@Sygmei
Copy link
Copy Markdown
Contributor Author

Sygmei commented Apr 14, 2023

Yes ! I'll try to take some time today to finish this

@Sygmei
Copy link
Copy Markdown
Contributor Author

Sygmei commented Apr 14, 2023

@fdncred should be good :)

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Apr 14, 2023

Thanks. Let's go!

@fdncred fdncred merged commit 71611de into nushell:main Apr 14, 2023
bobhy pushed a commit to bobhy/nushell that referenced this pull request Apr 15, 2023
# Description

This PR adds an `items` command which allows the user to iterate over
both `columns` and `values` of a `Record<>` type at the same time.


![image](https://user-images.githubusercontent.com/3835355/227976277-c9badbb2-2e31-4243-8d00-7e28f2289587.png)

# User-Facing Changes

No breaking changes, only a new `items` command.

# Formatting

- `cargo fmt --all -- --check` 👌 
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A
clippy::needless_collect` 👌
- `cargo test --workspace` 👌
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants