Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`get` Command Produces Both Table And Error #2677

Open
JosephTLyons opened this issue Oct 17, 2020 · 6 comments
Open

`get` Command Produces Both Table And Error #2677

JosephTLyons opened this issue Oct 17, 2020 · 6 comments

Comments

@JosephTLyons
Copy link
Collaborator

@JosephTLyons JosephTLyons commented Oct 17, 2020

Describe the bug
I was opening the main .toml file in the project root when I discovered this. Using the command:

open Cargo.toml | get bin | get required-features, produces both an output table, and an error:

> open Cargo.toml | get bin | get required-features
────┬────────────
  0 │ textview   
  1 │ inc        
  2 │ ps         
  3 │ sys        
  4 │ fetch      
  5 │ match      
  6 │ post       
  7 │ binaryview 
  8 │ tree       
  9 │ start      
 10 │ s3         
 11 │ chart      
 12 │ chart      
 13 │ xpath      
 14 │ bson       
 15 │ bson       
 16 │ sqlite     
 17 │ sqlite     
────┴────────────
error: Unknown column
  ┌─ shell:1:33
  │
1 │ open Cargo.toml | get bin | get required-features
  │                                 ^^^^^^^^^^^^^^^^^
  │                                 │
  │                                 There isn't a column named 'required-features'
  │                                 Perhaps you meant 'name'? Columns available: name, path

> 

Expected behavior
I expect to see only the table and not the error.

Screenshots
If applicable, add screenshots to help explain your problem.

Configuration (please complete the following information):

  • OS (e.g. Windows): macOS 10.15.7
  • Nu version (you can use the version command to find out): 0.21 (compiled from master)
@JosephTLyons JosephTLyons changed the title `get` Command On Column Name With `-` Character Produces Error `get` Command Produces Both Table And Error Oct 17, 2020
@fdncred
Copy link
Collaborator

@fdncred fdncred commented Oct 17, 2020

I'm guessing that this happens because not all get bin's returned have a required-features column because I can do this.

open Cargo.toml | get bin | flatten required-features
╭────┬─────────────────────────────┬────────────────────────────────────────────┬───────────────────╮
│ #  │ name                        │ path                                       │ required-features │
├────┼─────────────────────────────┼────────────────────────────────────────────┼───────────────────┤
│  0 │ nu_plugin_core_textview     │ src/plugins/nu_plugin_core_textview.rs     │ textview          │
│  1 │ nu_plugin_core_inc          │ src/plugins/nu_plugin_core_inc.rs          │ inc               │
│  2 │ nu_plugin_core_ps           │ src/plugins/nu_plugin_core_ps.rs           │ ps                │
│  3 │ nu_plugin_core_sys          │ src/plugins/nu_plugin_core_sys.rs          │ sys               │
│  4 │ nu_plugin_core_fetch        │ src/plugins/nu_plugin_core_fetch.rs        │ fetch             │
│  5 │ nu_plugin_core_match        │ src/plugins/nu_plugin_core_match.rs        │ match             │
│  6 │ nu_plugin_core_post         │ src/plugins/nu_plugin_core_post.rs         │ post              │
│  7 │ nu_plugin_extra_binaryview  │ src/plugins/nu_plugin_extra_binaryview.rs  │ binaryview        │
│  8 │ nu_plugin_extra_tree        │ src/plugins/nu_plugin_extra_tree.rs        │ tree              │
│  9 │ nu_plugin_extra_start       │ src/plugins/nu_plugin_extra_start.rs       │ start             │
│ 10 │ nu_plugin_extra_s3          │ src/plugins/nu_plugin_extra_s3.rs          │ s3                │
│ 11 │ nu_plugin_extra_chart_bar   │ src/plugins/nu_plugin_extra_chart_bar.rs   │ chart             │
│ 12 │ nu_plugin_extra_chart_line  │ src/plugins/nu_plugin_extra_chart_line.rs  │ chart             │
│ 13 │ nu_plugin_extra_xpath       │ src/plugins/nu_plugin_extra_xpath.rs       │ xpath             │
│ 14 │ nu_plugin_extra_from_bson   │ src/plugins/nu_plugin_extra_from_bson.rs   │ bson              │
│ 15 │ nu_plugin_extra_to_bson     │ src/plugins/nu_plugin_extra_to_bson.rs     │ bson              │
│ 16 │ nu_plugin_extra_from_sqlite │ src/plugins/nu_plugin_extra_from_sqlite.rs │ sqlite            │
│ 17 │ nu_plugin_extra_to_sqlite   │ src/plugins/nu_plugin_extra_to_sqlite.rs   │ sqlite            │
╰────┴─────────────────────────────┴────────────────────────────────────────────┴───────────────────╯
╭────┬──────┬─────────────╮
│ #  │ name │ path        │
├────┼──────┼─────────────┤
│ 18 │ nu   │ src/main.rs │
╰────┴──────┴─────────────╯
@JosephTLyons
Copy link
Collaborator Author

@JosephTLyons JosephTLyons commented Oct 18, 2020

Ahh yeah, that makes sense to me.

@oskarskog
Copy link
Contributor

@oskarskog oskarskog commented Oct 21, 2020

Opening files with same data but different format gives different results:

test.json

[
    { "a": 1, "b": 2 },
    { "a": 3 }
]
nushell(master)> open test.json
───┬───┬───
 # │ a │ b
───┼───┼───
 0 │ 1 │ 2
───┴───┴───
───┬───
 # │ a
───┼───
 1 │ 3
───┴───

test.csv

a,b
1,2
3
nushell(master)> open test.csv
───┬──
 0 │
───┴──
error: Could not parse as CSV (Line 3: expected 2 fields, found 1)
  ┌─ shell:1:1
  │
1 │ open test.csv
  │ ^^^^ -------- value originates from here
  │ │
  │ input cannot be parsed as CSV

Poweshell output:

PS /Users/oskarskog/dev/src/nushell> Get-Content ./test.json | ConvertFrom-Json

a b
- -
1 2
3

PS /Users/oskarskog/dev/src/nushell> Get-Content ./test.csv | ConvertFrom-Csv

a b
- -
1 2
3

Should we add columns to the rows that are missing them to keep the shape of the data consistent?

@fdncred
Copy link
Collaborator

@fdncred fdncred commented Oct 21, 2020

@oskarskog You have to keep in mind that nu wants to stream data. We've had a lot of discussions about homogeneity as it relates to rows and columns. i.e. having the exact number of columns for any given csv file. In my mind, we can't really have homogeneous columns with csv files without (sometimes?) removing streaming because that would mean that we parse the entire csv file before outputting a single row in order to make sure we have enough columns for well-formed and malformed csv files. However, I would say that I think our csv handling should be more resilient.

We'd love to have some contributors help with that resiliency. We're always open to suggestions.

I haven't looked at get-content from PowerShell yet, nor do I know if PowerShell streams data but it would be an interesting investigation.

Generally speaking, though, with your given tests above I'd like to have a similar output, I think. It seems like it would just be more consistent that way.

For fun, I tested another data-centric shell on these files and it responds similar to nushell. I'm not saying any way is right or wrong. Just throwing another shell at the comparision.

cat test.csv | csv:from
Error: csv: Wrong number of columns in CSV file
cat test.csv | csv:from
^^^^^^^^^^^^^^^^^^^^^^^
cat test.json | json:from
[data a=(1) b=(2), data a=(3)]
@pmarreck
Copy link

@pmarreck pmarreck commented Feb 25, 2021

@fdncred A couple thoughts:

  1. If you presume that the first row of data dictates the number of columns (which, granted, may be suboptimal from a design perspective... what if the first row of data is the one missing columns? etc.), then you don't have to parse the entire file, and can simply output nulls where data later on appears to be missing. In CSV's case specifically, I believe the first line of data is the column headers, so this assumption would work well. You can still runtime-error out if a given line contains a greater number of values than there are apparent columns.
  2. You could consume the first ~10 rows, look at those and see what the general (or greatest) # of values looks like, and then error if a later streamed-in line goes over, or just output nulls if a later line comes up short. Granted, this is a bit "magic-y" but I think it would also lead to the expected behavior in 99.9% of the input cases containing grid-like or spreadsheet-like data.
@fdncred
Copy link
Collaborator

@fdncred fdncred commented Feb 25, 2021

I tried something similar to number 2. I would cache the first X number of rows from the stream in order to determine the column widths. But the same could've been done for the column names. I chucked that code though because i couldn't get it to work right. I'd really like someone to revisit it because I still think it's a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants