Optional chaining for cell paths (?.)#7540
Conversation
|
cc: @webbedspace |
| }, | ||
| Example { | ||
| description: "Get a column from a table, and return null for any rows where that column does not exist", | ||
| example: "[{A: A0} {}] | get ?.A", |
There was a problem hiding this comment.
Consider adding an example for get .A as well, with the text indicating that the dot is strictly optional. Such an example would provide needed clarity as to what the ?. syntax "means". (Currently, it sorta looks like the ? is another cell name, when actually it's an extension of the separator between names.)
|
Could you add tests that this works with |
|
@webbedspace |
|
Ah, very well then. |
ff5740d to
d75c816
Compare
5af6182 to
7535a5d
Compare
|
I've confirmed that this works as expected with |
|
This proposal confused a few people on Discord. It's possible I didn't explain it well there, but it's also possible there might be better syntax out there for this kind of functionality. I feel that this PR is ready to go from a code quality perspective, but I'd like to hear a few more people weigh in on whether they think the syntax is OK (with alternative examples if you don't like the current syntax). |
|
Actually, now that you mention it, I just noticed this does have some peculiarities I don't know if I like. (Sorry for not scrutinising that closely earlier).
So, like, in JS and C# (and Swift), How about this alternative: Here, I've split |
|
OK, yeah, people on Discord had similar issues with the placing of the
Correct, the I'm not sure I understand all of your examples. You mention a "terminal |
|
My stab at the "terminal |
|
I'll explain further. Remember that my |
|
Thanks; I don't find that intuitive right now but maybe it'll grow on me. I think you accidentally listed |
|
If it helps, recall that this syntax evolved from the C ternary operator: |
|
Gotcha. Might want to update your examples: |
|
Ah yes, good point, sorry. |
|
Maybe it would help to recap the initial motivation for this: using Current Behaviour
Note that:
Desired BehaviourIt would be a lot simpler if there were only 2 ways to handle a failed cell path access: fail, or replace missing data with Next StepsI'm still not sure what the syntax should be. I think I am trying to solve a problem that is slightly different than the one JS solves with Footnotes
|
The problems are very similar, however. I don't think the difference is too major. In JS, 1) access of a missing property produces Nushell is more strict than JS: 1) access of a missing property is an error, and 2) access of a property on |
|
I think this is the main thing that bothers me about your proposal: The position of the |
6b146ed to
3480837
Compare
Yeah, that's usually a safe assumption to make - something's probably gone wrong if you have mixed records and null values in the same list. I'm willing to give your proposal a try, thanks for walking me through it. I'm still not sure I have an intuitive understanding of it, but perhaps attempting to implement it will clear things up. |
|
One other thing I forgot to mention all this time: Nushell already has a notion of a suffixing |
|
I'm going to work on other things for a while; my brief attempt to implement Leon’s approach didn’t go well and I’d like to shift gears for a bit. If anyone else wants to try implementing something like this, feel free. |
|
Closing this for now. I believe we still need something like this but I wasn't able to find a syntax that makes everyone happy. Leon's suggestion might be the way to go but I wasn't able to wrap my head around it. |
|
If it were up to me, I'd land this PR even though I don't think the operators are "perfect". I'd also want to keep the If we're not deprecating anything with this PR and we're just adding new optional syntax, I'm not sure why we wouldn't land it and just keep iterating on it. I haven't looked at the details for a while so I may have forgotten something. |
|
I think enough people found the initial syntax unintuitive that there's definitely something wrong with it. I'd rather not spend time polishing and documenting this before we find a syntax people are happy with. If anyone else wants to run with this work, feel free; I've lost the motivation to work on it for now. My initial motivation for working on this was to make it easier for |
|
I want to take this branch and finish it myself… although I've been thinking a bit more about the semantics (as you might expect). Something I've been pondering is what So, I've been wondering if using this syntax for holes at all is a good idea, given that it goes against this operator's "original" design intent. Maybe a second variant is needed, with a different symbol like I do also think, in this light, that the |
|
For me, it's always been about the postfix operator meaning the item before it is optional. So, I think syntax should be closer to this. Both of these have the same result. And if you want to keep drilling down into nested structures it would be like: |
|
So, like, do you really think it's OK to have this one cell path modifier glyph produce
To me, these are different enough to merit consideration. |
I go back and forth on this. I like the simplicity of having 1 glyph in a known position suppress any error. But I can appreciate that those are meaningfully different types of errors, if we can find an ergonomic way to handle them separately let's do it. Either way, I think we probably need short-circuiting behaviour (where we stop evaluating a cell path after a failure to access one optional cell path). I initially implemented this without short-circuiting and that was bad in retrospect; it's not very useful if |
|
Good comment by @kubouch on Discord:
I think I agree with that. |
|
So, I think we want to try the following approach:
|
|
That sounds good to me. Cases other than optionality might be better addressed via a pattern-matching syntax, anything in that vein for nushell? |
|
New PR is up for postfix |
This is a follow up from #7540. Please provide feedback if you have the time! ## Summary This PR lets you use `?` to indicate that a member in a cell path is optional and Nushell should return `null` if that member cannot be accessed. Unlike the previous PR, `?` is now a _postfix_ modifier for cell path members. A cell path of `.foo?.bar` means that `foo` is optional and `bar` is not. `?` does _not_ suppress all errors; it is intended to help in situations where data has "holes", i.e. the data types are correct but something is missing. Type mismatches (like trying to do a string path access on a date) will still fail. ### Record Examples ```bash { foo: 123 }.foo # returns 123 { foo: 123 }.bar # errors { foo: 123 }.bar? # returns null { foo: 123 } | get bar # errors { foo: 123 } | get bar? # returns null { foo: 123 }.bar.baz # errors { foo: 123 }.bar?.baz # errors because `baz` is not present on the result from `bar?` { foo: 123 }.bar.baz? # errors { foo: 123 }.bar?.baz? # returns null ``` ### List Examples ``` 〉[{foo: 1} {foo: 2} {}].foo Error: nu::shell::column_not_found × Cannot find column ╭─[entry #30:1:1] 1 │ [{foo: 1} {foo: 2} {}].foo · ─┬ ─┬─ · │ ╰── cannot find column 'foo' · ╰── value originates here ╰──── 〉[{foo: 1} {foo: 2} {}].foo? ╭───┬───╮ │ 0 │ 1 │ │ 1 │ 2 │ │ 2 │ │ ╰───┴───╯ 〉[{foo: 1} {foo: 2} {}].foo?.2 | describe nothing 〉[a b c].4? | describe nothing 〉[{foo: 1} {foo: 2} {}] | where foo? == 1 ╭───┬─────╮ │ # │ foo │ ├───┼─────┤ │ 0 │ 1 │ ╰───┴─────╯ ``` # Breaking changes 1. Column names with `?` in them now need to be quoted. 2. The `-i`/`--ignore-errors` flag has been removed from `get` and `select` 1. After this PR, most `get` error handling can be done with `?` and/or `try`/`catch`. 4. Cell path accesses like this no longer work without a `?`: ```bash 〉[{a:1 b:2} {a:3}].b.0 2 ``` We had some clever code that was able to recognize that since we only want row `0`, it's OK if other rows are missing column `b`. I removed that because it's tricky to maintain, and now that query needs to be written like: ```bash 〉[{a:1 b:2} {a:3}].b?.0 2 ``` I think the regression is acceptable for now. I plan to do more work in the future to enable streaming of cell path accesses, and when that happens I'll be able to make `.b.0` work again.

This PR adds the ability to use
?.in cell paths, somewhat like optional chaining in JS and C#'s null-conditional operator.This provides a succinct way for users to specify whether a failed cell path access should return a
ShellErrororValue::Nothing. It is not aiming for exact compatibility with JS and C#.Examples
How does this relate to
getandselect?As always, any cell path access can be rewritten to use
get(ex:$foo.a->$foo | get a). The arguments given togetandselect(ex:fooandbarinselect foo bar) are actually cell paths.Before this PR, cell paths used by
getandselectwere not allowed to start with a.; parsingget .awould fail. After this PR, a.before the first member is (optionally) allowed. All of the following are now valid:I think this makes sense; the user's intent is unambiguous in each case and it provides consistency whether using
getor accessing a cell path withoutget.As part of this PR, the
-i/--ignore-errorsflag has been removed fromgetandselect. Using?in cell paths now offers more fine-grained control over error handling:Future Work
If this change lands, I should be able to follow it up with another PR that makes cell path access stream properly instead of collecting
ListStreams.