Skip to content

Spread operator for list literals#11006

Merged
kubouch merged 22 commits intonushell:mainfrom
ysthakur:spread
Nov 22, 2023
Merged

Spread operator for list literals#11006
kubouch merged 22 commits intonushell:mainfrom
ysthakur:spread

Conversation

@ysthakur
Copy link
Copy Markdown
Member

@ysthakur ysthakur commented Nov 9, 2023

Goes towards implementing #10598

Description

This PR is for adding a spread operator that can be used when building lists. Additional functionality can be added later.

Changes:

  • Treating ... as the spread operator when it appears before $, [, or ( inside lists (no whitespace allowed in between)
  • A Spread variant in the Expr enum

Possible problem: Inside parse_list_expression, since ...$foo is a single token, the Span representing ...$foo has to be modified to only include $foo before parse_multispan_value can be called. It seems like it's merely icky and wouldn't actually cause any problems, but it's worth reviewing anyway.

... is still treated as a normal string outside lists, and even in lists, it is not treated as a spread operator when not followed immediately by a $, [, or (.

User-Facing Changes

The ... operator will be allowed only inside lists, before variables, lists, and subexpressions in parentheses.

In certain situations, ... can simplify building lists, letting you avoid ++, prepend, and append:

> let foo = ["bar", "baz", "foobar"]
> ["quux" ...$foo "xyzzy" ...[1 2] ...("ab" | split chars) ...bare]
╭───┬─────────╮
│ 0 │ quux    │
│ 1 │ bar     │
│ 2 │ baz     │
│ 3 │ foobar  │
│ 4 │ xyzzy   │
│ 5 │       1 │
│ 6 │       2 │
│ 7 │ a       │
│ 8 │ b       │
│ 9 │ ...bare │
╰───┴─────────╯

Spreading a homogeneous list inside another homogeneous list preserves the type:

> [1 2 ...[3 4] 5] | describe
list<int>
> [1 2 ...["misfit"]] | describe
list<any>

TODO more examples?

The ... bit is highlighted the same color as other operators.

Tests + Formatting

The current tests cover these cases:

  • Spreading an empty list
  • Spreading variables, list literals, and subexpressions
  • Nested spreads ([ ...[ ...[ ...[ a ] b ] c ] d ])
  • Spreading non-lists (results in an error)
  • Making sure the result is of the right type (a list<int> spread inside a list<int> evaluates to a list<int>, mixing types results in a list<any>)

After Submitting

@sholderbach
Copy link
Copy Markdown
Member

Your plan of action around the AST/Expr sounds good to me. Excited for this!

@amtoine amtoine linked an issue Nov 9, 2023 that may be closed by this pull request
@ysthakur
Copy link
Copy Markdown
Member Author

Regarding parsing, should ... be treated as a separate token in the lexer itself? This would mean that a command like foo ...bar baz... would be interpreted as foo ... "bar" "baz..." rather than foo "...bar" "baz..." (parsing rest parameters would also be slightly different but that's a minor issue). This might confuse users expecting ...bar to be a single string, but it might also confuse users who know the spread operator exists and expect ...bar to be equivalent to ..."bar".

It looks like .. is only treated as a range operator when a range/math expression is expected, so it may be better to go with that precedent.

If a user enters ...foo and ... is treated as a separate token, then there would be a can't run executable error that would point only to the ... part (rather than ...foo, as it does now). So we would be obliged to have an error like spread operator unexpected instead. This does mean that we'd need to check for stray spread operators that are outside lists (and later records and command calls), which is extra work, but it seems like we'd just need a single check in parse_call for that, and it feels more "correct." This is assuming that people generally don't name their executables starting with ..., but I feel like that's a reasonable assumption (and they can use ^ anyway).

@amtoine
Copy link
Copy Markdown
Member

amtoine commented Nov 10, 2023

excited to see this become a thing 😇

this would help for things like that in the long term i think 👌

@ysthakur
Copy link
Copy Markdown
Member Author

ysthakur commented Nov 11, 2023

This comment is outdated. Original text:

Details I made some changes to the parser to let it parse spread expressions, but I put it in a different branch in my fork and made a [PR](https://github.com/ysthakur/pull/1) in case we want to do the parsing in a different way, so we wouldn't have to revert a bunch of commits on this branch. (let me know if I should merge the PR if that makes reviewing the changes more convenient)

The changes it makes are:

  • Treating ... as a separate token whenever it appears before an expression, not just in lists (so ...foo would always be tokenized as ... foo)
  • Because of the above change, the code for parsing rest parameters had to be made a bit more complicated. A new RestParamMode variant had to be added to ParseMode
  • Most importantly, in parse_list_expression, when it encounters ..., it tries to parse an expression shaped like either a string or a list of whatever the outer list's elements are shaped like
    • If the argument to the spread operator is a list containing elements not of the same type as the outer/containing list, then the type of the outer list becomes list<any>
    • If the argument to the spread operator is a string, it's basically treated like a list of strings when it comes to determining the type of the outer list

It doesn't check for stray spread operators yet, though.

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Nov 16, 2023

@ysthakur I'm excited about this. Please feel free to ping us if you need answers/help.

@ysthakur ysthakur marked this pull request as ready for review November 17, 2023 00:05
@ysthakur ysthakur marked this pull request as draft November 18, 2023 00:03
@ysthakur ysthakur marked this pull request as ready for review November 18, 2023 02:50
@kubouch
Copy link
Copy Markdown
Contributor

kubouch commented Nov 18, 2023

I played with it and looks good. I didn't manage to break it! A few more tests covering potential edge cases would be nice, but otherwise looks good to me.

I also tried nested spread ([ ...[ ...[ ...[ a ] b ] c ] d ]) and spreading lists with other values than strings (let a = [{name: a, value: 1}]; [{name: b, value: 2}, ...$a]), it all works. Syntax highlighting also seems to work correctly with it. I like that it acts as a fine-grained flatten operator.

The ... foo and ...$foo are not currently treated as errors, although I think that's fine for this PR.

The manual span handling is a bit hacky, but we do it in other places in the parser, and in this PR it seems carefully checked. In our new parser experiment, that might eventually replace the current parser, we're trying to use a less error-prone style.

@ysthakur
Copy link
Copy Markdown
Member Author

@kubouch Do you mean that ... foo and ... $foo should be treated as errors only inside lists, or also outside?

@kubouch
Copy link
Copy Markdown
Contributor

kubouch commented Nov 19, 2023

They are actually a valid syntax, so I think they shouldn't be treated as errors.

@kubouch kubouch merged commit 823e578 into nushell:main Nov 22, 2023
@ysthakur ysthakur deleted the spread branch November 22, 2023 21:42
amtoine pushed a commit that referenced this pull request Nov 29, 2023
Goes towards implementing #10598, which asks for a spread operator in
lists, in records, and when calling commands (continuation of #11006,
which only implements it in lists)

# Description
This PR is for adding a spread operator that can be used when building
records. Additional functionality can be added later.

Changes:

- Previously, the `Expr::Record` variant held `(Expression, Expression)`
pairs. It now holds instances of an enum `RecordItem` (the name isn't
amazing) that allows either a key-value mapping or a spread operator.
- `...` will be treated as the spread operator when it appears before
`$`, `{`, or `(` inside records (no whitespace allowed in between) (not
implemented yet)
- The error message for duplicate columns now includes the column name
itself, because if two spread records are involved in such an error, you
can't tell which field was duplicated from the spans alone

`...` will still be treated as a normal string outside records, and even
in records, it is not treated as a spread operator when not followed
immediately by a `$`, `{`, or `(`.

# User-Facing Changes
Users will be able to use `...` when building records.

```
> let rec = { x: 1, ...{ a: 2 } }
> $rec
╭───┬───╮
│ x │ 1 │
│ a │ 2 │
╰───┴───╯
> { foo: bar, ...$rec, baz: blah }
╭─────┬──────╮
│ foo │ bar  │
│ x   │ 1    │
│ a   │ 2    │
│ baz │ blah │
╰─────┴──────╯
```
If you want to update a field of a record, you'll have to use `merge`
instead:
```
> { ...$rec, x: 5 }
Error: nu::shell::column_defined_twice

  × Record field or table column used twice: x
   ╭─[entry #2:1:1]
 1 │  { ...$rec, x: 5 }
   ·       ──┬─  ┬
   ·         │   ╰── field redefined here
   ·         ╰── field first defined here
   ╰────
> $rec | merge { x: 5 }
╭───┬───╮
│ x │ 5 │
│ a │ 2 │
╰───┴───╯
```

# Tests + Formatting

# After Submitting
hardfau1t pushed a commit to hardfau1t/nushell that referenced this pull request Dec 14, 2023
hardfau1t pushed a commit to hardfau1t/nushell that referenced this pull request Dec 14, 2023
Goes towards implementing nushell#10598, which asks for a spread operator in
lists, in records, and when calling commands (continuation of nushell#11006,
which only implements it in lists)

# Description
This PR is for adding a spread operator that can be used when building
records. Additional functionality can be added later.

Changes:

- Previously, the `Expr::Record` variant held `(Expression, Expression)`
pairs. It now holds instances of an enum `RecordItem` (the name isn't
amazing) that allows either a key-value mapping or a spread operator.
- `...` will be treated as the spread operator when it appears before
`$`, `{`, or `(` inside records (no whitespace allowed in between) (not
implemented yet)
- The error message for duplicate columns now includes the column name
itself, because if two spread records are involved in such an error, you
can't tell which field was duplicated from the spans alone

`...` will still be treated as a normal string outside records, and even
in records, it is not treated as a spread operator when not followed
immediately by a `$`, `{`, or `(`.

# User-Facing Changes
Users will be able to use `...` when building records.

```
> let rec = { x: 1, ...{ a: 2 } }
> $rec
╭───┬───╮
│ x │ 1 │
│ a │ 2 │
╰───┴───╯
> { foo: bar, ...$rec, baz: blah }
╭─────┬──────╮
│ foo │ bar  │
│ x   │ 1    │
│ a   │ 2    │
│ baz │ blah │
╰─────┴──────╯
```
If you want to update a field of a record, you'll have to use `merge`
instead:
```
> { ...$rec, x: 5 }
Error: nu::shell::column_defined_twice

  × Record field or table column used twice: x
   ╭─[entry nushell#2:1:1]
 1 │  { ...$rec, x: 5 }
   ·       ──┬─  ┬
   ·         │   ╰── field redefined here
   ·         ╰── field first defined here
   ╰────
> $rec | merge { x: 5 }
╭───┬───╮
│ x │ 5 │
│ a │ 2 │
╰───┴───╯
```

# Tests + Formatting

# After Submitting
dmatos2012 pushed a commit to dmatos2012/nushell that referenced this pull request Feb 20, 2024
dmatos2012 pushed a commit to dmatos2012/nushell that referenced this pull request Feb 20, 2024
Goes towards implementing nushell#10598, which asks for a spread operator in
lists, in records, and when calling commands (continuation of nushell#11006,
which only implements it in lists)

# Description
This PR is for adding a spread operator that can be used when building
records. Additional functionality can be added later.

Changes:

- Previously, the `Expr::Record` variant held `(Expression, Expression)`
pairs. It now holds instances of an enum `RecordItem` (the name isn't
amazing) that allows either a key-value mapping or a spread operator.
- `...` will be treated as the spread operator when it appears before
`$`, `{`, or `(` inside records (no whitespace allowed in between) (not
implemented yet)
- The error message for duplicate columns now includes the column name
itself, because if two spread records are involved in such an error, you
can't tell which field was duplicated from the spans alone

`...` will still be treated as a normal string outside records, and even
in records, it is not treated as a spread operator when not followed
immediately by a `$`, `{`, or `(`.

# User-Facing Changes
Users will be able to use `...` when building records.

```
> let rec = { x: 1, ...{ a: 2 } }
> $rec
╭───┬───╮
│ x │ 1 │
│ a │ 2 │
╰───┴───╯
> { foo: bar, ...$rec, baz: blah }
╭─────┬──────╮
│ foo │ bar  │
│ x   │ 1    │
│ a   │ 2    │
│ baz │ blah │
╰─────┴──────╯
```
If you want to update a field of a record, you'll have to use `merge`
instead:
```
> { ...$rec, x: 5 }
Error: nu::shell::column_defined_twice

  × Record field or table column used twice: x
   ╭─[entry nushell#2:1:1]
 1 │  { ...$rec, x: 5 }
   ·       ──┬─  ┬
   ·         │   ╰── field redefined here
   ·         ╰── field first defined here
   ╰────
> $rec | merge { x: 5 }
╭───┬───╮
│ x │ 5 │
│ a │ 2 │
╰───┴───╯
```

# Tests + Formatting

# After Submitting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce a spread operator

5 participants