Skip to content

allow lists to have type annotations#8270

Closed
1Kinoti wants to merge 9 commits intonushell:mainfrom
1Kinoti:main
Closed

allow lists to have type annotations#8270
1Kinoti wants to merge 9 commits intonushell:mainfrom
1Kinoti:main

Conversation

@1Kinoti
Copy link
Copy Markdown
Contributor

@1Kinoti 1Kinoti commented Mar 1, 2023

Description

this pr allows list type parameters to have type annotations,

example

def nums [nums: list<int>] { $nums }

unterminated annotations raise an error

Error: nu::parser::unterminated_list_type_annotation (link)

  × Unterminated list type annotation.
   ╭─[entry #42:1:1]
 1 │ def test [list: list<int] {}
   ·                 ───┬──
   ·                    ╰── use `>` to terminate it
   ╰────

type annotations can be nested

def test [list: list<list<int>>] {}

empty type annotations are allowed

def test [list: list<>] {} # this is the same as `list: list`

the original format is also allowed

def test [list: list] {}

'would be nice' polishes

  1. the error message could be better. i am not sure if the error message i used are the best

  2. we should be able to parse such a line, for better error reporting

def test [list: list< int>] {}
                ─────┬────
                      ╰── spaces are not allowed in type annotations

@1Kinoti 1Kinoti marked this pull request as draft March 1, 2023 01:46
@1Kinoti 1Kinoti marked this pull request as ready for review March 1, 2023 02:23
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 1, 2023

Codecov Report

Merging #8270 (5978c8b) into main (d614188) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #8270      +/-   ##
==========================================
+ Coverage   68.18%   68.21%   +0.02%     
==========================================
  Files         620      620              
  Lines       99306    99345      +39     
==========================================
+ Hits        67708    67764      +56     
+ Misses      31598    31581      -17     
Impacted Files Coverage Δ
crates/nu-parser/src/parser.rs 82.76% <100.00%> (+0.31%) ⬆️
src/test_bins.rs 96.88% <0.00%> (+0.44%) ⬆️
crates/nu-parser/src/lite_parser.rs 69.34% <0.00%> (+2.16%) ⬆️

@baehyunsol
Copy link
Copy Markdown
Contributor

Does it throw an error when [1, 2, true] is given when list<int> is expected?

@1Kinoti
Copy link
Copy Markdown
Contributor Author

1Kinoti commented Mar 1, 2023

yes it does. unfortunately,, it currently only underlines the first list element that does not match the specified type

def test [list: list<int>] {};
test [1 true 5]
        ─┬─
         ╰── expected int

Copy link
Copy Markdown
Member

@sholderbach sholderbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing the type system a good bit forward! We have been missing out on that for a while now.

Comment on lines +2888 to +2891
if bytes.starts_with(b"list<") {
return parse_list_shape(working_set, bytes, span);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't particularly like the separated special case. It might be confusing when you look at the larger match below.
But I can see how it makes sense with the error return in parse_list_shape().

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the review!

i had thought about using a match arm with a guard but that looked even more confusing

let result = match bytes {
  // ...
  list if list.starts_with(b"list<") => { /* ... */ }
}

Comment on lines +2859 to +2879
return (SyntaxShape::List(Box::new(SyntaxShape::Any)), None);
} else if src.is_empty() {
// list<
return (
SyntaxShape::List(Box::new(SyntaxShape::Any)),
Some(unterminated_error),
);
}

let (shape, err) = parse_shape_name(working_set, src, inner_span);
if err.is_some() {
return (SyntaxShape::List(Box::new(shape)), err);
}

let error = if !is_terminated {
Some(unterminated_error)
} else {
None
};

(SyntaxShape::List(Box::new(shape)), error)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All those errors still return a SyntaxShape::List(). Is this desirable or should we reject bogus type names in there outright? (Haven't checked how the parse/error pair is handled upstream in this case)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the final return type of parse_shape_name(), any unknown type makes the function return type Any but with an Some(error). from the tests, i found that if the error section of the tuple is some, then the error is raised. so it seems that if Some(error) then the returned shape name doesn't matter,

i returned a list<any> just for consistency

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a parser expert, seems to work in the tests, which is the important bit.
If anyone of the parser experts wants to weigh in, would be welcome.

Comment on lines +482 to +486
fn list_type_annotations() -> TestResult {
let input = "def test [list: list<string>] { $list | length }; test [ 'nushell', 'nunu' ]";
let expected = "2";
run_test(input, expected)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the thorough testing!

Comment on lines +166 to +175
fn def_with_specified_list_type_nested() {
let actual = nu!(
cwd: ".", pipeline(
r#"
def test-command [ foo: list<list<any>> ] {}
"#
));

assert!(actual.err.is_empty());
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are those tests redundant to the tests in src/tests/test_parser.rs or does it catch another potential area of issues?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before this pr there was this test, this is what i replaced.

i will remove them

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I saw that we already had one there. I checked the src/tests/test_parser.rs tests and they also check the whole nu binary so there is no issue that it is just a unit test in parser land.

Comment on lines +178 to +184
fn def_with_specified_list_type_unterminated() {
let actual = nu!(
cwd: ".", pipeline(
r#"
def test-command [ foo: list<any ] {}
"#
));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth adding a test with a bogus spec like list<foo> or invalid separators like list<foo; bar>,list<int, string>.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@sholderbach sholderbach added A:parser Issues related to parsing A:type-system Problems or features related to nushell's type system labels Mar 5, 2023
@amtoine
Copy link
Copy Markdown
Member

amtoine commented Mar 5, 2023

amazing ❤️

i do even think this PR will address the second part of #8109 😮

@1Kinoti
Copy link
Copy Markdown
Contributor Author

1Kinoti commented Mar 5, 2023

one thing i not sure about is if the error messages are okay, i couldn't come up with anything better

Copy link
Copy Markdown
Member

@sholderbach sholderbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround!

#[test]
fn list_type_annotations_nested_with_unknown_type() -> TestResult {
let input =
"def test [list: list<list<str>>] { $list | length }; test [ ['nushell'], ['nunu'] ]";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a second to realize that str is not string. Damn you Rust and Python brain. :D

#[test]
fn list_type_annotations_unknown_separators() -> TestResult {
let input = "def test [list: list<int, string>] { $list | length }; test [2 5 4]";
let expected = "use `>` to terminate it";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great if we could use something like ParseError::ExtraPositional or ExtraToken here to indicate that it is too much. Don't know if we get to see that from the lexer?

Don't know if the whitespace variants below have to be errors or if we could be tolerant to them?

Copy link
Copy Markdown
Contributor Author

@1Kinoti 1Kinoti Mar 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently, the lexer lexes that signature as two parameters, separated by a comma, so we get,
list: list<int and string> as two separate params. to really handle this, we need to have a tiny lexer that works on the signature and replaces this,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that sucks a bit. I think, if we want to repeat your work for records, we would need to be able to support that, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean something like

def test [person: record<name: string, age: int>] {}

yeah, this DEFINITELY needs the smaller lexer

@1Kinoti 1Kinoti closed this Mar 18, 2023
@amtoine
Copy link
Copy Markdown
Member

amtoine commented Mar 18, 2023

oooh no why did you close this? 😢

@1Kinoti
Copy link
Copy Markdown
Contributor Author

1Kinoti commented Mar 18, 2023

currently, the lexer lexes that signature as two parameters, separated by a comma, so we get,
list: list as two separate params. to really handle this, we need to have a tiny lexer that works on the signature and replaces this,

realized that it needs more work than i thought. i will open a more refined one when i figure out if we can rewrite the lexer, or at least have another lexer specifically meant for signatures. if we implement the latter, then we can solve all signature problems and even future ones, like:

# the lexer can be used to lex assignment type annotations
let person: record<name: string, age: int> = {name: "Name", age: 25}

# can also be used to lex return values
def run [] -> record<status: string, value: any> {
   {status: Ok, value: "some value"}
}

# and of cause param type annotations
def complex_math [args: list<int>] {}

@sholderbach
Copy link
Copy Markdown
Member

I think we can start off with the list signatures as a stepping stone to see if we run into particular issues in practice. I wouldn't mind having the partial solution while having the work in the lexer going on in the background. Sorry that the PR got under the wheels of all the other stuff going on. (Don't know if there were merge problems accumulating?)

sholderbach pushed a commit that referenced this pull request Mar 24, 2023
this pr refines #8270 and closes #8109

# description
examples:

the original syntax is okay
```nu
def okay [nums: list] {}         # the type of list will be list<any>
```

empty annotations are allowed in any variation
the last two may be caught by a future formatter, 
but do not affect `nu` code currently
```nu
def okay [nums: list<>] {}       # okay

def okay [nums: list<     >] {}  # weird but also okay

def okay [nums: list<
>] {}                            # also weird but okay
```

types are allowed (See [notes](#notes) below)
```nu
def okay [nums: list<int>] {}    # `test [a b c]` will throw an error 

def okay [nums: list< int > {}   # any amount of space within the angle brackets is okay

def err [nums: list <int>] {}    # this is not okay, `nums` and `<int>` will be parsed as
                                 # two separate params, 
```

nested annotations are allowed in many variations
```nu
def okay [items: list<list<int>>] {}

def okay [items: list<list>] {}
```

any unterminated annotation is caught
```nu
Error: nu::parser::unexpected_eof

  × Unexpected end of code.
   ╭─[source:1:1]
 1 │ def err [nums: list<int] {}
   ·                       ▲
   ·                       ╰── expected closing >
   ╰────
```

unknown types are flagged
```nu
Error: nu::parser::unknown_type

  × Unknown type.
   ╭─[source:1:1]
 1 │ def err [nums: list<str>] {}
   ·                     ─┬─
   ·                      ╰── unknown type
   ╰────

Error: nu::parser::unknown_type

  × Unknown type.
   ╭─[source:1:1]
 1 │ def err [nums: list<int, string>] {}
   ·                    ─────┬─────
   ·                          ╰── unknown type
   ╰────
```

# notes
the error message for mismatched types in not as intuitive
```nu
Error: nu::parser::parse_mismatch

  × Parse mismatch during operation.
   ╭─[source:1:1]
 1 │ def err [nums: list<int>] {}; err [a b c]
   ·                                    ┬
   ·                                    ╰── expected int
   ╰────
```
it should be something like this
```nu
Error: nu::parser::parse_mismatch

  × Parse mismatch during operation.
   ╭─[source:1:1]
 1 │ def err [nums: list<int>] {}; err [a b c]
   ·                                    ──┬──
   ·                                      ╰── expected list<int>
   ╰────
```
this is currently not implemented
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A:parser Issues related to parsing A:type-system Problems or features related to nushell's type system

Projects

None yet

Development

Successfully merging this pull request may close these issues.

list type annotations do not allow default values and specifying generic type

4 participants