Use PatternMatchAs for mapping pattern's rest node#6838
Use PatternMatchAs for mapping pattern's rest node#6838charliermarsh wants to merge 1 commit intomainfrom
PatternMatchAs for mapping pattern's rest node#6838Conversation
PR Check ResultsEcosystem✅ ecosystem check detected no changes. BenchmarkLinuxWindows |
MichaReiser
left a comment
There was a problem hiding this comment.
This is interesting for two reasons:
- I don't understand why there's a dedicated node for
*namebut not**name - It would be similar to
argumentsandparameters
The main downside that I see is that the AST now allows representing more invalid syntaxes (assuming someone mutates the AST).
Other than that, there's a limitation that **_ isn't valid syntax. The PEP only mentionts in the quick intro...
Mapping patterns: {"bandwidth": b, "latency": l} captures the "bandwidth" and "latency" values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard **rest is also supported. (But **_ would be redundant, so it is not allowed.) [source]
... case {"a": 1, "b": 2, **_}: ...
File "<stdin>", line 2
case {"a": 1, "b": 2, **_}: ...
^
SyntaxError: invalid syntax
Our parser silently parsed this before (and continues to do so now). So this isn't a specific to this PR
Nevertheless, I'm a bit hesitant from making this change because I wonder why Python decided on their AST structure. Did you do some research to find why Python structured their AST the way they did?
| pub keys: Vec<Expr>, | ||
| pub patterns: Vec<Pattern>, | ||
| pub rest: Option<Identifier>, | ||
| pub rest: Option<Box<Pattern>>, |
There was a problem hiding this comment.
Why Box<Pattern> and not limiting it to the specific node?
No, I can look into it. I assume it's because you can't have arbitrary patterns on the right-hand side of a |
|
I can't find much on it, after searching through the Discuss and the CPython repo itself, but it's hard to Google for it since there were a bunch of PEPs that went through a lot of iteration (some rejected, some superseded, etc.). However, I did notice that in the PEP, they use But we resolve CapturePattern: ast::Pattern = {
<location:@L> <name:Identifier> <end_location:@R> => ast::PatternMatchAs {
pattern: None,
name: if name.as_str() == "_" { None } else { Some(name) },
range: (location..end_location).into()
}.into(),
}This is the pattern we match when we see match foo:
case x:
...(Python also parses this as a And subsequently, we don't use MappingPattern: ast::Pattern = {
<location:@L> "{" "**" <rest:Identifier> ","? "}" <end_location:@R> => {
ast::PatternMatchMapping {
keys: vec![],
patterns: vec![],
rest: Some(rest),
range: (location..end_location).into()
}.into()
},
}Anyway, I'm not really sure either way on this one. I think this will make our lives easier, but it is a deviation from the CPython AST. |
Separately: we can fix that by validating the name -- we already do that for AsPattern: ast::Pattern = {
<location:@L> <pattern:OrPattern> "as" <name:Identifier> <end_location:@R> =>? {
if name.as_str() == "_" {
Err(LexicalError {
error: LexicalErrorType::OtherError("cannot use '_' as a target".to_string()),
location,
})?
} else {
Ok(ast::Pattern::MatchAs(
ast::PatternMatchAs {
pattern: Some(Box::new(pattern)),
name: Some(name),
range: (location..end_location).into()
},
))
}
},
} |
|
Thinking on this more, though it is a formatter-centric change, I still find it odd that |
Maybe Identifier should be a node? It already has a range |
|
That's an interesting idea, I wouldn't mind it. |
For me, changing Identifier to a node is easier to assess than whether we should change the pattern structure. I'm just not familiar enough with its representation and I would need to spend some time to get up to speed. Having a dedicated node for Identifiers would move toward my long-term goal of having a In the meantime, having I'll leave it up to you to decide whether you want to change pattern or identifier because you have more context than I. |
|
@MichaReiser - That makes sense, I will convert to draft until I have time to try it. |
|
Not a priority right now, will revisit eventually. |
Summary
In the mapping pattern match, you can have a rest argument to capture the remaining fields, like:
The
restargument has to be an identifier, and has to come last, and can't be_. Right now, we represent it as an identifier; this PR instead changes it to aPattern, usingPatternMatchAswith an emptypattern(which is equivalent to an identifier, and appears elsewhere in the pattern matching AST -- this is the pattern you get when you docase x:to match a wildcard and assign it tox).The nice thing about this change is that the
restin**restis now a node, not just an identifier, which I believe will significantly simplify some of the comment handling in #6836 (and will also enable that comment handling to look a lot more like the dictionary comment handling).Test Plan
cargo test