Conversation
Draft for @jcouv review.
|
pping @jcouv |
| There may be reasons to violate this rule. For example, the syntax model does not have a concept of precedence, but the parser uses precedence to decide how to assemble the tree. Precedence errors are therefore reported in the parser. As the syntactic shape of the tree is impacted here and we do not want to form illegal tree, diagnostics at this layer are sensible. | ||
|
|
||
| However, diagnostics should be avoided for cases where the syntactic model can be easily fitted to, but the rule is effectively a semantic one driven by external context. A good example of this are 'language version' checks. Outside of exceptional cases (like 'record parsing') parsing is not actually affected by language-version. Instead, the lang version simply states if the construct can be used or not, not if it was recognized and parsed successfully into the syntax model. These checks should happen later, like in the decl-table or binding passed of the compiler (see the DOs/DON'Ts section of this document below for reasons why). Note: an acceptable reason to still keep these checks at the parser level would be to avoid a simple and direct parsing check turning into a 'smeared out peanut butter' check higher up (e.g. where perhaps dozens of locations might need checks added). In that case, the negative cost to compiler maintainability would outweigh the positive benefits we get elsewhere. | ||
|
|
There was a problem hiding this comment.
It feels like the text changes in this PR can be simplified significantly. Please consider replacing the changes with the following. More than this seems unnecessary.
PREFER reporting diagnostics in binding when diagnostics do not affect the shape of the tree
The incremental parser does not reuse syntax nodes with diagnostics so it is preferable to report diagnostics in binding rather than parsing when diagnostics do not affect the shape of the syntax tree and when the cost of reporting diagnostics in binding or parsing is similar.
The common case is a syntax change introduced in a newer language version where the syntax tree shape is independent of language version.
|
|
||
| There may be reasons to violate this rule, for example where the language is specified to be sensitive to context. For example, `await` is treated as a keyword if the enclosing method has the `async` modifier. | ||
|
|
||
| #### Impact of not following these DOs/DON'Ts |
There was a problem hiding this comment.
This section seems unnecessary.
There was a problem hiding this comment.
I think this was necessary as the impact here was not well understood when conversing on this topic with other team members.
|
Consider whether this PR is still necessary now that the relevant code was labelled with |
|
I thnik it is still pretty relevant. Esp for someone coming into the codebase. Explaining the 'whys' is important IMO to get a good grasp of the reason we care about this. |
|
|
||
| 1. The presence of ambient context greatly impacts the ability to do incremental parsing properly. Ambient context must be tracked in some fashion and incremental reuse across disparate contexts leads to violating the invariant that incremental reparsing produces the same tree as normal parsing. This can and lead to fundamental brokenness that is only solved by a host restart. As an example, consider an edit that adds 'async' to a method. This must necessarily affect reuse of nodes within the method as they may be parsed separately. Every piece of ambient context can have this effect and it dramatically makes it harder to reason about incremental edits and can lead to subtle and hard to diagnose or repro incremental errors in the wild. | ||
|
|
||
| 2. The presense of diagnostics beyond just `"missing token" and "unexpected token"` has impact again to the incremental parser. Incremental parsing cannot reuse nodes that have errors in them, as it does not know what the root cause of the error was and if it would be fixed by an edit in a disparate part of the file. As such, any additional diagnostic forces complete reparsing of that construct and all parent nodes above it. This adds excess time to incremental parsing and can cause higher memory churn as well as new nodes must be created. Both of these problems directly impact downstream consumption in IDE scenarios. In the IDE typeing latency is a high value SLA target we need to meet. However, many features must operate within the typing window in a manner that is 'correct' wrt to the syntax tree. This includes, but is not limited to featuers like brace-insertion as well as indentation. Both of these must happen near-instantiously to the user. And both need an up-to-date SyntaxTree to determine what to do properly. As such, these trees are retrieved synchronously using incremental parsing, so as little extra work as possible must be done so we can fit all the remainder of the work into the time-slice available. Additional diagnostics interferes with this, adding extra unnecessary CPU and causing memory churn. |
There was a problem hiding this comment.
| 2. The presense of diagnostics beyond just `"missing token" and "unexpected token"` has impact again to the incremental parser. Incremental parsing cannot reuse nodes that have errors in them, as it does not know what the root cause of the error was and if it would be fixed by an edit in a disparate part of the file. As such, any additional diagnostic forces complete reparsing of that construct and all parent nodes above it. This adds excess time to incremental parsing and can cause higher memory churn as well as new nodes must be created. Both of these problems directly impact downstream consumption in IDE scenarios. In the IDE typeing latency is a high value SLA target we need to meet. However, many features must operate within the typing window in a manner that is 'correct' wrt to the syntax tree. This includes, but is not limited to featuers like brace-insertion as well as indentation. Both of these must happen near-instantiously to the user. And both need an up-to-date SyntaxTree to determine what to do properly. As such, these trees are retrieved synchronously using incremental parsing, so as little extra work as possible must be done so we can fit all the remainder of the work into the time-slice available. Additional diagnostics interferes with this, adding extra unnecessary CPU and causing memory churn. | |
| 2. The presence of diagnostics beyond just `"missing token" and "unexpected token"` has impact again to the incremental parser. Incremental parsing cannot reuse nodes that have errors in them, as it does not know what the root cause of the error was and if it would be fixed by an edit in a disparate part of the file. As such, any additional diagnostic forces complete reparsing of that construct and all parent nodes above it. This adds excess time to incremental parsing and can cause higher memory churn as well as new nodes must be created. Both of these problems directly impact downstream consumption in IDE scenarios. In the IDE typing latency is a high value SLA target we need to meet. However, many features must operate within the typing window in a manner that is 'correct' wrt to the syntax tree. This includes, but is not limited to features like brace-insertion as well as indentation. Both of these must happen near-instantaneously to the user. And both need an up-to-date SyntaxTree to determine what to do properly. As such, these trees are retrieved synchronously using incremental parsing, so as little extra work as possible must be done so we can fit all the remainder of the work into the time-slice available. Additional diagnostics interferes with this, adding extra unnecessary CPU and causing memory churn. |
Spelling fixes :)
|
IMO we should just merge as long as we don't find any of the documentation to be wrong/misleading/distracting. An imperfect document that includes useful info is better than the absence of it. |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
No description provided.