Currently, we attempt error recovery by skipping until a specific token is found, i.e. a terminator or a closing delimiter. To better isolate the error recovery, we keep track of which tokens are expected in (nested) delimited groups and possibly stop skipping earlier than on the expected token, to contain the skipped token ranges inside a given group. Let's call these synchronizing tokens.
The current approach works pretty well but can be further improved by improving the synchronizing token set for a specific parser. For instance, in pragma solidity version ranges, it's necessary to stop skipping on the || token when trying to recover from the version expression (see #595), or it might be reasonable to recover on a function keyword (or other that starts a contract member definition) when trying to recover from an incomplete contract member definition.
The common approach is to use a FOLLOW set as the synchronizing token set. This would unblock certain recoveries, like the version expression recovery, or improve existing ones like incomplete or unrecognized item definitions.
A good starting point is https://github.com/AntonyBlakey/slang/tree/AntonyBlakey/first_set_dispatch, which computes the FIRST set, from which we can derive the FOLLOW sets for our non-terminal parsers.
Currently, we attempt error recovery by skipping until a specific token is found, i.e. a terminator or a closing delimiter. To better isolate the error recovery, we keep track of which tokens are expected in (nested) delimited groups and possibly stop skipping earlier than on the expected token, to contain the skipped token ranges inside a given group. Let's call these synchronizing tokens.
The current approach works pretty well but can be further improved by improving the synchronizing token set for a specific parser. For instance, in
pragma solidityversion ranges, it's necessary to stop skipping on the||token when trying to recover from the version expression (see #595), or it might be reasonable to recover on afunctionkeyword (or other that starts a contract member definition) when trying to recover from an incomplete contract member definition.The common approach is to use a FOLLOW set as the synchronizing token set. This would unblock certain recoveries, like the version expression recovery, or improve existing ones like incomplete or unrecognized item definitions.
A good starting point is https://github.com/AntonyBlakey/slang/tree/AntonyBlakey/first_set_dispatch, which computes the FIRST set, from which we can derive the FOLLOW sets for our non-terminal parsers.