-
Notifications
You must be signed in to change notification settings - Fork 34
Description
Currently fparser does some clever magic to store some global state on fparser.two.Fortran2003.Base (a dictionary giving all subclasses of a given subclass name). I propose that rather than have global state, we have an object which encapsulates this information, and acts as the driver of the recursive parse process.
-
Each Base subclass (herein referred to as a parse element) MUST receive a Parser instance as well as the source to be parsed to a new
Base.from_sourceclass method. The existing parse-via-constructor will be deprecated, and the constructor will be used exclusively for constructing instances of that type given already constructed children (e.g.Type_Declaration_Stmt(declaration_type_spec_instance)). Existingreprof parser element instances will therefore continue to produce working code. -
The parse element will defer its subelement parsing through the Parser instance using named references (e.g.
parser.match('program-unit', source)). The Parser instance will use the equivalent of today'ssubclassesstate to find the appropriate parse element and call itsfrom_sourceormatchclass methods (as appropriate). This will break the existing direct linkage between classes, and will allow improved modularisation of the code, as well as the ability for parsers to be defined with specific extensions without the need for intrusive change to the class hierarchy. -
The parser will have at least two parse methods:
parser.match(typename, source)- return a subclass of the given typename, orNone(equivalent of today's parse element.matchclass method)parser.parse(typename, source)- return a subclass of the given typename, or raise aNoMatchError(equivalent of today's parse element constructor)
-
Although out of scope for the initial implementation, it is envisaged that the parser will also hold state about the context of what has already been parsed. This will gradually allow issues where prior parse context is required, such as fparser2 needs more context to be able to parse correctly #190, to be addressed.
Backwards compatibility
Existing code such as Derived_Type_Stmt("type a") will only be possible with a parser instance, either directly via Derived_Type_Stmt.from_source("type a", parser) or indirectly via parser.parse(source, 'derived-type-stmt'). It is worth noting that for both pre- and post- examples it was necessary to have "created" a parser, but with the new interface it is more explicit, non-global, and allows greater flexibility with regards to parser customisation and future context information.
Benefits
- Avoid global state, thus allowing multiple parsers to exist concurrently (e.g.
f2003,f2008)- Avoids the need for global state in the unit tests
- Explicit syntax separates parsing from constructors. Removes the surprising behaviour of
Program_Unit(...)constructing a non-Program_Unitinstance.- Allows the Parser to handle subtype recursion, thus removing the need for passing
parent_clswhen parsing, and avoiding the need for the parse elements to concern themselves with recursion errors (L269, L269)
- Allows the Parser to handle subtype recursion, thus removing the need for passing
- Breaks the dependence on the actual class objects in the parse element definitions, allowing modularisation of definition (Fortran2003.py is currently ~9000 LoC), and a much easier process for swapping out definitions in custom Parser instances (e.g.
f2003-strictvsf2003-with-standard-compiler-extensions)- Using named parse elements brings the code a step closer to the original Fortran standard definition (i.e.
Data_Component_Def_Stmt->data-component-def-stmt), and potentially allows the parse element to define subtype names in fewer places, rather than 3 in the current implementation (e.g.Declaration_Type_SpecinData_Component_Def_Stmt)
- Using named parse elements brings the code a step closer to the original Fortran standard definition (i.e.
- Provides possibility of stateful context for addressing issues such as fparser2 needs more context to be able to parse correctly #190
Implementation plan
- Telco to agree design and implementation plan
- All parser elements to receive a
from_sourceclass method, and all.matchmethods to make use of.from_sourcerather than the explicit constructor form (existing form continues to be fully functional) - Creation of
Parserclass to hold the equivalent of theBase.subclassesstate. Allfrom_sourcecalls to include the parser in the call arguments, even those that don't recurse/use the parser (e.g. terminal nodes) - All documentation to be updated to reflect the new interface
- Parse-via-constructor to be deprecated
- All appropriate unit-tests to use a new pytest fixture which is the parser context (most of the existing class constructor tests)
- Global
Base.subclassesstate to be removed - Parse-via-constructor to be removed
- Parse elements to gradually migrate to using
parser.parse(source, type_name)orparser.match(source, type_name)within their.matchclass method