-
Notifications
You must be signed in to change notification settings - Fork 34
Description
Fortran is not a context free grammar, where even simple language tokenization requires additional context/state (e.g. if(i.le.20.and.j.le.10) and the challenge with tokenizing "20.and"). Fparser2 can already handle this tokenization example well, but there are Fortran parse rules which are sensitive to high-level program context which are harder to deal with.
For example, take the following code:
module my_struct_mod
STRUCTURE /item/
INTEGER id
END STRUCTURE
end module my_struct_mod
module my_func_mod
contains
function item(id)
integer, intent(in) :: id
integer :: item
item = id * 2
end function item
end module my_func_mod
program main
!use my_struct_mod, only: item
!use my_func_mod, only: item
print*, item(id=2)
end program main
The line containing the print statement will be parsed using R912 (PRINT format [, output-item-list]), and the single output-item-list will be parsed with:
R917 (expr)
-> R711 (level-5-expr)
-> R717 (level-4-expr)
-> R714 (level-4-expr)
-> R712 (level-3-expr)
-> R710 (level-2-expr)
-> R704 (level-1-expr)
-> R702 (primary).
At this point, parsing the expression depends on the previously seen context - if my_struct_mod is uncommented then we would parse this statement with R701 (structure-constructor) whereas if my_func_mod is uncommented we would parse it with R701 (function-reference).
#182 raised a number of cases of these extra contextual information being needed for parsing of the Primary type:
- C701 (R701) The type-param-name shall be the name of a type parameter. (captured in test case
test_C701_no_assumed_size_array) - C702 (R701) The designator shall not be a whole assumed-size array (captured in test case
test_C702_no_assumed_size_array) - R701 function-reference (captured in test case
test_Function_Reference) - R701 type-param-inquiry (captured in test case
test_Type_Param_Inquiry)
Further to the parse context of print*, item(id=2) example, one could imagine a parser which fails to parse item(id=2) unless item has already been defined as a structure-constructor or a function-reference. It may therefore make sense for the parser to raise a SyntaxError / NoMatchError as soon as it is realised that there is no appropriate match, rather than the existing behaviour of producing a structure-constructor. Naturally, this would be a breaking change to the existing behaviour, and would need to be managed appropriately.