Skip to content

fparser2 needs more context to be able to parse correctly #190

@pelson

Description

@pelson

Fortran is not a context free grammar, where even simple language tokenization requires additional context/state (e.g. if(i.le.20.and.j.le.10) and the challenge with tokenizing "20.and"). Fparser2 can already handle this tokenization example well, but there are Fortran parse rules which are sensitive to high-level program context which are harder to deal with.

For example, take the following code:

module my_struct_mod
    STRUCTURE /item/
      INTEGER id
    END STRUCTURE
end module my_struct_mod

module my_func_mod
contains
    function item(id)
        integer, intent(in) :: id
        integer             :: item

        item = id * 2
    end function item
end module my_func_mod

program main
  !use my_struct_mod, only: item
  !use my_func_mod, only: item

  print*, item(id=2)
end program main

The line containing the print statement will be parsed using R912 (PRINT format [, output-item-list]), and the single output-item-list will be parsed with:

 R917 (expr)
 -> R711 (level-5-expr)
 -> R717 (level-4-expr)
 -> R714 (level-4-expr)
 -> R712 (level-3-expr)
 -> R710 (level-2-expr)
 -> R704 (level-1-expr)
 -> R702 (primary).

At this point, parsing the expression depends on the previously seen context - if my_struct_mod is uncommented then we would parse this statement with R701 (structure-constructor) whereas if my_func_mod is uncommented we would parse it with R701 (function-reference).

#182 raised a number of cases of these extra contextual information being needed for parsing of the Primary type:

  1. C701 (R701) The type-param-name shall be the name of a type parameter. (captured in test case test_C701_no_assumed_size_array)
  2. C702 (R701) The designator shall not be a whole assumed-size array (captured in test case test_C702_no_assumed_size_array)
  3. R701 function-reference (captured in test case test_Function_Reference)
  4. R701 type-param-inquiry (captured in test case test_Type_Param_Inquiry)

Further to the parse context of print*, item(id=2) example, one could imagine a parser which fails to parse item(id=2) unless item has already been defined as a structure-constructor or a function-reference. It may therefore make sense for the parser to raise a SyntaxError / NoMatchError as soon as it is realised that there is no appropriate match, rather than the existing behaviour of producing a structure-constructor. Naturally, this would be a breaking change to the existing behaviour, and would need to be managed appropriately.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions