forked from pearu/f2py
-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
fparser is still the main bottleneck when executing PSyclone, it takes 70%-80% of its execution time. There are some low code-impact modifications that can be explored without changing the main structure of the code. I have implemented some of those in a local branch and show promising results. Some of the ideas are:
- (towards 312) Improve performance by adding caching to the tokenizer #336 Use caching/memoisation in the tokenizer (because the same string is tokenized multipe times when recursing down a match chain)
- (towards 312) Remove imports and function declaration from hotpath (__new__ and match) #337 Remove the imports from the hot path (
__new__andmatch) . These are there because they are cyclical imports but other solutions can be found. - Currently each object is instantiated twice: one with the reader object as a parameter and inside it another time with the string from the reader.get() as a parameter. This doubles the number of instantiations and methods calls in the hotpath and provably can be optimized significantly if merged.
Current results:
| branch | Nemo sbccpl.f90 | PSyclone gocean_opencl_trans_test.py |
|---|---|---|
| master | 2.6s | 22s |
| #337 dynamic import | 2.2s (x1.18) | 20.3 (x1.08) |
| #336 tokenizer | 1.9s (x1.36) | 17.2s (x1.27) |
| all opts | 1.6 (x1.62) | 15.7s (x1.4) |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels