Skip to content

Speed up Vasprun parsing some more#4360

Merged
shyuep merged 5 commits intomaterialsproject:masterfrom
kavanase:vasprun_parsing_speedup
Apr 15, 2025
Merged

Speed up Vasprun parsing some more#4360
shyuep merged 5 commits intomaterialsproject:masterfrom
kavanase:vasprun_parsing_speedup

Conversation

@kavanase
Copy link
Copy Markdown
Contributor

Me again.
From further profiling and playing around, I found I could speed up _parse_vasp_array (one of the main bottlenecks when using parse_dos = True (default), parse_eigen = True (default) and/or parse_projected_eigen = True (False by default)), using numpy's parse from string function.
e.g. parsing a SOC defect supercell vasprun via doped with these updates (with parse_projected_eigen=True to get eigenvalues/magnetisation) decreases parsing time from ~8.5s to ~4.8s.

All changes here should be covered by tests already in the codebase.

@shyuep
Copy link
Copy Markdown
Member

shyuep commented Apr 15, 2025

Thanks. but I don't think we need to use string concat? I believe np.loadtxt would be able to handle the text without concat.

@shyuep
Copy link
Copy Markdown
Member

shyuep commented Apr 15, 2025

Example:

In [1]: import numpy as np

In [3]: np.loadtxt(["1 2", "3 4"])
Out[3]:
array([[1., 2.],
       [3., 4.]])

@shyuep
Copy link
Copy Markdown
Member

shyuep commented Apr 15, 2025

Even better, no reshaping needed.

@kavanase
Copy link
Copy Markdown
Contributor Author

Ah yes! Good points. Done ⬆️

@shyuep shyuep merged commit e714e2b into materialsproject:master Apr 15, 2025
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants