refactor(lobster): Rewrite LOBSTER parsers with memory-efficient streaming architecture#4592
Merged
shyuep merged 14 commits intomaterialsproject:masterfrom Feb 24, 2026
Merged
Conversation
Member
|
Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR introduces a complete rewrite of the LOBSTER output file parsers in
pymatgen.io.lobster. The new implementation lives inpymatgen.io.lobster.futureand provides a cleaner, more maintainable, and a memory-efficient architecture.Motivation
The existing LOBSTER parsers have several limitations:
_get_lines(), then parsed, resulting in duplicate data (raw text + parsed structures)Key Changes
New Base Class Architecture
New Features
NcICOBILISTparser now fully supports orbital-resolved multi-center COBI data, which was previously ignored with a warningVersion Processor System
Version processors are automatically registered via
__init_subclass__and selected at runtime based on file version detection or user-specified version.Interaction Filtering API
The
get_interactions_by_propertiesandget_data_by_propertiesmethods provide filtering across all interaction-based parsers (COXXCAR, ICOXXLIST, NcICOBILIST):New Module Structure
Migration Path
The existing
pymatgen.io.lobster.outputsmodule remains unchanged for backward compatibility. Users can migrate gradually:Testing
Breaking Changes
None for existing code. The new module is in
pymatgen.io.lobster.futureand does not modify existing APIs.Future Work
@JaGeo @naik-aakash