Skip to content

Add LZMA file support to io.fits#17968

Merged
saimn merged 7 commits intoastropy:mainfrom
kYwzor:lzmafiles
Apr 9, 2025
Merged

Add LZMA file support to io.fits#17968
saimn merged 7 commits intoastropy:mainfrom
kYwzor:lzmafiles

Conversation

@kYwzor
Copy link
Copy Markdown
Member

@kYwzor kYwzor commented Apr 1, 2025

Description

This pull request is to address the lack of LZMA support in io.fits. Fixes #9714

Header magic needs to be expanded to 6 bytes because that's the size of LZMA's (at least according to both The Tukaani Project and Wikipedia). The existing code was already doing magic.startswith, so this should not conflict with the other detections.

The code added is a near copy-paste of the code for bz2, but it seems to work fine. I'll add tests later.

@kYwzor kYwzor requested a review from saimn as a code owner April 1, 2025 15:43
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2025

Thank you for your contribution to Astropy! 🌌 This checklist is meant to remind the package maintainers who will review this pull request of some common things to look for.

  • Do the proposed changes actually accomplish desired goals?
  • Do the proposed changes follow the Astropy coding guidelines?
  • Are tests added/updated as required? If so, do they follow the Astropy testing guidelines?
  • Are docs added/updated as required? If so, do they follow the Astropy documentation guidelines?
  • Is rebase and/or squash necessary? If so, please provide the author with appropriate instructions. Also see instructions for rebase and squash.
  • Did the CI pass? If no, are the failures related? If you need to run daily and weekly cron jobs as part of the PR, please apply the "Extra CI" label. Codestyle issues can be fixed by the bot.
  • Is a change log needed? If yes, did the change log check pass? If no, add the "no-changelog-entry-needed" label. If this is a manual backport, use the "skip-changelog-checks" label unless special changelog handling is necessary.
  • Is this a big PR that makes a "What's new?" entry worthwhile and if so, is (1) a "what's new" entry included in this PR and (2) the "whatsnew-needed" label applied?
  • At the time of adding the milestone, if the milestone set requires a backport to release branch(es), apply the appropriate "backport-X.Y.x" label(s) before merge.

@kYwzor
Copy link
Copy Markdown
Member Author

kYwzor commented Apr 1, 2025

Side note: I don't want to mess with it in this PR to not mix things up, but it seems the LZMA magic is not being properly checked on utils.data.

elif signature[:3] == b"\xfd7z": # xz
if not HAS_LZMA:
for fd in close_fds:
fd.close()
raise ModuleNotFoundError(
"This Python installation does not provide the lzma module."
)
import lzma

Only the first 3 bytes are being checked, but technically the correct magic is 6 bytes long. This means that in theory we could have some false positives (though probably rare, not sure how much we should care about this).

@saimn saimn added this to the v7.1.0 milestone Apr 3, 2025
@saimn
Copy link
Copy Markdown
Contributor

saimn commented Apr 3, 2025

Thanks for the PR, looks good but just needs some tests (for which you could replicate the bz2 tests).

@kYwzor
Copy link
Copy Markdown
Member Author

kYwzor commented Apr 3, 2025

Added tests now and also found some places where the documentation needed to be updated accordingly. I believe this is now ready.

@kYwzor
Copy link
Copy Markdown
Member Author

kYwzor commented Apr 7, 2025

Added a few edge cases that were missing. Ready for final review.

raise ModuleNotFoundError(
"This Python installation does not provide the lzma module."
)
new_file = lzma.LZMAFile(name, mode="w")
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just mimicked the bz2 case in _flush_resize, but I frankly don't fully understand what's happening here. I don't think we have any test case covering this for either gzip, bz2 or lzma, so this wasn't triggering any CI errors. I guess we should create a test case for when a resize is triggered, but I'm not sure what the best approach for that is.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm we explicitly forbid use of update mode with bz2 (and xz) so I don't think this case can be triggered. Would need a deeper dive to check that, but no need to hold that PR because of that.

Copy link
Copy Markdown
Contributor

@saimn saimn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice addition, thanks @kYwzor !

@saimn saimn merged commit c6634ba into astropy:main Apr 9, 2025
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: add lzma (.xz) support to io.fits

2 participants