2

I'm trying to solve reading a zipfile from stdin in python, but I keep getting issues. What I want is to be able to run cat test.xlsx | python3 test.py and create a valid zipfile.ZipFile object without first writing a temporary file if possible.

My initial approach was this, but ZipFile complained the file is not seekable,


import sys
import zipfile

zipfile.ZipFile(sys.stdin)

so I changed it around, but now it complains that this is not a valid zip file:

import io
import sys
import zipfile

zipfile.ZipFile(io.StringIO(sys.stdin.read()))

Can this be solved without writing the zip to a temporary file?

2
  • 1
    Obligatory, don't abuse your cat Commented Jul 27, 2023 at 14:32
  • Not that I don't abuse it often, but this was now for simulating in | ziphandler where in comes from aerc (reading an xslx attachment in the pager). Commented Jul 28, 2023 at 7:44

1 Answer 1

3

Zip files are binary data, not UTF-8 encoded text. You won't be able to read the file into a str with sys.stdin.read() without immediately hitting a UnicodeDecodeError: 'utf-8' codec can't decode byte ... error.

Instead, you can access the underlying binary buffer object to read stdin as raw bytes. Pair that with BytesIO to get an in-memory seekable file-like object:

zipfile.ZipFile(io.BytesIO(sys.stdin.buffer.read()))

Alternatively, if you provide a seekable stdin (for example, by redirecting stdin instead of streaming from a pipe), you can operate on sys.stdin.buffer directly:

zipfile.ZipFile(sys.stdin.buffer)

paired with something like

python3 test.py <test.xlsx

If you care to, you can select between the two depending on whether stdin is seekable by querying the IO object's seekable method:

if sys.stdin.buffer.seekable():
    zip_file = zipfile.ZipFile(sys.stdin.buffer)
else:
    buffer = io.BytesIO(sys.stdin.buffer.read())
    zip_file = zipfile.ZipFile(buffer)

print(zip_file.filelist)
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, that was extremely comprehensive!
Tbh, I should've added you as a co-author on the commit :D github.com/ferdinandyb/xlsx2csv/commit/… if you give me your name/address I can still do so.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.