I have a script that creates zip files of dirs containing symlinks. I was surprised to find that the zipfiles have zipped the targets of the links as opposed to the links themselves, which is what I wanted and expected. Anyone know how to get zipfile to zip the links?
7 Answers
It is possible to have zipfile store symbolic links, instead of the files themselves. For an example, see here. The relevant part of the script is storing the symbolic link attribute within the zipinfo:
zipInfo = zipfile.ZipInfo(archiveRoot)
zipInfo.create_system = 3
# long type of hex val of '0xA1ED0000L',
# say, symlink attr magic...
zipInfo.external_attr = 2716663808L
zipOut.writestr(zipInfo, os.readlink(fullPath))
3 Comments
zipInfo.external_attr = 0xA1ED0000L should work (which is slightly more readable?)zipInfo.external_attr |= 0xA0000000stat.S_IFLINKzipfile doesn't appear to support storing symbolic links. The way to store them in a ZIP is actually not part of the format and is only available as a custom extension in some implementations. In particular, Info-ZIP's implementation supports them so you can delegate to it instead. Make sure your decompression software can handle such archives - as I said, this feature is not standardized.
3 Comments
external_attr of the added zinfo contains stat.IFLNK << 16) when I extract the archive with unzip.Please find a complete Python code as a working example that creates a cpuinfo.zip archive with the symbolic link cpuinfo.txt that points to /proc/cpuinfo.
#!/usr/bin/python
import stat
import zipfile
def create_zip_with_symlink(output_zip_filename, link_source, link_target):
zipInfo = zipfile.ZipInfo(link_source)
zipInfo.create_system = 3 # System which created ZIP archive, 3 = Unix; 0 = Windows
unix_st_mode = stat.S_IFLNK | stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR | stat.S_IRGRP | stat.S_IWGRP | stat.S_IXGRP | stat.S_IROTH | stat.S_IWOTH | stat.S_IXOTH
zipInfo.external_attr = unix_st_mode << 16 # The Python zipfile module accepts the 16-bit "Mode" field (that stores st_mode field from struct stat, containing user/group/other permissions, setuid/setgid and symlink info, etc) of the ASi extra block for Unix as bits 16-31 of the external_attr
zipOut = zipfile.ZipFile(output_zip_filename, 'w', compression=zipfile.ZIP_DEFLATED)
zipOut.writestr(zipInfo, link_target)
zipOut.close()
create_zip_with_symlink('cpuinfo.zip', 'cpuinfo.txt', '/proc/cpuinfo')
You can further issue the following commands (e.g. under Ubuntu) to see how the archive unpacks to a working symbolic link:
unzip cpuinfo.zip
ls -l cpuinfo.txt
cat cpuinfo.txt
Comments
I have defined the following method in a Zip support class
def add_symlink(self, link, target, permissions=0o777):
self.log('Adding a symlink: {} => {}'.format(link, target))
permissions |= 0xA000
zi = zipfile.ZipInfo(link)
zi.create_system = 3
zi.external_attr = permissions << 16
self.zip.writestr(zi, target)
Comments
While not part of the POSIX standard, many zip implementations support storing generic filesystem attributes on entries. The high bytes of the 4-byte value represent the file mode.
Essentially you need to replicate ZipInfo.from_file, but without following the link or truncating the mode:
st = os.lstat(path)
mtime = time.localtime(st.st_mtime)
info = zipfile.ZipInfo(name, mtime[0:6])
info.file_size = st.st_size
info.external_attr = st.st_mode << 16
out_zip.writestr(info, os.readlink(path))
1 Comment
Here's what I've tried to improve:
- tested with Python 3.11
- iterative loop instead of recursive.
- preserve the the original symlink attribute (e.g. permission)
import zipfile
import stat
import os
def archive(source, output_path):
def _convert_attr_to_symlink_type(external_attr):
# Refer to https://unix.stackexchange.com/a/14727
# zipfile external_attr is 32 bit file attribute structure
# first 4 bits determine filetype
# next 3 bit setuid, setgid, sticky
# next 9 bit is the read write execute permission for user group & others.
# next 8 bit is unused
# last 8 bit is DOS attribute
# Preserve everything except the first 4 bits (i.e filetype bit)
# MASK: 00001111111111111111111111111111
preserve_mask = (1 << 28) - 1
external_attr &= preserve_mask
# Overwrite File type as Symbolic Link File type (modify first 4 bits)
# MASK: 10100000000000000000000000000000
overwrite_mask = stat.S_IFLNK << 16
external_attr |= overwrite_mask
return external_attr
with zipfile.ZipFile(output_path, mode='w') as zf:
for root, folders, files in os.walk(source):
for folder in folders:
folderpath = os.path.join(root, folder)
if os.path.islink(folderpath):
zip_info = zipfile.ZipInfo.from_file(folderpath)
zip_info.filename = zip_info.filename.rstrip('/')
zip_info.external_attr = _convert_attr_to_symlink_type(zip_info.external_attr)
zf.writestr(zip_info, os.readlink(folderpath))
for filename in files:
filepath = os.path.join(root, filename)
if os.path.islink(filepath):
zip_info = zipfile.ZipInfo.from_file(filepath)
zip_info.external_attr = _convert_attr_to_symlink_type(zip_info.external_attr)
zf.writestr(zip_info, os.readlink(filepath))
else:
zf.write(filepath)
archive('testfolder', test.zip')
Comments
Another way of handling it is using shutil.make_archive but similarly it has a problem with resolving symlinks to the actual target files, I managed to solve it with a hacky approach, but it worked:
import os
import functools
@contextlib.contextmanager
def patch_os_walk():
original_os_walk = os.walk
os.walk = functools.partial(original_os_walk, followlinks=True)
yield os.walk
os.walk = original_os_walk
Then in your client code:
with patch_os_walk():
shutil.make_archive(...)