Skip to content

Zip command line character limit error fix#452

Merged
jefferis merged 2 commits intonatverse:masterfrom
jonmarty:patch-2
Jan 26, 2021
Merged

Zip command line character limit error fix#452
jefferis merged 2 commits intonatverse:masterfrom
jonmarty:patch-2

Conversation

@jonmarty
Copy link
Copy Markdown
Contributor

@jonmarty jonmarty commented Jan 4, 2021

The following change resolves an issue in which a sufficiently long neuronlist would lead to a large number of files being passed into the zip command, thus exceeding the command line character limit when the 'zip' command is called, causing the zip function to error out.

The following change resolves an issue in which a sufficiently long neuronlist would lead to a large number of files being passed into the zip command, thus exceeding the command line character limit when the 'zip' command is called, causing the zip function to error out.
@coveralls
Copy link
Copy Markdown

coveralls commented Jan 4, 2021

Coverage Status

Coverage increased (+0.003%) to 81.315% when pulling e220a91 on jonmarty:patch-2 into 42ca544 on natverse:master.

Copy link
Copy Markdown
Collaborator

@jefferis jefferis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear @jonmarty, many thanks for your interest in nat and taking the time to make a PR – much appreciated.

I think the only implication of this change is that files may not be added to the zip file in they order that they are found in the input neuronlist. I don't think this is a big deal, but there could be some code somewhere that relies on this, so I am just hesitating.

I was wondering if you have actually encountered the error that motivates this issue? The reason is that the R zip function actually includes code that tries to fix exactly this issue, dating back two years now:

https://github.com/wch/r-source/blame/e767ea66c8684b897e07d174e90ef5ae0cdd2c87/src/library/utils/R/zip.R#L110-L120

Does this fail in your case? Thanks again,

Greg.

@jonmarty
Copy link
Copy Markdown
Contributor Author

jonmarty commented Jan 6, 2021

Hello Greg,

I did run into that issue. I was attempting to store the all the neurons
in the FCWB brain (about 19,000) in a zip file -- which is probably a
pretty extreme case. While the code does successfully write all the .swc
files to a 'dir', it does not then create 'dir.zip' before deleting
'dir'. I wrote all the .swc files to a directory and then tried running
the zip function on its own, which gave no print output and created no files for the full FCWB brain (stored in the 'dps' neuronlist)

dir = 'dps'
zip_file = 'dps.zip'
zip(zip_file, files=dir(dir, recursive=TRUE, full.names=TRUE))

And for a smaller neuronlist

dir='skel.fc'
zip_file='skel.fc.zip'
zip(zip_file, files=dir(dir, recursive=TRUE, full.names=TRUE))
adding: skel.fc/1664980698.swc (deflated 53%)
adding: skel.fc/1884625521.swc (deflated 55%)
adding: skel.fc/1975347348.swc (deflated 53%)
adding: skel.fc/2007068523.swc (deflated 53%)
adding: skel.fc/2065745704.swc (deflated 54%)
adding: skel.fc/2068801704.swc (deflated 53%)
adding: skel.fc/264083994.swc (deflated 55%)
adding: skel.fc/296544364.swc (deflated 55%)
adding: skel.fc/324846570.swc (deflated 54%)
adding: skel.fc/325529237.swc (deflated 55%)
adding: skel.fc/356818551.swc (deflated 55%)
adding: skel.fc/386834269.swc (deflated 55%)
adding: skel.fc/387166379.swc (deflated 55%)
adding: skel.fc/387944118.swc (deflated 54%)
adding: skel.fc/448260940.swc (deflated 55%)
adding: skel.fc/450034902.swc (deflated 55%)
adding: skel.fc/480029788.swc (deflated 55%)
adding: skel.fc/511051477.swc (deflated 55%)
adding: skel.fc/5813001741.swc (deflated 55%)
adding: skel.fc/5813010153.swc (deflated 55%)
adding: skel.fc/5813021192.swc (deflated 55%)
adding: skel.fc/5813022274.swc (deflated 55%)
adding: skel.fc/5813026773.swc (deflated 55%)
adding: skel.fc/5813056917.swc (deflated 55%)
adding: skel.fc/5813064789.swc (deflated 55%)
adding: skel.fc/5813069648.swc (deflated 55%)
adding: skel.fc/5813071319.swc (deflated 55%)

Also, since the zip function uses the '-r' argument on the command line,
the changed line should have the same effect as the previous one.

Thanks,
Jonathan

@jefferis
Copy link
Copy Markdown
Collaborator

jefferis commented Jan 7, 2021

Thanks for the additional info @jonmarty!

To clarify my point about sort order, when you pass a directory name to zip, the files are added in the lexographical sort order recursing through the directory tree. Whereas when you pass in a file list, then they are added in that specific order.

I still need to think about exactly why it is failing in your case. It could be that it's actually related to the zip utility's handling of many files when explicitly specified rather than a problem with argument length since that should already be fixed by the R zip function. Could I ask which OS and R version you are on? And what does

> Sys.getenv("R_ZIPCMD")
[1] "/usr/bin/zip"
> system2(Sys.getenv("R_ZIPCMD"), args = '-h')
Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
Zip 3.0 (July 5th 2008). Usage:
zip [-options] [-b path] [-t mmddyyyy] [-n suffixes] [zipfile list] [-xi list]
  The default action is to add or replace zipfile entries from list, which
  can include the special name - to compress standard input.
  If zipfile and list are omitted, zip compresses stdin to stdout.
  -f   freshen: only changed files  -u   update: only changed or new files
  -d   delete entries in zipfile    -m   move into zipfile (delete OS files)
  -r   recurse into directories     -j   junk (don't record) directory names
  -0   store only                   -l   convert LF to CR LF (-ll CR LF to LF)
  -1   compress faster              -9   compress better
  -q   quiet operation              -v   verbose operation/print version info
  -c   add one-line comments        -z   add zipfile comment
  -@   read names from stdin        -o   make zipfile as old as latest entry
  -x   exclude the following names  -i   include only the following names
  -F   fix zipfile (-FF try harder) -D   do not add directory entries
  -A   adjust self-extracting exe   -J   junk zipfile prefix (unzipsfx)
  -T   test zipfile integrity       -X   eXclude eXtra file attributes
  -y   store symbolic links as the link instead of the referenced file
  -e   encrypt                      -n   don't compress these suffixes
  -h2  show more help

give for you.

@jefferis
Copy link
Copy Markdown
Collaborator

Dear @jonmarty, could I just ping you re my questions above? I haven't been able to reproduce your issue on my machine, so I wonder if it might be an OS interaction or possibly about path lengths.
In addition, I've decided to switch behaviour to your fix when the neuronlist is greater than some threshold number of neurons. Do you have any idea when the problem became evident to you? 1000 neurons, 5000, 10000?

@jonmarty
Copy link
Copy Markdown
Contributor Author

jonmarty commented Jan 25, 2021

Sorry for the late response. I found that I did have an old version of zip, although updating it did not resolve the issue. I was having the issue mentioned above when I was saving all the neurons in the FCWB brain, which contains 16,000 neurons. I tested out a couple different numbers of neurons and found that the problem arises around 8,000 neurons (however, 5000 worked), with write.neurons erring with "sh: /usr/bin/zip: Argument list too long". I'm on mac OS Catalina 10.15.17 and use R version 3.4.1. The commands you specified give the following output:

> Sys.getenv("R_ZIPCMD")
[1] "/usr/bin/zip"
> system2(Sys.getenv("R_ZIPCMD"), args = '-h')
Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
Zip 3.0 (July 5th 2008). Usage:
zip [-options] [-b path] [-t mmddyyyy] [-n suffixes] [zipfile list] [-xi list]
  The default action is to add or replace zipfile entries from list, which
  can include the special name - to compress standard input.
  If zipfile and list are omitted, zip compresses stdin to stdout.
  -f   freshen: only changed files  -u   update: only changed or new files
  -d   delete entries in zipfile    -m   move into zipfile (delete OS files)
  -r   recurse into directories     -j   junk (don't record) directory names
  -0   store only                   -l   convert LF to CR LF (-ll CR LF to LF)
  -1   compress faster              -9   compress better
  -q   quiet operation              -v   verbose operation/print version info
  -c   add one-line comments        -z   add zipfile comment
  -@   read names from stdin        -o   make zipfile as old as latest entry
  -x   exclude the following names  -i   include only the following names
  -F   fix zipfile (-FF try harder) -D   do not add directory entries
  -A   adjust self-extracting exe   -J   junk zipfile prefix (unzipsfx)
  -T   test zipfile integrity       -X   eXclude eXtra file attributes
  -y   store symbolic links as the link instead of the referenced file
  -e   encrypt                      -n   don't compress these suffixes
  -h2  show more help

@jefferis
Copy link
Copy Markdown
Collaborator

Dear @jonmarty, thanks so much for that additional information. The key point is that you are using R 3.4.1. I almost certain that your issue is resolved by this change

https://github.com/wch/r-source/blame/e767ea66c8684b897e07d174e90ef5ae0cdd2c87/src/library/utils/R/zip.R#L110

which was included in the R 3.6.0 release (see NEWS) in April 2019. I will add a small conditional and then merge.

* zip can cope with long file lists since R 3.6.0
* out of an abundance of caution only use new behaviour before that
* See discussion at natverse#452 for details
* https://github.com/wch/r-source/blame/e767ea66c8684b897e07d174e90ef5ae0cdd2c87/src/library/utils/R/zip.R#L110
@jefferis jefferis merged commit 7f665de into natverse:master Jan 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants