Skip to content

Conversation

@MauricePasternak
Copy link
Contributor

Linked issue

Closes #1285

Comments

Tested on 1000+ files in a Population folder and then re-running the study. Noticeable improvement over the old Python version.

As in the original issue, the new version of the program can beat out native gzip in Unix under the condition that at least 3 threads are used. No logic was added for Unix, as this was meant to be a simple language & CLI change, but it wouldn't be difficult to make a check for whether is it reasonable to do so. I leave it to the discretion of reviewers on preferred logic.

Copy link
Member

@HenkMutsaerts HenkMutsaerts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I don't understand: the original SuperGZip was implemented by Michael for Windows specifically. But now you state that you created something that performs faster (un)zipping in Unix? That would be great of course, but we need to keep xASL_adm_GzipAllFiles to work for all platforms.

Minor name comments:

  • a) CamelCase would give SuperGZip.exe
  • b) Why do you call it unzip, the gzip that you replace does both zip and unzip right?

Should we improve xASL_adm_GzipNifti and xASL_adm_UnzipNifti accordingly?

@MauricePasternak
Copy link
Contributor Author

Couple of points:

  • the original SuperGzip was written by me in Python. It was only packaged for Windows. Michael presumably wrote the logic behind invoking the packaged executable from matlab, since my knowledge of version control and general programming outside of Python still wasn't that great. Despite being Python, it still offered a boost over matlab's gzip by virtue of multiprocessing.

  • this version written by me in Rust offers the benefits of using a compiled language over an interpreted one. It's a CLI tool I decided to write as a quick mini project after seeing that gzipping a couple of thousand Nifti files still takes a bit of time.

  • Yes it performs both gzip and unzip. That is part of the reason for the lowercase name difference. The other reason being that CLI-based tools that can be used on Unix should not be uppercase by convention, if I recall correctly.

  • I use Github Workflows to package this newer Rust version for all major desktop platforms.. That shouldn't be an issue.

  • It performs best when asked to batch process many files in one go. Even if Rust > Matlab in terms of speed, the initial overhead of a system call in matlab may not make the venture worthwhile just to gunzip one or two files. So with regards to adding calls to this executable in other ExploreASL functions, I'd first ask whether it'd only be used to gunzip one or two files at a time. If so, not worth it. If used to perform cleanup/startup operations on a whole subject/module/study, then the performance benefits start to shine.

@HenkMutsaerts HenkMutsaerts linked an issue Jan 30, 2023 that may be closed by this pull request
4 tasks
@HenkMutsaerts
Copy link
Member

HenkMutsaerts commented Jan 31, 2023

@MauricePasternak Thanks for clarifying this! If we make this change, it should work with both multi-core processors and single-threaded processors (if ExploreASL is ran in a server, it could be ran single-threaded but with multiple ExploreASL instances).

Can we also address #1052 in here (implementing this also for Linux and macOS)?

Copy link
Member

@HenkMutsaerts HenkMutsaerts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! See also my other comment above

glob_pat = join(['"' ROOT '/**/*.nii"']);
% Get SuperGzip path
PathToSuperGzip = fullfile(pathExternal, 'SuperGZip', 'SuperGZip_Windows.exe');
PathToSuperGzip = fullfile(pathExternal, 'SuperGZip', 'super-gunzip.exe');
Copy link
Member

@HenkMutsaerts HenkMutsaerts Jan 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still not call it super-gunzip.exe if we also (or even more frequently) use it for zipping; fine to rename to super-zip.exe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I'll keep calling it super-gunzip in the original repo & its release page. But when copying over to ExploreASL, will change name to just super-zip.

Copy link
Member

@HenkMutsaerts HenkMutsaerts Feb 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still say that that is weird. No zip package calls itself unzip as primary name right? 7zip not 7unzip. Winrar not WinUnrar. Etc. But perhaps you have another reason to call it this way.

@jan-petr jan-petr self-assigned this Feb 2, 2023
@jan-petr jan-petr self-requested a review February 2, 2023 13:01
@HenkMutsaerts HenkMutsaerts changed the title Closes #1285 Closes #1285_supergzip_rust_implementation_update Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SuperGzip Rust implementation update

4 participants