🚀 Feature
Use multiple processes when extracting ImageNet training archive.
Motivation
I recently extracting the ImageNet training archive with the code of torchvision and was suprised how long it took. I realised that after extracting the main archive, we only extract the subarchives one after another:
|
for archive in archives: |
|
extract_archive(archive, os.path.splitext(archive)[0], remove_finished=True) |
Pitch
I think we can speed that up significantly by using multiple processes to do this simultaneously. IMO doing this would have no drawbacks.
Additional context
If we want this feature, I could take it up, albeit with a low priority.
cc @pmeier
🚀 Feature
Use multiple processes when extracting
ImageNettraining archive.Motivation
I recently extracting the
ImageNettraining archive with the code oftorchvisionand was suprised how long it took. I realised that after extracting the main archive, we only extract the subarchives one after another:vision/torchvision/datasets/imagenet.py
Lines 183 to 184 in 3c254fb
Pitch
I think we can speed that up significantly by using multiple processes to do this simultaneously. IMO doing this would have no drawbacks.
Additional context
If we want this feature, I could take it up, albeit with a low priority.
cc @pmeier