Better parallelisation of exiftool for faster report generation

In https://github.com/mattburns/exiftool.js-test/blob/master/test.js#L66 you invoke a new instance of exiftool for every new image found. This is not terribly efficient, since there is a huge overhead in starting exiftool (perl interpreter warmup, load modules,...) and we are doing this for every sample image we find.

Better approaches would be
a) invoke exiftool once and let it do the batch processing (e.g. exiftool <OTHER OPTIONS> *.jpg -w .jpg.json) - might require some refactoring in the report generation
b) use the -stay_open option of exiftool together with an ARGFILE where we write the commands to run on each image. Here exiftool stays in memory and executes the commands written to the ARGFILE until we write a terminate command there.

Both approaches can bring speedups of up to [60 times compared to single-command invocation](http://u88.n24.queensu.ca/exiftool/forum/index.php/topic,4134.msg19575.html?PHPSESSID=vom6t68vj55i1fee6u91pt2hs5#msg19575). Actually approach b) could even bring a better performance, since we can prefork multiple "daemonized" instances of exiftool and share the work between them.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better parallelisation of exiftool for faster report generation #20

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Better parallelisation of exiftool for faster report generation #20

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions