Skip to content

Better parallelisation of exiftool for faster report generation #20

@fbuchinger

Description

@fbuchinger

In https://github.com/mattburns/exiftool.js-test/blob/master/test.js#L66 you invoke a new instance of exiftool for every new image found. This is not terribly efficient, since there is a huge overhead in starting exiftool (perl interpreter warmup, load modules,...) and we are doing this for every sample image we find.

Better approaches would be
a) invoke exiftool once and let it do the batch processing (e.g. exiftool *.jpg -w .jpg.json) - might require some refactoring in the report generation
b) use the -stay_open option of exiftool together with an ARGFILE where we write the commands to run on each image. Here exiftool stays in memory and executes the commands written to the ARGFILE until we write a terminate command there.

Both approaches can bring speedups of up to 60 times compared to single-command invocation. Actually approach b) could even bring a better performance, since we can prefork multiple "daemonized" instances of exiftool and share the work between them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions