Study of progressive vs baseline JPEG encoding

Methodology and samples

Get 20K random URLs from HTTPArchive

SELECT url FROM `httparchive.summary_requests.2022_07_01_desktop` where ext='jpg' order by rand() limit 20000

Save results as urls.csv

$ split -l 500 into smaller files

Open 10 terminal tabs, start downloading and watching a movie. If anything hangs, kill it.

$ wget -T 30 -t 1 -i ../urls1
$ wget -T 30 -t 1 -i ../urls2
#...

Stop at some point

1,465,606,852 bytes (1.58 GB on disk) for 14,511 items

Rename files with indexes:

$ ls -v | cat -n | while read n f; do mv -n "$f" "$n.jpg"; done

Find non-jpegs:

$ identify -regard-warnings *.jpg > ../log.txt
$ node nonjpeg.js > rm.sh
$ sh rm.sh

How many are progressive in the raw data:

$ identify -format "%f,%[interlace]\n" *.jpg > ../prog-or-not.csv
$ node prog-or-not.js
{ prog: 4229, base: 9896 } # 29.94% prog

Start generating optimization scripts and run them, e.g.

$ node opt.sh.js > opt-tran-prog.sh
$ date && sh opt-tran-prog.sh && date
Sat Jul 23 12:28:31 PDT 2022
Sat Jul 23 12:34:56 PDT 2022

Gather stats for all files and sizes:

$ node stats.js > stats.csv

.. and just for moz to see how many are progressive now

identify -format "%f,%b,%[quality],%[interlace]\n" *.jpg > ../stats-moz.csv

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
_sample.log.txt		_sample.log.txt
_sample_opt-moz-base.sh		_sample_opt-moz-base.sh
_sample_opt-moz-prog.sh		_sample_opt-moz-prog.sh
_sample_opt-tran-base.sh		_sample_opt-tran-base.sh
_sample_opt-tran-prog.sh		_sample_opt-tran-prog.sh
_sample_rm.sh		_sample_rm.sh
_sample_urls.csv		_sample_urls.csv
moz30k.png		moz30k.png
moz52k.png		moz52k.png
moz666.png		moz666.png
nonjpeg.js		nonjpeg.js
nywebperf2022.pdf		nywebperf2022.pdf
opt.sh.js		opt.sh.js
prog-or-not.js		prog-or-not.js
stats-graphs.xlsx		stats-graphs.xlsx
stats-moz.csv		stats-moz.csv
stats-savings.xlsx		stats-savings.xlsx
stats-source.csv		stats-source.csv
stats.csv		stats.csv
stats.js		stats.js
tran-30k.png		tran-30k.png
tran-52k.png		tran-52k.png
tran-all.png		tran-all.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Study of progressive vs baseline JPEG encoding

Methodology and samples

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Study of progressive vs baseline JPEG encoding

Methodology and samples

Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages