Added audit for request compression (gzip & br)#1513
Conversation
| category: 'Performance', | ||
| name: 'uses-request-compression', | ||
| description: 'Site uses GZIP/BROTLI compression', | ||
| helpText: 'Requests should be optimized to save network bytes.', |
There was a problem hiding this comment.
maybe add a learn more link? https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/optimize-encoding-and-transfer or something?
| // Filter requests that are on the same host as the page and not over h2. | ||
| const resources = networkRecords.filter(record => { | ||
| return record._resourceType && record._resourceType._isTextType && | ||
| !record._responseHeaders.find(header => |
There was a problem hiding this comment.
I believe I was once told we try to have continuations like this at 4 spaces. Is that right @brendankenny ?
| "gatherers": [] | ||
| }], | ||
|
|
||
| "audits": [ |
There was a problem hiding this comment.
was this meant to be included?
| if (resources.length > 1) { | ||
| displayValue = `${resources.length} requests were not handled with gzip/brottli compression`; | ||
| } else if (resources.length === 1) { | ||
| displayValue = `${resources.length} request was not handled with gzip/brottli compression`; |
There was a problem hiding this comment.
It'd be nice if we could provide an estimate of potential byte and time savings. Even something like https://cs.chromium.org/chromium/src/third_party/WebKit/Source/devtools/front_end/audits/AuditRules.js?q=f:AuditRules+gzip&sq=package:chromium&l=97 SGTM as a start
| static audit(artifacts) { | ||
| const networkRecords = artifacts.networkRecords[Audit.DEFAULT_PASS]; | ||
|
|
||
| // Filter requests that are on the same host as the page and not over h2. |
There was a problem hiding this comment.
oh wait does h2 compress everything by default @brendankenny @ebidel ? this might be a redundant audit then...
There was a problem hiding this comment.
I'm not aware of the two being linked. Why not look at http1.1 responses too?
There was a problem hiding this comment.
looks like h2 still shows content-encoding: gzip. I should check all http requests, it's a comment that I forgot to change 😛
|
good stuff thanks! yay for more byte efficiency audits :) |
| * limitations under the License. | ||
| */ | ||
| /* | ||
| * @fileoverview This audit determines if the images used are sufficiently larger |
There was a problem hiding this comment.
oh and remove this :)
| return { | ||
| category: 'Performance', | ||
| name: 'uses-request-compression', | ||
| description: 'Site uses GZIP/BROTLI compression', |
There was a problem hiding this comment.
"Server responses are compressed using GZIP or BROTLI."
| static audit(artifacts) { | ||
| const networkRecords = artifacts.networkRecords[Audit.DEFAULT_PASS]; | ||
|
|
||
| // Filter requests that are on the same host as the page and not over h2. |
There was a problem hiding this comment.
I'm not aware of the two being linked. Why not look at http1.1 responses too?
| return record._resourceType && record._resourceType._isTextType && | ||
| !record._responseHeaders.find(header => | ||
| header.name.toLowerCase() === 'content-encoding' && | ||
| (header.value === 'gzip' || header.value === 'br') |
There was a problem hiding this comment.
Should we check for deflate too. https://cs.chromium.org/chromium/src/third_party/WebKit/Source/devtools/front_end/audits/AuditRules.js?q=f:AuditRules+gzip&sq=package:chromium&l=121 does.
| const Audit = require('../audit'); | ||
| const Formatter = require('../../formatters/formatter'); | ||
|
|
||
| class UsesRequestCompression extends Audit { |
There was a problem hiding this comment.
WDYT about CompressesResponses as a name? Or ReponsesAreCompressed
|
Thanks for reviewing! I think this is my first audit 🎉 |
| header.name.toLowerCase() === 'content-encoding' && | ||
| (header.value === 'gzip' || header.value === 'br') | ||
| ); | ||
| record._resourceSize > 1000 && |
There was a problem hiding this comment.
should we show violations for all sizes?
There was a problem hiding this comment.
I'm interested in nuking this condition and instead filtering out when the savings are under some millisecond threshold.
There was a problem hiding this comment.
here's WPT's logic on filtering out small wins: " allow a pass if we don't get 10% savings or less than 1400 bytes"
https://github.com/WPO-Foundation/webpagetest/blob/master/agent/wpthook/optimization_checks.cc#L225
There was a problem hiding this comment.
I interpreted this as a safeguard against small files that might increase in file size due to gzipping since we aren't actually gzipping them to see if they'd be smaller, so I'm still in favor of having some sort of threshold on the size.
There was a problem hiding this comment.
Ah that would make sense. Let's add a comment if that's the case.
If we go the threshold route, should we standardize on the same threshold as the image opt audit (and future audits). That's also 10%?
There was a problem hiding this comment.
fwiw over in https://discuss.httparchive.org/t/file-size-and-compression-savings/145 doug sillars used 500 bytes for the same purpose.
should we standardize on the same threshold as the image opt audit (and future audits). That's also 10%?
well, technically the difference between 49kb and 51kb can be an extra round trip (like 30ms) whereas there shouldn't be any difference between 51kb and 99kb. but that's assuming lots of things. let's start simple, sure. :)
There was a problem hiding this comment.
This might be "hard" to measure.. When using h2 the tcp connection is already open which gives us much more bandwidth at a start so the slow start is not applicable, then again we might calculate it based on network packets and time it took.
| record._resourceSize > 1000 && | ||
| !record._responseHeaders.find(header => | ||
| header.name.toLowerCase() === 'content-encoding' && | ||
| compressionTypes.includes(header.value) |
There was a problem hiding this comment.
node 4 will have problems with .includes. You'll have to use indexOf :(
|
|
||
| class UsesRequestCompression extends Audit { | ||
| const KB_IN_BYTES = 1024; | ||
| const compressionTypes = ['gzip', 'br', 'deflate'] |
There was a problem hiding this comment.
travis is complaining about a missing ;
| ); | ||
| }).map(record => { | ||
| const originalKb = record._resourceSize / KB_IN_BYTES; | ||
| const savings = originalKb * 2 / 3; |
There was a problem hiding this comment.
can you explain this savings calculation?
There was a problem hiding this comment.
that was me referencing the devtools gzip audit rule estimation, +1 to adding comment to explain :)
| name: 'uses-request-compression', | ||
| description: 'Site uses GZIP/BROTLI compression', | ||
| helpText: 'Requests should be optimized to save network bytes.', | ||
| description: 'Server responses are compressed using GZIP or BROTLI.', |
There was a problem hiding this comment.
should we include DEFLATE in all the text strings?
| ); | ||
| }).map(record => { | ||
| const originalKb = record._resourceSize / KB_IN_BYTES; | ||
| const savings = originalKb * 2 / 3; |
There was a problem hiding this comment.
Maybe add a quick comment that GZIP savings is relatively predictable so we aren't confused later :)
|
|
||
| let displayValue = ''; | ||
| if (resources.length > 1) { | ||
| displayValue = `${resources.length} requests were not handled with GZIP/BROTTLI compression`; |
There was a problem hiding this comment.
we're moving in all of the byte efficiency audits to report something like XXKB (~XXms) potential savings, I'm working on generic time estimate but for now just sum up the the byte savings of each record and report that?
| const savings = originalKb * 2 / 3; | ||
| const percent = Math.round(savings / originalKb * 100); | ||
|
|
||
| const label = `${Math.round(originalKb)} KB total, GZIP savings: ${percent}%`; |
There was a problem hiding this comment.
I'd like to compute the time waste measured by these requests. Take a look at computeWaste within patrick's #1497 for an idea of bytes -> milliseconds in download time.
I think your byte savings calculation is the right kind of start, but we can bring it all the way to "total download time wasted."
| ); | ||
| }).map(record => { | ||
| const originalKb = record._resourceSize / KB_IN_BYTES; | ||
| const savings = originalKb * 2 / 3; |
There was a problem hiding this comment.
agree with eric, citation needed here. :)
can you find any bulk data about this.. maybe https://discuss.httparchive.org/ has something?
There was a problem hiding this comment.
found this https://discuss.httparchive.org/t/file-size-and-compression-savings/145
check out the potential savings column. It might be interesting to look into how it was calculated, but at first glance it indicates we may want to use these numbers based on resource type rather than a broad generalization.
There was a problem hiding this comment.
I do like this approach. We could keep some kind of mapping table with these values, we could also checkout brottli and show both of them like we do with images.
| // Filter requests that are text based and have gzip/br encoding. | ||
| const resources = networkRecords.filter(record => { | ||
| return record._resourceType && record._resourceType._isTextType && | ||
| record._resourceSize > 1000 && |
There was a problem hiding this comment.
if you want to only consider these, let's add a comment why. (also, good to note this is an uncompressed size)
| const networkRecords = artifacts.networkRecords[Audit.DEFAULT_PASS]; | ||
|
|
||
| // Filter requests that are text based and have gzip/br encoding. | ||
| const resources = networkRecords.filter(record => { |
There was a problem hiding this comment.
first, technically there can be multiple requests for a single resource, so i'd like to use resource and request and response names correctly.
but let's split the filter and the map.
const uncompressedResponses = networkRecords.filter(fn);
const recordsWithWastage = uncompressedResponses.map(this.computeWaste);a31bd90 to
091ed8e
Compare
| tableHeadings: { | ||
| url: 'URL', | ||
| total: 'Original (KB)', | ||
| gzipSavings: 'GZIP Savings (%)', |
There was a problem hiding this comment.
since we're showing a static 66.6% it might make more sense to show in KB :)
There was a problem hiding this comment.
👍 .
also, Is it straightforward to use num.toLocaleString() ? would be nice to get thousands delimiters on these numbers.
also,, that is a BIG page you're looking at. 😀
There was a problem hiding this comment.
calculations where wrong forgot to divide 1024 for bytes.
Yes will update with some string numbers
| const KB_IN_BYTES = 1024; | ||
| const TOTAL_WASTED_BYTES_THRESHOLD = 100 * KB_IN_BYTES; | ||
|
|
||
| class CompressesResponses extends Audit { |
There was a problem hiding this comment.
name is a little awkward. how about "ResponseCompression"
| static filterUnoptimizedResponses(networkRecords) { | ||
| return networkRecords.reduce((prev, record) => { | ||
| const isTextBasedResource = record._resourceType && record._resourceType._isTextType; | ||
| const isContentEncoded = isTextBasedResource && |
There was a problem hiding this comment.
i'd be in favor of just checking the response headers for this boolean. leave isTextBasedResource and recordSize out of it.
return early if its not text based or it is but no recordsize.
| * @return {!Array<{url: string, isBase64DataUri: boolean, mimeType: string, resourceSize: number}>} | ||
| */ | ||
| static filterUnoptimizedResponses(networkRecords) { | ||
| return networkRecords.reduce((prev, record) => { |
There was a problem hiding this comment.
is it important to do this inside of a reduce?
There was a problem hiding this comment.
that was probably copied from my optimized-images gatherer where it's also become unnecessary since going map -> array :)
There was a problem hiding this comment.
yeah indeed, wanted to keep it the same
There was a problem hiding this comment.
It looks like this reduce might be a problem.. perhaps the return prev below?
lighthouse --perf --view http://www.al.com/news/huntsville/index.ssf/2017/03/nasa_gives_mars_orbiter_a_boos.htmlsee the duplicate entries? AFAICT these requests do not repeat, so we're incorrect.

When you figure out what's happening here, can you add a test to cover it?
edit: i ran again on this URL and got some repeats, but it wasn't nearly as bad..
There was a problem hiding this comment.
How did you find this site this is crazy!
My findings: the reduce is not the problem assuming DevTools is not broken, every request has a unique request ID and all requests for the duplicate urls have different start or end times.
Furthermore, loading the page in a clean profile with the same emulation settings yields entirely different network records for me! The triplicate URLs aren't even loaded once. Lighthouse shows ~600 network records while DevTools I can only get max ~370.
My money is on the site actually requesting the same resource more than once, this does not have preserveQuery enabled which masks more errors, and somehow Chrome is making distinct requests (or DevTools is reporting erroneous information) that look in every other respect to be identical, but I have no idea why there's such a stark difference in LH loading the page and vanilla Chrome.
There was a problem hiding this comment.
Now i remove all duplicates that have the same display name and kb.
There was a problem hiding this comment.
k thx for looking into this. :)
ditto on the reduce, again.
| const textRecords = ResponseCompression.filterUnoptimizedResponses(networkRecords); | ||
|
|
||
| return textRecords.map(record => { | ||
| // GZIP saving is predictable (@see https://cs.chromium.org/chromium/src/third_party/WebKit/Source/devtools/front_end/audits/AuditRules.js?q=f:AuditRules+gzip&sq=package:chromium&l=97) |
There was a problem hiding this comment.
I'd change "gzip saving is predictable" to "Compression savings can be roughly estimated"
And lets link to the discuss.httparchive thread instead, as that clearly had more research involved. :)
| // GZIP saving is predictable (@see https://cs.chromium.org/chromium/src/third_party/WebKit/Source/devtools/front_end/audits/AuditRules.js?q=f:AuditRules+gzip&sq=package:chromium&l=97) | ||
| const gzipSize = record.resourceSize * 2 / 3; | ||
|
|
||
| return Object.assign({}, { |
There was a problem hiding this comment.
arr.map + object.assign for a single property seems like some overkill. let's make this a tad more simple/
|
|
||
| return textRecords.map(record => { | ||
| // GZIP saving is predictable (@see https://cs.chromium.org/chromium/src/third_party/WebKit/Source/devtools/front_end/audits/AuditRules.js?q=f:AuditRules+gzip&sq=package:chromium&l=97) | ||
| const gzipSize = record.resourceSize * 2 / 3; |
There was a problem hiding this comment.
The data from https://discuss.httparchive.org/t/file-size-and-compression-savings/145 shows this
| Type | Potential Savings |
|---|---|
| css | 47% |
| html | 32% |
| json | 50% |
| plain | -2% |
| script | 27% |
| xml | 35% |
By comparison we're saying everything should expect 33%.
I think we should use these numbers for our estimation. (Select an expected savings based on request.resourceType). We're still hand-wavy but it's nice to be on the right side of data.
There was a problem hiding this comment.
This is awesome! I'd love to be able to report more accurate estimates based on the makeup of the file types although I'm not sure I trust the methodology behind these numbers and think we'll end up underreporting savings (aside: the current AuditRules.js also says 66% savings not 33%).
27% for script sounds really low especially when you look around at the compression ratios for various libraries and Ilya's table in the read more link.
Specifically, I distrust this assumption.
If we assume that the files >500 bytes are evenly distributed between compressed/uncompressed
I would hope, though this may very naive, that larger, more sophisticated applications with lots of script use GZIP at a higher rate than that of smaller applications with less script. (example: new developer copy/pastes the CDN tag for jQuery/Angular/etc which will be gzipped and then writes a couple of his own files not gzipped at 1/10th the size of the library, which leads to the incorrect conclusion that gzip increases file size). Imprecise methodology still is better than no methodology at all though, just wanted to flag it as needing further investigation.
There was a problem hiding this comment.
I can use zlib which is included in node.
so I can do gzip compression in node
| */ | ||
| static filterUnoptimizedResponses(networkRecords) { | ||
| return networkRecords.reduce((prev, record) => { | ||
| const isTextBasedResource = record._resourceType && record._resourceType._isTextType; |
There was a problem hiding this comment.
record.resourceType().isTextType() should be all you need here. ;)
| const isTextBasedResource = record._resourceType && record._resourceType._isTextType; | ||
| const isContentEncoded = isTextBasedResource && | ||
| record._resourceSize && | ||
| !record._responseHeaders.find(header => |
| return networkRecords.reduce((prev, record) => { | ||
| const isTextBasedResource = record._resourceType && record._resourceType._isTextType; | ||
| const isContentEncoded = isTextBasedResource && | ||
| record._resourceSize && |
|
|
||
| if (isTextBasedResource && isContentEncoded) { | ||
| prev.push({ | ||
| url: record._url, |
There was a problem hiding this comment.
record.url(), record.mimeType(), record.resourceSize (note firs two are methods, last is a getter ;)
| const traceData = { | ||
| networkRecords: [ | ||
| { | ||
| _url: 'http://google.com/index.js', |
There was a problem hiding this comment.
I see now that the public accessors for the properties I recommended above would make it more difficult for your tests. Hmmm.
TBH it would be best if you construct these via new SDK.NetworkRequest(). There are some examples of this in the devtools own tests: https://cs.chromium.org/search/?q=%22new+SDK.NetworkRequest%22+f:layouttests&sq=package:chromium&type=cs
This will be a little overhead for you, but good in the long run in case devtools internals change and our tests assume old APIs. (hard to detect that. :)
There was a problem hiding this comment.
All I want is to learn new stuff so 👍
There was a problem hiding this comment.
@paulirish yeah I've run into this a number of times now too, at first I was following the existing pattern of the private variables, but have mixed in the public accessors in later audits. Are we trying to standardize on the public ones now?
There was a problem hiding this comment.
it's a little unfair to make you do this @wardpeet since every other change using network records mixes these regularly, including one landed yesterday :P But if you're willing, that would be great to have this solved.
If it gets too involved feel free to push off to a followup PR so you can get this one in.
There was a problem hiding this comment.
Also, a little unfair that devtools is inconsistent with what is a getter property and what is a getter method, leading to confusing mismatches. example: it's record.url and record.mimeType but record.resourceType()
There was a problem hiding this comment.
It's a bit confusing but I don't mind. After I get it right I can convert the other audits ^^
There was a problem hiding this comment.
i think we can continue to use the private _ props. :)
There was a problem hiding this comment.
so should I move my code around to use privates again? :)
209d8cd to
5dfe5ce
Compare
|
@wardpeet heyoooo how we doing here? looks like recent commit uses zlib, although its not yet in package.json. are you ready for review? |
|
@paulirish zlib is inside nodejs :) so no package.json necessary. Yeah ready to review |
|
pinging @paulirish @patrickhulce @ebidel @brendankenny for a review |
patrickhulce
left a comment
There was a problem hiding this comment.
Looks good overall just mostly rebase needed for a few things
| const KB_IN_BYTES = 1024; | ||
| const TOTAL_WASTED_BYTES_THRESHOLD = 100 * KB_IN_BYTES; | ||
|
|
||
| class ResponsesAreCompressed extends Audit { |
There was a problem hiding this comment.
It's been a while and this might be a pain (sorry!), but there's a new byte-efficiency folder and base audit you can extend that should eliminate a bit of duplicated code
| 'use strict'; | ||
|
|
||
| const Gatherer = require('../gatherer'); | ||
| const zlib = require('zlib'); |
There was a problem hiding this comment.
How does the built-in browserify? Do we need to do anything special for extension/DevTools?
There was a problem hiding this comment.
i'll have a look and report back
There was a problem hiding this comment.
I was more speaking to the check that there's no random native dependency that won't be bundle-able but good to be aware of ~100k increase too :)
There was a problem hiding this comment.
so should I look into other waters to see how we can switch pako? Or is 171kb something we wont care about?
There was a problem hiding this comment.
we can manage with 171k IMO.
@wardpeet bonus points if you can pass pako through a very basic minification before it gets bundled up. (assuming there's savings there)
| "uses-optimized-images": {}, | ||
| "uses-responsive-images": {} | ||
| "uses-responsive-images": {}, | ||
| "uses-request-compression": {} |
There was a problem hiding this comment.
This needs a rebase since the objects here are required now for the auto-expanding of categories to work (this also got moved to perf recently I believe)
250839b to
848e173
Compare
|
@patrickhulce done with rebase :) Now the threshold is 100kb but maybe it's too little? It's around 250ms of waste |
I'm in favor of making the threshold for this one lower than the others since it's basically a no-brainer compared to the pain of webp specific loading, finding unused rules, and generating many different size of images, etc. |
| const gzipSavings = originalSize - gzipSize; | ||
|
|
||
| // allow a pass if we don't get 10% savings or less than 1400 bytes | ||
| if (gzipSize / originalSize > 0.9 || gzipSavings < IGNORE_THRESHOLD_IN_BYTES) { |
There was a problem hiding this comment.
nit: want to throw the ratio into a const too? IGNORE_THRESHOLD_IN_PERCENT or something?
| } | ||
|
|
||
| return new Promise((resolve) => { | ||
| return zlib.gzip(content, (err, res) => { |
| const fakeNetworkAgent = { | ||
| enable() {} | ||
| enable() {}, | ||
| getResponseBody(requestId, onComplete) { |
There was a problem hiding this comment.
kinda wonky that we have to do this opaquely through the resourceContent() instead of explicitly in the gatherer that we need it :/ @paulirish the decision was based on wanting to use NetworkRequest from devtools more aggressively right?
There was a problem hiding this comment.
yah. i think we'd end up with a helper method in driver otherwise.. and it seems slightly more convenient to call a method on the record itself rather than driver.requestContent(networkRecord).then(..
so i'm supportive of mocking the real API here, yeah. :)
|
deleted stuff about double commits |
|
@patrickhulce @paulirish any luck you guys could give this a spin again? |
patrickhulce
left a comment
There was a problem hiding this comment.
sorry I've been so lagging! everything a-ok on my end, but I'll defer to @paulirish for final 👍
|
|
||
| const IGNORE_THRESHOLD_IN_BYTES = 1400; | ||
| const IGNORE_THRESHOLD_IN_PERCENT = 0.9; | ||
| const TOTAL_WASTED_BYTES_THRESHOLD = 10 * 1024; // 5KB |
| /* eslint-env mocha */ | ||
|
|
||
| describe('Page uses optimized responses', () => { | ||
| it('fails when reponses are collectively unoptimized', () => { |
There was a problem hiding this comment.
nit: s/reponses/responses/, same below :)
| return optimizedResponses.afterPass(options, createNetworkRequests(traceData)) | ||
| .then(artifact => { | ||
| assert.equal(artifact.length, 2); | ||
| checkSizes(artifact[0], 6, 26); |
There was a problem hiding this comment.
nit: since this is only used twice and not in a bunch of functions i'd have a preference to switch to straight up asserts so it's a little clearer what's happening. two fewer lines too 🎉
| ], | ||
| }); | ||
|
|
||
| assert.equal(auditResult.passes, false); |
There was a problem hiding this comment.
nit: assert a length on the extended info to make sure the ones you want filtered are filtered?
paulirish
left a comment
There was a problem hiding this comment.
pretty sure we're good to go after this round.
|
|
||
| // allow a pass if we don't get 10% savings or less than 1400 bytes | ||
| if ( | ||
| gzipSize / originalSize > IGNORE_THRESHOLD_IN_PERCENT || |
There was a problem hiding this comment.
little odd to flip the > on these two lines. can we keep it consistent? (probably should change the 0.9 above)
There was a problem hiding this comment.
Is this ok?
IGNORE_THRESHOLD_IN_PERCENT = 0.1
1 - gzipSize / originalSize < IGNORE_THRESHOLD_IN_PERCENT
| const uncompressedResponses = artifacts.ResponseCompression; | ||
|
|
||
| let totalWastedBytes = 0; | ||
| const results = uncompressedResponses.reduce((results, record) => { |
There was a problem hiding this comment.
i know i mentioned it before but this reduce() really kills me. let's forEach.
all patrick's fault. :)
| 'use strict'; | ||
|
|
||
| const Gatherer = require('../gatherer'); | ||
| const zlib = require('zlib'); |
There was a problem hiding this comment.
we can manage with 171k IMO.
@wardpeet bonus points if you can pass pako through a very basic minification before it gets bundled up. (assuming there's savings there)
| * @return {!Array<{url: string, isBase64DataUri: boolean, mimeType: string, resourceSize: number}>} | ||
| */ | ||
| static filterUnoptimizedResponses(networkRecords) { | ||
| return networkRecords.reduce((prev, record) => { |
There was a problem hiding this comment.
k thx for looking into this. :)
ditto on the reduce, again.
| const gzipSize = record.gzipSize; | ||
| const gzipSavings = originalSize - gzipSize; | ||
|
|
||
| // allow a pass if we don't get 10% savings or less than 1400 bytes |
There was a problem hiding this comment.
lets be more clear here
we require at least 10% savings off the original size AND at least 1400 bytes
if the savings is smaller than either, we don't care
| .ignore('url') | ||
| .ignore('debug/node'); | ||
| .ignore('debug/node') | ||
| .ignore('pako/lib/zlib/inflate.js'); |
There was a problem hiding this comment.
can you also ignore pako/lib/inflate.js ?
There was a problem hiding this comment.
fair. I was just looking at the source tree of pako. I guess the require that get done in browserify is selective already.
:)
- Moved to wastedBytes approach and wastedTime
|
CLAs look good, thanks! |






Still need to add some unit tests but you can have a look if this is something we need.