new_audit(legacy-javascript): detect unnecessary polyfills and transforms#6730
new_audit(legacy-javascript): detect unnecessary polyfills and transforms#6730connorjclark wants to merge 77 commits into
Conversation
|
DZL, do a barrel roll! |
|
DZL is done! Go check it out http://lh-dzl-6730.surge.sh |
There was a problem hiding this comment.
First off, I 😍 this!!
I'll say that I think we should take a slightly stronger position on polyfills. Even if users legitimately need polyfills for old IE, there's no reason they need to serve them to their Chrome users (which we know the version of the page we're looking at is the Chrome-one). I'd be happy marking this as a failure with language that points them on how to do the philwalton-style conditional serving.
Detecting modules that reimplement native JS features, but don't actually pollyfil anything. For example, bundling lodash/startsWith for a web target, and using that instead of String.prototype.startsWith. Bundlers tend to strip module names in production, so such detection likely requires source maps. Otherwise, we're left with analyzing what these modules look like minified. This could certainly be explored further in a separate feature.
I think this is a perfect candidate for the bundle analysis initiative, agreed it should wait for another time 👍
If a polyfill appears multiple times from multiple scripts, I display a counter in the report. I imagine this would only be actionable if the polyfills are coming from the same origin - if two third party scripts are pollyfilling the same thing, there's nothing really to be done. Should we suggest action only if multiple pollyfills are coming from the same origin? If so, what's a good way to present that suggestion?
I think polyfills from third party origins is already going to be a thorny/unsolvable issue for the user, so I don't think it terribly matters. Removing any unnecessary polyfill is a win, so I think showing each of them separately like you are is 👍
Is this useful at all? Should this be removed and only added back with source map support?
I think including it is fine, agreed it's not very useful, but it's better than nothing.
| // TODO this would only work if the bundle has not been stripped of module names (and replaced with an index into the bundled modules), | ||
| // which is unlikely in a production environment. Instead, look at source map? | ||
| // TODO how to detect if all of lodash is bundled (a waste)? | ||
| // TODO maybe just punt entirely for now? |
| let subpattern = ''; | ||
|
|
||
| // String.prototype.startsWith = | ||
| subpattern += `${object || ''}.${property}\\s*=`; |
There was a problem hiding this comment.
is that meant to be an optional dot literal? \\.?
| static async audit(artifacts) { | ||
| // TODO | ||
| // how do we determine which polys are not necessary? | ||
| // ex: only reason to polyfill Array.prototype.filter is if IE <9, |
There was a problem hiding this comment.
Our general position on this in the past has been that we know we're loading the page from Chrome, even if you need them for IE, you shouldn't be serving them to your Chrome users.
There was a problem hiding this comment.
That makes a lot of sense, I did not consider that. Very cool that many ES2015+ features can be feature-detected with <script> modules.
| const url = record.url; | ||
| if (content) { | ||
| scriptContentMap[record.requestId] = content; | ||
| scriptContentMap[record.requestId] = { |
There was a problem hiding this comment.
it was done this way originally because everything you want about the record (like URL) can be matched up by the requestId. url is a super common need, so I'm not terribly opposed to having it here, but I wouldn't want that list to grow to other things that can be plucked from the network record.
There was a problem hiding this comment.
Would I necessarily need to create a gather for this then? That's what I wanted to avoid.
There was a problem hiding this comment.
Nope, shouldn't need any gatherer changes at all, just use Scripts artifact as-is and get the URL from const url = networkRecords.find(record => record.requestId === requestId).url
will have to compute network records from devtools log, but not the end of the world
There was a problem hiding this comment.
Sweet - I reverted those changes.
| let result; | ||
| let row = 0; | ||
| let rowBeginsAtIndex = 0; | ||
| while ((result = re.exec(code)) !== null) { |
There was a problem hiding this comment.
I'm gonna at least need a comment here :)
the ...matches below made me think the results were already somehow all inside one regex match, but the .exec suggests we need to iterate through them all
the ..matches is just the subgroups of each match?
There was a problem hiding this comment.
yeah, and only one subgroup is ever defined in each matches. I agree this is arcane bordering on magic. I'll comment accordingly.
|
DZL is done! Go check it out http://lh-dzl-6730.surge.sh |
|
DZL is done! Go check it out http://lh-dzl-6730.surge.sh |
patrickhulce
left a comment
There was a problem hiding this comment.
not sure where this sits on the priority chain if it's still on your plate, but I'm definitely excited :)
| const i18n = require('../lib/i18n/i18n.js'); | ||
|
|
||
| const UIStrings = { | ||
| /** Imperative title of a Lighthouse audit that tells the user about all JavaScript polyfills loaded on the page. This is displayed in a list of audit titles that Lighthouse generates. */ |
There was a problem hiding this comment.
this one's not imperative :)
| title: 'Polyfills', | ||
| /** TODO: write this */ | ||
| // eslint-disable-next-line max-len | ||
| description: 'Polyfills enable older browsers to use new JavaScript language features. However, they aren\'t always necessary. Research what browsers you must support and consider removing polyfils for features that are well supported by them.', |
There was a problem hiding this comment.
Maybe our advice here is more like Consider conditionally serving polyfills based on feature availability.?
| // only reason to polyfill String.prototype.startsWith is if IE, ... | ||
| // Is there a standard way to declare "we support these browsers"? A meta tag? | ||
| /** @type {Poly[]} */ | ||
| const polys = [ |
There was a problem hiding this comment.
maybe we could pull some of this code out of the main audit function and break it down a bit?
| {key: 'url', itemType: 'url', text: 'URL'}, | ||
| {key: 'description', itemType: 'code', text: 'Description'}, | ||
| {key: 'location', itemType: 'code', text: 'Location'}, | ||
| // TODO include estimate for size based on https://www.npmjs.com/package/mdn-polyfills#supported-polyfills ? |
There was a problem hiding this comment.
this is interesting, we could then make it an opportunity instead of a diagnostic too!
There was a problem hiding this comment.
I think it might be hilariously small of an impact, though. The average in that table is 300 bytes. Say there's a dozen - about 4KB. is that significant?
There was a problem hiding this comment.
We'd probably have to get a little more aggressive then. Some of the babel polyfills are no joke, maybe async/await IIRC, but the impact was more like ~80KB for some bundles.
There was a problem hiding this comment.
Yeah in phil's article it was ~95K minified plus some extra parse/eval time savings
https://philipwalton.com/articles/deploying-es2015-code-in-production-today/
There was a problem hiding this comment.
But do we know how much of that is generator cruft (async -> es5 code), string templates (aka language level polyfills), vs. web api polyfills?
so far I'm just considering web api polyfills, which is much more easily identifiable than language polyfills. It'd be very powerful to capture those too though. any ideas?
There was a problem hiding this comment.
Yeah these are great points!
I wasn't trying to suggest we do it right away these other categories are what I was trying to get at with be more aggressive :) I'm afraid I don't really have any obvious initial ideas we can pursue. Maybe we'll have to take a look at some babelized bundles and look for distinctive markers of cruft, but we can cross that bridge another day for sure 👍
| {id: 'errors-in-console', weight: 1}, | ||
| {id: 'image-aspect-ratio', weight: 1}, | ||
| // Manual audits | ||
| {id: 'polyfills', weight: 0}, |
There was a problem hiding this comment.
IMO this belongs in performance as a diagnostic and can be upgraded to an opportunity once we have the size data :)
|
DZL is done! Go check it out http://lh-dzl-6730.surge.sh |
patrickhulce
left a comment
There was a problem hiding this comment.
I'd personally love to see this land as a diagnostic in performance :)
From my perspective I think we just need some tests and this is good to go in a 4.1
| url, | ||
| description: `${polyStatement}${countText}`, | ||
| // TODO use sourcemaps | ||
| location: `r: ${row}, c: ${col}`, |
There was a problem hiding this comment.
we can use line instead of row everywhere, right? or is there a subtle difference I'm missing
| * @param {string} code | ||
| * @return {PolyIssue[]} | ||
| */ | ||
| static detectPolys(polys, code) { |
There was a problem hiding this comment.
nit: but polyfills is only 4 more characters and would make me feel warm and fuzzy looking at all the code :)
|
DZL is done! Go check it out http://lh-dzl-6730.surge.sh |
yeah it seems like it's getting credit for all XHR callbacks, probably polyfilling fetch or something |
|
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) ℹ️ Googlers: Go here for more info. |
|
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
patrickhulce
left a comment
There was a problem hiding this comment.
great stuff! mostly comment comments from me, feel free to tweak or disagree :)
| // 42 are from babel-polyfill | ||
| // 1 from fixtures/legacy-javascript.js | ||
| // 1 inline from fixtures/legacy-javascript.html | ||
| length: 44, |
There was a problem hiding this comment.
man, expected: 44, received: 43 is going to be a fun one to debug huh :)
is it feasible to enumerate somehow?
| let lineBeginsAtIndex = 0; | ||
| // Each pattern maps to one subgroup in the generated regex. For each iteration of RegExp.exec, | ||
| // only one subgroup will be defined. Exec until no more matches. | ||
| this.re.lastIndex = 0; |
There was a problem hiding this comment.
is this resetting for the loop if .match is called more than once? I was a bit confused by the comment immediately above it, could move a line out of the way perhaps? :)
| tableRows.push({ | ||
| url, | ||
| description, | ||
| location: `Ln: ${line}, Col: ${column}`, |
There was a problem hiding this comment.
source-location candidate? :)
There was a problem hiding this comment.
that's the dream
Co-Authored-By: Patrick Hulce <patrick.hulce@gmail.com>
Codecov Report
@@ Coverage Diff @@
## master #6730 +/- ##
==========================================
+ Coverage 91.67% 91.73% +0.05%
==========================================
Files 297 298 +1
Lines 10192 10296 +104
==========================================
+ Hits 9344 9445 +101
- Misses 848 851 +3
Continue to review full report at Codecov.
|
Co-Authored-By: Patrick Hulce <patrick.hulce@gmail.com>
|
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
|
Because GitHub's way of eliding comments is so awful, here are the last significant comments I made summarizing things: #6730 (comment) #6730 (comment) And the spreadsheet listing all the babel plugins that should not be present in a |
|
Too many comments here, so I am closing this PR. Will reopen after making the updates discussed offline. |



EDIT 2: Design doc
EDIT: jump to here for updates.
I imagine this will go through a few iterations.
@paulirish had the idea to surface what polyfills are in use, and suggest removing them. It sounded like fun so I took pass at it.
First, let's talk detecting polyfills. I thought about a few ways to do it, but just two were viable. We could parse out an AST to extract where native functions were being redefined. But, that's likely to be slow, and would add quite a large dependency. Instead, I opted for somewhat-simple regex parsing. The grammar for setting a property (either
=orObject.defineProperty) is fairly simple to grok for, and ought to be much faster. So I went with regex.Second, let's talk what suggestions to make. I'm not sure we can concretely suggest removing any polyfills. Developers often must support an audience that uses old browsers, so suggesting removing polyfills for features present in the newest browsers would not be found agreeable. Even if a feature is found in browsers going back to IE8, it can still be the wrong call for some sites. So I suggest not attaching any score to this audit, and keeping the language non-prescriptive. Instead, we can just surface what polyfills exist and suggest that they consider researching what they actually need.
Some things worth talking about:
Detecting modules that re-implement native JS features, but don't actually polyfill anything. For example, bundling
lodash/startsWithfor a web target, and using that instead ofString.prototype.startsWith. Bundlers tend to strip module names in production, so such detection likely requires source maps. Otherwise, we're left with analyzing what these modules look like minified. This could certainly be explored further in a separate feature.If a polyfill appears multiple times from multiple scripts, I display a counter in the report. I imagine this would only be actionable if the polyfills are coming from the same origin - if two third party scripts are polyfilling the same thing, there's nothing really to be done. Should we suggest action only if multiple polyfills are coming from the same origin? If so, what's a good way to present that suggestion?
The location in the script that the polyfill occurs is provided in the report. Usually, this script is minified, so really it's just an index into an unreadable buffer... Is this useful at all? Should this be removed and only added back with source map support?
I grabbed a list of features that can be polyfilled from here. Certainly not exhaustive. From where should we source this list?
Related: #3862