report: truncate long attribute values in HTML snippets by Beytoven · Pull Request #10984 · GoogleChrome/lighthouse

Beytoven · 2020-06-17T22:02:56Z

Addresses #10717.

Before:

After:

paulirish · 2020-06-17T23:36:13Z

    });
+    for (const attributeName of clone.getAttributeNames()) {
+      let attributeValue = clone.getAttribute(attributeName);
+      if (attributeValue.length > 100) {


i'd pull this 100 up to a const towards the top of the method

paulirish · 2020-06-17T23:38:34Z

+    for (const attributeName of clone.getAttributeNames()) {
+      let attributeValue = clone.getAttribute(attributeName);
+      if (attributeValue.length > 100) {
+        attributeValue = attributeValue.slice(0, 97) + '...';


protip: option-; on mac is … which is a single character ellipsis.

brendankenny · 2020-06-19T20:30:07Z

before/after screenshots? :)

brendankenny · 2020-06-19T20:32:35Z

before/after screenshots? :)

oh, I see, not limiting which attributes, just making sure none of them are ridiculously long. I guess that's also not that hard to imagine :)

connorjclark · 2020-06-22T19:33:06Z

This is technically a breaking change.

Any way we could truncate in just the report but leave snippet untouched?

Beytoven · 2020-06-22T20:09:21Z

This is technically a breaking change.

Any way we could truncate in just the report but leave snippet untouched?

I could look into making this change in the snippet renderer instead. I think eventually we'd want to do it in page-functions for the next major release.

connorjclark · 2020-06-23T00:32:38Z

Spoke offline: nah not worth it.

…

On Mon, Jun 22, 2020, 1:09 PM Michael Blasingame ***@***.***> wrote: This is technically a breaking change. Any way we could truncate in just the report but leave snippet untouched? I could look into making this change in the snippet renderer instead. I think eventually we'd want to do it in page-functions for the next major release. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#10984 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA7CAMWMH2SLDLQGVJW4MRTRX6277ANCNFSM4OBADRPQ> .

connorjclark

Looks good to me. pending approval on seeing a before/after (cnn.com has nice big ones).

connorjclark · 2020-06-23T22:52:16Z

Maybe we should also have a maxTotalAttributeLength (random pick: 600) and once we hit it we elide more/all of the value?

Beytoven · 2020-06-23T22:58:09Z

Maybe we should also have a maxTotalAttributeLength (random pick: 600) and once we hit it we elide more/all of the value?

That would be nice, but then we run the risk of more important identifiers being cut off because they fall behind the other super long data attributes. I think the reason we opted for truncating per attribute was so that wouldn't happen. An alternative is that we could omit data-* attributes. They tend to be long and I don't know if they're actually useful in regards to element identification.

connorjclark · 2020-06-23T23:20:57Z

spoke offline:

let's count the length of the attribute name + value as we go, and once it hit's ~500-600, we stop and just elide the rest of the node with …
considered making sure we process class and id (most important for identification) first, but we figure that those attributes are typically already first in the markup, so we can just rely on the order of attributes correlating to importance.
As a follow up PR, @Beytoven wants to explore making the node detail type expandable in the report, such that we could show the entire HTML, unelided.

paulirish · 2020-07-01T01:29:32Z

let's count the length of the attribute name + value as we go, and once it hit's ~500-600, we stop and just elide the rest of the node with …

do we need to? since this adds so much more complexity i'd rather defer until we across a real example that necessitates it.

connorjclark · 2020-07-01T02:32:52Z

let's count the length of the attribute name + value as we go, and once it hit's ~500-600, we stop and just elide the rest of the node with …

do we need to? since this adds so much more complexity i'd rather defer until we across a real example that necessitates it.

Can you elaborate? the example in the OP is exactly why this is necessary. Just having a cap on the size of individual attributes does not prevent a lengthy node snippet when there are many attributes.

Beytoven · 2020-07-14T18:33:33Z

So the approach I ended up taking here is to establish an "attribute character budget" of sorts. We simply keep count of the number of characters used for attribute names/values and once we've exceeded that budget we remove the remaining attributes. I'm seeing 50-60% reductions in snippet length for some of the longer snippets (1200+ characters).

connorjclark · 2020-07-14T19:30:01Z

    const reOpeningTag = /^[\s\S]*?>/;
    const match = clone.outerHTML.match(reOpeningTag);
+    if (match && match[0] && charCount > SNIPPET_CHAR_LIMIT) {
+      return match[0].slice(0, match[0].length - 1) + '…>';


can you add some tests for this part?

yep, currently working on it

connorjclark · 2020-07-14T20:54:41Z

+    }
+
    const reOpeningTag = /^[\s\S]*?>/;
    const match = clone.outerHTML.match(reOpeningTag);


maybe replace with

const [match] = clone.outerHTML.match(reOpeningTag) || [];

and then drop all the match && match[0] for just match ?

connorjclark

some nits, but marking my approval anyway

Co-authored-by: Connor Clark <cjamcl@google.com>

Truncate long attribute values in html snippets

6fe98c2

Beytoven requested a review from a team as a code owner June 17, 2020 22:02

Beytoven requested review from connorjclark and removed request for a team June 17, 2020 22:02

googlebot added the cla: yes label Jun 17, 2020

devtools-bot assigned connorjclark Jun 17, 2020

devtools-bot added the waiting4reviewer label Jun 17, 2020

paulirish reviewed Jun 17, 2020

View reviewed changes

Move char limit to constant

cf0fe3d

vercel Bot deployed to Preview June 22, 2020 18:23 View deployment

Lint fix

6f762f6

vercel Bot deployed to Preview June 23, 2020 20:14 View deployment

connorjclark reviewed Jun 23, 2020

View reviewed changes

Shortening to 75

f046747

vercel Bot deployed to Preview June 23, 2020 22:17 View deployment

Adding total snippet length limit

16dc481

connorjclark added waiting4committer and removed waiting4reviewer labels Jul 7, 2020

vercel Bot deployed to Preview July 14, 2020 18:24 View deployment

connorjclark reviewed Jul 14, 2020

View reviewed changes

Beytoven added 2 commits July 14, 2020 13:31

Add test for total attribute length limit

163a1d4

Merge branch 'master' into shorten-html-snippet

14cd9cd

vercel Bot deployed to Preview July 14, 2020 20:32 View deployment

Beytoven added waiting4reviewer and removed waiting4committer labels Jul 14, 2020

connorjclark reviewed Jul 14, 2020

View reviewed changes

Comment thread lighthouse-core/lib/page-functions.js Outdated

connorjclark approved these changes Jul 14, 2020

View reviewed changes

Update lighthouse-core/lib/page-functions.js

1c031d9

Co-authored-by: Connor Clark <cjamcl@google.com>

vercel Bot deployed to Preview July 14, 2020 21:04 View deployment

Nit fix

e0abcc9

vercel Bot deployed to Preview July 14, 2020 21:40 View deployment

connorjclark changed the title ~~core(page-functions): truncate long attribute values in HTML snippets~~ report: truncate long attribute values in HTML snippets Jul 14, 2020

Beytoven merged commit bce9930 into master Jul 15, 2020

Beytoven deleted the shorten-html-snippet branch July 15, 2020 02:13

patrickhulce mentioned this pull request Jul 17, 2020

new_audit: add large-javascript-libraries audit #11096

Merged

connorjclark mentioned this pull request Aug 4, 2020

Reduce the length of HTML snippets #10717

Closed

joshmcarthur mentioned this pull request Aug 10, 2020

Long attribute value truncation (PR #10984) fires off requests to the truncated URLs #11244

Closed

csabapalfi mentioned this pull request Oct 1, 2020

core: prevent attribute truncation side-effects #11503

Merged

Conversation

Beytoven commented Jun 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paulirish Jun 17, 2020

Choose a reason for hiding this comment

Uh oh!

paulirish Jun 17, 2020

Choose a reason for hiding this comment

Uh oh!

brendankenny commented Jun 19, 2020

Uh oh!

brendankenny commented Jun 19, 2020

Uh oh!

connorjclark commented Jun 22, 2020

Uh oh!

Beytoven commented Jun 22, 2020

Uh oh!

connorjclark commented Jun 23, 2020 via email

Uh oh!

connorjclark left a comment

Choose a reason for hiding this comment

Uh oh!

connorjclark commented Jun 23, 2020

Uh oh!

Beytoven commented Jun 23, 2020

Uh oh!

connorjclark commented Jun 23, 2020

Uh oh!

paulirish commented Jul 1, 2020

Uh oh!

connorjclark commented Jul 1, 2020

Uh oh!

Beytoven commented Jul 14, 2020

Uh oh!

connorjclark Jul 14, 2020

Choose a reason for hiding this comment

Uh oh!

Beytoven Jul 14, 2020

Choose a reason for hiding this comment

Uh oh!

connorjclark Jul 14, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

connorjclark left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Beytoven commented Jun 17, 2020 •

edited

Loading