Skip to content

report: truncate long attribute values in HTML snippets#10984

Merged
Beytoven merged 9 commits into
masterfrom
shorten-html-snippet
Jul 15, 2020
Merged

report: truncate long attribute values in HTML snippets#10984
Beytoven merged 9 commits into
masterfrom
shorten-html-snippet

Conversation

@Beytoven

@Beytoven Beytoven commented Jun 17, 2020

Copy link
Copy Markdown
Contributor

Addresses #10717.

Before:
Screen Shot 2020-07-14 at 10 49 09 AM

After:
Screen Shot 2020-07-14 at 10 41 18 AM

@Beytoven Beytoven requested a review from a team as a code owner June 17, 2020 22:02
@Beytoven Beytoven requested review from connorjclark and removed request for a team June 17, 2020 22:02
Comment thread lighthouse-core/lib/page-functions.js Outdated
});
for (const attributeName of clone.getAttributeNames()) {
let attributeValue = clone.getAttribute(attributeName);
if (attributeValue.length > 100) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd pull this 100 up to a const towards the top of the method

Comment thread lighthouse-core/lib/page-functions.js Outdated
for (const attributeName of clone.getAttributeNames()) {
let attributeValue = clone.getAttribute(attributeName);
if (attributeValue.length > 100) {
attributeValue = attributeValue.slice(0, 97) + '...';

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protip: option-; on mac is which is a single character ellipsis.

@brendankenny

Copy link
Copy Markdown
Contributor

before/after screenshots? :)

@brendankenny

Copy link
Copy Markdown
Contributor

before/after screenshots? :)

oh, I see, not limiting which attributes, just making sure none of them are ridiculously long. I guess that's also not that hard to imagine :)

@connorjclark

Copy link
Copy Markdown
Collaborator

This is technically a breaking change.

Any way we could truncate in just the report but leave snippet untouched?

@Beytoven

Copy link
Copy Markdown
Contributor Author

This is technically a breaking change.

Any way we could truncate in just the report but leave snippet untouched?

I could look into making this change in the snippet renderer instead. I think eventually we'd want to do it in page-functions for the next major release.

@connorjclark

connorjclark commented Jun 23, 2020 via email

Copy link
Copy Markdown
Collaborator

@connorjclark connorjclark left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. pending approval on seeing a before/after (cnn.com has nice big ones).

@connorjclark

Copy link
Copy Markdown
Collaborator

Maybe we should also have a maxTotalAttributeLength (random pick: 600) and once we hit it we elide more/all of the value?

@Beytoven

Copy link
Copy Markdown
Contributor Author

Maybe we should also have a maxTotalAttributeLength (random pick: 600) and once we hit it we elide more/all of the value?

That would be nice, but then we run the risk of more important identifiers being cut off because they fall behind the other super long data attributes. I think the reason we opted for truncating per attribute was so that wouldn't happen. An alternative is that we could omit data-* attributes. They tend to be long and I don't know if they're actually useful in regards to element identification.

@connorjclark

Copy link
Copy Markdown
Collaborator

spoke offline:

  • let's count the length of the attribute name + value as we go, and once it hit's ~500-600, we stop and just elide the rest of the node with
  • considered making sure we process class and id (most important for identification) first, but we figure that those attributes are typically already first in the markup, so we can just rely on the order of attributes correlating to importance.
  • As a follow up PR, @Beytoven wants to explore making the node detail type expandable in the report, such that we could show the entire HTML, unelided.

@paulirish

Copy link
Copy Markdown
Member
  • let's count the length of the attribute name + value as we go, and once it hit's ~500-600, we stop and just elide the rest of the node with

do we need to? since this adds so much more complexity i'd rather defer until we across a real example that necessitates it.

@connorjclark

Copy link
Copy Markdown
Collaborator
  • let's count the length of the attribute name + value as we go, and once it hit's ~500-600, we stop and just elide the rest of the node with

do we need to? since this adds so much more complexity i'd rather defer until we across a real example that necessitates it.

Can you elaborate? the example in the OP is exactly why this is necessary. Just having a cap on the size of individual attributes does not prevent a lengthy node snippet when there are many attributes.

@Beytoven

Copy link
Copy Markdown
Contributor Author

So the approach I ended up taking here is to establish an "attribute character budget" of sorts. We simply keep count of the number of characters used for attribute names/values and once we've exceeded that budget we remove the remaining attributes. I'm seeing 50-60% reductions in snippet length for some of the longer snippets (1200+ characters).

Comment thread lighthouse-core/lib/page-functions.js Outdated
const reOpeningTag = /^[\s\S]*?>/;
const match = clone.outerHTML.match(reOpeningTag);
if (match && match[0] && charCount > SNIPPET_CHAR_LIMIT) {
return match[0].slice(0, match[0].length - 1) + '…>';

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some tests for this part?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, currently working on it

Comment thread lighthouse-core/lib/page-functions.js Outdated
}

const reOpeningTag = /^[\s\S]*?>/;
const match = clone.outerHTML.match(reOpeningTag);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe replace with

const [match] = clone.outerHTML.match(reOpeningTag) || [];

and then drop all the match && match[0] for just match ?

Comment thread lighthouse-core/lib/page-functions.js Outdated

@connorjclark connorjclark left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits, but marking my approval anyway

Co-authored-by: Connor Clark <cjamcl@google.com>
@connorjclark connorjclark changed the title core(page-functions): truncate long attribute values in HTML snippets report: truncate long attribute values in HTML snippets Jul 14, 2020
@Beytoven Beytoven merged commit bce9930 into master Jul 15, 2020
@Beytoven Beytoven deleted the shorten-html-snippet branch July 15, 2020 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants