Skip to content

core: gather source maps#9101

Merged
brendankenny merged 90 commits into
masterfrom
gather-source-maps
Jul 25, 2019
Merged

core: gather source maps#9101
brendankenny merged 90 commits into
masterfrom
gather-source-maps

Conversation

@connorjclark

Copy link
Copy Markdown
Collaborator

#9097

Gather source maps.

Nothing done with them yet.

@connorjclark

connorjclark commented Jun 1, 2019

Copy link
Copy Markdown
Collaborator Author

WIP b/c no tests, and:

I want to put source maps in a new artifact, because gathering them is a lot of network requests and parsing that should only be done if an audit explicitly needs it - which, right now, does not exist.

It could be in a new gatherer, but I'd like to use the Debugger.onScriptParsed to collect the sourceMapURL (instead of parsing it ourselves). So we can add sourceMapURL to ScriptElements, and make a computed artifact, cool. But we don't have the driver in computed artifacts (afaict), so can't make fetch requests in the page to collect the source maps.

Is there a nice way to pass the driver to a computed artifact? Or is it better to forgo Debugger.onScriptParsed and just parse for sourceMapURL?

@patrickhulce

Copy link
Copy Markdown
Collaborator

Is there a nice way to pass the driver to a computed artifact?

Sadly, no. Computed artifacts are firmly in audit phase territory and there's no way to have dependencies between artifacts unless one is a BaseArtifact

Or is it better to forgo Debugger.onScriptParsed and just parse for sourceMapURL?

I'm not sure I understand the decision here, since it sounds like if you needed a separate gatherer either way you could have the separate one listen on Debugger.onScriptParsed for now? At some point in the future it might make sense to merge these two if the set of audits that need them overlap significantly enough.

@connorjclark

Copy link
Copy Markdown
Collaborator Author

I'm not sure I understand the decision here, since it sounds like if you needed a separate gatherer either way you could have the separate one listen on Debugger.onScriptParsed for now?

Failed to realize it's not required to do that in the script-elements. yeah separate gatherer makes 100% sense now.

const a = 1;
// Use this obnoxious URL because using a non-existent url from localhost will return a
// bunch of HTML, which will be parsed, and no fetch error occurs.
//# sourceMappingURL=http://www.this-will-not-exist-blah-go-rockets.com/some-map.js.map

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess i could do http://localhost:1234 ...

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go-rockets

😱

haha 😉 to your actual point though, :1234 is the default for parcel which I tend to have running so that might cause something unexpected too ;) haha

how about an entirely invalid URL? or is that then testing something different you're not concerned about?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i suppose anything that results in a fetch error will work, including a malformed URL.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go-rockets

Did that just to get a rise out of you ;)

@connorjclark connorjclark changed the title Gather source maps core: gather source maps Jun 1, 2019
@codecov

codecov Bot commented Jun 1, 2019

Copy link
Copy Markdown

Codecov Report

Merging #9101 into master will decrease coverage by 0.01%.
The diff coverage is 94.87%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9101      +/-   ##
==========================================
- Coverage   91.44%   91.43%   -0.02%     
==========================================
  Files         291      292       +1     
  Lines        9929     9981      +52     
==========================================
+ Hits         9080     9126      +46     
- Misses        849      855       +6
Flag Coverage Δ
#smoke 84.43% <94.87%> (-0.33%) ⬇️
#unit 89.24% <15.38%> (-0.29%) ⬇️
Impacted Files Coverage Δ
lighthouse-core/config/default-config.js 87.5% <ø> (ø) ⬆️
lighthouse-core/gather/gatherers/source-maps.js 94.87% <94.87%> (ø)
lighthouse-core/audits/user-timings.js 96% <0%> (-4%) ⬇️
...house-core/computed/metrics/lantern-speed-index.js 97.14% <0%> (-2.86%) ⬇️
lighthouse-core/gather/gather-runner.js 99.05% <0%> (-0.95%) ⬇️
lighthouse-core/lib/manifest-parser.js 90.55% <0%> (+1.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0eacf06...5da78cf. Read the comment docs.

sourceMapUrl: isSourceMapADataUri ? undefined : sourceMapUrl,
// map is undefined, unless there wasn't an error.
map: undefined,
...sourceMapOrError,

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idk is this a sin?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's definitely approaching "hard to follow" :)

Something like this might be a little easier to follow, possibly:

/**
 * @param {Driver} driver
 * @param {LH.Crdp.Debugger.ScriptParsedEvent} event
 * @return {Promise<LH.Artifacts.SourceMap>}
 */
async _retrieveMapFromScriptParsedEvent(driver, event) {
  if (!event.sourceMapURL) {
    throw new Error('precondition failed: event.sourceMapURL should exist');
  }

  // `sourceMapURL` is simply the URL found in either a magic comment or an x-sourcemap header.
  // It has not been resolved to a base url.
  const isSourceMapADataUri = event.sourceMapURL.startsWith('data:');
  const scriptUrl = event.url;
  const rawSourceMapUrl = isSourceMapADataUri ?
      event.sourceMapURL :
      this._resolveUrl(event.sourceMapURL, scriptUrl);

  if (!rawSourceMapUrl) {
    return {
      scriptUrl,
      errorMessage: `Could not resolve map url: ${event.sourceMapURL}`,
    };
  }

  // sourceMapUrl isn't included in the the artifact if it was a data URL.
  const sourceMapUrl = isSourceMapADataUri ? undefined : rawSourceMapUrl;

  try {
    const map = isSourceMapADataUri ?
        this.parseSourceMapFromDataUrl(rawSourceMapUrl) :
        await this.fetchSourceMapInPage(driver, rawSourceMapUrl);

    return {
      scriptUrl,
      sourceMapUrl,
      map,
    };
  } catch (err) {
    return {
      scriptUrl,
      sourceMapUrl,
      errorMessage: err.toString(),
    };
  }
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, done

@googlebot

Copy link
Copy Markdown

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

@connorjclark

connorjclark commented Jul 24, 2019

Copy link
Copy Markdown
Collaborator Author

Considerable work needs to be done to support fetching arbitrary resources outside the page context for each of our environments. We have:

  1. node
  2. devtools frontend
  3. lightrider

.

  1. is easy - the http module.
  2. is also easy - DevTools has a host API for this.
  3. is not so easy.

Options moving forward:

a) land as-is, some maps can't be fetched due to CORs. For the audit that will come after, we can ignore fetch errors.
b) hold off landing completely until this is well supported.

I vote for a) to keep the PR smaller.

@googlebot

Copy link
Copy Markdown

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@connorjclark

Copy link
Copy Markdown
Collaborator Author

alternatively, could do a hybrid: support node fully using http module and fallback to in-page fetch in other environments (or just disable).

@brendankenny

Copy link
Copy Markdown
Contributor

a) land as-is, some maps can't be fetched due to CORs. For the audit that will come after, we can ignore fetch errors.
b) hold off landing completely until this is well supported.

I vote for a) to keep the PR smaller.

I definitely don't think we should block on that, at worst it's only as bad as if they hadn't published their source maps

@brendankenny brendankenny left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, I think just these last things (+ axing the logging)

Comment thread lighthouse-core/gather/gatherers/source-maps.js
sourceMapUrl: isSourceMapADataUri ? undefined : sourceMapUrl,
// map is undefined, unless there wasn't an error.
map: undefined,
...sourceMapOrError,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's definitely approaching "hard to follow" :)

Something like this might be a little easier to follow, possibly:

/**
 * @param {Driver} driver
 * @param {LH.Crdp.Debugger.ScriptParsedEvent} event
 * @return {Promise<LH.Artifacts.SourceMap>}
 */
async _retrieveMapFromScriptParsedEvent(driver, event) {
  if (!event.sourceMapURL) {
    throw new Error('precondition failed: event.sourceMapURL should exist');
  }

  // `sourceMapURL` is simply the URL found in either a magic comment or an x-sourcemap header.
  // It has not been resolved to a base url.
  const isSourceMapADataUri = event.sourceMapURL.startsWith('data:');
  const scriptUrl = event.url;
  const rawSourceMapUrl = isSourceMapADataUri ?
      event.sourceMapURL :
      this._resolveUrl(event.sourceMapURL, scriptUrl);

  if (!rawSourceMapUrl) {
    return {
      scriptUrl,
      errorMessage: `Could not resolve map url: ${event.sourceMapURL}`,
    };
  }

  // sourceMapUrl isn't included in the the artifact if it was a data URL.
  const sourceMapUrl = isSourceMapADataUri ? undefined : rawSourceMapUrl;

  try {
    const map = isSourceMapADataUri ?
        this.parseSourceMapFromDataUrl(rawSourceMapUrl) :
        await this.fetchSourceMapInPage(driver, rawSourceMapUrl);

    return {
      scriptUrl,
      sourceMapUrl,
      map,
    };
  } catch (err) {
    return {
      scriptUrl,
      sourceMapUrl,
      errorMessage: err.toString(),
    };
  }
}

}

const value = fetchError ?
Object.assign(new Error(), {message: fetchError, __failedInBrowser: true}) :

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, it's because of

return Promise.reject(Object.assign(new Error(), value));
(since fetchError.message isn't enumerable when message is set via the Error constructor, so it isn't copied over on that line)

fetchError could be a plain object like is done in the real pageFunctions.wrapRuntimeEvalErrorInBrowserString, but this seems good enough.

Comment thread types/artifacts.d.ts Outdated
/** Error that occurred during fetching or parsing of source map. */
errorMessage: string
/** No map on account of error. */
map: undefined;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be map?: undefined so it won't have to be set to undefined manually (the code snippet I posted above relies on that, I believe)

@googlebot

Copy link
Copy Markdown

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

@googlebot

Copy link
Copy Markdown

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@googlebot

Copy link
Copy Markdown

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

@googlebot

Copy link
Copy Markdown

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@connorjclark

Copy link
Copy Markdown
Collaborator Author

it seems half of these comments could be @googlebot 👎

@googlebot

Copy link
Copy Markdown

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

@googlebot

Copy link
Copy Markdown

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@connorjclark

Copy link
Copy Markdown
Collaborator Author

@brendankenny @exterkamp ya'll wanna add your personal emails to the CLA thing? click the Googler link above

@brendankenny brendankenny left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺 🚜
LGTM!

@googlebot

Copy link
Copy Markdown

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants