Skip to content

core(source-maps): workaround CORS for fetching maps#9459

Merged
connorjclark merged 23 commits into
masterfrom
fetch-no-cors
Mar 17, 2020
Merged

core(source-maps): workaround CORS for fetching maps#9459
connorjclark merged 23 commits into
masterfrom
fetch-no-cors

Conversation

@connorjclark

@connorjclark connorjclark commented Jul 25, 2019

Copy link
Copy Markdown
Collaborator

This patch adds an interface over the Fetch domain. See comments for hows and whys.

Deferring unit tests until after feedback. I added a simple smoke test to confirm that CORS was vanquished. Not certain on the cookie stuff, definitely give that special attention.

cont. #9101
related: #9097

@patrickhulce patrickhulce left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really clever workaround I like it! still a bit unsure about needing to start with cookies at all since I think it potentially opens the door to having Lighthouse as a auto-CSRF tool 😆but OK

does devtools fetch source maps with cookies?


<body>
<script>
console.log('a');

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the second one need to have content?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neither do - I didn't meant to commit this.

Comment thread lighthouse-core/gather/driver.js Outdated
}

const responseBody = await this.sendCommand('Fetch.getResponseBody', {requestId});
if (responseBody.base64Encoded) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this just based on if chrome thinks the resource is text-based or not?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, IDK how the agent decides but there is no way to configure it from the protocol.

Comment thread lighthouse-core/gather/driver.js Outdated

const responseBody = await this.sendCommand('Fetch.getResponseBody', {requestId});
if (responseBody.base64Encoded) {
resolve(Buffer.from(responseBody.body, 'base64').toString());

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels like a tricky way to handle buffers, could we return a complex type {buffer: Buffer}|{text: string} or something?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe return both cases as buffer?

instead of the string case: Buffer.from(bufStr, 'utf8');

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we currently fetch the robots.txt, sourcemaps, the start_url (HTML)

so we dont need non-text resources at all right now. fail any base64Encoded and we'll handle it later if we need to actually get these payloads

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

Comment thread lighthouse-core/gather/driver.js Outdated

// The first requestPaused event is for the request stage. Continue it.
if (!responseStatusCode) {
// Remove same-site cookies so we aren't buying stuff on Amazon.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we're not concerned with the sort of manifest/user profile state being stored in a samesite cookie that made us want to deal with cookies at all in the first place?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idk how cookies work tbh. so you're saying this bit of code totally invalidates any source maps behind cookie-authentication?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any cookie auth that they've marked as SameSite, yeah

marking auth cookies as samesite is a decent way to prevent the annoying site-style logout/language attack

Comment thread lighthouse-core/gather/driver.js Outdated
Comment thread lighthouse-core/gather/driver.js Outdated
Comment thread lighthouse-cli/test/fixtures/source-map/source-map-tester.html

@brendankenny brendankenny left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this should wait at least until there's an audit using source maps and we can see what the actual benefits would be. This isn't blocking anything in #9097 so it would be nice to have a much better picture of the tradeoffs involved before committing to this approach.

Comment thread lighthouse-core/gather/driver.js Outdated
Comment thread lighthouse-core/gather/driver.js Outdated
if (!responseStatusCode) {
// Remove same-site cookies so we aren't buying stuff on Amazon.
const sameSiteCookies = await this.sendCommand('Network.getCookies', {urls: [url]});
const sameSiteCookiesKeyValueSet = new Set();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could split into its own fn for easier testing.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

punting on testing b/c i don't really know how the cookie story plays out here, may be this is just removed and we totally delete all cookies?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to keep it simple and safe at first I would propose deleting all cookies, until we observe the need for them

Comment thread lighthouse-core/gather/driver.js Outdated

const responseBody = await this.sendCommand('Fetch.getResponseBody', {requestId});
if (responseBody.base64Encoded) {
resolve(Buffer.from(responseBody.body, 'base64').toString());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we currently fetch the robots.txt, sourcemaps, the start_url (HTML)

so we dont need non-text resources at all right now. fail any base64Encoded and we'll handle it later if we need to actually get these payloads

Comment thread lighthouse-core/gather/driver.js Outdated
/** @type {HTMLIFrameElement} */
const iframe = document.createElement('iframe');
// Try really hard not to affect the page.
iframe.style.display = 'none';

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think you also want to set top.

 iframe.style.cssText = `
display: none;
visibility: hidden;
position: absolute;
left: -100px;
top: -100px;
width: 1px;
height: 1px;
`;

Comment thread lighthouse-core/gather/driver.js Outdated
Comment thread lighthouse-core/gather/driver.js Outdated
resolve(responseBody.body);
}

// Fail the request (from the page's perspective) so that the iframe never loads.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remind me (perhaps in a comment) why we do this?

@connorjclark connorjclark Jul 25, 2019

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't have any more to add than what the comment says. I assume Blink will do more work if the iframe actually gets its contents loaded, even if all the styling on it makes it not visible. but i'm just guessing.

Comment thread lighthouse-core/gather/driver.js Outdated
}

// Fail the request (from the page's perspective) so that the iframe never loads.
this.sendCommand('Fetch.failRequest', {requestId, errorReason: 'Aborted'});

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aborted by Lighthouse?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an enum, can't give custom messages.

Comment thread lighthouse-core/gather/fetcher.js Outdated
Comment thread lighthouse-core/gather/fetcher.js Outdated
* 3) if multiple commands to continue the same request are sent, protocol errors occur.
*
* So instead we have one global `Fetch.enable` / `Fetch.requestPaused` pair, and allow specific
* urls to be intercepted via `driver.setOnRequestPausedHandler`.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* urls to be intercepted via `driver.setOnRequestPausedHandler`.
* urls to be intercepted via `fetcher.setOnRequestPausedHandler`.

Comment thread lighthouse-core/gather/fetcher.js Outdated
* @param {string} url
* @param {(event: LH.Crdp.Fetch.RequestPausedEvent) => void} handler
*/
async setOnRequestPausedHandler(url, handler) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so...... private? :)

Comment thread lighthouse-core/gather/fetcher.js Outdated
* @return {Promise<string>}
*/
async fetchResource(url, timeoutInMs = 500) {
if (!this.driver.isDomainEnabled('Fetch')) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also if we are going to be replacing this with a simple "Page.fetchTheThing" protocol method soon, then the whole enabling/disabling domain thing from outside is going to need to go away so I definitely vote to keep that internal to this class if possible :)

Comment thread lighthouse-core/gather/fetcher.js Outdated
.mockResponse('Debugger.disable', {});
.mockResponse('Debugger.disable', {})
.mockResponse('Fetch.enable', {})
.mockResponse('Fetch.disable', {});

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplifies all this testing too :)

Comment thread lighthouse-cli/test/smokehouse/test-definitions/core-tests.js

@patrickhulce patrickhulce left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread lighthouse-core/gather/gather-runner.js
@brendankenny

Copy link
Copy Markdown
Contributor

are we still pausing on this until we get a sense of the protocol method timeline/feature set?

@patrickhulce

Copy link
Copy Markdown
Collaborator

are we still pausing on this until we get a sense of the protocol method timeline/feature set?

I don't think we were ever pausing, just writing in a way that keeps the transition smooth :)

@brendankenny

Copy link
Copy Markdown
Contributor

are we still pausing on this until we get a sense of the protocol method timeline/feature set?

I don't think we were ever pausing, just writing in a way that keeps the transition smooth :)

sorry, this was from a side conversation where (I believe) we agreed to evaluate the viability of the protocol method first

@connorjclark

Copy link
Copy Markdown
Collaborator Author

i kicked off an email thread about this. because it would be cool.
in the meantime though, lets not block this improvement. in the future i hope we can improve the Fetcher even more.
#9459 (comment)

then somewhere else I think @paulirish reached out to @sigurdschneider and learned things are still very WIP.

Looking forward to a protocol solution but don't wanna block this. It's unknown when protocol support will land.

@connorjclark connorjclark merged commit 72f0774 into master Mar 17, 2020
@connorjclark connorjclark deleted the fetch-no-cors branch March 17, 2020 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants