Skip to content

Use definitions in FileAPI spec to resolve blob URLs and their origins.#371

Merged
annevk merged 9 commits intowhatwg:masterfrom
mkruisselbrink:blob-url-origin
Jan 14, 2019
Merged

Use definitions in FileAPI spec to resolve blob URLs and their origins.#371
annevk merged 9 commits intowhatwg:masterfrom
mkruisselbrink:blob-url-origin

Conversation

@mkruisselbrink
Copy link
Contributor

@mkruisselbrink mkruisselbrink commented Feb 2, 2018

This fixes #290, fixes #127, fixes w3c/FileAPI#20 and fixes w3c/FileAPI#63.

The only observable change here is the origin for blob URLs created in unique origins (which is only observable indirectly by trying to fetch such URLs), and tests for that already exist in https://w3c-test.org/FileAPI/url/sandboxed-iframe.html.


Preview | Diff

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many many thanks for tackling this! You should add yourself to the acknowledgments section. I also have a bit of feedback.

url.bs Outdated
It is initially null.

<p class="note">This is used to support caching the origin of a "<code>blob</code>"
<a for=/>URL</a> at the time it is parsed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we call it object origin then? Also, is this observable? If not, maybe we shouldn't make it a thing to avoid folks making it observable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, object origin makes sense. It's only observable for fetches to/from unique origins. The string serialization of the origin isn't changing, just that identity comparison with unique origined blob URLs will now work correctly (especially if the blob URL was parsed/resolved before the URL was revoked, but fetched after being revoked. Without caching this origin same-origin checks would fail during fetching in unique origins).

url.bs Outdated
"object".

<p>A <a for=/>URL</a> also has an associated
<dfn for=url>cached origin</dfn> that is either null or a <a>ASCII string</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an ASCII*

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

url.bs Outdated
<li><p>Set <var>url</var>'s <a for=url>object</a> to the entry in the <a>Blob URL Store</a>
corresponding to <var>url</var>'s <a for=url>path</a>[0].
<li><p>Set <var>url</var>'s <a for=url>object</a> to the result of
<a for="blob URL" lt="resolve">resolving the blob URL</a> <var>url</var>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "resolving the blob URL" algorithm can return failure. I guess the url’s object should be set to null then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from nits this looks okay to me. Another thought I had was that we could go with a different design where File API returns an object only and URL defines the rest. That is, if return url's object's environment's origin if url's object is non-null and otherwise keep the existing steps. Maybe that would be a little cleaner?

url.bs Outdated
It is initially null.

<p class="note">This is used to support caching the origin of a "<code>blob</code>"
<a for=/>URL</a> at the time it is parsed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to add here that this is important as it might be removed from the blob URL store after that time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

url.bs Outdated

<p>A <a for=/>URL</a> also has an associated
<dfn for=url>object origin</dfn> that is either null or an <a>ASCII string</a>.
It is initially null.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, shouldn't this be an origin rather than a string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point. Done.

url.bs Outdated
<a for=url>path</a>[0] is not in the <a>Blob URL Store</a>, then return <var>url</var>.
[[!FILEAPI]]
<li><p>Set <var>url</var>'s <a for=url>object origin</a> to the result of
<a for="blob URL" lt="resolve the origin">resolving the origin of the blob URL</a> <var>url</var>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we only set this if url's object is non-null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I don't think it makes a difference in practice, but it might be clearer that way, so done.

Copy link
Contributor Author

@mkruisselbrink mkruisselbrink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for losing track of this PR, I addressed some more of your comments. WDYT?

url.bs Outdated

<p>A <a for=/>URL</a> also has an associated
<dfn for=url>object origin</dfn> that is either null or an <a>ASCII string</a>.
It is initially null.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point. Done.

url.bs Outdated
It is initially null.

<p class="note">This is used to support caching the origin of a "<code>blob</code>"
<a for=/>URL</a> at the time it is parsed.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

url.bs Outdated
<a for=url>path</a>[0] is not in the <a>Blob URL Store</a>, then return <var>url</var>.
[[!FILEAPI]]
<li><p>Set <var>url</var>'s <a for=url>object origin</a> to the result of
<a for="blob URL" lt="resolve the origin">resolving the origin of the blob URL</a> <var>url</var>.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I don't think it makes a difference in practice, but it might be clearer that way, so done.

@annevk
Copy link
Member

annevk commented Jan 7, 2019

What did you think about the alternate design I offered (rephrased to fix errors)? We'd move the definition of object origin into URL: return url's object's relevant global object's origin if url's object is non-null and otherwise keep the existing steps.

Yet another alternative would be that File API returns a https://infra.spec.whatwg.org/#struct or some such wherein both the object and origin are stored. Going into File API twice for the "same thing" seems bad.

@mkruisselbrink
Copy link
Contributor Author

What did you think about the alternate design I offered (rephrased to fix errors)? We'd move the definition of object origin into URL: return url's object's relevant global object's origin if url's object is non-null and otherwise keep the existing steps.

Yet another alternative would be that File API returns a https://infra.spec.whatwg.org/#struct or some such wherein both the object and origin are stored. Going into File API twice for the "same thing" seems bad.

Oh sorry, missed this comment. I don't think the first option works (the url's object's relevant global object is not necesarily the same as the current settings object at the time createObjectURL was called), but something like the second option could definitely work. We can just store the https://w3c.github.io/FileAPI/#blob-url-entry and extract the actual blob and origin from there. Changing the URL's object from being the blob to being a struct containing the blob would have the possibly unfortunate side effect of requiring every existing reference to that in other specs to be updated as well (which would mean at least the fetch spec, not sure if other specs refer to it). Probably wouldn't be too much work though.

Store blob URL entry directly, rather than splitting out the blob and
origin separately. Will require updated other specs that refer to object
to start referring to blob entry's object instead.
@mkruisselbrink
Copy link
Contributor Author

Okay, got rid of object and replaced it with something for now called "blob entry" instead. Does this seem more in line with what you had in mind? I'll need to update the File API spec to export the relevant terms (and get rid of the no longer needed origin resolving algorithm there), but don't want to do that unless you're happy with this approach.

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I like this! I suspect we don't need to change too many places, since only Fetch is really supposed to inspect this field.

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, will you make the changes to Fetch too?

@mkruisselbrink
Copy link
Contributor Author

Great, will you make the changes to Fetch too?

Yep, will prepare a PR for that today.

mkruisselbrink added a commit to mkruisselbrink/fetch that referenced this pull request Jan 10, 2019
@mkruisselbrink
Copy link
Contributor Author

Great, will you make the changes to Fetch too?

Yep, will prepare a PR for that today.

whatwg/fetch#857, although obviously that won't build until this PR lands first

@annevk
Copy link
Member

annevk commented Jan 11, 2019

I have this as a commit message:

Use File API to resolve blob: URLs and their origins

In particular some blob: URLs representing an opaque origin could still be same origin with something. That only works if the origin is stored somewhere and is not obtained from the URL.

This fixes #127, fixes #290 , fixes https://github.com/w3c/FileAPI/issues/20, and fixes https://github.com/w3c/FileAPI/issues/63.

What I should have asked about sooner is whether we have any tests around this already and if any implementation bugs need to be filed? Sorry for the additional delay, but I'd like to be sure we don't have to revisit this anytime soon.

@mkruisselbrink
Copy link
Contributor Author

What I should have asked about sooner is whether we have any tests around this already and if any implementation bugs need to be filed? Sorry for the additional delay, but I'd like to be sure we don't have to revisit this anytime soon.

As mentioned in the initial pull request message:

The only observable change here is the origin for blob URLs created in unique origins (which is only observable indirectly by trying to fetch such URLs), and tests for that already exist in https://w3c-test.org/FileAPI/url/sandboxed-iframe.html.

And from looking at https://wpt.fyi/results/FileAPI/url/sandboxed-iframe.html?label=master&label=stable&aligned there are probably some implementation bugs that should be filed if they haven't been filed already, although not clear how much that is related to this PR (which really was just trying to codify what browsers are already doing today). I.e. tests like "Blob URLs can be used in XHR and "Blob URLs can be used in fetch" passing in all browsers show that at least in some cases they use the correct origin. But then "Blob URLs can be used in iframes, and are treated same origin" fails in all browsers, so I'll have to double check what's going on there.

So the test situation is a bit tricky since it is only possible to indirectly observe the origin of a opaque origin'ed blob URL, but I think for this PR we're fine, all implementations at least do the right thing for fetch/xhr. There's definitely more work to be done here though, for example as part of whatwg/fetch#666, which will help address when exactly same/cross origin blob URLs should actually load or not load. As part of that I expect I'll be writing more tests, and probably filing implementation bugs. Does that sound reasonable?

@annevk
Copy link
Member

annevk commented Jan 14, 2019

Sounds good to me, appreciate it!

@annevk annevk merged commit d2ef633 into whatwg:master Jan 14, 2019
annevk pushed a commit to whatwg/fetch that referenced this pull request Jan 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants