Skip to content

Add text imports#11933

Open
eemeli wants to merge 7 commits intowhatwg:mainfrom
eemeli:import-text
Open

Add text imports#11933
eemeli wants to merge 7 commits intowhatwg:mainfrom
eemeli:import-text

Conversation

@eemeli
Copy link
Copy Markdown
Member

@eemeli eemeli commented Nov 20, 2025

This is the HTML counterpart for the newly-introduced TC39 Import Text Stage 2 proposal; this has been previously discussed here in #9444, and is closely related (but not blocking on) PR #11657, which correspondingly adds byte imports.

The type: 'text' import attribute is introduced here without any MIME type requirement, and without any validity or well-formedness checks on the returned string value. @sffc has raised some concerns about additional validation steps being desirable for text content, which would be appropriate for consideration in this spec, rather than the JS spec.

The implementer support noted below is based on feedback from delegates at the 2025 November TC39 meeting.


/acknowledgements.html ( diff )
/infrastructure.html ( diff )
/links.html ( diff )
/references.html ( diff )
/webappapis.html ( diff )

Adds support for the `{ type: 'text' }` import attribute,
which enables importing text content as a JavaScript string.
@annevk
Copy link
Copy Markdown
Member

annevk commented Nov 20, 2025

cc @nicolo-ribaudo @bakkot

@bakkot
Copy link
Copy Markdown
Contributor

bakkot commented Nov 22, 2025

Looks good. I do wonder whether there ought to be a required MIME type, especially given that the other types all have one. If type: "bytes" had a similar requirement (which IMO would be much more annoying), that would allay the concerns in this issue.

Alternatively there could be a requirement not to have certain MIME types, e.g. to disallow importing JavaScript as text.

I think I am personally inclined not to impose such requirements, but I could see the argument either way.

@zcorpan
Copy link
Copy Markdown
Member

zcorpan commented Nov 25, 2025

Would this make available the same content that is already available via fetch()? Is it possible to use CSP (connect-src ?) to restrict these imports?

@bakkot
Copy link
Copy Markdown
Contributor

bakkot commented Nov 25, 2025

Good question. Looks like type: "json" imports are governed by connect-src, so I assume these should be as well. This will require filing a PR against CSP updating the "Get the effective directive for request" algorithm to list requests with destination "text" as having effective directive connect-src. (Not strictly necessary because connect-src is the fallback, but it would be weird to leave it out given that "json" imports are listed explicitly.)

For clarity it might be good to rename the destination to "text-import" or something, incidentally.

Anyway, yes, this ends up letting you get the exact same content as fetch.

@nicolo-ribaudo
Copy link
Copy Markdown
Collaborator

Do we actually want a text destination? We added one for json because it's a specific format, but in most cases a text import will then be parsed by something else. fetch uses an empty destination because, similarly, we don't actually know how its result is going to be used.

@bakkot
Copy link
Copy Markdown
Contributor

bakkot commented Nov 25, 2025

json doesn't seem any more specific than text to me, really. It's just data either way.

@annevk
Copy link
Copy Markdown
Member

annevk commented Nov 25, 2025

It's specific in that we can ask the server for JSON with Accept: application/json (and we do).

Given that the destination is exposed these days through Sec-Fetch-Dest as well it would probably be useful to have a dedicated value here as well. text seems fine to me.

@bakkot
Copy link
Copy Markdown
Contributor

bakkot commented Nov 25, 2025

Oh, nice, I didn't realize that it got exposed there and wasn't just editorial. In that case I agree text is fine, as long as we don't imagine later introducing a new kind of "text" destination where the distinction is important.

@eemeli
Copy link
Copy Markdown
Member Author

eemeli commented Dec 17, 2025

I've now added PRs for the changes in the Fetch and CSP specs, and for WPT tests.

@eemeli
Copy link
Copy Markdown
Member Author

eemeli commented Mar 11, 2026

The TC39 / ECMA-262 part of this proposal has now advanced to Stage 3.

@smaug---- smaug---- added the agenda+ To be discussed at a triage meeting label Mar 12, 2026
Copy link
Copy Markdown
Collaborator

@noamr noamr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be rebased, and properly support modulepreload like CSS/JSON.

@eemeli eemeli requested a review from noamr March 12, 2026 17:58
@noamr
Copy link
Copy Markdown
Collaborator

noamr commented Mar 13, 2026

Do the tests cover module preload? The spec text seemed good AFAICT!

Copy link
Copy Markdown
Collaborator

@noamr noamr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unofficial LGTM % tests for modulepreload

@eemeli
Copy link
Copy Markdown
Member Author

eemeli commented Mar 20, 2026

Do the tests cover module preload?

@noamr They do now (I just updated the WPT PR).

Copy link
Copy Markdown
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. It would be great if @nicolo-ribaudo could also do a quick skim.

@annevk
Copy link
Copy Markdown
Member

annevk commented Mar 20, 2026

MDN and browser bugs should be completed for this as well before landing.

@eemeli
Copy link
Copy Markdown
Member Author

eemeli commented Mar 20, 2026

MDN and browser bugs should be completed for this as well before landing.

Done.

@eemeli eemeli requested a review from nicolo-ribaudo March 20, 2026 17:09
@noamr
Copy link
Copy Markdown
Collaborator

noamr commented Mar 24, 2026

@eemeli can you detail how this PR addresses the concerns you've linked to in tc39/proposal-import-text#8?

The PR description says they should be addressed in this PR. Maybe they are? It would be good to flesh it out.
Seems to me from this PR that response bytes for text imports are always decoded as UTF-8. Perhaps it needs to verify that the mime-type is textual as well, so that it doesn't try to UTF8-decode some binary response?

@smaug---- smaug---- removed the agenda+ To be discussed at a triage meeting label Mar 26, 2026
@eemeli
Copy link
Copy Markdown
Member Author

eemeli commented Mar 26, 2026

@noamr The concerns raised in tc39/proposal-import-text#8 lead to tc39/proposal-import-text#10, which added a non-normative qualifier that the String representing the imported module "should represent textual data". This recalls the pre-existing spec description of Strings as "generally used to represent textual data".

As you note, this proposal has imported text modules always being parsed as UTF-8, as required by the Encoding standard. And so if the imported bytes do not in fact represent text, we end up with an unusable string full of lossy � replacements, and not e.g. a base64 representation of it, or some other reversible binary-as-text mapping.

Once a text module has been imported, we can (and should) presume that it'll be used for something; either presented directly to the user, or parsed for some further meaning. In either case, a UTF-8 parsing of binary data will be readily apparent as not fulfilling whatever need is put to it, and the error can be addressed by the developer. However, this assertion of "readily apparent" is rather dependent on having some idea of the expected shape of the textual data, which we don't know for the general case: Maybe it's prose that's been incorrectly encoded and decoded resulting in a smattering of replacement characters; maybe it's an XML document; maybe it's base122 encoded data.

And so if we do want to identify the errors caused by parsing binary data as text a bit earlier, and in a way that would make those errors non-recoverable (you can't try-catch an import statement), I agree that filtering on the MIME type is perhaps the only available recourse. That does leave us with the problem of identifying "textual" MIME types, for which I'm not aware of a pre-existing registry or heuristic -- or is there one I've missed? Relying on the IANA registry's "Encoding Considerations" section is not really appropriate: as noted in its application form,

If the format is based on JSON or XML, "binary" should generally be selected due to the possibility that lines could be longer than 998 octets.

Relying on such a registry would also introduce unnecessary points of divergence between implementations. For example, application/yaml was introduced in 2024, and should definitely qualify as "textual data" for the purposes here -- it's been the example I've used for the proposal from its inception. If filtering like this were used and the proposal was already available, a pre-2024 browser would probably not allow importing an application/yaml file as text, while a post-2024 browser would allow it. This seems like an unnecessary burden to put on the adoption of new MIME types, with no clear user benefit.

To summarize, I think the concerns that have been raised are addressed as well as they can be by this proposal, and that introducing additional filtering, sniffing, or heuristics is not warranted, given the real-world limitations we need to contend with, and the negative consequences of doing so.

@noamr
Copy link
Copy Markdown
Collaborator

noamr commented Mar 26, 2026

Thanks for the detailed response, @eemeli. This is resolved as far as I'm concerned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

addition/proposal New features or enhancements topic: script

Development

Successfully merging this pull request may close these issues.

7 participants