perf(html): reduce allocations and speed up the experimental HTML parser#21152
Conversation
Replace the per-check input.slice(...).toLowerCase() in the RCDATA / RAWTEXT / SCRIPT_DATA / SCRIPT_DATA_ESCAPED end-tag-name states with the existing allocation-free rangeEqualsLower helper, and add an exact-match fast path to decodeHtmlEntities so the common single-entity case skips a full-length prefix slice. https://claude.ai/code/session_01RbPceANkJXa5R9WQWCfH6q
Replace inline array-literal `.includes()` checks in the per-token insertion-mode handlers with module-level `Set.has()` lookups (reusing TABLE_CONTEXT / HEAD_ELEMENTS and adding a few small sets), and hoist the repeated open-stack "is there an HTML <template>?" predicate so the eight `open.some(...)` calls share one function instead of allocating an arrow each time. https://claude.ai/code/session_01RbPceANkJXa5R9WQWCfH6q
The have-an-element-in-scope helpers (inScope / inButtonScope / inListItemScope / inTableScope / inScopeEl) passed one or two arrow predicates into a shared matcher, allocating closures on every call -- and these run several times per body tag. Rewrite them to walk the open stack directly with the boundary kind selected by a small int constant, and add a hoisted findAttr helper to replace the per-call `Array#find` closures used for the <input> type and annotation-xml encoding lookups. https://claude.ai/code/session_01RbPceANkJXa5R9WQWCfH6q
`framesetOk` only ever transitions true→false, so once it is false the per-character-token `isAllWs` check in "in body" is wasted work -- guard it with `framesetOk &&` so the scan stops running after the flag flips (which happens very early in real documents). Also rewrite `isAllWs` with a charCodeAt loop, dropping the `for…of` code-point iterator, per-char string, and Set lookup; this speeds up the remaining whitespace checks in the head/table/after-body modes too. https://claude.ai/code/session_01RbPceANkJXa5R9WQWCfH6q
The tag-name and attribute-name tokenizer states stepped one character per outer-state-machine iteration, unlike the data / RAWTEXT / attribute-value states which fast-forward the ordinary run in a tight inner loop. Add the same inner loop to both states so a run of ordinary name characters is consumed without re-entering the big state switch each character; the loop stops on every terminator and on the chars that need a per-occurrence parse error, which the outer switch re-handles, so behavior is unchanged. https://claude.ai/code/session_01RbPceANkJXa5R9WQWCfH6q
🦋 Changeset detectedLatest commit: 81e37d6 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
This PR is packaged and the instant preview is available (d323aee). Install it locally:
npm i -D webpack@https://pkg.pr.new/webpack@d323aee
yarn add -D webpack@https://pkg.pr.new/webpack@d323aee
pnpm add -D webpack@https://pkg.pr.new/webpack@d323aee |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #21152 +/- ##
=======================================
Coverage 92.32% 92.33%
=======================================
Files 581 581
Lines 63288 63349 +61
Branches 17507 17518 +11
=======================================
+ Hits 58431 58491 +60
- Misses 4857 4858 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Merging this PR will improve performance by 37.56%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Memory | benchmark "asset-modules-bytes", scenario '{"name":"mode-development-rebuild","mode":"development","watch":true}' |
246.7 KB | 859.1 KB | -71.28% |
| ⚡ | Memory | benchmark "lodash", scenario '{"name":"mode-development-rebuild","mode":"development","watch":true}' |
858.6 KB | 126.6 KB | ×6.8 |
| ⚡ | Memory | benchmark "many-chunks-esm", scenario '{"name":"mode-production","mode":"production"}' |
10 MB | 7.5 MB | +33.63% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing perf/html-parser-perf (81e37d6) with main (9bd0b91)
Summary
Behavior-preserving allocation- and CPU-focused optimizations to the experimental HTML parser (the
walkHtmlTokenstokenizer and thebuildHtmlAsttree builder). The changes: replace the per-tokeninput.slice(...).toLowerCase()end-tag checks in the RCDATA/RAWTEXT/script states with the allocation-freerangeEqualsLowerhelper; swap inline array-literal.includes()in the insertion-mode handlers for module-levelSetlookups (reusing the existing sets) and hoist the repeated "is there an open HTML<template>?" predicate; make the open-stack scope checks (inScope/inButtonScope/inListItemScope/inTableScope/inScopeEl) closure-free; add afindAttrhelper to drop per-callArray#findclosures; skip the redundant per-text-tokenframesetOkwhitespace scan once the flag is false (and rewriteisAllWsas acharCodeAtloop); and fast-forward the tag-name and attribute-name tokenizer states like the data/value states already do.On 16 MB inputs this is ~17% faster on indented/pretty-printed markup, ~9% on attribute-heavy markup, and ~2% on a balanced page (where AST construction dominates total time). Retained heap is unchanged — the wins come from removing transient allocations and redundant scans, not from changing AST node shapes.
What kind of change does this PR introduce?
perf
Did you add tests for your changes?
No new tests — these are behavior-preserving performance changes. Correctness is covered by the existing
test/walkHtmlTokens.unittest.js(253 tests),test/buildHtmlAst.unittest.js+test/HtmlParser.unittest.js(48 tests), and the full WHATWGtest/html5lib.spectest.jsconformance corpus (15,161 cases); all pass unchanged.Does this PR introduce a breaking change?
No.
If relevant, what needs to be documented once your changes are merged or what have you already documented?
n/a
Use of AI
Yes. Implemented with Claude Code: it located the allocation/CPU hotspots, made the edits, and verified them against the existing unit tests, the html5lib conformance suite, and before/after benchmarks. Reviewed before submitting.
Generated by Claude Code