Skip to content

perf(html): reduce allocations and speed up the experimental HTML parser#21152

Merged
alexander-akait merged 5 commits into
mainfrom
perf/html-parser-perf
Jun 9, 2026
Merged

perf(html): reduce allocations and speed up the experimental HTML parser#21152
alexander-akait merged 5 commits into
mainfrom
perf/html-parser-perf

Conversation

@alexander-akait

Copy link
Copy Markdown
Member

Summary

Behavior-preserving allocation- and CPU-focused optimizations to the experimental HTML parser (the walkHtmlTokens tokenizer and the buildHtmlAst tree builder). The changes: replace the per-token input.slice(...).toLowerCase() end-tag checks in the RCDATA/RAWTEXT/script states with the allocation-free rangeEqualsLower helper; swap inline array-literal .includes() in the insertion-mode handlers for module-level Set lookups (reusing the existing sets) and hoist the repeated "is there an open HTML <template>?" predicate; make the open-stack scope checks (inScope/inButtonScope/inListItemScope/inTableScope/inScopeEl) closure-free; add a findAttr helper to drop per-call Array#find closures; skip the redundant per-text-token framesetOk whitespace scan once the flag is false (and rewrite isAllWs as a charCodeAt loop); and fast-forward the tag-name and attribute-name tokenizer states like the data/value states already do.

On 16 MB inputs this is ~17% faster on indented/pretty-printed markup, ~9% on attribute-heavy markup, and ~2% on a balanced page (where AST construction dominates total time). Retained heap is unchanged — the wins come from removing transient allocations and redundant scans, not from changing AST node shapes.

What kind of change does this PR introduce?

perf

Did you add tests for your changes?

No new tests — these are behavior-preserving performance changes. Correctness is covered by the existing test/walkHtmlTokens.unittest.js (253 tests), test/buildHtmlAst.unittest.js + test/HtmlParser.unittest.js (48 tests), and the full WHATWG test/html5lib.spectest.js conformance corpus (15,161 cases); all pass unchanged.

Does this PR introduce a breaking change?

No.

If relevant, what needs to be documented once your changes are merged or what have you already documented?

n/a

Use of AI

Yes. Implemented with Claude Code: it located the allocation/CPU hotspots, made the edits, and verified them against the existing unit tests, the html5lib conformance suite, and before/after benchmarks. Reviewed before submitting.


Generated by Claude Code

Replace the per-check input.slice(...).toLowerCase() in the RCDATA /
RAWTEXT / SCRIPT_DATA / SCRIPT_DATA_ESCAPED end-tag-name states with the
existing allocation-free rangeEqualsLower helper, and add an exact-match
fast path to decodeHtmlEntities so the common single-entity case skips a
full-length prefix slice.

https://claude.ai/code/session_01RbPceANkJXa5R9WQWCfH6q
Replace inline array-literal `.includes()` checks in the per-token
insertion-mode handlers with module-level `Set.has()` lookups (reusing
TABLE_CONTEXT / HEAD_ELEMENTS and adding a few small sets), and hoist the
repeated open-stack "is there an HTML <template>?" predicate so the eight
`open.some(...)` calls share one function instead of allocating an arrow
each time.

https://claude.ai/code/session_01RbPceANkJXa5R9WQWCfH6q
The have-an-element-in-scope helpers (inScope / inButtonScope /
inListItemScope / inTableScope / inScopeEl) passed one or two arrow
predicates into a shared matcher, allocating closures on every call --
and these run several times per body tag. Rewrite them to walk the open
stack directly with the boundary kind selected by a small int constant,
and add a hoisted findAttr helper to replace the per-call `Array#find`
closures used for the <input> type and annotation-xml encoding lookups.

https://claude.ai/code/session_01RbPceANkJXa5R9WQWCfH6q
`framesetOk` only ever transitions true→false, so once it is false the
per-character-token `isAllWs` check in "in body" is wasted work -- guard
it with `framesetOk &&` so the scan stops running after the flag flips
(which happens very early in real documents). Also rewrite `isAllWs` with
a charCodeAt loop, dropping the `for…of` code-point iterator, per-char
string, and Set lookup; this speeds up the remaining whitespace checks in
the head/table/after-body modes too.

https://claude.ai/code/session_01RbPceANkJXa5R9WQWCfH6q
The tag-name and attribute-name tokenizer states stepped one character per
outer-state-machine iteration, unlike the data / RAWTEXT / attribute-value
states which fast-forward the ordinary run in a tight inner loop. Add the
same inner loop to both states so a run of ordinary name characters is
consumed without re-entering the big state switch each character; the loop
stops on every terminator and on the chars that need a per-occurrence parse
error, which the outer switch re-handles, so behavior is unchanged.

https://claude.ai/code/session_01RbPceANkJXa5R9WQWCfH6q
@changeset-bot

changeset-bot Bot commented Jun 9, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 81e37d6

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
webpack Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

This PR is packaged and the instant preview is available (d323aee).

Install it locally:

  • npm
npm i -D webpack@https://pkg.pr.new/webpack@d323aee
  • yarn
yarn add -D webpack@https://pkg.pr.new/webpack@d323aee
  • pnpm
pnpm add -D webpack@https://pkg.pr.new/webpack@d323aee

@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 97.95918% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.33%. Comparing base (87394de) to head (81e37d6).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
lib/html/buildHtmlAst.js 97.14% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #21152   +/-   ##
=======================================
  Coverage   92.32%   92.33%           
=======================================
  Files         581      581           
  Lines       63288    63349   +61     
  Branches    17507    17518   +11     
=======================================
+ Hits        58431    58491   +60     
- Misses       4857     4858    +1     
Flag Coverage Δ
css-parsing 28.64% <ø> (-0.02%) ⬇️
html5lib 31.05% <97.95%> (+0.02%) ⬆️
integration 88.51% <78.57%> (+0.01%) ⬆️
test262 45.36% <ø> (+0.04%) ⬆️
unit 41.13% <81.63%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Types Coverage

Coverage after merging perf/html-parser-perf into main will be
99.33%
Coverage Report
FileStmtsBranchesFuncsLinesUncovered Lines
bin
   webpack.js98.77%100%100%98.77%91
examples
   build-common.js100%100%100%100%
   buildAll.js100%100%100%100%
   examples.js100%100%100%100%
   template-common.js98.21%100%100%98.21%72
examples/custom-javascript-parser
   test.filter.js100%100%100%100%
examples/custom-javascript-parser/internals
   acorn-parse.js100%100%100%100%
   meriyah-parse.js100%100%100%100%
   oxc-parse.js91.30%100%100%91.30%140, 142–143, 145, 147, 153–154, 161, 168, 90
examples/markdown
   webpack.config.mjs100%100%100%100%
examples/typescript
   test.filter.js100%100%100%100%
examples/typescript-non-erasable
   test.filter.js50%100%100%50%5
examples/virtual-modules
   test.filter.js100%100%100%100%
examples/wasm-bindgen-esm
   test.filter.js100%100%100%100%
examples/wasm-complex
   test.filter.js100%100%100%100%
examples/wasm-simple
   test.filter.js100%100%100%100%
examples/wasm-simple-source-phase
   test.filter.js100%100%100%100%
lib
   APIPlugin.js100%100%100%100%
   AsyncDependenciesBlock.js100%100%100%100%
   AutomaticPrefetchPlugin.js100%100%100%100%
   BannerPlugin.js100%100%100%100%
   Cache.js98.21%100%100%98.21%101
   CacheFacade.js100%100%100%100%
   Chunk.js99.72%100%100%99.72%39
   ChunkGraph.js100%100%100%100%
   ChunkGroup.js100%100%100%100%
   ChunkTemplate.js100%100%100%100%
   CleanPlugin.js99.15%100%100%99.15%206, 226
   CodeGenerationResults.js100%100%100%100%
   CompatibilityPlugin.js100%100%100%100%
   Compilation.js98.49%100%100%98.49%1577, 1873, 1880, 1888, 1910, 2806, 3249, 3924, 3954, 4007–4008, 4012, 4017, 4033–4034, 4048–4049, 4054–4055, 4532, 4558, 512, 517, 5366, 5398, 5415, 5431, 5447, 5462, 5487–5488, 5490, 5818, 5823, 5829, 5832, 5844, 5846, 5850, 5866, 5881, 5913, 5967, 5991, 6105, 731–732
   Compiler.js99.56%100%100%99.56%1135–1136, 1144
   ConcatenationScope.js98.59%100%100%98.59%189
   ConditionalInitFragment.js100%100%100%100%
   ConstPlugin.js100%100%100%100%
   ContextExclusionPlugin.js100%100%100%100%
   ContextModule.js100%100%100%100%
   ContextModuleFactory.js97.40%100%100%97.40%258, 395, 418, 420, 424, 433–434
   ContextReplacementPlugin.js100%100%100%100%
   DefinePlugin.js99%100%100%99%170–171, 187, 206, 280
   DependenciesBlock.js100%100%100%100%
   Dependency.js98.15%100%100%98.15%379, 425
   DependencyTemplate.js100%100%100%100%
   DependencyTemplates.js100%100%100%100%
   DotenvPlugin.js98.41%100%100%98.41%378, 391–392
   DynamicEntryPlugin.js100%100%100%100%
   EntryOptionPlugin.js100%100%100%100%
   EntryPlugin.js100%100%100%100%
   Entrypoint.js100%100%100%100%
   EnvironmentPlugin.js97.14%100%100%97.14%49
   ErrorHelpers.js100%100%100%100%
   EvalDevToolModulePlugin.js100%100%100%100%
   EvalSourceMapDevToolPlugin.js100%100%100%100%
   ExportsInfo.js100%100%100%100%
   ExportsInfoApiPlugin.js100%100%100%100%
   ExternalModule.js98.97%100%100%98.97%425–429, 577
   ExternalModuleFactoryPlugin.js100%100%100%100%
   ExternalsPlugin.js100%100%100%100%
   FileSystemInfo.js99.50%100%100%99.50%182, 2252–2253, 2256, 2267, 2278, 2289, 278, 3693, 3708, 3732
   FlagAllModulesAsUsedPlugin.js100%100%100%100%
   FlagDependencyExportsPlugin.js98.85%100%100%98.85%434, 436, 440
   FlagDependencyUsagePlugin.js100%100%100%100%
   FlagEntryExportAsUsedPlugin.js100%100%100%100%
   Generator.js100%100%100%100%
   HotModuleReplacementPlugin.js100%100%100%100%
   HotUpdateChunk.js100%100%100%100%
   IgnorePlugin.js100%100%100%100%
   IgnoreWarningsPlugin.js100%100%100%100%
   InitFragment.js100%100%100%100%
   JavascriptMetaInfoPlugin.js100%100%100%100%
   LibraryTemplatePlugin.js100%100%100%100%
   LoaderOptionsPlugin.js100%100%100%100%
   LoaderTargetPlugin.js100%100%100%100%
   MainTemplate.js100%100%100%100%
   ManifestPlugin.js100%100%100%100%
   Module.js98.50%100%100%98.50%1311, 1316, 1376, 1390, 1452, 1461
   ModuleFactory.js100%100%100%100%
   ModuleFilenameHelpers.js98.85%100%100%98.85%106, 108
   ModuleGraph.js99.73%100%100%99.73%1005
   ModuleGraphConnection.js100%100%100%100%
   ModuleInfoHeaderPlugin.js100%100%100%100%
   ModuleNotFoundError.js100%100%100%100%
   ModuleProfile.js100%100%100%100%
   ModuleSourceTypeConstants.js100%100%100%100%
   ModuleTemplate.js100%100%100%100%
   ModuleTypeConstants.js100%100%100%100%
   MultiCompiler.js99.69%100%100%99.69%659
   MultiStats.js100%100%100%100%
   MultiWatching.js100%100%100%100%
   NoEmitOnErrorsPlugin.js100%100%100%100%
   NodeStuffPlugin.js100%100%100%100%
   NormalModule.js97.90%100%100%97.90%1219, 1222, 1239, 1256, 1503, 1537, 1553, 1640, 1994, 2292, 2297–2307, 417, 421, 575
   NormalModuleFactory.js99.47%100%100%99.47%1083, 1392, 486, 498
   NormalModuleReplacementPlugin.js100%100%100%100%
   NullFactory.js100%100%100%100%
   OptimizationStages.js100%100%100%100%
   OptionsApply.js100%100%100%100%
   Parser.js100%100%100%100%
   PlatformPlugin.js100%100%100%100%
   PrefetchPlugin.js100%100%100%100%
   ProgressPlugin.js98.85%100%100%98.85%519–520, 525, 527, 591
   ProvidePlugin.js100%100%100%100%
   RawModule.js100%100%100%100%
   RecordIdsPlugin.js100%100%100%100%
   RequestShortener.js100%100%100%100%
   ResolverFactory.js100%100%100%100%
   RuntimeGlobals.js100%100%100%100%
   RuntimeModule.js100%100%100%100%
   RuntimePlugin.js100%100%100%100%
   RuntimeTemplate.js100%100%100%100%
   SelfModuleFactory.js100%100%100%100%
   SingleEntryPlugin.js100%100%100%100%
   SourceMapDevToolModuleOptionsPlugin.js100%100%100%100%
   SourceMapDevToolPlugin.js98.62%100%100%98.62%220, 224, 226, 419, 430, 891
   Stats.js100%100%100%100%
   Template.js100%100%100%100%
   TemplatedPathPlugin.js99.13%100%100%99.13%176–177
   UseStrictPlugin.js100%100%100%100%
   WarnCaseSensitiveModulesPlugin.js100%100%100%100%
   WarnDeprecatedOptionPlugin.js100%100%100%100%
   WarnNoModeSetPlugin.js100%100%100%100%
   WatchIgnorePlugin.js100%100%100%100%
   Watching.js100%100%100%100%
   WebpackError.js100%100%100%100%
   WebpackIsIncludedPlugin.js100%100%100%100%
   WebpackOptionsApply.js100%100%100%100%
   WebpackOptionsDefaulter.js100%100%100%100%
   buildChunkGraph.js99.87%100%100%99.87%326
   cli.js98.62%100%100%98.62%10, 119, 545, 577, 627, 897
   index.js99.72%100%100%99.72%165
   validateSchema.js94.67%100%100%94.67%100, 87, 89, 98
   webpack.js96.33%100%100%96.33%10, 198, 220, 222
lib/asset
   AssetBytesGenerator.js100%100%100%100%
   AssetBytesParser.js100%100%100%100%
   AssetGenerator.js100%100%100%100%
   AssetModulesPlugin.js97.32%100%100%97.32%283, 307, 310, 36, 362, 41
   AssetParser.js100%100%100%100%
   AssetSourceGenerator.js100%100%100%100%
   AssetSourceParser.js100%100%100%100%
   RawDataUrlModule.js100%100%100%100%
lib/async-modules
   AsyncModuleHelpers.js100%100%100%100%
   AwaitDependenciesInitFragment.js100%100%100%100%
   InferAsyncModulesPlugin.js100%100%100%100%
lib/cache
   AddBuildDependenciesPlugin.js100%100%100%100%
   AddManagedPathsPlugin.js100%100%100%100%
   IdleFileCachePlugin.js97.92%100%100%97.92%71, 83, 91
   MemoryCachePlugin.js95.83%100%100%95.83%33
   MemoryWithGcCachePlugin.js93.15%100%100%93.15%106, 113–114, 122, 89
   PackFileCacheStrategy.js96.40%100%100%96.40%1250, 1350, 1354, 1416, 628, 647, 657–659, 661, 677–678, 683, 686, 688, 693, 698, 722, 728, 762, 768, 774, 779, 790, 799, 804–805, 807, 824, 830–831, 833
   ResolverCachePlugin.js100%100%100%100%
   getLazyHashedEtag.js100%100%100%100%
   mergeEtags.js100%100%100%100%
lib/config
   browserslistTargetHandler.js100%100%100%100%
   defaults.js99.30%100%100%99.30%1429–1431, 1439, 274, 277, 282, 286
   normalization.js99.01%100%100%99.01%191–192, 258, 273
   target.js100%100%100%100%
lib/container
   ContainerEntryDependency.js100%100%100%100%
   ContainerEntryModule.js100%100%100%100%
   ContainerEntryModuleFactory.js100%100%100%100%
   ContainerExposedDependency.js100%100%100%100%
   ContainerPlugin.js100%100%100%100%
   ContainerReferencePlugin.js100%100%100%100%
 

@codspeed-hq

codspeed-hq Bot commented Jun 9, 2026

Copy link
Copy Markdown

Merging this PR will improve performance by 37.56%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 2 improved benchmarks
❌ 1 regressed benchmark
✅ 141 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Memory benchmark "asset-modules-bytes", scenario '{"name":"mode-development-rebuild","mode":"development","watch":true}' 246.7 KB 859.1 KB -71.28%
Memory benchmark "lodash", scenario '{"name":"mode-development-rebuild","mode":"development","watch":true}' 858.6 KB 126.6 KB ×6.8
Memory benchmark "many-chunks-esm", scenario '{"name":"mode-production","mode":"production"}' 10 MB 7.5 MB +33.63%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing perf/html-parser-perf (81e37d6) with main (9bd0b91)

Open in CodSpeed

@alexander-akait alexander-akait merged commit d323aee into main Jun 9, 2026
126 of 127 checks passed
@alexander-akait alexander-akait deleted the perf/html-parser-perf branch June 9, 2026 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant