Commit eb2fc4d
committed
fix(html): address Copilot review + CI lint failure
- Fix tsc lint failure: cast the inlined `HTML_ENTITIES` to
`Readonly<Record<string, string>>` so the two `HTML_ENTITIES[name]`
call sites type-check (the frozen-literal type was too narrow).
- Copilot #1: replace the `tempBuffer` string in
`STATE_SCRIPT_DATA_DOUBLE_ESCAPE_START` / `_END` with a small
`scriptMatch` counter against the literal `"script"`. Worst-case
inputs with very long ASCII-alpha runs after `</` no longer grow a
buffer or do quadratic string concatenation. Vestigial
`tempBuffer = ""` resets in the four content-mode less-than-sign
states (which never read the buffer) are removed; the
`tempBuffer` variable is gone.
- Copilot #2: introduce a `MAX_ENTITY_NAME_LEN = 32` constant
(longest WHATWG entity name including the trailing `;`) and cap
the named-character-reference alphanumeric run at
`MAX_ENTITY_NAME_LEN - 1`. Replaces the off-by-one
`if (runLen > 32) break` with a clearer in-loop bound.
- Copilot #3: cap the `decodeHtmlEntities` longest-prefix backtrack
at `MAX_ENTITY_NAME_LEN`. Inputs like `&` followed by thousands
of alphanumerics stay linear-time; anything past the cap is
appended verbatim.
Adds a regression test that decodes `&` + 1000 chars to confirm the
decoder doesn't go quadratic.
236/236 tests pass, 100% coverage on `walkHtmlTokens.js`,
`yarn lint` (including tsc) clean.
https://claude.ai/code/session_01N4rd8xuv5oRaHWFL8dFpwh1 parent 5536665 commit eb2fc4d
3 files changed
Lines changed: 77 additions & 51 deletions
0 commit comments