Skip to content

perf(formatter_core): add printable-ASCII fast path to TextWidth#23913

Merged
leaysgur merged 2 commits into
oxc-project:mainfrom
linyiru:perf/formatter-ascii-text-width
Jun 29, 2026
Merged

perf(formatter_core): add printable-ASCII fast path to TextWidth#23913
leaysgur merged 2 commits into
oxc-project:mainfrom
linyiru:perf/formatter-ascii-text-width

Conversation

@linyiru

@linyiru linyiru commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

What

TextWidth::from_text and TextWidth::from_non_whitespace_str both call UnicodeWidthStr::width, which is a per-character multi-level table walk with no string-level ASCII shortcut. The text they measure — identifiers, private names, JSX names, numbers, and most string-literal content — is overwhelmingly printable ASCII.

This adds a fast path: if every byte is in 0x20..=0x7E, the display width equals the byte length and the text is single-line, so the result is returned directly without consulting the Unicode width table:

if text.as_bytes().iter().all(|&b| matches!(b, 0x20..=0x7e)) {
    return Self::single(text.len() as u32);
}

The range excludes \t (0x09), \n (0x0A), every control byte and all multi-byte UTF-8, which fall through to the unchanged scan. This replaces a per-char table lookup with an autovectorizable byte-range scan; it is not an attempt to beat existing SIMD.

Correctness (byte-identical)

  • A new differential unit test asserts the fast path equals the original computation over ASCII, leading/trailing spaces, tabs, \n, \r\n, control bytes (VT/FF/DEL/US) and multi-byte/emoji-VS16 inputs (cargo test -p oxc_formatter_core).
  • The full Prettier conformance suite is unchangedjs 746/753, ts 591/601, all JSON variants and jsdoc identical, with zero snapshot diffs.

Performance

CodSpeed confirms a measurable improvement (the per-text-element table walk is removed on the hottest node types):

Benchmark BASE HEAD Efficiency
formatter[errors.ts] 613.1 µs 570.3 µs +7.5%
formatter[handle-comments.js] 2.9 ms 2.8 ms +4.17%
formatter[index.tsx] 3.9 ms 3.8 ms +3.19%
formatter[core.js] 1.6 ms 1.6 ms +3.15%

4 improved benchmarks, 0 regressed.


Disclosure: this change was prepared with AI assistance (Claude). I reviewed and tested it myself — byte-identity is verified by the differential unit test and the unchanged Prettier conformance snapshots, and the speedup above is from CodSpeed on this PR.

`TextWidth::from_text` and `from_non_whitespace_str` call
`UnicodeWidthStr::width`, which is a per-character multi-level table walk
with no string-level ASCII shortcut. The text it measures (identifiers,
private names, JSX names, numbers, and most string-literal content) is
overwhelmingly printable ASCII.

Add a fast path: if every byte is in `0x20..=0x7E` the display width equals
the byte length and the text is single-line, so we can return it directly
without consulting the Unicode width table. The range excludes `\t`, `\n`,
all control bytes and every multi-byte UTF-8 sequence, which fall through to
the unchanged scan.

Behaviour is byte-identical: a differential unit test checks the fast path
against the original computation over ASCII, tab/newline, control and
multi-byte inputs, and the full Prettier conformance suite is unchanged.
@codspeed-hq

codspeed-hq Bot commented Jun 28, 2026

Copy link
Copy Markdown

Merging this PR will improve performance by 4.54%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 4 improved benchmarks
✅ 48 untouched benchmarks
⏩ 19 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation formatter[errors.ts] 613.7 µs 570.7 µs +7.53%
Simulation formatter[handle-comments.js] 2.9 ms 2.8 ms +4.16%
Simulation formatter[index.tsx] 3.9 ms 3.8 ms +3.36%
Simulation formatter[core.js] 1.6 ms 1.6 ms +3.17%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing linyiru:perf/formatter-ascii-text-width (931afdd) with main (b4f5b4b)2

Open in CodSpeed

Footnotes

  1. 19 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on main (0b07c4c) during the generation of this report, so b4f5b4b was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@linyiru linyiru marked this pull request as ready for review June 28, 2026 23:46
@linyiru linyiru requested a review from leaysgur as a code owner June 28, 2026 23:46

@leaysgur leaysgur left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you~

@leaysgur leaysgur merged commit 4ddcba0 into oxc-project:main Jun 29, 2026
36 checks passed
camc314 added a commit that referenced this pull request Jun 29, 2026
# Oxlint
### 💥 BREAKING CHANGES

- 88f4455 str: [**BREAKING**] `Str` and `Ident` methods take
`&GetAllocator` (#23781) (overlookmotel)

### 🚀 Features

- f2091b3 ast: Unify old and new `AstBuilder`s (#23875) (overlookmotel)
- 1c8f50c linter: Add schema for `eslint/no-restricted-import` (#23642)
(Sysix)

### 🐛 Bug Fixes

- 7cb85c4 linter/eslint/no-negated-condition: Add autofix for negated
conditions (#23825) (Yagiz Nizipli)
- f7d1f50 oxlint, oxfmt: Enable `disable_old_builder` Cargo feature for
`oxc_ast` crate (#23886) (overlookmotel)
- d891990 linter/jsx-a11y/role-supports-aria-props: Ignore nullish prop
values (#23865) (Mikhail Baev)
- 94b6599 linter: Deduplicate missing plugin errors (#23853) (camc314)
- eff3eff linter/oxc/branches-sharing-code: Avoid else-if false
positives (#23843) (camc314)
- 2a2d3b9 linter/eslint/prefer-destructuring: Skip
`AssignmentExpression` autofixes (#23818) (camc314)
- ddc24ae linter/eslint/id-length: Respect checkGeneric for mapped type
keys (#23802) (bab)
- cd89202 linter/react/exhaustive-deps: Skip wrapper expression when
analyzing hook initializers (#23793) (camc314)
- 20e8285 linter/unicorn/prefer-native-coercion-function: Allow ts type
predicates (#23774) (camc314)
- d86f60b lsp: Normalize user config path to watch pattern (#23723)
(Sysix)
- 52032cf linter: Newline-terminate tsgolint errors (#23762) (Mikhail
Baev)
- 368fda7 linter/eslint/no-warning-comments: Avoid dropping generated
regex patterns (#23741) (camc314)
- ce44fbd linter/valid-title: Escape disallowed words regex (#23742)
(camc314)
- 3100d11 linter/prefer-called-exactly-once-with: Avoid out-of-bounds
slice panic at end of file (#23625) (Jerry Zhao)
- 742be36 refactor/node/handle-callback-err: Reject invalid regex config
(#23740) (camc314)
- d7be179 linter/eslint/no-restricted-globals: Handle shadowed locals
(#23736) (camc314)
- b3b1ff8 linter/vitest/expect-expect: Handle global vitest detection
correctly (#23734) (camc314)

### ⚡ Performance

- 68f9472 linter/jsx-a11y: Skip lowercasing non-aria attribute names
(#23906) (Lawrence Lin)
- b9312b4 linter/unicorn/prefer-export-from: Use keyed binding lookup
(#23893) (Marius Schulz)
- cd5204e linter/typescript/no-unsafe-declaration-merging: Use keyed
binding lookup (#23894) (Marius Schulz)
- e948498 linter/eslint/prefer-named-capture-group: Only dispatch for
relevant node types (#23868) (Connor Shea)
- 4ac7a8e linter/eslint/max-depth: Derive node types (#23896) (Connor
Shea)
- daeed09 linter/eslint/no-restricted-globals: Only scan unresolved
references (#23890) (camc314)
- e808514 linter/jest-vitest: Speed up no-standalone-expect (#23883)
(camc314)
- 8b165e5 linter/react/exhaustive-deps: Skip non-reactive calls early
(#23882) (camc314)
- 54005e7 linter/eslint/no-unused-vars: Precompute exported bindings
(#23881) (camc314)
- 9bc2f8c linter/unicorn/prefer-number-properties: Speed up global
checks (#23880) (camc314)
- 4ff104f linter: Optimize `require-hook` and `prefer-mock-*` rules to
run on specific node types (#23871) (Connor Shea)
- cc2213b linter: Run `no-underscore-dangle` only when relevant node
types are present (#23867) (Connor Shea)
- 3e55c21 linter/promise/always-return: Narrow to function node types
(#23878) (Connor Shea)
- 7136182 linter/jest-vitest: Speed up no-commented-out-tests (#23864)
(camc314)
- f138264 linter/eslint/no-script-url: Match javascript: prefix without
allocating (#23861) (Lawrence Lin)
- 7ef6895 linter/react/no-array-index-key: Delay index symbol lookup
(#23857) (camc314)
- 26bc171 linter/react/no-array-index-key: Match callback methods
directly (#23856) (camc314)
- 44fbbda linter/jsx-a11y/interactive-supports-focus: Check cheap
conditions first (#23854) (camc314)
- 84a5aa3 linter/eslint/no-extend-native: Skip lowercase references
early (#23851) (camc314)
- 88a74b2 linter/eslint/no-nonoctal-decimal-escape: Scan decimal escapes
as bytes (#23850) (camc314)
- fca69a8 linter: Skip traversal without this expressions (#23845)
(camc314)
- 838fd63 linter: Reduce preallocation for per-file diagnostics `Vec`
(#23705) (Marius Schulz)
- 417b506 linter/typescript/array-type: Remove full source text clone
(#23751) (Marius Schulz)

### 📚 Documentation

- 57e4469 linter/unicorn: Update prefer-dom-node-text-content rationale
(#23933) (Mikhail Baev)
- 3d61dea all: Correct capitalization in comments (#23887)
(overlookmotel)

### 🛡️ Security

- 3cdd18f deps: Update npm packages (#23690) (renovate[bot])
# Oxfmt
### 💥 BREAKING CHANGES

- 259e0cd oxfmt,formatter_graphql: [**BREAKING**] Support draft syntax
with removing prettier fallback (#23326) (leaysgur)
- accbc49 oxfmt: [**BREAKING**] Format `parser:css,less,scss` files +
css-in-js by `oxc_formatter_css` (#23321) (leaysgur)

### 🚀 Features

- dffa4b3 formatter_css: Implement `oxc_formatter_css` (#23320)
(leaysgur)
- 01de9ec oxfmt: Format `parser:graphql` files by
`oxc_formatter_graphql` (#23318) (leaysgur)
- 4e66212 formatter_graphql: Implement oxc_formatter_graphql (#23317)
(leaysgur)

### 🐛 Bug Fixes

- 67325ae formatter_css: Handle frontmatter language (#23819) (leaysgur)
- 3f355e5 formatter_graphql: Improve major prettier diffs (#23419)
(leaysgur)
- 48e2d78 formatter_css: Improve major prettier diffs (#23327)
(leaysgur)
- 8c07cad all: Enable `disable_old_builder` Cargo feature for `oxc_ast`
crate in tests (#23888) (overlookmotel)
- f7d1f50 oxlint, oxfmt: Enable `disable_old_builder` Cargo feature for
`oxc_ast` crate (#23886) (overlookmotel)
- d86f60b lsp: Normalize user config path to watch pattern (#23723)
(Sysix)

### ⚡ Performance

- 4ddcba0 formatter_core: Add printable-ASCII fast path to TextWidth
(#23913) (Lawrence Lin)

### 📚 Documentation

- b4d0dc9 oxfmt,formatter,formatter_css,formatter_core: Update AGENTS.md
(#23814) (leaysgur)

Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>
Co-authored-by: Cameron <cameron.clark@hey.com>
camc314 pushed a commit that referenced this pull request Jul 3, 2026
)

## What

`TextWidth::from_text` and `TextWidth::from_non_whitespace_str` both
call `UnicodeWidthStr::width`, which is a per-character multi-level
table walk with no string-level ASCII shortcut. The text they measure —
identifiers, private names, JSX names, numbers, and most string-literal
content — is overwhelmingly printable ASCII.

This adds a fast path: if every byte is in `0x20..=0x7E`, the display
width equals the byte length and the text is single-line, so the result
is returned directly without consulting the Unicode width table:

```rust
if text.as_bytes().iter().all(|&b| matches!(b, 0x20..=0x7e)) {
    return Self::single(text.len() as u32);
}
```

The range excludes `\t` (0x09), `\n` (0x0A), every control byte and all
multi-byte UTF-8, which fall through to the unchanged scan. This
replaces a per-char table lookup with an autovectorizable byte-range
scan; it is *not* an attempt to beat existing SIMD.

## Correctness (byte-identical)

- A new differential unit test asserts the fast path equals the original
computation over ASCII, leading/trailing spaces, tabs, `\n`, `\r\n`,
control bytes (VT/FF/DEL/US) and multi-byte/emoji-VS16 inputs (`cargo
test -p oxc_formatter_core`).
- The full **Prettier conformance suite is unchanged** — `js 746/753`,
`ts 591/601`, all JSON variants and jsdoc identical, with **zero
snapshot diffs**.

## Performance

CodSpeed confirms a measurable improvement (the per-text-element table
walk is removed on the hottest node types):

| Benchmark | `BASE` | `HEAD` | Efficiency |
| --- | --- | --- | --- |
| `formatter[errors.ts]` | 613.1 µs | 570.3 µs | **+7.5%** |
| `formatter[handle-comments.js]` | 2.9 ms | 2.8 ms | +4.17% |
| `formatter[index.tsx]` | 3.9 ms | 3.8 ms | +3.19% |
| `formatter[core.js]` | 1.6 ms | 1.6 ms | +3.15% |

4 improved benchmarks, 0 regressed.

---

*Disclosure: this change was prepared with AI assistance (Claude). I
reviewed and tested it myself — byte-identity is verified by the
differential unit test and the unchanged Prettier conformance snapshots,
and the speedup above is from CodSpeed on this PR.*

---------

Co-authored-by: Yuji Sugiura <y.sugiura.0316@gmail.com>
camc314 added a commit that referenced this pull request Jul 3, 2026
# Oxlint
### 💥 BREAKING CHANGES

- 88f4455 str: [**BREAKING**] `Str` and `Ident` methods take
`&GetAllocator` (#23781) (overlookmotel)

### 🚀 Features

- f2091b3 ast: Unify old and new `AstBuilder`s (#23875) (overlookmotel)
- 1c8f50c linter: Add schema for `eslint/no-restricted-import` (#23642)
(Sysix)

### 🐛 Bug Fixes

- 7cb85c4 linter/eslint/no-negated-condition: Add autofix for negated
conditions (#23825) (Yagiz Nizipli)
- f7d1f50 oxlint, oxfmt: Enable `disable_old_builder` Cargo feature for
`oxc_ast` crate (#23886) (overlookmotel)
- d891990 linter/jsx-a11y/role-supports-aria-props: Ignore nullish prop
values (#23865) (Mikhail Baev)
- 94b6599 linter: Deduplicate missing plugin errors (#23853) (camc314)
- eff3eff linter/oxc/branches-sharing-code: Avoid else-if false
positives (#23843) (camc314)
- 2a2d3b9 linter/eslint/prefer-destructuring: Skip
`AssignmentExpression` autofixes (#23818) (camc314)
- ddc24ae linter/eslint/id-length: Respect checkGeneric for mapped type
keys (#23802) (bab)
- cd89202 linter/react/exhaustive-deps: Skip wrapper expression when
analyzing hook initializers (#23793) (camc314)
- 20e8285 linter/unicorn/prefer-native-coercion-function: Allow ts type
predicates (#23774) (camc314)
- d86f60b lsp: Normalize user config path to watch pattern (#23723)
(Sysix)
- 52032cf linter: Newline-terminate tsgolint errors (#23762) (Mikhail
Baev)
- 368fda7 linter/eslint/no-warning-comments: Avoid dropping generated
regex patterns (#23741) (camc314)
- ce44fbd linter/valid-title: Escape disallowed words regex (#23742)
(camc314)
- 3100d11 linter/prefer-called-exactly-once-with: Avoid out-of-bounds
slice panic at end of file (#23625) (Jerry Zhao)
- 742be36 refactor/node/handle-callback-err: Reject invalid regex config
(#23740) (camc314)
- d7be179 linter/eslint/no-restricted-globals: Handle shadowed locals
(#23736) (camc314)
- b3b1ff8 linter/vitest/expect-expect: Handle global vitest detection
correctly (#23734) (camc314)

### ⚡ Performance

- 68f9472 linter/jsx-a11y: Skip lowercasing non-aria attribute names
(#23906) (Lawrence Lin)
- b9312b4 linter/unicorn/prefer-export-from: Use keyed binding lookup
(#23893) (Marius Schulz)
- cd5204e linter/typescript/no-unsafe-declaration-merging: Use keyed
binding lookup (#23894) (Marius Schulz)
- e948498 linter/eslint/prefer-named-capture-group: Only dispatch for
relevant node types (#23868) (Connor Shea)
- 4ac7a8e linter/eslint/max-depth: Derive node types (#23896) (Connor
Shea)
- daeed09 linter/eslint/no-restricted-globals: Only scan unresolved
references (#23890) (camc314)
- e808514 linter/jest-vitest: Speed up no-standalone-expect (#23883)
(camc314)
- 8b165e5 linter/react/exhaustive-deps: Skip non-reactive calls early
(#23882) (camc314)
- 54005e7 linter/eslint/no-unused-vars: Precompute exported bindings
(#23881) (camc314)
- 9bc2f8c linter/unicorn/prefer-number-properties: Speed up global
checks (#23880) (camc314)
- 4ff104f linter: Optimize `require-hook` and `prefer-mock-*` rules to
run on specific node types (#23871) (Connor Shea)
- cc2213b linter: Run `no-underscore-dangle` only when relevant node
types are present (#23867) (Connor Shea)
- 3e55c21 linter/promise/always-return: Narrow to function node types
(#23878) (Connor Shea)
- 7136182 linter/jest-vitest: Speed up no-commented-out-tests (#23864)
(camc314)
- f138264 linter/eslint/no-script-url: Match javascript: prefix without
allocating (#23861) (Lawrence Lin)
- 7ef6895 linter/react/no-array-index-key: Delay index symbol lookup
(#23857) (camc314)
- 26bc171 linter/react/no-array-index-key: Match callback methods
directly (#23856) (camc314)
- 44fbbda linter/jsx-a11y/interactive-supports-focus: Check cheap
conditions first (#23854) (camc314)
- 84a5aa3 linter/eslint/no-extend-native: Skip lowercase references
early (#23851) (camc314)
- 88a74b2 linter/eslint/no-nonoctal-decimal-escape: Scan decimal escapes
as bytes (#23850) (camc314)
- fca69a8 linter: Skip traversal without this expressions (#23845)
(camc314)
- 838fd63 linter: Reduce preallocation for per-file diagnostics `Vec`
(#23705) (Marius Schulz)
- 417b506 linter/typescript/array-type: Remove full source text clone
(#23751) (Marius Schulz)

### 📚 Documentation

- 57e4469 linter/unicorn: Update prefer-dom-node-text-content rationale
(#23933) (Mikhail Baev)
- 3d61dea all: Correct capitalization in comments (#23887)
(overlookmotel)

### 🛡️ Security

- 3cdd18f deps: Update npm packages (#23690) (renovate[bot])
# Oxfmt
### 💥 BREAKING CHANGES

- 259e0cd oxfmt,formatter_graphql: [**BREAKING**] Support draft syntax
with removing prettier fallback (#23326) (leaysgur)
- accbc49 oxfmt: [**BREAKING**] Format `parser:css,less,scss` files +
css-in-js by `oxc_formatter_css` (#23321) (leaysgur)

### 🚀 Features

- dffa4b3 formatter_css: Implement `oxc_formatter_css` (#23320)
(leaysgur)
- 01de9ec oxfmt: Format `parser:graphql` files by
`oxc_formatter_graphql` (#23318) (leaysgur)
- 4e66212 formatter_graphql: Implement oxc_formatter_graphql (#23317)
(leaysgur)

### 🐛 Bug Fixes

- 67325ae formatter_css: Handle frontmatter language (#23819) (leaysgur)
- 3f355e5 formatter_graphql: Improve major prettier diffs (#23419)
(leaysgur)
- 48e2d78 formatter_css: Improve major prettier diffs (#23327)
(leaysgur)
- 8c07cad all: Enable `disable_old_builder` Cargo feature for `oxc_ast`
crate in tests (#23888) (overlookmotel)
- f7d1f50 oxlint, oxfmt: Enable `disable_old_builder` Cargo feature for
`oxc_ast` crate (#23886) (overlookmotel)
- d86f60b lsp: Normalize user config path to watch pattern (#23723)
(Sysix)

### ⚡ Performance

- 4ddcba0 formatter_core: Add printable-ASCII fast path to TextWidth
(#23913) (Lawrence Lin)

### 📚 Documentation

- b4d0dc9 oxfmt,formatter,formatter_css,formatter_core: Update AGENTS.md
(#23814) (leaysgur)

Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>
Co-authored-by: Cameron <cameron.clark@hey.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants