Skip to content

perf(formatter): pre-size output buffer using source text length#22594

Merged
graphite-app[bot] merged 1 commit into
mainfrom
claude/perf-formatter-buffer-capacity
May 21, 2026
Merged

perf(formatter): pre-size output buffer using source text length#22594
graphite-app[bot] merged 1 commit into
mainfrom
claude/perf-formatter-buffer-capacity

Conversation

@Dunqing

@Dunqing Dunqing commented May 19, 2026

Copy link
Copy Markdown
Member

Summary

Printer constructs its underlying CodeBuffer (a Vec<u8>) via CodeBuffer::with_indent, which starts the buffer at cap = 0. For multi-MB outputs, that's ~log2(output_size) growth reallocations of a multi-MB buffer — each one allocates a bigger block, copies all existing bytes, and frees the old one.

Add a Printer::with_capacity(capacity, options, sorted_tailwind_classes) constructor and route Formatted::print / print_with_indent through it with source_text.len() as the hint. Formatted output is typically within a small factor of the input size, so a single pre-allocation eliminates the growth reallocs in one shot.

Printer::new is kept (delegates to with_capacity(0, ...)) for back-compat with internal callers.

Allocation impact

cargo allocs, allocs_formatter.snap:

File Size Sys reallocs before Sys reallocs after
checker.ts 2.92 MB 584 565 (−19)
cal.com.tsx 1.06 MB 787 773 (−14)
RadixUIAdoptionSection.jsx 2.5 KB 37 28 (−9)
pdf.mjs 567 KB 356 342 (−14)
antd.js 6.69 MB 4671 4652 (−19)
binder.ts 193 KB 58 43 (−15)

Sys alloc counts unchanged. The 9-19 reductions per file match log2(file_size) — exactly the doubling-growth pattern of the output buffer.

Timing

Local cargo bench --bench formatter shows small wins (~3-7%) on tightly-measured benches (errors.ts, Search.tsx, core.js, handle-comments.js). Two files (next.ts, index.tsx) have wide Criterion intervals that swallow any signal in either direction. App.tsx (the biggest at 5.97 ms) shows essentially no change — modern allocators handle large Vec<u8> reallocations via mmap remap rather than physical copy, so the wins on big files turn out smaller than the realloc count suggests.

CodSpeed (instruction-count-based) will give the deterministic answer.

Test Plan

  • cargo test -p oxc_formatter --lib --tests — 364 tests pass
  • cargo allocs — formatter snapshot shows clean reallocation reductions
  • cargo bench --bench formatter — no regressions outside noise

Side fix

The commit hook flagged a pre-existing clippy::redundant_pub_crate warning in crates/oxc_parser/src/lexer/mod.rs:38 (pub(crate) use inside a private module). One-character fix to pub use included so the hook passes.

AI disclosure: drafted with Claude Code, reviewed manually.

@Dunqing Dunqing requested a review from leaysgur as a code owner May 19, 2026 14:15
@github-actions github-actions Bot added A-parser Area - Parser A-formatter Area - Formatter labels May 19, 2026
@codspeed-hq

codspeed-hq Bot commented May 19, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 52 untouched benchmarks
⏩ 8 skipped benchmarks1


Comparing claude/perf-formatter-buffer-capacity (14bd1ac) with main (2a7562e)

Open in CodSpeed

Footnotes

  1. 8 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@Dunqing

Dunqing commented May 19, 2026

Copy link
Copy Markdown
Member Author
image

1% - 2% performance improvement. Using the input content length is fine because, in practice, most files are already formatted, so they are exactly the same.

@leaysgur leaysgur added the 0-merge Merge with Graphite Merge Queue label May 20, 2026

leaysgur commented May 20, 2026

Copy link
Copy Markdown
Member

Merge activity

  • May 20, 1:33 AM UTC: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • May 20, 1:53 AM UTC: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • May 20, 1:54 AM UTC: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • May 20, 1:59 AM UTC: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • May 21, 1:00 PM UTC: The merge label '0-merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • May 21, 1:00 PM UTC: Dunqing added this pull request to the Graphite merge queue.
  • May 21, 1:04 PM UTC: Merged by the Graphite merge queue.

@leaysgur leaysgur removed the 0-merge Merge with Graphite Merge Queue label May 20, 2026
@leaysgur

leaysgur commented May 20, 2026

Copy link
Copy Markdown
Member

@Dunqing The PR itself is LGTM, but for some reason, the CI is failing, so I cannot merge it...

Claude said it was an irrelevant change to the lexer, so I tried reverting it, but it still fails.

Why is there a difference in transformer allocs even though we only modified the formatter...?

Refs: #22614 🤷🏻

graphite-app Bot pushed a commit that referenced this pull request May 20, 2026
## Summary

Adds `kitchen-sink.tsx` — a comprehensive synthetic TypeScript+JSX fixture maintained at [oxc-project/benchmark-files](https://github.com/oxc-project/benchmark-files) — to both `TestFiles::minimal()` (bench input set) and `TestFiles::complicated()` (alloc-tracking input set). The existing files in each set are untouched; this is a strict append.

## Why

The existing bench input set didn't reliably surface general-purpose perf wins above the ~1-2% measurement noise floor:
- #22580 (semantic pre-reserve) — visible because `binder.ts` exercises it
- #22594 (formatter buffer) — visible
- #22596 (minifier `try_fold_concat`) — **not visible** on the old set
- #22599 (semantic resolve-refs no-temp-Vec) — **not visible**
- #22603 (semantic var-hoist SmallVec) — **not visible**

The kitchen-sink fixes that by exercising every AST node, every transformer plugin, every minifier optimization opportunity, and every semantic step in one large file. Verified by re-benching #22596 against this fixture: **minifier mean −1.5%, min −3.7%** — above noise, signal confirmed.

## Fixture stats (cross-checked locally)

| Metric | Value |
|---|---|
| Source size | 21,117 lines / 732.90 kB |
| AST nodes | ~133,000 |
| Scopes | ~4,750 |
| Symbols | ~7,000 |
| Resolved references | ~16,000 |
| Semantic diagnostics | 0 errors / 0 warnings |

## Snap baselines

`tasks/track_memory_allocations/allocs_*.snap` updated with the kitchen-sink row across all 5 pipelines (parser / semantic / transformer / minifier / formatter). Future PRs that change allocation behavior on this fixture will produce a snap diff in CI.

## Bench-cleaner fix

`tasks/benchmark/benches/lexer.rs`'s `SourceCleaner` was missing `visit_ts_template_literal_type` — TypeScript type-level template literals (e.g. `` `${T}-${U}` `` in conditional / mapped types) are syntactically identical to value-level template literals, so the bench-mode lexer (without parser context) cannot distinguish them. Without the cleaner converting them to plain strings, kitchen-sink's type-level templates caused the lexer bench to swallow ~1 KB spans as a single `TemplateHead` and produce spurious "Unterminated string" / "Invalid Unicode escape" errors. One-line fix to mirror the existing `visit_template_literal` handling.

AI disclosure: drafted with Claude Code, reviewed manually.
@Dunqing Dunqing force-pushed the claude/perf-formatter-buffer-capacity branch from fa8e5b8 to 14bd1ac Compare May 21, 2026 12:24
@Dunqing Dunqing added the 0-merge Merge with Graphite Merge Queue label May 21, 2026
)

## Summary

`Printer` constructs its underlying `CodeBuffer` (a `Vec<u8>`) via `CodeBuffer::with_indent`, which starts the buffer at `cap = 0`. For multi-MB outputs, that's ~`log2(output_size)` growth reallocations of a multi-MB buffer — each one allocates a bigger block, copies all existing bytes, and frees the old one.

Add a `Printer::with_capacity(capacity, options, sorted_tailwind_classes)` constructor and route `Formatted::print` / `print_with_indent` through it with `source_text.len()` as the hint. Formatted output is typically within a small factor of the input size, so a single pre-allocation eliminates the growth reallocs in one shot.

`Printer::new` is kept (delegates to `with_capacity(0, ...)`) for back-compat with internal callers.

## Allocation impact

`cargo allocs`, `allocs_formatter.snap`:

| File | Size | Sys reallocs before | Sys reallocs after |
|---|---|---|---|
| `checker.ts` | 2.92 MB | 584 | **565** (−19) |
| `cal.com.tsx` | 1.06 MB | 787 | **773** (−14) |
| `RadixUIAdoptionSection.jsx` | 2.5 KB | 37 | **28** (−9) |
| `pdf.mjs` | 567 KB | 356 | **342** (−14) |
| `antd.js` | 6.69 MB | 4671 | **4652** (−19) |
| `binder.ts` | 193 KB | 58 | **43** (−15) |

Sys alloc counts unchanged. The 9-19 reductions per file match `log2(file_size)` — exactly the doubling-growth pattern of the output buffer.

## Timing

Local `cargo bench --bench formatter` shows small wins (~3-7%) on tightly-measured benches (errors.ts, Search.tsx, core.js, handle-comments.js). Two files (next.ts, index.tsx) have wide Criterion intervals that swallow any signal in either direction. App.tsx (the biggest at 5.97 ms) shows essentially no change — modern allocators handle large `Vec<u8>` reallocations via mmap remap rather than physical copy, so the wins on big files turn out smaller than the realloc count suggests.

CodSpeed (instruction-count-based) will give the deterministic answer.

## Test Plan

- [x] `cargo test -p oxc_formatter --lib --tests` — 364 tests pass
- [x] `cargo allocs` — formatter snapshot shows clean reallocation reductions
- [x] `cargo bench --bench formatter` — no regressions outside noise

## Side fix

The commit hook flagged a pre-existing `clippy::redundant_pub_crate` warning in `crates/oxc_parser/src/lexer/mod.rs:38` (`pub(crate) use` inside a private module). One-character fix to `pub use` included so the hook passes.

AI disclosure: drafted with Claude Code, reviewed manually.
@graphite-app graphite-app Bot force-pushed the claude/perf-formatter-buffer-capacity branch from 14bd1ac to 78cf83f Compare May 21, 2026 13:01
@graphite-app graphite-app Bot merged commit 78cf83f into main May 21, 2026
29 checks passed
@graphite-app graphite-app Bot removed the 0-merge Merge with Graphite Merge Queue label May 21, 2026
@graphite-app graphite-app Bot deleted the claude/perf-formatter-buffer-capacity branch May 21, 2026 13:05
Dunqing added a commit that referenced this pull request May 26, 2026
# Oxlint
### 🚀 Features

- 10da26b linter: `no-misleading-character-class` add suggestions for
regex literal (#22681) (Sysix)
- b84941e linter/vue: Implement no-expose-after-await rule (#22675)
(bab)
- 98b98c1 linter/vue: Implement no-computed-properties-in-data rule
(#22674) (bab)
- 868e2e8 linter: Add suggestion for `no-misleading-character-class`
(#22631) (Sysix)
- 2d4c919 oxlint: Support `vite-plus/resolveConfig` for vite.config.ts
(#22456) (leaysgur)
- 2a60012 linter/vue: Implement require-render-return rule (#22613)
(bab)
- 9f227fd linter/vue: Implement no-deprecated-props-default-this rule
(#21892) (bab)
- 9cd28b3 linter: Add debug option to print files to be linted (#22546)
(camchenry)
- 87f065e linter/vue: Implement return-in-emits-validator rule (#21935)
(bab)
- ea0380c linter/unicorn: Implement `import-style` rule (#22173) (Hao
Chen)
- dde40fe linter/vue: Implement no-watch-after-await rule (#22006) (bab)
- a735eb0 linter/vue: Implement valid-next-tick rule (#22531) (bab)
- 6dc615d linter/vue: Implement no-shared-component-data rule (#21842)
(bab)
- a656418 linter/vue: Implement valid-define-options rule (#22107) (bab)
- bb6f1b2 linter/vue: Implement require-slots-as-functions rule (#22244)
(bab)
- 5fa4774 linter/n: Implement `callback-return` rule (#22470) (Mikhail
Baev)

### 🐛 Bug Fixes

- 52bd016 linter: Respect allow unused disable directives in LSP
(#22715) (camc314)
- fa7c463 semantic: Correct TS enum member symbol spans (#22689)
(camc314)
- ed445ba linter: Respect flags overrides for `RegExp(/regex/i, "u")`
(#22678) (Sysix)
- 26b9396 semantic: Resolve parameter decorators outside parameter scope
(#22623) (camc314)
- 203952d linter: `no-misleading-character-class` fix
`is_unicode_code_point_escape` check (#22655) (Sysix)
- 2f64d3d linter: `no-misleading-character-class` own diagnostic for
surrogate pairs without u flag (#22654) (Sysix)
- 0c6ebc2 linter/eslint/no-lone-blocks: Do not flag empty loops (#22649)
(Mikhail Baev)
- 2a7562e linter/no-focused-tests: Mark fixer as a suggestion (#22645)
(camc314)
- dbe644f linter: Respect args none for unused rest parameters (#22627)
(camc314)
- d0211b0 linter: Allow undefined in DummyRuleMap index (#22626)
(camc314)
- 36fc0ec oxlint/lsp: "ignore this" actions merge with existing
directive (#22604) (Sysix)
- f7f370e linter/vitest/prefer-expect-type-of: Recommend `toBeTypeOf`
instead of `expectTypeOf` (#22612) (Mikhail Baev)
- 77063e5 linter/consistent-indexed-object-style: Preserve interface
modifiers in fixes (#22579) (camc314)
- 4f33aa7 linter: Treat `TSGlobalDeclaration` as ambient in
`has_ambient_typescript_ancestor` (#22577) (camc314)

### ⚡ Performance

- c22938d linter/no-async-endpoint-handlers: Populate node types
(#22601) (camc314)
- 607486e linter/no-negated-condition: Populate node types (#22602)
(camc314)
- 4dcaa59 linter/consistent-type-imports: Populate node types (#22600)
(camc314)
- 5bd3b25 linter/no-unused-vars: Avoid cloned ancestor iterator (#22598)
(camc314)
- 97fe9ba linter/no-extra-non-null-assertion: Reduce node matches
(#22588) (camc314)
- ae98296 linter/consistent-indexed-object-style: Populate node types
(#22578) (camc314)
# Oxfmt
### 🚀 Features

- 16b8058 oxfmt: Support `vite-plus/resolveConfig` for vite.config.ts
(#22454) (leaysgur)

### 🐛 Bug Fixes

- 5a26479 formatter: Preserve import phases (#22692) (Cameron)

### ⚡ Performance

- 78cf83f formatter: Pre-size output buffer using source text length
(#22594) (Dunqing)

Co-authored-by: Dunqing <29533304+Dunqing@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-formatter Area - Formatter A-parser Area - Parser

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants