perf: Improve generator perf by liuxingbaoyu · Pull Request #14701 · babel/babel

liuxingbaoyu · 2022-06-28T00:10:22Z

Q	A
License	MIT

I love this PR, and I'm sure you do too!

This PR improves the performance of the generator by 100%, which is equivalent to a 50% reduction in time consumption.

Benchmark results are from the real world (jquery 3.6).
The running environment is nodejs 18.3 and Windows 11.

current 1 jquery 3.6: 45.99 ops/sec ±0.83% (22ms)
current 4 jquery 3.6: 11.31 ops/sec ±0.44% (88ms)
current 16 jquery 3.6: 2.74 ops/sec ±0.52% (364ms)
current 64 jquery 3.6: 0.67 ops/sec ±0.34% (1490ms)
baseline 1 jquery 3.6: 25.99 ops/sec ±3.81% (38ms)
baseline 4 jquery 3.6: 5.57 ops/sec ±0.88% (180ms)
baseline 16 jquery 3.6: 1.33 ops/sec ±2.17% (750ms)
baseline 64 jquery 3.6: 0.32 ops/sec ±6.15% (3090ms)
baseline 1 jquery 3.6: 26.15 ops/sec ±0.65% (38ms)
baseline 4 jquery 3.6: 5.55 ops/sec ±0.85% (180ms)
baseline 16 jquery 3.6: 1.3 ops/sec ±3.79% (770ms)
baseline 64 jquery 3.6: 0.33 ops/sec ±2.08% (3013ms)
current 1 jquery 3.6: 45.51 ops/sec ±0.38% (22ms)
current 4 jquery 3.6: 11.19 ops/sec ±0.49% (89ms)
current 16 jquery 3.6: 2.71 ops/sec ±0.38% (369ms)
current 64 jquery 3.6: 0.67 ops/sec ±0.29% (1495ms)

I made the following changes.

Significant performance improvements:

Optimize the string merging logic. When merging into a large string, it seems that some strange operations will greatly improve the performance. It is difficult to explain clearly with words. You can directly view the code.
Optimize the queue of printers to avoid frequent creation and destruction of objects.
Modify the row and column calculation logic to avoid searching for newlines in most cases.
Note: Now we need to add a parameter to check for newlines on elements that may wrap, but this is rare, I only found two in the current code.
In parentheses.ts and whitespace.ts, change the objects to bitwise operations. And add bounds checking to avoid reading elements outside the array and useless type judgment.

Minor performance improvements:

Add the tokenChar method, which can save some checks, and automatically modify the single-character text to the character code and call tokenChar through the compile-time plugin.
Treat the queue elements uniformly as a single character or the same character.
Note: There is an externally observable change here.
That is, we assume that indentation characters must contain only one character.
E.g:
"space+space", or "tab+tab"
instead of "space+tab".

There are also some minor modifications that may not have been mentioned, but that shouldn't matter.

It's worth mentioning that none of the tests were modified and all passed, so this should be safe overall.

What might be possible in the future:

Current bit operations cannot be optimized by constant folding, this should have little impact, but we can improve it in the future.
We can inline types.isXXX with a plugin, I don't know how much performance improvement this will bring, but maybe try.

babel-bot · 2022-06-28T00:14:10Z

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/52457/

babel-bot · 2022-06-28T00:14:10Z

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/52391/

packages/babel-generator/src/node/index.ts

packages/babel-generator/src/node/whitespace.ts

packages/babel-generator/src/printer.ts

JLHwung · 2022-06-29T15:00:03Z

babel.config.js

 }
+
+/**
+ * @param {import("@babel/core")} pluginAPI


Suggested change

* @param {import("@babel/core")} pluginAPI

* @param {import("@babel/core").PluginAPI}

Currently our babel/core type definitions are in the Microsoft repo, which does not yet have a PluginAPI.

module "C:/Users/user/AppData/Local/Microsoft/TypeScript/4.7/node_modules/@types/babel__core/index"

benchmark/babel-generator/real-case/jquery.mjs

JLHwung · 2022-06-29T15:48:22Z

packages/babel-generator/src/buffer.ts

+
+    if (++this._appendCount > 4096) {
+      // @ts-ignore
+      this._str | 0; // Unexplainable huge performance boost. Ref: https://github.com/davidmarkclements/flatstr License: MIT


The repo offered an explanation about how this approach works for V8: https://github.com/davidmarkclements/flatstr#how-does-it-work

It triggers V8's internal String::Flatten method via side effects. Because it is an engine-specific hack, in the source, the author suggests relying on the package instead of copy-pasting it:

// You may be tempted to copy and paste this, // but take a look at the commit history first, // this is a moving target so relying on the module // is the best way to make sure the optimization // method is kept up to date` and compatible with // every Node version.

Though I think +this._str should do the trick, with 3 bytes shorter than | 0.

For reference, here are my local benchmark results:

without this hack:

node --predictable ./real-case/jquery.mjs baseline 1 jquery 3.6: 64 ops/sec ±10.45% (16ms) baseline 4 jquery 3.6: 13.68 ops/sec ±1.06% (73ms) baseline 16 jquery 3.6: 2.81 ops/sec ±26.7% (355ms) baseline 64 jquery 3.6: 0.75 ops/sec ±6.47% (1330ms) current 1 jquery 3.6: 95.27 ops/sec ±10.87% (10ms) current 4 jquery 3.6: 14.31 ops/sec ±11.89% (70ms) current 16 jquery 3.6: 3.26 ops/sec ±22.33% (307ms) current 64 jquery 3.6: 0.89 ops/sec ±16.89% (1119ms)

+ approach:

$ node --predictable ./real-case/jquery.mjs baseline 1 jquery 3.6: 67.18 ops/sec ±10.51% (15ms) baseline 4 jquery 3.6: 14.05 ops/sec ±0.84% (71ms) baseline 16 jquery 3.6: 2.77 ops/sec ±31.2% (361ms) baseline 64 jquery 3.6: 0.79 ops/sec ±3.29% (1268ms) current 1 jquery 3.6: 92.05 ops/sec ±15.16% (11ms) current 4 jquery 3.6: 22.96 ops/sec ±15.59% (44ms) current 16 jquery 3.6: 5.91 ops/sec ±19.89% (169ms) current 64 jquery 3.6: 1.59 ops/sec ±1.73% (630ms)

| 0 approach:

baseline 1 jquery 3.6: 65.05 ops/sec ±10.47% (15ms) baseline 4 jquery 3.6: 13.58 ops/sec ±1.07% (74ms) baseline 16 jquery 3.6: 3.08 ops/sec ±7.63% (325ms) baseline 64 jquery 3.6: 0.72 ops/sec ±20.03% (1391ms) current 1 jquery 3.6: 93.78 ops/sec ±10.35% (11ms) current 4 jquery 3.6: 21.35 ops/sec ±15.54% (47ms) current 16 jquery 3.6: 6.28 ops/sec ±0.85% (159ms) current 64 jquery 3.6: 1.4 ops/sec ±28.66% (716ms)

Notice how the flatstr improves the performance when generating large files ( >=4 jquery ).

Yes, I know this should have something to do with Flatten, but it's still weird.🤔
At first I used Number(str), and later referenced str | 0 in this package, so I didn't refer to this package directly.
The benchmark results are amazing, we can avoid this hack with a simple if in the buffer small enough!
In addition, your benchmark seems to have a large floating range. Perhaps you should avoid running other programs that occupy the CPU at the same time. At least on my computer, the benchmark is quite unstable, and even produces a 10% error, depending on which physical core is scheduled, which is really helpless.

PS: Your computer is really fast. :)

I've only made minor changes at the moment, the small files I'll do later when I have time. (may be tomorrow)

Ah, I seem to be reading something wrong, it seems that this hack has no effect on small buffers.😰

packages/babel-generator/src/buffer.ts

packages/babel-generator/src/node/index.ts

packages/babel-generator/src/buffer.ts

packages/babel-generator/src/node/index.ts

nicolo-ribaudo · 2022-07-05T07:55:11Z

packages/babel-generator/src/node/index.ts

  node: t.Node,
  parent: t.Node,
-  type: "before" | "after",
+  type: number,


Since it's not "any number", can we use WhitespaceFlag?

Ok, I see that you cannot export WhitespaceFlag from the other file because it gets deopted. I opened #14723 to fix it, so you can use export type { WhitespaceFlag } (we can already do it in this PR, and update our @babel/ dependencies as soon as #14723 gets released).

export function needsWhitespaceBefore(node: t.Node, parent: t.Node) { return needsWhitespace(node, parent, 1); } export function needsWhitespaceAfter(node: t.Node, parent: t.Node) { return needsWhitespace(node, parent, 2); }

Unfortunately we can still only use 1 and 2 here, because type imports cannot be used as values.

Yes, but at least the type annotation will help.

Unfortunately it still doesn't seem to be possible to optimize.😰

Ah, it was because I didn't update the dependencies. Should I change @babel/preset-typescript to workspace:^?

No, because we can't built Babel with Babel itself unless Babel is already built 😛

I'll release a patch before merging this, then we can merge this PR and (in a separate PR) update our @babe/* dependencies.

EDIT: https://github.com/babel/babel/actions/runs/2635171130

packages/babel-generator/src/node/whitespace.ts

nicolo-ribaudo · 2022-07-05T08:12:51Z

packages/babel-generator/src/printer.ts

+    this._indentChar = format.indent.style.charCodeAt(0);
+    this._indentRepeat = format.indent.style.length;


This technically breaks indent.style === " \t", but no one will notice 🤫

nicolo-ribaudo · 2022-07-05T08:13:51Z

packages/babel-generator/src/printer.ts

+    // space is mandatory to avoid outputting <!--
+    // http://javascript.spec.whatwg.org/#comment-syntax
+    const lastChar = this.getLastChar();
+    if (
+      // Need spaces for operators of the same kind to avoid: `a+++b`
+      (char === charCodes.plusSign && lastChar === charCodes.plusSign) ||
+      (char === charCodes.dash && lastChar === charCodes.dash) ||
+      // Needs spaces to avoid changing '34' to '34.', which would still be a valid number.
+      (char === charCodes.dot && this._endsWithInteger)
+    ) {
+      this._space();
+    }


Q: Does the generator know if we are printing a module or a script? In modules we don't need the space, se we could skip these checks.

I'm not sure about this, I'm not too familiar with logic yet...

Ok let's keep them for now, we'll think about it in the future 👍

liuxingbaoyu · 2022-07-05T08:42:25Z

https://github.com/liuxingbaoyu/babel/blob/17e04f9059268eb335921d3197dbd68e65d7f751/packages/babel-generator/src/printer.ts#L33-L37

Also I found that we don't seem to be using the base attribute anywhere, maybe it can be removed.

Co-authored-by: Armano <armano2@users.noreply.github.com>

Co-authored-by: Nicolò Ribaudo <nicolo.ribaudo@gmail.com>

nicolo-ribaudo

Awesome, thanks!

liuxingbaoyu · 2022-07-08T10:13:41Z

4ec0304

Ah, so sorry!
I made a simple change to the master branch for CI testing in my local repo and it was mistakenly included in this PR when rebasing.😭

#14574

nicolo-ribaudo · 2022-07-08T10:38:58Z

Whops, I didn't notice it 😅

But CI still works and as you pointed out the running time isn't affected by much; that line was causing problems with yarn 3, so... let's keep it 🤷

liuxingbaoyu added pkg: generator PR: Performance 🏃‍♀️ A type of pull request used for our changelog categories labels Jun 28, 2022

armano2 reviewed Jun 28, 2022

View reviewed changes

packages/babel-generator/src/node/index.ts Outdated Show resolved Hide resolved

armano2 reviewed Jun 28, 2022

View reviewed changes

packages/babel-generator/src/node/whitespace.ts Outdated Show resolved Hide resolved

armano2 reviewed Jun 28, 2022

View reviewed changes

packages/babel-generator/src/printer.ts Outdated Show resolved Hide resolved

JLHwung reviewed Jun 29, 2022

View reviewed changes

benchmark/babel-generator/real-case/jquery.mjs Show resolved Hide resolved

JLHwung reviewed Jun 29, 2022

View reviewed changes

nicolo-ribaudo reviewed Jun 29, 2022

View reviewed changes

packages/babel-generator/src/buffer.ts Show resolved Hide resolved

liuxingbaoyu force-pushed the improve-generator-perf branch from b719b55 to 17e04f9 Compare June 30, 2022 10:09

liuxingbaoyu commented Jun 30, 2022

View reviewed changes

packages/babel-generator/src/node/index.ts Outdated Show resolved Hide resolved

nicolo-ribaudo reviewed Jul 5, 2022

View reviewed changes

JLHwung approved these changes Jul 6, 2022

View reviewed changes

liuxingbaoyu and others added 7 commits July 7, 2022 18:59

test

4ec0304

improve generator perf

2f26d3a

Update packages/babel-generator/src/printer.ts

7d4a948

Co-authored-by: Armano <armano2@users.noreply.github.com>

review

33cb769

fix

292ef75

Update packages/babel-generator/src/node/whitespace.ts

9fd8a4c

Co-authored-by: Nicolò Ribaudo <nicolo.ribaudo@gmail.com>

review

a6a50b3

liuxingbaoyu force-pushed the improve-generator-perf branch from b1ff8bc to a6a50b3 Compare July 8, 2022 08:37

nicolo-ribaudo approved these changes Jul 8, 2022

View reviewed changes

nicolo-ribaudo merged commit 6165537 into babel:main Jul 8, 2022

liuxingbaoyu mentioned this pull request Jul 8, 2022

Fix: Test exception when using global yarn3 #14574

Closed

liuxingbaoyu mentioned this pull request Jul 21, 2022

chore: expand prettier-e2e test and update typings/deps #14779

Merged

github-actions bot added the outdated A closed issue/PR that is archived due to age. Recommended to make a new issue label Oct 8, 2022

github-actions bot locked as resolved and limited conversation to collaborators Oct 8, 2022

	* @param {import("@babel/core")} pluginAPI
	* @param {import("@babel/core").PluginAPI}

		this._indentChar = format.indent.style.charCodeAt(0);
		this._indentRepeat = format.indent.style.length;

Uh oh!

Conversation

liuxingbaoyu commented Jun 28, 2022 • edited by gitpod-io bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

babel-bot commented Jun 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

babel-bot commented Jun 28, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JLHwung Jun 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liuxingbaoyu Jun 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liuxingbaoyu Jun 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liuxingbaoyu Jul 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liuxingbaoyu Jul 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liuxingbaoyu Jul 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nicolo-ribaudo Jul 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liuxingbaoyu commented Jul 5, 2022

Uh oh!

nicolo-ribaudo left a comment

Choose a reason for hiding this comment

Uh oh!

liuxingbaoyu commented Jul 8, 2022

Uh oh!

nicolo-ribaudo commented Jul 8, 2022

Uh oh!

Reviewers

Assignees

Labels

liuxingbaoyu commented Jun 28, 2022 •

edited by gitpod-io bot

Loading

babel-bot commented Jun 28, 2022 •

edited

Loading

JLHwung Jun 29, 2022 •

edited

Loading

liuxingbaoyu Jun 29, 2022 •

edited

Loading

liuxingbaoyu Jun 30, 2022 •

edited

Loading

liuxingbaoyu Jul 8, 2022 •

edited

Loading

liuxingbaoyu Jul 5, 2022 •

edited

Loading

liuxingbaoyu Jul 8, 2022 •

edited

Loading

nicolo-ribaudo Jul 8, 2022 •

edited

Loading