perf: Improve generator perf#14701
Conversation
|
Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/52457/ |
|
Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/52391/ |
| } | ||
|
|
||
| /** | ||
| * @param {import("@babel/core")} pluginAPI |
There was a problem hiding this comment.
| * @param {import("@babel/core")} pluginAPI | |
| * @param {import("@babel/core").PluginAPI} |
There was a problem hiding this comment.
Currently our babel/core type definitions are in the Microsoft repo, which does not yet have a PluginAPI.
module "C:/Users/user/AppData/Local/Microsoft/TypeScript/4.7/node_modules/@types/babel__core/index"
|
|
||
| if (++this._appendCount > 4096) { | ||
| // @ts-ignore | ||
| this._str | 0; // Unexplainable huge performance boost. Ref: https://github.com/davidmarkclements/flatstr License: MIT |
There was a problem hiding this comment.
The repo offered an explanation about how this approach works for V8: https://github.com/davidmarkclements/flatstr#how-does-it-work
It triggers V8's internal String::Flatten method via side effects. Because it is an engine-specific hack, in the source, the author suggests relying on the package instead of copy-pasting it:
// You may be tempted to copy and paste this,
// but take a look at the commit history first,
// this is a moving target so relying on the module
// is the best way to make sure the optimization
// method is kept up to date` and compatible with
// every Node version.
Though I think +this._str should do the trick, with 3 bytes shorter than | 0.
For reference, here are my local benchmark results:
without this hack:
node --predictable ./real-case/jquery.mjs
baseline 1 jquery 3.6: 64 ops/sec ±10.45% (16ms)
baseline 4 jquery 3.6: 13.68 ops/sec ±1.06% (73ms)
baseline 16 jquery 3.6: 2.81 ops/sec ±26.7% (355ms)
baseline 64 jquery 3.6: 0.75 ops/sec ±6.47% (1330ms)
current 1 jquery 3.6: 95.27 ops/sec ±10.87% (10ms)
current 4 jquery 3.6: 14.31 ops/sec ±11.89% (70ms)
current 16 jquery 3.6: 3.26 ops/sec ±22.33% (307ms)
current 64 jquery 3.6: 0.89 ops/sec ±16.89% (1119ms)
+ approach:
$ node --predictable ./real-case/jquery.mjs
baseline 1 jquery 3.6: 67.18 ops/sec ±10.51% (15ms)
baseline 4 jquery 3.6: 14.05 ops/sec ±0.84% (71ms)
baseline 16 jquery 3.6: 2.77 ops/sec ±31.2% (361ms)
baseline 64 jquery 3.6: 0.79 ops/sec ±3.29% (1268ms)
current 1 jquery 3.6: 92.05 ops/sec ±15.16% (11ms)
current 4 jquery 3.6: 22.96 ops/sec ±15.59% (44ms)
current 16 jquery 3.6: 5.91 ops/sec ±19.89% (169ms)
current 64 jquery 3.6: 1.59 ops/sec ±1.73% (630ms)
| 0 approach:
baseline 1 jquery 3.6: 65.05 ops/sec ±10.47% (15ms)
baseline 4 jquery 3.6: 13.58 ops/sec ±1.07% (74ms)
baseline 16 jquery 3.6: 3.08 ops/sec ±7.63% (325ms)
baseline 64 jquery 3.6: 0.72 ops/sec ±20.03% (1391ms)
current 1 jquery 3.6: 93.78 ops/sec ±10.35% (11ms)
current 4 jquery 3.6: 21.35 ops/sec ±15.54% (47ms)
current 16 jquery 3.6: 6.28 ops/sec ±0.85% (159ms)
current 64 jquery 3.6: 1.4 ops/sec ±28.66% (716ms)
Notice how the flatstr improves the performance when generating large files ( >=4 jquery ).
There was a problem hiding this comment.
Yes, I know this should have something to do with Flatten, but it's still weird.🤔
At first I used Number(str), and later referenced str | 0 in this package, so I didn't refer to this package directly.
The benchmark results are amazing, we can avoid this hack with a simple if in the buffer small enough!
In addition, your benchmark seems to have a large floating range. Perhaps you should avoid running other programs that occupy the CPU at the same time. At least on my computer, the benchmark is quite unstable, and even produces a 10% error, depending on which physical core is scheduled, which is really helpless.
PS: Your computer is really fast. :)
There was a problem hiding this comment.
I've only made minor changes at the moment, the small files I'll do later when I have time. (may be tomorrow)
There was a problem hiding this comment.
Ah, I seem to be reading something wrong, it seems that this hack has no effect on small buffers.😰
b719b55 to
17e04f9
Compare
| node: t.Node, | ||
| parent: t.Node, | ||
| type: "before" | "after", | ||
| type: number, |
There was a problem hiding this comment.
Since it's not "any number", can we use WhitespaceFlag?
There was a problem hiding this comment.
There was a problem hiding this comment.
export function needsWhitespaceBefore(node: t.Node, parent: t.Node) {
return needsWhitespace(node, parent, 1);
}
export function needsWhitespaceAfter(node: t.Node, parent: t.Node) {
return needsWhitespace(node, parent, 2);
}Unfortunately we can still only use 1 and 2 here, because type imports cannot be used as values.
There was a problem hiding this comment.
Yes, but at least the type annotation will help.
There was a problem hiding this comment.
Unfortunately it still doesn't seem to be possible to optimize.😰
There was a problem hiding this comment.
Ah, it was because I didn't update the dependencies. Should I change @babel/preset-typescript to workspace:^?
There was a problem hiding this comment.
No, because we can't built Babel with Babel itself unless Babel is already built 😛
I'll release a patch before merging this, then we can merge this PR and (in a separate PR) update our @babe/* dependencies.
EDIT: https://github.com/babel/babel/actions/runs/2635171130
| this._indentChar = format.indent.style.charCodeAt(0); | ||
| this._indentRepeat = format.indent.style.length; |
There was a problem hiding this comment.
This technically breaks indent.style === " \t", but no one will notice 🤫
| // space is mandatory to avoid outputting <!-- | ||
| // http://javascript.spec.whatwg.org/#comment-syntax | ||
| const lastChar = this.getLastChar(); | ||
| if ( | ||
| // Need spaces for operators of the same kind to avoid: `a+++b` | ||
| (char === charCodes.plusSign && lastChar === charCodes.plusSign) || | ||
| (char === charCodes.dash && lastChar === charCodes.dash) || | ||
| // Needs spaces to avoid changing '34' to '34.', which would still be a valid number. | ||
| (char === charCodes.dot && this._endsWithInteger) | ||
| ) { | ||
| this._space(); | ||
| } |
There was a problem hiding this comment.
Q: Does the generator know if we are printing a module or a script? In modules we don't need the space, se we could skip these checks.
There was a problem hiding this comment.
I'm not sure about this, I'm not too familiar with logic yet...
There was a problem hiding this comment.
Ok let's keep them for now, we'll think about it in the future 👍
|
Also I found that we don't seem to be using the base attribute anywhere, maybe it can be removed. |
Co-authored-by: Armano <armano2@users.noreply.github.com>
Co-authored-by: Nicolò Ribaudo <nicolo.ribaudo@gmail.com>
b1ff8bc to
a6a50b3
Compare
|
Whops, I didn't notice it 😅 But CI still works and as you pointed out the running time isn't affected by much; that line was causing problems with yarn 3, so... let's keep it 🤷 |
I love this PR, and I'm sure you do too!
This PR improves the performance of the generator by 100%, which is equivalent to a 50% reduction in time consumption.
Benchmark results are from the real world (jquery 3.6).
The running environment is nodejs 18.3 and Windows 11.
I made the following changes.
Significant performance improvements:
Optimize the string merging logic. When merging into a large string, it seems that some strange operations will greatly improve the performance. It is difficult to explain clearly with words. You can directly view the code.
Optimize the queue of printers to avoid frequent creation and destruction of objects.
Modify the row and column calculation logic to avoid searching for newlines in most cases.
Note: Now we need to add a parameter to check for newlines on elements that may wrap, but this is rare, I only found two in the current code.
In
parentheses.tsandwhitespace.ts, change the objects to bitwise operations. And add bounds checking to avoid reading elements outside the array and useless type judgment.Minor performance improvements:
Add the
tokenCharmethod, which can save some checks, and automatically modify the single-character text to the character code and calltokenCharthrough the compile-time plugin.Treat the queue elements uniformly as a single character or the same character.
Note: There is an externally observable change here.
That is, we assume that indentation characters must contain only one character.
E.g:
"space+space", or "tab+tab"
instead of "space+tab".
There are also some minor modifications that may not have been mentioned, but that shouldn't matter.
It's worth mentioning that none of the tests were modified and all passed, so this should be safe overall.
What might be possible in the future:
Current bit operations cannot be optimized by constant folding, this should have little impact, but we can improve it in the future.
We can inline
types.isXXXwith a plugin, I don't know how much performance improvement this will bring, but maybe try.