Skip to content

Refactor private name tokenizing#13256

Merged
JLHwung merged 11 commits intobabel:mainfrom
JLHwung:refactor-private-name-tokenizing
May 6, 2021
Merged

Refactor private name tokenizing#13256
JLHwung merged 11 commits intobabel:mainfrom
JLHwung:refactor-private-name-tokenizing

Conversation

@JLHwung
Copy link
Copy Markdown
Contributor

@JLHwung JLHwung commented May 4, 2021

Q                       A
Tests Added + Pass? Yes
License MIT

This PR overhauls how Babel parser tokenizes the privateIdentifier #name. Currently #name is tokenized as hash and name, in this PR we merge these two tokens into a new privateName token whose value holds the String value of private identifier (without hash). We observe performance gain up to 18%.

$ node --predictable ./benchmark/many-class-private-properties/1-length.bench.mjs
baseline 256 length-1 private properties: 2640 ops/sec ±35.22% (0.379ms)
baseline 512 length-1 private properties: 1700 ops/sec ±1.47% (0.588ms)
baseline 1024 length-1 private properties: 844 ops/sec ±1.24% (1.186ms)
baseline 2048 length-1 private properties: 424 ops/sec ±0.42% (2.359ms)
current 256 length-1 private properties: 3123 ops/sec ±33.93% (0.32ms)
current 512 length-1 private properties: 2010 ops/sec ±0.83% (0.497ms)
current 1024 length-1 private properties: 977 ops/sec ±0.77% (1.023ms)
current 2048 length-1 private properties: 455 ops/sec ±0.85% (2.198ms)

This PR is based on the observation that Babel always does a lookahead when tokenizing #, so we can determine early if an identifier start is following a #, and avoid extra read of the leading identifier character.

By merging # and name we also avoids the tokenizer context update hooks for name-type tokens. We don't need to check of, functions and class for private identifiers anyway.

Since we expose tokens when options.tokens is true, we add a compat routine for tt.privateName which essentially undo the merging, hopefully we can remove it in Babel 8.

Note that Acorn adopts the same approach, which means it is likely that @babel/eslint-parser will have to merge # and name for older Babel versions. By merging tokens in @babel/parser we also do a favour for the @babel/eslint-parser.

@JLHwung JLHwung added pkg: parser PR: Performance 🏃‍♀️ A type of pull request used for our changelog categories labels May 4, 2021
@codesandbox-ci
Copy link
Copy Markdown

codesandbox-ci bot commented May 4, 2021

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Latest deployment of this branch, based on commit 52f533d:

Sandbox Source
babel-repl-custom-plugin Configuration
babel-plugin-multi-config Configuration

@babel-bot
Copy link
Copy Markdown
Collaborator

babel-bot commented May 4, 2021

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/45853/

@JLHwung JLHwung force-pushed the refactor-private-name-tokenizing branch from 806d076 to 97a2f60 Compare May 4, 2021 15:38
token.type = "Numeric";
token.value = `${token.value}n`;
} else if (type === tt.privateName) {
token.type = "PrivateIdentifier";
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We convert tt.privateName to PrivateIdentifier, which aligns to eslint/js#486

Note that although this PR merges tt.hash and tt.name to tt.privateName, this behaviour will be not observed by @babel/eslint-parsers because of the compat layer. However if we run with BABEL_8_BREAKING=true, the eslint parser will see tt.privateName, instead of breaking tt.privateName, we align it to the new espree behaviour.

@JLHwung JLHwung force-pushed the refactor-private-name-tokenizing branch from 3f161e3 to 1f65f5e Compare May 4, 2021 16:15
Copy link
Copy Markdown
Member

@nicolo-ribaudo nicolo-ribaudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

this.finishToken(tt.bracketHashL);
}
this.state.pos += 2;
} else if (isIdentifierStart(next) || next === charCodes.backslash) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the backslash? I'm confused why #\ would be valid.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for the escaped private names: #\u0061.

@lweathermon
Copy link
Copy Markdown

Thank you

Co-authored-by: Justin Ridgewell <justin@ridgewell.name>
Copy link
Copy Markdown
Member

@existentialism existentialism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

@JLHwung JLHwung merged commit a387973 into babel:main May 6, 2021
@JLHwung JLHwung deleted the refactor-private-name-tokenizing branch May 6, 2021 13:46
@fedeci fedeci mentioned this pull request May 17, 2021
1 task
@github-actions github-actions bot added the outdated A closed issue/PR that is archived due to age. Recommended to make a new issue label Aug 6, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

outdated A closed issue/PR that is archived due to age. Recommended to make a new issue pkg: parser PR: Performance 🏃‍♀️ A type of pull request used for our changelog categories

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants