feat(ast): generate node_id accessors for AST enum wrappers#21653
feat(ast): generate node_id accessors for AST enum wrappers#21653graphite-app[bot] merged 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an ast_tools generator that emits node_id() convenience accessors for AST enum wrapper types (e.g. Expression, Statement) by delegating to the inner boxed node’s node_id.
Changes:
- Register a new
DeriveGetNodeIdGeneratorinast_toolsand export it from the generators module. - Generate and wire a new
crates/oxc_ast/src/generated/derive_get_node_id.rsmodule intooxc_ast. - Provide
pub fn node_id(&self) -> NodeIdimpls on enum wrapper types where all variants ultimately contain a struct with anode_id: Cell<NodeId>field.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tasks/ast_tools/src/main.rs | Adds the new generator to the codegen runner list. |
| tasks/ast_tools/src/generators/mod.rs | Declares and re-exports the new generator module/type. |
| tasks/ast_tools/src/generators/derive_get_node_id.rs | Implements codegen logic to emit node_id() for eligible AST enums. |
| crates/oxc_ast/src/lib.rs | Includes the newly generated module in mod generated. |
| crates/oxc_ast/src/generated/derive_get_node_id.rs | Generated output: node_id() methods for AST enum wrappers. |
Merging this PR will improve performance by 3.13%
Performance Changes
Comparing Footnotes
|
There was a problem hiding this comment.
This is definitely worthwhile. Because NodeId field is in same position in all structs (offset 8), all these methods should boil down to a single instruction, for enums where all variants are Box-ed.
A few notes:
- IMO ideally we'd fold this into the existing generator which creates the
node_idmethods for structs (generators/get_id.rs). - A faster way to check if a struct has a
node_idfield isstruct_def.kind.has_kind(all structs withnode_idfield have anAstKind, those that don't do not - see #21685). - If all variants of an enum are
Box<'a, T>, the method should be marked#[inline(always)]just to make sure it does get inlined (because it should boil down to a single instruction). I'm not quite clear in what order LLVM optimizes functions, but if it decides whether to inlinenode_idmethod before it optimizes it, it might decide not to inline it because it's huge pre-optimization, before later boiling it down to a single op. - The latter would also give us a hint that we've made a mistake if we change the AST later on in a way that breaks this optimization.
None of those are blockers to merging. If you want to merge it as is, please go ahead, and you/I can tweak it in a follow-up.
I've made some changes to get_id.rs in this stack: #21686. Just waiting for CI (so sloooow) before merging it. If you do want to fold this into get_id generator, could you please stack it on top to avoid conflicts?
47ed998 to
1d5b47d
Compare
Done! We've now got a function for generating the get id methods for structs, and a separate for enums.
yep this makes sense - done!
Done! There are a couple that don't have this e.g. Going to merge - feel free to push fixes to this! |
Merge activity
|
not sure if this is the idomatic way to approach this. We don't have a use for this at present, but it seems very useful to have
a62ff15 to
678767e
Compare
Follow-on after #21653. Combine 2 loops over enums' variants into a single loop. Should be a little faster.
… all variants unboxed (#21707) Follow-on after #21653. That PR revealed that some AST enums don't have all their variants boxed. Where all variants are boxed, `node_id` methods should boil down to a single instruction, so we mark the method `#[inline(always)]`. `node_id` method will also boil down to a single instruction where all variants are _not_ boxed, so add `#[inline(always)]` to them too. It's only where there's a mix of boxed and unboxed where the method will retain branching logic and be larger. Probably compiler will inline these methods anyway, but the attributes are a useful signal as to where we can improve the AST - all enums should consistently box _all_ their variants, or none of them.
#21653 revealed that `ArrayExpressionElement` has an unboxed variant `Elision`. Before the introduction of `NodeId` fields, this was fine as `Elison` contained only a `Span` (8 bytes). But with `NodeId` field added, it's 16 bytes, which makes `ArrayExpressionElement` larger too - 24 bytes. Box `Elison` in this variant, bringing `ArrayExpressionElement` back down to 16 bytes. This also has the side-effect of making `ArrayExpressionElement`'s `node_id`, `span`, and `span_mut` methods branchless.
#21653 revealed that `JSXExpression` has an unboxed variant `EmptyExpression`. Before the introduction of `NodeId` fields, this was fine as `JSXEmptyExpression` contained only a `Span` (8 bytes). But with `NodeId` field added, it's 16 bytes, which makes `JSXExpression` larger too - 24 bytes. Box `JSXEmptyExpression` in this variant, bringing `JSXExpression` back down to 16 bytes. This also has the side-effect of making `JSXExpression`'s `node_id`, `span`, and `span_mut` methods branchless.
#21653 revealed that `TSTypePredicateName` has an unboxed variant `This`. Before the introduction of `NodeId` fields, this was fine as `TSThisType` contained only a `Span` (8 bytes). But with `NodeId` field added, it's 16 bytes, which makes `TSTypePredicateName` larger too - 24 bytes. Box `TSThisType` in this variant, bringing `TSTypePredicateName` back down to 16 bytes. This also has the side-effect of making `TSTypePredicateName`'s `node_id`, `span`, and `span_mut` methods branchless.
### 💥 BREAKING CHANGES - 502e804 ast: [**BREAKING**] Reduce size of `TSTypePredicateName` (#21711) (overlookmotel) - 5651539 ast: [**BREAKING**] Reduce size of `JSXExpression` (#21710) (overlookmotel) - c44e280 ast: [**BREAKING**] Reduce size of `ArrayExpressionElement` (#21709) (overlookmotel) - c5b3deb syntax: [**BREAKING**] Remove `CommentNodeId` (#21679) (overlookmotel) ### 🚀 Features - b738a39 allocator: Add `Allocator::cursor_ptr` method (#21773) (overlookmotel) - 678767e ast: Generate node_id accessors for AST enum wrappers (#21653) (camc314) - f091d77 minifier: Inline constant spread elements into arrays (#21095) (Armano) ### 🐛 Bug Fixes - 0d608c2 minifier: Preserve raw CR in template literals (#21645) (Dunqing) - a889ea9 minifier: Track pure functions in DCE mode (#21722) (Dunqing) - 674dfac allocator: `Arena` retry allocation when chunk size approaches maximum (#21777) (overlookmotel) - f130cc0 allocator: Fix arithmetic overflow in `Arena::new_chunk_memory_details` (#21745) (overlookmotel) - b9bf239 allocator: Fix UB in `Arena::grow_zeroed` (#21739) (overlookmotel) - d2b9389 allocator: Clippy warning when building without `testing` feature (#21681) (camc314) - 503dc86 codegen: Map sourcemaps from visible output starts (#21662) (Dunqing) - c92bd3b transformer: Use SPAN for synthesized helper calls to prevent comment misattribution (#21578) (Dunqing) - 0d80441 codegen: Add mapping before printing `#` for private ident (#21619) (camc314) ### ⚡ Performance - 9fa362e napi/parser: Do not generate tokens except in tests (#21811) (overlookmotel) - 0044392 allocator: Reduce branches when allocating new chunk (#21776) (overlookmotel) - 7896bd0 allocator: `Allocator::used_bytes` do not use chunk iterator (#21771) (overlookmotel) - a5c562f allocator: Remove check in `Arena::new_chunk_memory_details` (#21750) (overlookmotel) - 35bbe1f allocator: `Arena` use unchecked size round up where guaranteed no overflow (#21743) (overlookmotel) - ffe229b allocator: Remove unnecessary check from `Arena::try_alloc_layout_slow_impl` (#21732) (overlookmotel) - 72fece5 allocator: Use `NonNull::offset_from_unsigned` in `Arena::chunk_capacity` (#21731) (overlookmotel) - cab32ae ast: Add `#[inline(always)]` to `node_id` methods on enums with all variants unboxed (#21707) (overlookmotel) - b179688 parser: Allocate `TriviaBuilder` comments in the arena (#21512) (Boshen) - 2290f31 lexer: Fix perf of `Token::set_*` methods on Rust 1.95.0 (#21659) (overlookmotel) - 1b58029 allocator: Move code into cold path in `Arena::alloc_layout` (#21622) (overlookmotel) - 3cf7cef allocator: Reduce instructions on allocation hot path (#21510) (overlookmotel) ### 📚 Documentation - ce65070 data_structures: Document why `as_ref` and `as_mut` on `NonNullConst` and `NonNullMut` take `self` (#21800) (overlookmotel) - 93b7dbd allocator: Improve doc comments for `ChunkFooter` (#21733) (overlookmotel) - 295db8d transformer: Fix comment (#21717) (overlookmotel) - 5c93af8 ast: Add comments explaining `#[inline(always)]` to `node_id` methods on enums (#21706) (overlookmotel) - e4cea25 transform: Use the `node:` namespace in the example (#19998) (루밀LuMir) ### 🛡️ Security - d8076c9 deps: Update rolldown (#21639) (renovate) Co-authored-by: Boshen <1430279+Boshen@users.noreply.github.com>
not sure if this is the idomatic way to approach this.
We don't have a use for this at present, but it seems very useful to have