Optimize `Semantic`

Possible optimizations to `Semantic`. I'll add to this list as I learn more about how it works:

* Optimize structs-of-arrays (#11, oxc-project/oxc#4269)
* Store Semantic data in arena (#31)
* Re-design `ScopeFlags` (#16)
* `unresolved_references` doesn't need to be stored in `ScopeTree`, only a single `Vec` for `root_unresolved_references`. `unresolved_references` is used internally within `Semantic` while resolving references, but at the end, it's an empty `Vec` for every scope except root scope. Store it in `SemanticBuilder` instead and discard at the end. https://github.com/oxc-project/oxc/pull/4107
* `unresolved_references` does not need to retain entries for every scope. Could be a stack where we reuse hash maps from previous scopes (see https://github.com/oxc-project/oxc/pull/4107#issuecomment-2214167393).
* `unresolved_references` could be a linked list / chunked linked list instead of a `Vec`. I don't think it's ever indexed into.
* Reduce hashing when resolving references. If current `unresolved_references` hash map contains hashes already, no need to hash each identifier again when finding entry in parent hash map (maybe hash map *doesn't* contain hashes - SwissTable-style hash maps don't, I think - in which case we could store hashes in entries).
* Store binding names as `Atom<'a>` not `CompactStr`. Conversion to `CompactStr` causes unnecessary allocations. We can just reference strings in source text (as `Atom` does).
* `Reference::name` field is unnecessary for bound references - can be got from `SymbolId`. Is needed for unbound references, but they could be stored elsewhere and referenced.
* Add a scope for "global" which would contain unbound references (replacing root `unresolved_references`)? Then every reference has a `SymbolId`.
* Initialize `Semantic`'s `Vec`s with sufficient capacity so they don't need to grow (see oxc-project/backlog#31).
* Get rid of `Reference`.
  * Store `SymbolId` instead of `ReferenceId` in `IdentifierReference`.
  * Store a reference count for each symbol so you can check if a symbol is referenced or not.
  * Need some way in semantic to update `SymbolId` for `IdentifierReference`s long after exiting the node. Would need a pointer-based solution, or `Cell<SymbolId>`.
  * Only thing we lose is the `AstNodeId` in `Reference`. This is probably used in linter, but I don't know what for, and how easy to replace it.
  * This would allow removing the `resolved_references: IndexVec<SymbolId, Vec<ReferenceId>>` field in `SymbolTable` which is major source of reallocation in semantic, as `Vec<ReferenceId>` is pushed to every time a `IdentifierReference` is found, and has an inherently unpredictable growth pattern (can't know in advance how big it needs to be).
* Don't use `Nodes` within `SemanticBuilder`. We can get type of parent node etc by maintaining stacks in the visitor. This will enable us to have a cut-down version of `SemanticBuilder` which doesn't build `Nodes`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `Semantic` #32

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Optimize Semantic #32

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Optimize `Semantic` #32