Reduce memory footprint of program ast by packing booleans into a bit field

### Feature Use Case

The ast nodes on the rollup side of things consume a lot of memory for large projects. These are generally speaking already fairly small objects, but in many cases the sheer volume of nodes means that small reductions in object size for commonly appearing nodes can have larger impacts.

For example, on the project I'm currently building, there are 1,309,167 Identifier nodes that are each 128 bytes. Resulting in ~160 MiB of memory for Identifier objects alone (excluding the data they retain through references). Each boolean field on an object currently uses 4 bytes of memory for 1 bit of information.

Looking at the shape of the Identifier node, there are a 3 booleans (included, deoptimized, and isTDZAccess) so we can save 8 bytes (12 - 4) per instance by packing the boolean values into a single 4 byte number field. (Side note: isTDZAccess requires 2-bits of data as it currently is `boolean|null` to allow us to cache the result if it is currently null).

<img width="560" alt="image" src="https://github.com/rollup/rollup/assets/17907922/7fa63886-5a6b-4f31-8368-ef964883f455">

Two of those boolean values exist in the `NodeBase` and `ExpressionEntity` base classes, so this optimization reduces the size of all node classes by 4 bytes at a minimum + 4 bytes for every boolean of their own that gets packed into the field.

Additionally, there are a few other things we can do like only setting esTreeNode if it needs to be retained (most of the time it doesn't), or providing `this.context` through an accessor that that returns `return this._context ?? this.parent.context` that would allow the value to be overridden by node subclasses, but for the default case would save another 4 bytes for each of these two fields.

**I have tested this approach by modifying the rollup js output in node_modules directly, and this appears to have saved about 120mb of AST node data in total according to the heap snapshots.**

After this optimization, Identifier takes 96 bytes per instance vs 128 (32 byte reduction resulting in 40 MiB less heap data)
<img width="551" alt="image" src="https://github.com/rollup/rollup/assets/17907922/9e68d1fd-307e-4012-be77-45a8c827e45f">

This initial investigation only touched `ExpressionEntity`, `NodeBase`, `Identifier`, and `MemberExpression`, but as mentioned previously, `ExpressionEntity` and `NodeBase` are base classes for lots of other node types.

### Feature Proposal

The way I implemented this test was to add a number field `flags` to `ExpressionEntity` along with a `getFlag`/`setFlag` helper function that takes a mask and a value.

Each boolean field/member on the class is then replaced with a getter/setter pair that calls into that helper function using an enum that defines the mask values
```ts
class ExpressionEntity {
    protected flags: number = 0;
    protected getFlag(mask: number): boolean {
        return (this.flags & mask) !== 0
    }
    protected setFlag(mask: number, value: boolean): void {
        const original = this.flags;
        this.flags = value ? (original & ~mask) : (original | mask);
    }
    
    protected get included(): boolean {
        return this._getFlag(FLAG_INCLUDED);
    }
    protected set included(value: boolean): void {
        this._setFlag(FLAG_INCLUDED, value);
    }
    
    // ... more code
}
```

classes that then derive from `ExpressionEntity` then just need to provide those accessors instead of defining fields on the class. Its a little more verbose, but not too bad:

This:
```ts
class NodeBase extends ExpressionEntity {
	protected deoptimized = false;
}
```

becomes this:
```ts
class NodeBase extends ExpressionEntity {
	protected get deoptimized(): boolean {
            return this.getFlag(FLAG_DEOPTIMIZED);
        }
	protected get deoptimized(value: boolean) {
            this.setFlag(FLAG_DEOPTIMIZED, value);
        }

	constructor() {
            // if the previous initializer was !== false then we need to initialize in the constructor,
            // otherwise we can omit this:
            // this.deoptimized = true; 
        }
}
```

In my test project, I only wound up using 8 bits for the mask, but numbers still take up the full 32 bits. I don't anticipate needing more than 32 bits for this process as there aren't that many more booleans, but technically speaking the sub-type bits can overlap between node types if we need to. I'd prefer to avoid that if we can just because it leads to more confusion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce memory footprint of program ast by packing booleans into a bit field #5127

Feature Use Case

Feature Proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Reduce memory footprint of program ast by packing booleans into a bit field #5127

Description

Feature Use Case

Feature Proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions