Summary
Provide means to customize the AST, for including additional data with each node.
Detailed Explanation
Background: I'm currently developing a compiler for a subset of Python, currently planning to use the RustPython parser to get the AST. The project requires type inference and other processing with the AST. It would be helpful if we can attach data to each node in the AST.
I think allowing users to store custom data in AST nodes would also help other developers to use the crate for other purposes, like static analysis, or provide better optimization for the RustPython runtime by trying to recover more type information (not sure if this is feasible though, I am not familiar with the project).
I think it would be possible to provide an additional field using generic with a default value. For example:
pub struct Located<T, U = ()> {
pub location: Location,
pub node: T,
pub custom: Option<U>,
}
And add the U parameter to all references to Located (Expression, Statement, ... basically all structs in the AST and parser methods)
With the default parameter, I think it would not break the existing code as users can still use Expression<U> as Expression.
This may also break pattern matching, but that would just require the users to add .. to the pattern so it should be simple.
Drawbacks, Rationale, and Alternatives
Drawbacks:
- May break pattern matching in existing code.
- The default type
U = () would increase the size of the Located struct. One way to prevent the problem would be to use the never type !, but this requires nightly. The following code print 16 and 8 in the rust playground.
#![feature(never_type)]
struct Test<U = ()> {
a: usize,
b: Option<U>
}
fn main() {
println!("{}", std::mem::size_of::<Test>());
println!("{}", std::mem::size_of::<Test<!>>());
}
An alternative would simply be not supporting the additional field and require users to store it elsewhere, maybe using HashMap or another tree. That would be harder to use and suffer from performance problems.
Unresolved Questions
Summary
Provide means to customize the AST, for including additional data with each node.
Detailed Explanation
Background: I'm currently developing a compiler for a subset of Python, currently planning to use the RustPython parser to get the AST. The project requires type inference and other processing with the AST. It would be helpful if we can attach data to each node in the AST.
I think allowing users to store custom data in AST nodes would also help other developers to use the crate for other purposes, like static analysis, or provide better optimization for the RustPython runtime by trying to recover more type information (not sure if this is feasible though, I am not familiar with the project).
I think it would be possible to provide an additional field using generic with a default value. For example:
And add the
Uparameter to all references toLocated(Expression,Statement, ... basically all structs in the AST and parser methods)With the default parameter, I think it would not break the existing code as users can still use
Expression<U>asExpression.This may also break pattern matching, but that would just require the users to add
..to the pattern so it should be simple.Drawbacks, Rationale, and Alternatives
Drawbacks:
U = ()would increase the size of theLocatedstruct. One way to prevent the problem would be to use the never type!, but this requires nightly. The following code print 16 and 8 in the rust playground.An alternative would simply be not supporting the additional field and require users to store it elsewhere, maybe using HashMap or another tree. That would be harder to use and suffer from performance problems.
Unresolved Questions