Rust & Typescript Cursors for the CST#540
Rust & Typescript Cursors for the CST#540AntonyBlakey merged 1 commit intoNomicFoundation:mainfrom AntonyBlakey:AntonyBlakey/node-cursors
Conversation
🦋 Changeset detectedLatest commit: 22128d5 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
| Cursor::new(self.clone()) | ||
| } | ||
|
|
||
| pub fn is_token_kind(&self, kinds: &[TokenKind]) -> Option<&Rc<TokenNode>> { |
There was a problem hiding this comment.
For is_token_kind() and is_rule_kind(), I wonder why do they take multiple kinds, rather than only one? is it an optimization for NAPI? does that mean that we won't do that if it was a native TS API?
There was a problem hiding this comment.
That is for alignment with the Cursor API, where you can search for more than one at a time without having to do multiple searches over multiple Cursors. In TS it will be is_token_kind(TokenKind, ...).
Note that it also returns the matched TokenNode, meaning no match is required, and it functions therefore as a typeguard.
Note also the following is_token_matching which is also for alignment with the cursor API.
There was a problem hiding this comment.
I think is_XX() usually indicates returning booleans, a cheap check that can be chained in a boolean condition with other conditions. Thoughts about something like as_token()? which indicates a "cast" that can succeed/fail (thus returning an option).
| self.leaf.is_none() | ||
| } | ||
|
|
||
| pub fn node_or_none(&self) -> Option<Node> { |
There was a problem hiding this comment.
regarding node() and node_or_none(), I might be missing something, but I would have expected the rust idiomatic way is to return Option<Node> only/directly for things like this, and let the user expect()/unwrap() if needed? why the extra API?
There was a problem hiding this comment.
Because there are many cases where the client know that node isn't none e.g. after a successful go_to_*, and I wanted to not require .unwrap() there, and instead catch it in one place. Happy to change the method names if you have a suggestion?
There was a problem hiding this comment.
I think this is idiomatic to let the user make that decision, and to make it explicit at the call site (a strong direction in rust). I think similar to how a vector/map/set would offer get() -> Option<T> without (the often convenient) get_and_unwrap() -> T
There was a problem hiding this comment.
trait Index provides [] for this, and it is implemented on Vec, HashMap, and BTreeMap for example.
There was a problem hiding this comment.
But in any case, solved with an explicit completed state. Also allows the cursor to be reset
| .text_range() | ||
| } | ||
|
|
||
| pub fn path_rule_nodes(&self) -> Vec<Rc<RuleNode>> { |
There was a problem hiding this comment.
not sure I understand the use case for path_rule_nodes(). I see it is currently unused?
There was a problem hiding this comment.
So that you can access the parent RuleNodes when using a cursor to navigate the tree.
| } | ||
|
|
||
| #[napi(getter, ts_return_type = "RuleNode | TokenNode | null")] | ||
| pub fn node(&self, env: Env) -> Option<JsObject> { |
There was a problem hiding this comment.
I think I'm missing something. This would convert it to a plain JS object, however, us typing it with RuleNode | TokenNode means that it will have the usual APIs on it, like the ones that need round tripping to Rust. How does that work?
this is related to other to_js() uses as well.
There was a problem hiding this comment.
In this case we have to return JsObject because the result is polymorphic. And fundamentally, JsObject is what is passed even if the Rust type were explicit.
There was a problem hiding this comment.
So on the receiver side, working in NodeJS land, they would get a plain JS object.
It wouldn't have the native Rust APIs? not sure how would calling its methods would round trip.
... I think I will wait for this to be ready and play with it for a bit to understand more.
| fn rule_exit( | ||
| &mut self, | ||
| node: &Rc<RuleNode>, | ||
| cursor: &Cursor, |
There was a problem hiding this comment.
TOL, I wonder if we should be exposing the cursor to the visitor API, even if it cannot be used/mutated by the caller? what is the use case here?
There was a problem hiding this comment.
For two reasons
- You can start a
Visitorat anyCursorposition. It is nice in theVisitorto be able to get to theCursorand do some more internal iteration. Apart from being convenient, it makes internal and external iterators nicely symmetric. - Regardless of the symmetry, it is convenient in the visitor to be able to access things like the
text_rangefrom theCursor, and also thepath_rule_nodes. As theCursorAPI is extended, theVisitorimplementations can use those facilities.
| }, | ||
| }; | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
I suggest splitting into three tests for readability.
There was a problem hiding this comment.
I thought about that, but because the 'test' nature of this is not actually for testing, but just to validate that the example is ok, I decided that the clarity I could gain by simply repeating a section of code with exactly the same result but using different API patterns was worth having the repeated functionality.
There was a problem hiding this comment.
I think it is also for "testing", since the Rust test infra is the same in all projects, this is a quick way for someone to just copy a test locally, and edit/run it for experimenting.
It also indicates to readers that these are separate, instead of having side effects on later statements.
crates/solidity/outputs/cargo/tests/src/doc_examples/cursor_api.rs
Outdated
Show resolved
Hide resolved
OmarTawfik
left a comment
There was a problem hiding this comment.
Left a few suggestions/concerns.
| return Ok(VisitorEntryResponse::StepIn); | ||
| if node.kind == RuleKind::ContractDefinition { | ||
| if let Node::Token(token) = &node.children[2] { | ||
| assert_eq!(token.kind, TokenKind::Identifier); |
There was a problem hiding this comment.
nit: ensure!() to be consistent with bail!() instead of panicking?
| ) -> Result<VisitorEntryResponse> { | ||
| if node.kind != RuleKind::ContractDefinition { | ||
| return Ok(VisitorEntryResponse::StepIn); | ||
| if node.kind == RuleKind::ContractDefinition { |
There was a problem hiding this comment.
nit: I suggest quitting early when possible (like the existing visitor) for readability, and to decrease nesting. But not a blocker at all.
if !interesting {
return;
}
do_something();rather than:
if interesting {
do_something();
}
return;There was a problem hiding this comment.
As a functional programming fan, I prefer to have a single control flow, and therefore not use guards as often.
| goToNthChild(childNumber: number): boolean; | ||
| goToNextSibling(): boolean; | ||
| goToPreviousSibling(): boolean; | ||
| findRuleKind(kinds: Array<RuleKind>): RuleNode | null; |
There was a problem hiding this comment.
TOL: thoughts about unifying findXXXX() and findXXXXWithoutRecursing() to a default argument findXXXX(recursive = true)?
There was a problem hiding this comment.
I wouldn't want to add a recurse param to find* if the use cases were not equaly distributed, which I suspect they wouldn't be.
But in any case I've removed the *WithoutRecursing forms because they cannot work - the movement to the next has to happen between calls to those methods.
OmarTawfik
left a comment
There was a problem hiding this comment.
I've a few open questions, but will wait until the docs are ready to have a better understanding of the proposed API before continuing.
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and publish to npm yourself or [setup this action to publish automatically](https://github.com/changesets/action#with-publishing). If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @nomicfoundation/slang@0.9.0 ### Minor Changes - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add a Rust Cursor API and refactor the Rust Visitor API to run on top of it. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Move Visitor et al to node:: namespace, which is where Cursor is. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Rename `range` functions that return a TextRange to `text_range` ### Patch Changes - [#543](#543) [`7a34599`](7a34599) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Move `syntax::parser::ProductionKind` to `syntax::nodes` namespace. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add TokenNode.text to the TS API. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add first pass of Typescript binding to the Cursor API, but no TS Visitor yet. - [#545](#545) [`e73658a`](e73658a) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - render EBNF grammar on top of each `ProductionKind`, `RuleKind`, and `TokenKind`. - [#558](#558) [`95bbc50`](95bbc50) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Correct versioning for `SourceUnitMember` and `ContractMember` children. ## changelog@0.9.0 ### Minor Changes - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add a Rust Cursor API and refactor the Rust Visitor API to run on top of it. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Move Visitor et al to node:: namespace, which is where Cursor is. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Rename `range` functions that return a TextRange to `text_range` ### Patch Changes - [#543](#543) [`7a34599`](7a34599) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Move `syntax::parser::ProductionKind` to `syntax::nodes` namespace. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add TokenNode.text to the TS API. - [#540](#540) [`0d04f95`](0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add first pass of Typescript binding to the Cursor API, but no TS Visitor yet. - [#545](#545) [`e73658a`](e73658a) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - render EBNF grammar on top of each `ProductionKind`, `RuleKind`, and `TokenKind`. - [#558](#558) [`95bbc50`](95bbc50) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Correct versioning for `SourceUnitMember` and `ContractMember` children. Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and publish to npm yourself or [setup this action to publish automatically](https://github.com/changesets/action#with-publishing). If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @nomicfoundation/slang@0.9.0 ### Minor Changes - [#540](NomicFoundation/slang#540) [`0d04f95`](NomicFoundation/slang@0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add a Rust Cursor API and refactor the Rust Visitor API to run on top of it. - [#540](NomicFoundation/slang#540) [`0d04f95`](NomicFoundation/slang@0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Move Visitor et al to node:: namespace, which is where Cursor is. - [#540](NomicFoundation/slang#540) [`0d04f95`](NomicFoundation/slang@0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Rename `range` functions that return a TextRange to `text_range` ### Patch Changes - [#543](NomicFoundation/slang#543) [`7a34599`](NomicFoundation/slang@7a34599) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Move `syntax::parser::ProductionKind` to `syntax::nodes` namespace. - [#540](NomicFoundation/slang#540) [`0d04f95`](NomicFoundation/slang@0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add TokenNode.text to the TS API. - [#540](NomicFoundation/slang#540) [`0d04f95`](NomicFoundation/slang@0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add first pass of Typescript binding to the Cursor API, but no TS Visitor yet. - [#545](NomicFoundation/slang#545) [`e73658a`](NomicFoundation/slang@e73658a) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - render EBNF grammar on top of each `ProductionKind`, `RuleKind`, and `TokenKind`. - [#558](NomicFoundation/slang#558) [`95bbc50`](NomicFoundation/slang@95bbc50) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Correct versioning for `SourceUnitMember` and `ContractMember` children. ## changelog@0.9.0 ### Minor Changes - [#540](NomicFoundation/slang#540) [`0d04f95`](NomicFoundation/slang@0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add a Rust Cursor API and refactor the Rust Visitor API to run on top of it. - [#540](NomicFoundation/slang#540) [`0d04f95`](NomicFoundation/slang@0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Move Visitor et al to node:: namespace, which is where Cursor is. - [#540](NomicFoundation/slang#540) [`0d04f95`](NomicFoundation/slang@0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Rename `range` functions that return a TextRange to `text_range` ### Patch Changes - [#543](NomicFoundation/slang#543) [`7a34599`](NomicFoundation/slang@7a34599) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Move `syntax::parser::ProductionKind` to `syntax::nodes` namespace. - [#540](NomicFoundation/slang#540) [`0d04f95`](NomicFoundation/slang@0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add TokenNode.text to the TS API. - [#540](NomicFoundation/slang#540) [`0d04f95`](NomicFoundation/slang@0d04f95) Thanks [@AntonyBlakey](https://github.com/AntonyBlakey)! - Add first pass of Typescript binding to the Cursor API, but no TS Visitor yet. - [#545](NomicFoundation/slang#545) [`e73658a`](NomicFoundation/slang@e73658a) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - render EBNF grammar on top of each `ProductionKind`, `RuleKind`, and `TokenKind`. - [#558](NomicFoundation/slang#558) [`95bbc50`](NomicFoundation/slang@95bbc50) Thanks [@OmarTawfik](https://github.com/OmarTawfik)! - Correct versioning for `SourceUnitMember` and `ContractMember` children. Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Does not include TS Visitor - waiting for Omar's TS/Rust codegen unification PR to land before completing this task.