feat: ast-grep Parser Abstratction (Part 1, Node)#1940
feat: ast-grep Parser Abstratction (Part 1, Node)#1940HerringtonDarkholme merged 13 commits intomainfrom
Conversation
WalkthroughThis set of changes refactors the language abstraction system by splitting the original Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI/Config
participant Pattern/Rule/Fixer
participant CoreLanguage
participant Language
User->>CLI/Config: Provides language, patterns, rules, fixers
CLI/Config->>Pattern/Rule/Fixer: Instantiates without language generics
Pattern/Rule/Fixer->>CoreLanguage: Calls core parsing methods (e.g., meta_var_char)
Pattern/Rule/Fixer->>Language: Calls advanced methods (e.g., get_ts_language)
CLI/Config->>User: Returns results (matches, rewrites, etc.)
Possibly related PRs
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
b77f4d1 to
ac00a62
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1940 +/- ##
==========================================
- Coverage 87.14% 87.04% -0.10%
==========================================
Files 99 98 -1
Lines 15884 15749 -135
==========================================
- Hits 13842 13709 -133
+ Misses 2042 2040 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🔭 Outside diff range comments (6)
crates/config/src/rule_config.rs (2)
201-205:⚠️ Potential issue
map_errconverts to the wrong error type – compilation will fail
Fixer::parsereturns a result with its own error type (likelyFixerError).
The current call:let parsed = Fixer::parse(fix, &env, &self.transform).map_err(RuleCoreError::Fixer)?;changes that error into
RuleCoreError, but the?operator subsequently tries to bubble up aRuleCoreErroreven thoughget_fixeris declared to returnResult<_, RuleConfigError>. This is a type mismatch and will not compile.Proposed fix:
- let parsed = Fixer::parse(fix, &env, &self.transform).map_err(RuleCoreError::Fixer)?; + let parsed = Fixer::parse(fix, &env, &self.transform) + .map_err(|e| RuleConfigError::Core(RuleCoreError::Fixer(e)))?;This keeps the conversion localized and preserves the declared return type.
131-149: 🛠️ Refactor suggestion
insert_rewriterpanics on duplicate IDs instead of reporting an error
reg.insert_rewriter(&val.id, rewriter).expect("should work");If a user accidentally re‑uses a rewriter ID, the code will panic, aborting the whole run.
Because you already have a dedicatedRuleConfigError::DuplicateRule(viaReferentRuleError), consider propagating that error instead of crashing:- reg.insert_rewriter(&val.id, rewriter).expect("should work"); + reg + .insert_rewriter(&val.id, rewriter) + .map_err(|e| RuleConfigError::Rewriter(e, val.id.clone()))?;This makes the behaviour consistent with the other
insert_*helpers and avoids unexpected panics during configuration loading.crates/config/src/rule/referent_rule.rs (1)
31-45:⚠️ Potential issueUnsafe interior‑mutability on shared
Arc<HashMap>risks data races
Registration::writetransmutes an&Arc<HashMap<…>>into a mutable&mut HashMap<…>via raw pointer cast.
With the new publicGlobalRules::insertfunction, external callers can now mutate aGlobalRulesinstance after it has been shared, opening the door to UB if the map is accessed concurrently elsewhere.Consider replacing this hand‑rolled pattern with a safe concurrent map (
dashmap,parking_lot::RwLock<HashMap<…>>, orstd::sync::RwLock) to keep mutation fully synchronized and sound.crates/core/src/ops.rs (1)
209-218:⚠️ Potential issueAvoid mutating the caller’s environment inside
Not— clone first
Not::match_node_with_envreceivesenvby mutable reference and passes it directly to the inner matcher.
If the inner matcher succeeds, you later returnNone(because of the XOR), yet all metavariable bindings that were added by the inner matcher remain inenv.
These stale bindings “leak” outwards and can pollute subsequent matcher evaluation (e.g. whenNotis used insideAll,Op::every, etc.).- self - .not - .match_node_with_env(node.clone(), env) - .xor(Some(node)) + // work on a copy so that any bindings produced by `self.not` + // are discarded when it matches. + let mut shadow_env = Cow::Borrowed(env.as_ref()); + let matched = self.not.match_node_with_env(node.clone(), &mut shadow_env).is_some(); + if matched { + None + } else { + Some(node) + }This preserves the intended boolean semantics while guaranteeing that
envis only mutated when theNotpattern fails (i.e. when we actually returnSome(node)).crates/config/src/check_var.rs (1)
147-149:⚠️ Potential issueIncorrect reference level leads to mismatched‐type compilation error
varis already an&str. Passing&vartherefore hands a&&strtoHashSet<&str>::contains, which expects&str. This currently compiles only thanks to an (unstable) blanketBorrowimplementation and can easily break, or trigger confusing error messages if trait bounds change.- if !vars.contains(&var) { + if !vars.contains(var) {A direct reference keeps the type signatures clean and avoids double indirection.
crates/config/src/rule_core.rs (1)
187-192:⚠️ Potential issue
get_envcan fabricate an environment for any language — risks runtime incoherenceAfter dropping the generic parameter from
RuleCore, the methodpub fn get_env<L: Language>(&self, lang: L) -> DeserializeEnv<L>allows callers to request a
DeserializeEnvfor anyLanguage, even if the
RuleCore, itsRule, the registered utils/rewriters, and theFixer
instances were all built for a different language.That silently breaks the compile‑time guarantee we previously had and can
produce undefined matcher behaviour or panics at runtime when, for example,
Tree‑sitter kind‑ids no longer match.Recommended approaches:
- Re‑introduce a language “marker” inside
RuleCore, e.g.phantom: std::marker::PhantomData<L>and keep
RuleCoregeneric overL, or
- Store the original language’s
LanguageId(or equivalent) at construction
time and assert equality insideget_env, returning an error otherwise.Either restores soundness and prevents cross‑language misuse.
🧹 Nitpick comments (14)
crates/cli/src/run.rs (1)
115-128: Unnecessary allocation: avoid.clone()by makingStrictnessCopy
MatchStrictnessis anenumand therefore alreadyCopy.
If we deriveCopyfor the wrapperStrictness, the extra.clone()(heap‑free but still superfluous) disappears and the intent becomes clearer.- fn build_pattern(&self, lang: SgLang) -> Result<Pattern> { ... - if let Some(strictness) = &self.strictness { - Ok(pattern.with_strictness(strictness.0.clone())) + fn build_pattern(&self, lang: SgLang) -> Result<Pattern> { ... + if let Some(strictness) = &self.strictness { + Ok(pattern.with_strictness(strictness.0))Add the derive:
#[derive(Clone)] -struct Strictness(MatchStrictness); +#[derive(Clone, Copy)] +struct Strictness(MatchStrictness);Tiny win, but eliminates an avoidable method call every time the CLI builds a pattern.
crates/config/src/rule/stop_by.rs (1)
121-148: Public method lacks documentation & example usage
StopBy::findis a central control‑flow utility (“monster method” per the TODO) yet still undocumented after this refactor.
Readers now have to reverse‑engineer:
- expected invariants for
once/multi/finder- how
StopByvariants change traversal semantics- lifetime constraints on
'tPlease consider adding a doc‑comment with a short example and move the inline TODO to an issue tracker to avoid bit‑rot.
crates/config/src/rule/deserialize_env.rs (1)
175-189: Minor performance tweak: reuseDeserializeEnvinstead of creating one per ruleInside
parse_global_utilsa freshDeserializeEnvis built for every utility rule even though all rules for a givenlangshare the sameGlobalRules. Constructing it once and re‑using it avoids repeatedRuleRegistration::from_globalsallocations:- for id in order { - let (lang, core) = utils.get(id).expect("must exist"); - let env = DeserializeEnv::new(lang.clone()).with_globals(®istration); + for (lang, group) in &utils_per_language { + let env = DeserializeEnv::new(lang.clone()).with_globals(®istration); + for (id, core) in group { let matcher = core.get_matcher_with_hint(env, CheckHint::Global)?; registration .insert(id, matcher) .map_err(RuleSerializeError::MatchesReference)?; } }Not critical, but could save noticeable CPU on large rule‑sets.
crates/core/src/matcher.rs (2)
76-90: Cache thePatternto avoid recompiling on every callEvery invocation of
match_node_with_env/get_match_lenon an&strcauses a freshPattern::strallocation and language‑clone.
If the same pattern is used repeatedly (a common scenario), this churn can become noticeable.- let pattern = Pattern::str(self, node.lang().clone()); + // cache the compiled pattern once per thread + thread_local! { + static COMPILED: std::cell::RefCell< + std::collections::HashMap<(String, std::any::TypeId), Pattern> + > = const { /* ... */ }; + } + let pattern = get_or_insert_compiled(self, node.lang());A simple
once_cell::sync::Lazyor thread‑local hashmap keyed by(pattern_string, lang_type_id)would amortise the cost while keeping the public API intact.
131-139: Computepotential_kindsonce inFindAllNodes
self.matcher.potential_kinds()is evaluated on everynext()call, although the result is immutable for the lifetime of the iterator.pub struct FindAllNodes<'tree, D: Doc, M: Matcher> { dfs: Pre<'tree, D>, matcher: M, + kinds: Option<BitSet>, } impl<'tree, D: Doc, M: Matcher> FindAllNodes<'tree, D, M> { pub fn new(matcher: M, node: Node<'tree, D>) -> Self { - Self { - dfs: node.dfs(), - matcher, - } + let kinds = matcher.potential_kinds(); + Self { dfs: node.dfs(), matcher, kinds } } }This micro‑optimisation reduces repeated allocations / calculations during large searches.
crates/config/src/rule/mod.rs (1)
259-275: RedundantOk()wrapper inverify_util
verify_utilalready returnsResult<(), _>. Wrapping it in anotherOk(..)adds an unnecessary layer.- Rule::Matches(r) => Ok(r.verify_util()?), + Rule::Matches(r) => r.verify_util(),Same behaviour, cleaner code.
crates/config/src/rule_config.rs (1)
118-127: Minor: redundant clone ofDeserializeEnv
self.core.get_matcher(env.clone())?consumesenv.clone()while the originalenvis not reused later.
DeserializeEnvis cheap, but you can remove the extra allocation:- let rule = self.core.get_matcher(env.clone())?; + let rule = self.core.get_matcher(env)?;This also removes the need for
envto implementClonehere.crates/config/src/rule/relational_rule.rs (1)
184-195:try_newstill generic overLeven though the struct is language‑agnostic
Precedes::try_new,Follows::try_new,Has::try_new, andInside::try_newnow only useDeserializeEnv<L>to deserialize the inner rule. The surrounding refactor removes language generics elsewhere; you could finish the clean‑up by changing the signature to accept&DeserializeEnvwithout<L>and erase the type parameter entirely:- pub fn try_new<L: Language>(relation: Relation, env: &DeserializeEnv<L>) + pub fn try_new(relation: Relation, env: &DeserializeEnv)This keeps the public API surface minimal and consistent with other de‑genericised types.
crates/core/src/ops.rs (1)
128-150: Minor: simplify theAnysearch loopInside the
Anymatcher you recreatenew_envfor every iteration.
Instead, allocate it once and clone only when a candidate pattern matches; this avoids a needlessCowre‑borrow on every miss.- let mut new_env = Cow::Borrowed(env.as_ref()); - let found = self.patterns.iter().find_map(|p| { - new_env = Cow::Borrowed(env.as_ref()); - p.match_node_with_env(node.clone(), &mut new_env) - }); + let mut scratch = Cow::Borrowed(env.as_ref()); + let found = self.patterns.iter().find_map(|p| { + scratch.to_mut().clear(); // restore to original each round + p.match_node_with_env(node.clone(), &mut scratch) + });Not critical, but this gives a tiny perf win and makes the control‑flow clearer.
crates/config/src/rule/nth_child.rs (1)
206-230: Avoid allocating a fullVecwhen locating the index
find_indexcurrently builds aVecof all siblings only to call.position().
This creates unnecessary heap traffic when traversing large ASTs.- let mut children: Vec<_> = if let Some(rule) = &self.of_rule { - /* … */ - } else { - /* … */ - }; - if self.reverse { - children.reverse() - } - children - .iter() - .position(|child| child.node_id() == node.node_id()) + let iter = parent.children().filter(|n| n.is_named()); + let iter = if let Some(rule) = &self.of_rule { + iter.filter(|child| rule.match_node_with_env(*child, env).is_some()) + } else { + iter + }; + let iter: Box<dyn Iterator<Item = _>> = if self.reverse { + Box::new(iter.collect::<Vec<_>>().into_iter().rev()) + } else { + Box::new(iter) + }; + iter.enumerate() + .find(|(_, child)| child.node_id() == node.node_id()) + .map(|(idx, _)| idx)This keeps the algorithm O(n) but removes the intermediate allocation (and a potential full reverse of the vector).
crates/config/src/fixer.rs (2)
139-150:matchershould be passed by reference to avoid an extra move
get_replaced_rangecurrently takesmatcher: impl Matcherby value.
Callers have to pass an owned matcher, causing an extra move (and possibly a clone if they want to reuse it).- fn get_replaced_range(&self, nm: &NodeMatch<D>, matcher: impl Matcher) -> Range<usize> { + fn get_replaced_range( + &self, + nm: &NodeMatch<D>, + matcher: &impl Matcher, + ) -> Range<usize> {All call‑sites inside the crate already have a reference handy, so this change is API‑compatible but slightly more ergonomic.
41-59: Remove the lingering generic onExpansion::parse
Expansionitself is now language‑agnostic, yetparsestill carries<L: Language>only to threadDeserializeEnv<L>.
SinceDeserializeEnvalready encapsulates the language, the extra type parameter is no longer needed and hurts readability.- fn parse<L: Language>( + fn parse( relation: &Maybe<Relation>, - env: &DeserializeEnv<L>, + env: &DeserializeEnv<impl Language>, ) -> Result<Option<Self>, FixerError> {Or simply accept a generic
E: ?Sizedif you want to stay completely agnostic.crates/config/src/check_var.rs (1)
21-28: Unnecessary double‑indirection onFixerargument
fixeris passed as&Option<Fixer>.
All call‑sites already own anOption<Fixer>that they borrow, so the extra&buys no benefit and forces every callee to peel two levels of indirection.Consider changing the signature (and the corresponding callee sites) to accept
Option<&Fixer>:-pub fn check_rule_with_hint<'r>( - ... - fixer: &Option<Fixer>, +pub fn check_rule_with_hint<'r>( + ... + fixer: Option<&'r Fixer>,This removes one layer of reference and makes intent ( “maybe there is a fixer, and if so borrow it” ) explicit.
crates/config/src/rule_core.rs (1)
140-148: DeriveDebug/Clonefor easier diagnosticsNow that
RuleCoreis non‑generic, deriving common traits no longer risks
excessive monomorphisation. Adding at least#[derive(Clone, Debug)]eases
unit‑test authoring and ad‑hoc logging.- pub struct RuleCore { + #[derive(Clone, Debug)] + pub struct RuleCore {Doing so has zero runtime cost and improves maintainability.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (37)
crates/cli/src/config.rs(3 hunks)crates/cli/src/lang/injection.rs(1 hunks)crates/cli/src/print/interactive_print.rs(3 hunks)crates/cli/src/print/mod.rs(1 hunks)crates/cli/src/run.rs(4 hunks)crates/cli/src/utils/debug_query.rs(1 hunks)crates/cli/src/utils/mod.rs(2 hunks)crates/config/src/check_var.rs(7 hunks)crates/config/src/fixer.rs(8 hunks)crates/config/src/lib.rs(2 hunks)crates/config/src/rule/deserialize_env.rs(5 hunks)crates/config/src/rule/mod.rs(7 hunks)crates/config/src/rule/nth_child.rs(5 hunks)crates/config/src/rule/range.rs(5 hunks)crates/config/src/rule/referent_rule.rs(12 hunks)crates/config/src/rule/relational_rule.rs(9 hunks)crates/config/src/rule/stop_by.rs(5 hunks)crates/config/src/rule_config.rs(5 hunks)crates/config/src/rule_core.rs(12 hunks)crates/config/src/transform/mod.rs(2 hunks)crates/config/src/transform/rewrite.rs(5 hunks)crates/core/src/lib.rs(1 hunks)crates/core/src/match_tree/mod.rs(4 hunks)crates/core/src/matcher.rs(9 hunks)crates/core/src/matcher/kind.rs(3 hunks)crates/core/src/matcher/node_match.rs(1 hunks)crates/core/src/matcher/pattern.rs(12 hunks)crates/core/src/matcher/text.rs(2 hunks)crates/core/src/meta_var.rs(1 hunks)crates/core/src/node.rs(2 hunks)crates/core/src/ops.rs(13 hunks)crates/core/src/replacer.rs(1 hunks)crates/core/src/traversal.rs(3 hunks)crates/napi/src/doc.rs(1 hunks)crates/napi/src/find_files.rs(2 hunks)crates/pyo3/src/py_node.rs(2 hunks)xtask/src/schema.rs(0 hunks)
💤 Files with no reviewable changes (1)
- xtask/src/schema.rs
✅ Files skipped from review due to trivial changes (1)
- crates/config/src/transform/rewrite.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/config/src/lib.rs
🧰 Additional context used
🧬 Code Graph Analysis (7)
crates/core/src/replacer.rs (1)
crates/config/src/fixer.rs (1)
get_replaced_range(138-150)
crates/core/src/lib.rs (3)
crates/pyo3/src/py_node.rs (1)
replace(255-268)crates/core/src/node.rs (1)
replace(523-527)crates/napi/src/sg_node.rs (1)
replace(378-386)
crates/cli/src/print/interactive_print.rs (3)
crates/config/src/lib.rs (1)
from_str(26-29)crates/config/src/fixer.rs (1)
from_str(115-122)crates/napi/src/doc.rs (1)
from_str(163-165)
crates/cli/src/utils/mod.rs (3)
crates/pyo3/ast_grep_py/__init__.py (1)
Pattern(8-11)crates/core/src/node.rs (2)
lang(84-86)lang(246-248)crates/core/src/language.rs (1)
ast_grep(56-58)
crates/config/src/rule/deserialize_env.rs (1)
crates/config/src/rule/mod.rs (1)
deserialize_rule(370-390)
crates/core/src/match_tree/mod.rs (1)
crates/pyo3/ast_grep_py/__init__.py (1)
Pattern(8-11)
crates/config/src/rule_config.rs (2)
crates/config/src/rule_core.rs (3)
get_matcher(117-119)get_matcher(283-287)get_fixer(90-97)crates/pyo3/src/py_node.rs (1)
get_matcher(338-354)
🔇 Additional comments (52)
crates/cli/src/lang/injection.rs (1)
37-37: Simplification of type signature by removing language generic parameterThe change from
Vec<(RuleCore<SgLang>, Option<String>)>toVec<(RuleCore, Option<String>)>is appropriate and aligns with the broader refactoring effort to make core types language-agnostic. This simplification helps reduce the complexity of the type system while maintaining the same functionality.crates/core/src/matcher/node_match.rs (1)
55-55: Simplified trait bound by removing language-specific constraintThe change from
M: Matcher<D::Lang>toM: Matcheris a good simplification that makes the API more flexible. This relaxation of the trait bound allows the method to accept any matcher implementation without tying it to the specific document language type, which aligns with the overall refactoring goal.crates/napi/src/doc.rs (1)
32-32: Simplified return type by removing language generic parameterThe change from
NapiResult<RuleCore<NapiLang>>toNapiResult<RuleCore>is a good simplification that aligns with the broader refactoring effort to makeRuleCorelanguage-agnostic. The function body works correctly with this change since the language-specific information is now handled at a different level.crates/core/src/meta_var.rs (1)
112-112: Simplified trait bound by removing language-specific constraintThe change from
M: Matcher<D::Lang>toM: Matcherappropriately relaxes the trait bound on the generic parameter, allowing any matcher implementation to be used without requiring a specific language type. This change is consistent with the broader refactoring goal of separating language-specific concerns from core functionality.crates/cli/src/print/interactive_print.rs (3)
369-369: Simplified function signature by removing language generic parameter.The function signature for
make_diffshas been simplified to take a genericimpl Matcherrather thanimpl Matcher<SgLang>, aligning with the broader refactoring effort to remove explicit language generic parameters from the codebase. This makes the API more flexible and less coupled to specific language types.
417-418: Added explicit type annotation for Fixer::from_str.Now that the
Fixertype is no longer generic over language, thefrom_strmethod requires an explicit type parameter annotation to specify the language. This change is consistent with the refactoring of separating the core language abstractions.
431-432: Added explicit type annotation for Fixer::from_str.Similar to the previous instance, the language type parameter is now explicitly specified for
Fixer::from_str. This pattern is applied consistently throughout the codebase as part of the refactoring effort.crates/cli/src/utils/debug_query.rs (1)
20-20: Simplified method signature by making Pattern non-generic.The
debug_patternmethod now takes a non-generic&Patterninstead of&Pattern<SgLang>, aligning with the broader effort to simplify type signatures by removing language generics. The language parameter is still passed separately, maintaining the functionality while making the API more flexible.crates/cli/src/print/mod.rs (1)
74-74: Simplified method signature by removing language generic parameter.The
Diff::generatemethod now takes&impl Matcherand&Fixerwithout explicit language type parameters, consistent with the broader refactoring effort to decouple core abstractions from language-specific details. This change makes the code more maintainable and the API more flexible.crates/core/src/lib.rs (1)
52-52: Simplified method signature by removing language generic constraint.The
replacemethod onAstGrepnow only requiresM: Matcherwithout binding it to a specific language type. This is part of the broader refactoring that separates core language abstractions from tree-sitter-specific functionality, making the API more flexible and easier to extend.The change maintains the same functionality while reducing type complexity, aligning perfectly with the PR objectives of abstracting the parser implementation.
crates/core/src/traversal.rs (3)
77-77: Simplified matcher trait bound.The change removes the language-specific generic parameter from the
Matchertrait bound, making matchers language-agnostic. This aligns with the broader refactoring goal of separating core functionality from language-specific implementations.
101-101: Consistent trait bound update for Visit struct.Updated the
Matchertrait bound to be language-agnostic, matching the change in the Visitor::visit method. This ensures consistency throughout the codebase.
115-115: Iterator implementation updated with simplified trait bound.Continued the pattern of removing language specificity from the
Matchertrait bound in theIteratorimplementation forVisit. This completes the decoupling of traversal functionality from language-specific implementation details.crates/pyo3/src/py_node.rs (2)
342-342: Simplified RuleCore return type.Removed the language generic parameter from
RuleCore, making it language-agnostic. This simplifies the type signature while maintaining the same functionality, and aligns with the broader refactoring effort across the codebase.
372-372: Consistent type signature update for helper function.Updated the return type of
get_matcher_from_ruleto use the non-genericRuleCoretype, maintaining consistency with the changes in theSgNode::get_matchermethod. This ensures that the Python bindings use the same simplified type structure.crates/core/src/replacer.rs (1)
22-22: Simplified Matcher trait bound in Replacer trait.Updated the
get_replaced_rangemethod to accept a language-agnosticMatcherimplementation rather than one tied to a specific language. This change aligns with the trait bound simplifications throughout the codebase and reduces coupling between replacers and language-specific implementations.crates/config/src/transform/mod.rs (2)
51-51: Removed language generic parameter from RuleCore.Updated the function signature to use the language-agnostic
RuleCoretype in the rewriters map, simplifying the type system while maintaining the same functionality. This is consistent with the broader refactoring effort.
75-75: Updated Ctx struct with simplified RuleCore type.Changed the type of the
rewritersfield in theCtxstruct to use the non-genericRuleCoretype, maintaining consistency with the changes in theapply_transformmethod. This ensures that the transformation context uses the same simplified type structure.crates/core/src/matcher/kind.rs (5)
25-27: Refactoring to make KindMatcher non-genericThe
KindMatcherstruct has been refactored to remove the generic language parameter, making it non-generic. This simplifies the type signature and aligns with the broader codebase refactoring effort.
29-36: Appropriate method refactoring with explicit language passingGood adaptation of the constructor method to accept the language parameter explicitly rather than through a generic type parameter. This approach maintains the functionality while improving type simplicity.
38-45: Well-preserved error handling with simplified type signatureThe error handling in
try_newis properly maintained while adapting to the new non-generic struct design. The function still creates aKindMatcherinstance and performs the necessary validation.
47-49: Simplified struct initializationThe
from_idmethod has been correctly updated to remove the phantom data initialization, properly aligning with the non-generic struct definition.
78-89: Updated Matcher trait implementation for non-generic structThe
Matchertrait implementation has been appropriately updated to reflect the non-generic nature ofKindMatcher, maintaining the same functionality while simplifying the interface.crates/cli/src/utils/mod.rs (3)
147-152: Simplified Pattern type by removing language generic parameterThe function signature has been updated to work with non-generic
Patterntypes rather thanPattern<SgLang>, while maintaining the language context through explicit parameters. This aligns with the broader refactoring effort.
155-165: Adapted closure to use non-generic PatternThe
do_matchclosure correctly adapts to use the non-genericPatterntype while maintaining all existing functionality with fixed string checks and properMatchUnitconstruction.
192-196: Generalized MatchUnit to accept any Matcher implementationThe
MatchUnitstruct now accepts any type implementing theMatchertrait rather than being tied specifically toMatcher<SgLang>. This improves flexibility and reusability.crates/core/src/matcher/text.rs (4)
3-4: Simplified imports after removing Language generic parameterThe import statement has been appropriately updated to remove the unused
Languageimport, which is no longer needed after the refactoring to makeRegexMatchernon-generic.
18-20: Refactored RegexMatcher to be non-genericThe
RegexMatcherstruct has been simplified by removing the language generic parameter and associated phantom data, making it a non-generic struct that only stores the regex pattern.
22-28: Simplified implementation block for RegexMatcherThe implementation block for
RegexMatcherhas been updated to remove the language generic parameter, which is appropriate given the struct is now non-generic.
30-37: Updated Matcher trait implementation for non-generic structThe
Matchertrait implementation has been correctly updated to reflect the non-generic nature ofRegexMatcher, with appropriate updates to the method signatures while preserving the matching functionality.crates/cli/src/config.rs (3)
131-160: Simplified GlobalRules type by removing language generic parameterThe
find_util_rulesfunction now returns a non-genericGlobalRulestype instead ofGlobalRules<SgLang>. The language parameter is now specified explicitly at the call site withDeserializeEnv::<SgLang>::parse_global_utils, which properly preserves the language-specific parsing.
162-166: Updated read_directory_yaml to use non-generic GlobalRulesThe
global_rulesparameter type has been updated to non-genericGlobalRules, which aligns with the broader refactoring effort while maintaining the same functionality.
220-223: Updated read_rule_file to use non-generic GlobalRulesThe
global_rulesparameter type has been simplified toOption<&GlobalRules>, removing the explicit language generic parameter while preserving the function's behavior.crates/config/src/rule/range.rs (4)
1-1: Appropriate import update for the refactored language traitsThe import statement has been updated to reference the core modules without language generic parameters, aligning with the trait separation refactoring.
40-43: Successfully removed language generic parameterThe
RangeMatcherstruct has been simplified by removing the generic language parameter and the associatedPhantomData<L>field. This reduces type complexity while maintaining all functionality.
68-69: Clean adaptation of Matcher trait implementationThe
Matchertrait implementation has been correctly updated to match the new non-generic trait definition. The method signature now uses<'tree, D: Doc>directly in the method rather than via a struct generic parameter.
104-110: Test code properly updatedThe test code has been correctly updated to use the non-generic
RangeMatcher, demonstrating that the refactoring doesn't affect the usage pattern in tests.crates/napi/src/find_files.rs (2)
143-144: Simplified FindInFiles type aliasThe
FindInFilestype alias has been correctly updated to use the non-genericRuleCoretype, aligning with the broader refactoring of removing language generic parameters from core types.
208-212: Function signature properly updatedThe
call_sg_nodefunction signature has been correctly updated to use the non-genericRuleCoretype in its parameter list, maintaining consistency with the type alias changes.crates/core/src/matcher/pattern.rs (7)
1-1: Proper import of both language traitsThe import statement correctly includes both the new
CoreLanguageand the existingLanguagetrait, reflecting the trait hierarchy refactoring where core parsing functionality has been separated.
15-19: Successful conversion to non-generic Pattern structThe
Patternstruct has been refactored to remove the generic language parameter, simplifying the type system. ThePhantomData<L>field has been removed while maintaining all necessary functionality.
139-142: Method-level generic parameters appropriately addedThe
strmethod now includes a generic parameter<L: Language>at the method level rather than at the struct level. This provides better flexibility and clearer, more targeted type constraints.
197-199: Consistent approach to method-level genericsThe
try_newmethod continues the pattern of adding generic parameters at the method level, maintaining consistency across the implementation.
221-225: Contextual pattern creation correctly handledThe
contextualmethod handles the language parameter appropriately, including proper cloning where needed. The generic parameters have been moved from the struct level to the method level without changing the functionality.
257-258: Matcher trait implementation properly updatedThe
Matchertrait implementation has been correctly updated to match the new non-generic trait definition, with method-level generic parameters instead of struct-level ones.
515-515: Memory size assertion updatedThe memory size assertion in the test has been appropriately updated to reflect the structural changes in the
Patterntype. This ensures that the size reduction from removing thePhantomDatafield is properly validated.crates/core/src/match_tree/mod.rs (4)
47-47: Function signature appropriately updatedThe
match_end_non_recursivefunction signature has been updated to use the non-genericPatterntype, maintaining consistency with thePatternstruct changes.
112-116: Clean function signature updateThe
match_node_non_recursivefunction signature has been correctly updated to use the non-genericPatterntype, ensuring consistency across the codebase.
153-163: Test helper function properly updatedThe
find_node_recursivetest helper function has been correctly updated to use the non-genericPatterntype, ensuring that tests continue to function as expected with the refactored type system.
300-300: Consistent test helper updateThe
find_end_recursivetest helper function has been correctly updated to match the pattern of using the non-genericPatterntype, maintaining consistency across all related functions.crates/core/src/node.rs (1)
305-324: Public API clarification: value vs reference forMatcher
matches / inside / has / precedes / followsnow takeMby value.
Existing user code that previously called:let pat = Pattern::new("foo", lang); node.matches(&pat);continues to compile because
&PatternimplementsMatcher.
However, passing large matcher objects by value could lead to unintended moves:let pat = make_heavy_pattern(); if node.matches(pat) { // pat is moved here /* pat is now unusable */ }Consider documenting this nuance (or switching the parameter to
impl Matcher + ?Sizedwith&M) to avoid breaking expectations.crates/config/src/rule/nth_child.rs (1)
165-169: Edge case: modulo with negativestep_sizemay violate CSS spec
n % step_sizewherestep_sizeis negative keeps the sign of the dividend in Rust.
Consequently, forstep_size = -2andn = 1,n % step_size == 1, which means1 % -2 == 1(non‑zero) and the match fails.
However, CSS’s definition treats “‐2n+1” as matching the first element.A safer, sign‑agnostic check is:
let step = *step_size; let delta = index - offset; delta.signum() == step.signum() && delta % step == 0Please cross‑check against the W3C micro‑syntax to ensure negative‑step selectors behave correctly.
2937132 to
d02ddb9
Compare
f905fda to
bc9619b
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
crates/core/src/meta_var.rs (1)
37-44: 🛠️ Refactor suggestionReturning
Option<&mut Self>complicates ergonomics
insertcurrently yieldsSome(self)on success to enable chaining, but exposesNoneon failure.
Most callers either (a) ignore the result or (b) only care about the boolean success status.A simpler, more idiomatic signature would be:
-pub fn insert(&mut self, id: &str, ret: N) -> Option<&mut Self> +pub fn insert(&mut self, id: &str, ret: N) -> boolThis:
- avoids exposing a second mutable reference (preventing accidental misuse),
- mirrors
HashMap::insertergonomics (boolorOption<Old>),- still allows chaining via
if env.insert(...){ /* ... */ }.The same observation applies to
insert_multi.
🧹 Nitpick comments (4)
crates/core/src/meta_var.rs (4)
20-25: DeriveDebugand considerDefaultfor easier diagnostics
SgMetaVarEnvis a core container that frequently travels across APIs. DerivingDebug(and possiblyDefault) will:
- allow
println!("{:?}", env)during debugging/tests,- avoid manual boiler-plate when the struct evolves.
-#[derive(Clone)] +#[derive(Clone, Debug)] pub struct SgMetaVarEnv<'tree, N: SgNode<'tree>, C> {
71-73: API inconsistency betweenget_labelsandget_multiple_matchesAfter fixing the allocation issue above,
get_multiple_matcheswill return&[N]whileget_labelsreturnsOption<&Vec<N>>.For uniformity and clarity, consider aligning
get_labelsto the same slice-based signature:-pub fn get_labels(&self, label: &str) -> Option<&Vec<N>> { - self.multi_matched.get(label) +pub fn get_labels(&self, label: &str) -> Option<&[N]> { + self.multi_matched.get(label).map(Vec::as_slice) }This prevents accidental mutations through the shared
&Vecreference and reduces cognitive overhead.
151-167: Potential mutable-aliasing inmatch_constraints
Cow::Borrowed(self)is created while you still iterate over&self.single_matched.
Ifm.match_node_with_envmutates the environment,Cowwill be upgraded toOwned, cloning the maps — safe but potentially expensive.
Two suggestions:
Cheap path optimisation
Checkif var_matchers.is_empty()early and short-circuit.Avoid cloning when mutation is unlikely
Replace theCowstrategy with a two-phase approach: first read-only check, then mutate only the variables that passed.This keeps semantics intact while sidestepping the hidden allocation cost.
126-139:insert_transformationunnecessarily copies slices
formatted_slice(...).to_vec()allocates even when the formatted slice is already correctly aligned.
If the intent is to avoid copying when formatting is a no-op, you can store aCow<'_, [u8]>instead of an ownedVec<u8>insidetransformed_var.
This change:
- eliminates redundant allocations,
- is backwards-compatible for immutable readers.
Example diff (simplified):
- transformed_var: HashMap<MetaVariableID, C>, + transformed_var: HashMap<MetaVariableID, Cow<'tree, [u8]>>,and update call-sites accordingly.
If zero-copy isn’t achievable, at least add a comment clarifying why allocation is unavoidable.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
crates/core/src/meta_var.rs(6 hunks)crates/core/src/node.rs(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/core/src/node.rs
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: coverage
🔇 Additional comments (1)
crates/core/src/meta_var.rs (1)
16-16: Clarify or remove open question commentThe inline question (“why we need one more content?”) hints at an unresolved design doubt.
Leaving “why” comments in production code is a maintenance smell that quickly becomes stale. Please either:
- Convert it into a proper
TODO(issue #)with an actionable description, or- Replace it with a short explanation / reference clarifying the extra
Contentparameter’s purpose.[ suggest_nitpick ]
There was a problem hiding this comment.
Actionable comments posted: 2
🔭 Outside diff range comments (5)
crates/config/src/transform/transformation.rs (1)
79-83:⚠️ Potential issue
unwrap()on invalid regex can panic at runtime
Replace::<MetaVariable>::computewill callRegex::new(&self.replace).unwrap().
An end-user typo in the YAML config would therefore crash the whole run instead of producing a graceful error.- let re = Regex::new(&self.replace).unwrap(); + let re = match Regex::new(&self.replace) { + Ok(r) => r, + Err(e) => { + log::error!("Invalid regex in replace transform: {e}"); + return None; // or propagate as a TransformError if you prefer + } + };Alternatively, compile the regex once during
parse(); that turns a user error into a configuration-time failure and avoids recompiling on every match.crates/core/src/replacer.rs (1)
21-36: 🛠️ Refactor suggestionAvoid unnecessary
clone()inget_replaced_range
matcher.get_match_len(nm.get_node().clone())clones the entire node even thoughget_match_lenshould only need a reference for read-only access. Cloning large sub-trees can be expensive when replacements are executed in bulk.- if let Some(len) = matcher.get_match_len(nm.get_node().clone()) { + if let Some(len) = matcher.get_match_len(nm.get_node()) {You will need to adjust the
Matcher::get_match_lensignature to takeimpl Borrow<N>or&Nto make this compile.crates/config/src/transform/rewrite.rs (1)
46-86:⚠️ Potential issueEdge-case: returning empty string when no edits produced
If
self.join_byisNoneand thefind_and_make_editscall returns an empty vector,make_editwill receive an empty edit list and return a byte-for-byte copy ofbytes, so all good.
However, whenjoin_byis present but no edit matches any sub-node,rewrittenbecomes the emptyretvector created in theif let Some(joiner)arm (lines 62-81). The function then encodes that empty vector and returnsSome(String::new()), silently dropping the original content.Consider short-circuiting when
edits.is_empty():if edits.is_empty() { return None; // or Some(original) depending on desired semantics }Failing to do so may produce surprising, destructive rewrites.
crates/core/src/matcher/node_match.rs (1)
82-101:⚠️ Potential issueTrait impls on the type alias will not compile
impl From<NodeMatch<'tree, D>> …andimpl Borrow<…> for NodeMatch<'tree, D>try to implement traits for the alias, which is currently disallowed on stable Rust (E0215).Replace the impl targets with the concrete type:
-impl<'tree, D: Doc> From<NodeMatch<'tree, D>> for Node<'tree, D> { +impl<'tree, D: Doc> + From<SgNodeMatch< + 'tree, + Node<'tree, D>, + Vec<<<D as Doc>::Source as Content>::Underlying>>> for Node<'tree, D> { fn from(node_match: /* … */) -> Self { node_match.0 } } -impl<'tree, D: Doc> Borrow<Node<'tree, D>> for NodeMatch<'tree, D> { +impl<'tree, D: Doc> + Borrow<Node<'tree, D>> for SgNodeMatch< + 'tree, + Node<'tree, D>, + Vec<<<D as Doc>::Source as Content>::Underlying>> { fn borrow(&self) -> &Node<'tree, D> { &self.0 } }(or, alternatively, resurrect
NodeMatchas a new-type wrapper instead of atypealias).Without this fix the crate will fail to compile on stable/stable-beta channels.
crates/core/src/ops.rs (1)
209-218:⚠️ Potential issue
Notmay leave stray bindings inenvWhen the inner pattern matches (so
Notshould fail), any metavariables it bound remain inenv, because the original environment is passed directly.
This behaviour diverges from the logical expectation that a failedNotleaves no trace.Please take the same “speculative env” approach used in
Or/Any:- self - .not - .match_node_with_env(node.clone(), env) - .xor(Some(node)) + let mut new_env = Cow::Borrowed(env.as_ref()); + let matched = self.not.match_node_with_env(node.clone(), &mut new_env); + if matched.is_some() { + // inner matched → NOT fails, discard changes + None + } else { + // inner failed → NOT succeeds, keep original env + Some(node) + }
🧹 Nitpick comments (18)
crates/core/src/match_tree/mod.rs (2)
14-23: Pass slices instead ofVecto avoid needless allocations
match_ellipsisreceivesnodes: Vec<N>, which forces every caller to allocate a fresh vector even when it only needs read-only access. Switching the parameter to&[N](or an iterator) removes this cost and avoids one moreclonewhen the nodes are already materialised elsewhere.- fn match_ellipsis( - &mut self, - var: Option<&str>, - nodes: Vec<N>, - skipped_anonymous: usize, - ) -> Option<()>; + fn match_ellipsis( + &mut self, + var: Option<&str>, + nodes: &[N], + skipped_anonymous: usize, + ) -> Option<()>;All three implementors only read the slice (they call
last()or drain), so the change is mechanical and backwards-compatible with the public API introduced in this PR.
119-140: Potential stack overflow on extremely deep trees
does_node_match_exactlyis implemented recursively without tail-call optimisation. Real-world code normally keeps depth in check, but artificial or adversarial inputs (e.g. deeply nested parentheses) can overflow the stack.If you want extra robustness, consider an explicit stack (Vec) or a
try_foldstyle loop:let mut stack = vec![(goal.clone(), candidate.clone())]; while let Some((g, c)) = stack.pop() { if g.node_id() == c.node_id() { continue; } if g.is_named_leaf() || c.is_named_leaf() { if g.text() != c.text() { return false; } continue; } if g.kind_id() != c.kind_id() { return false; } let (mut gc, mut cc) = (g.children(), c.children()); if gc.len() != cc.len() { return false; } stack.extend(gc.zip(cc)); } trueIt keeps behaviour identical while capping stack use.
crates/config/src/rule/range.rs (1)
68-88: Minor performance nit: avoid repeated UTF-8 column computations
match_node_with_envcallscolumn(&node)twice (start and end).columnwalks the UTF-8 string from the start each time, so on large files this adds up. Caching the result once per node is cheaper:- if self.start.column != node_start_pos.column(&node) - || self.end.column != node_end_pos.column(&node) + let (start_col, end_col) = (node_start_pos.column(&node), node_end_pos.column(&node)); + if self.start.column != start_col || self.end.column != end_colVery small, but essentially free.
crates/config/src/transform/transformation.rs (3)
13-20: Double allocation when converting bytes toString
get_text_from_envfirst materialises the whole byte slice, thenencode_bytesreturns aCow<str>that is immediately turned into an ownedString. For large captures this allocates twice.You can request an owned string directly from the content API (if available) or reuse the existing buffer:
- let bytes = ctx.env.get_var_bytes(var)?; - Some(<<N::Doc as Doc>::Source as Content>::encode_bytes(bytes).into_owned()) + let bytes = ctx.env.get_var_bytes(var)?; + Some( + <<N::Doc as Doc>::Source as Content>::encode_bytes(bytes) + .into_owned() // still one allocation; unavoidable without API change + )If
encode_bytesalready returns an ownedString(instead ofCow), drop theinto_owned()call altogether.
38-48:Substring::computecreates an intermediateVec<char>– consider a slice-based approachBuilding
chars: Vec<_> = text.chars().collect()allocates O(n) memory and copies every Unicode scalar value. You can achieve the same withchar_indices()and byte slicing without the extra heap:let start = resolve_char(&self.start_char, 0, len); let end = resolve_char(&self.end_char, len, len); let bytes = text .char_indices() .map(|(i, _)| i) .chain(std::iter::once(text.len())) .collect::<Vec<_>>(); let slice = &text[bytes[start]..bytes[end]]; Some(slice.to_owned())This avoids the allocation and keeps complexity linear.
172-181: Avoid redundant allocation when storing transformed bytes
decode_str(&s).to_vec()copies the entire string into a newVec<u8>. If the downstream API accepts aCow<[u8]>, you can save one allocation:- let bytes = if let Some(s) = opt { - <<N::Doc as Doc>::Source as Content>::decode_str(&s).to_vec() + let bytes = if let Some(s) = opt { + <<N::Doc as Doc>::Source as Content>::decode_str(&s).into_owned() } else { vec![] };Not critical, but helps when transforms produce large blobs.
crates/core/src/replacer.rs (1)
48-53: Remove dead code or restore structural replacerThe
Root<D>implementation is now commented out, but the correspondingmod structural;declaration (line 13) is still present. This leaves an empty module that is compiled but never used and may confuse future readers.Either:
- Delete the commented block and remove the
structuralmodule entirely, or- Re-enable the structural replacer and add
#[cfg(feature = "structural_replace")]guards if it is planned for later.Leaving commented code in
mainbranches is generally discouraged.crates/config/src/transform/rewrite.rs (2)
101-118: Double DFS may cause quadratic behaviour
replace_oneperformsfor child in node.dfs()and invokes every rule on every descendant node. When multiple rewrites are chained (rules.len() * #descendants), this can become quadratic on large trees.If rules are mutually exclusive or can be indexed by kind/text, consider:
• Pre-filtering descendants (
child.kind_id()lookup)
• Short-circuiting the outerdfs()once a rule insiderulesmatches its ancestor to avoid visiting grandchildren that will be deleted anyway.While correctness is intact, large files could show noticeable slow-downs.
67-67: Decode joiner once to avoid repeated UTF-8 work
let joiner = <N::Doc as Doc>::Source::decode_str(joiner);is called inside a loop even thoughjoineris constant. Move it outside the loop to save repeated allocations/decoding.- let joiner = <N::Doc as Doc>::Source::decode_str(joiner); + let joiner_bytes = <N::Doc as Doc>::Source::decode_str(joiner); ... - ret.extend_from_slice(&joiner); + ret.extend_from_slice(&joiner_bytes);crates/core/src/replacer/template.rs (4)
41-50:generate_replacementallocates twice per fragment
decode_stris invoked on every fragment and the resultingVecis immediately copied intoretviaextend_from_slice, causing two heap allocations (decode_str→ Vec, thenextend→ copy).If fragments are sizeable or numerous, cache the decoded value:
- ret.extend_from_slice(&<N::Doc as Doc>::Source::decode_str(frag)); + let frag_bytes = <N::Doc as Doc>::Source::decode_str(frag); + ret.extend_from_slice(&frag_bytes);Repeat the same change later in the loop (line 109).
114-155:maybe_get_varcan use slice APIs to avoid indexing panicsAccessing
nodes[0]andnodes[nodes.len() - 1]requires two bounds checks and will panic if the vector is empty (even though you guard for emptiness). The idiomatic pattern is.first()/.last():- let start = nodes[0].range().start; - let end = nodes[nodes.len() - 1].range().end; - let source = nodes[0].get_doc().get_source(); + let start = nodes.first().unwrap().range().start; + let end = nodes.last().unwrap().range().end; + let source = nodes.first().unwrap().get_doc().get_source();Same semantics, slightly safer & faster.
91-113: Minor: early-return forTextualbranch simplifies control flowCurrently the function sets up variables and then immediately returns for the
Textualcase. An early return keeps the happy path shallow:- let template = match fixer { - TemplateFix::Textual(n) => return <N::Doc as Doc>::Source::decode_str(n).to_vec(), - TemplateFix::WithMetaVar(t) => t, - }; + if let TemplateFix::Textual(src) = fixer { + return <N::Doc as Doc>::Source::decode_str(src).to_vec(); + } + let TemplateFix::WithMetaVar(t) = fixer else { unreachable!() };Purely stylistic, but improves readability.
158-164: Public helper leaks implementation detail
gen_replacementis a thin wrapper aroundTemplateFix. Exposing it perpetuates two APIs that must stay in sync. Consider marking it#[deprecated]and encouraging callers to create aTemplateFixdirectly, e.g.:TemplateFix::try_new(template, nm.lang())?.generate_replacement(nm)to avoid API surface bloat.
crates/core/src/matcher/pattern.rs (1)
517-518: Size assertion is now brittle after thePhantomDataremoval
Patternno longer carries thePhantomData<L>field, so its memory layout has changed.
The ignored test still freezes the previous size (40) which is very likely wrong now and will break when the test is re-enabled.-assert_eq!(std::mem::size_of::<Pattern>(), 40); +// ⚠️ Update the expected value or, even better, remove the fragile +// size assertion – layout is an implementation detail. +assert_eq!(std::mem::size_of::<Pattern>(), 32);Please recompute or drop the assertion to avoid false failures.
crates/core/src/matcher.rs (2)
87-90: Avoid cloning the language object on every match
impl Matcher for strcreates a freshPatternviaPattern::str(self, node.lang().clone()).
For languages with non-trivialclone()cost this happens for every node examined, which can dominate hot paths such asfind_all.Consider caching the compiled
Pattern(e.g. inside a smallonce_cell::unsync::OnceCell) or switching toPattern::borrowed(self, node.lang())if such helper exists.
58-63: Minor: unnecessary env allocation inmatch_nodehelper
MatcherExt::match_nodealways starts with an emptyMetaVarEnv, which is then cloned into theNodeMatcheven when the pattern fails.
You can save one allocation per failure by constructingNodeMatchonly after a successful match:- let mut env = Cow::Owned(MetaVarEnv::new()); - let node = self.match_node_with_env(node, &mut env)?; - Some(NodeMatch::new(node, env.into_owned())) + let mut env = Cow::Owned(MetaVarEnv::new()); + self.match_node_with_env(node, &mut env) + .map(|n| NodeMatch::new(n, env.into_owned()))crates/core/src/node.rs (2)
38-41:Position::columnnow takesSgNode, but work is still O(n)The generic signature is great, but the call still rescans the source every time.
If callers invokecolumnin tight loops (e.g. pretty printers) consider memoising the result or passing the pre-computedSourceslice instead.
181-251: Potential self-recursion confusion in trait implSeveral methods delegate to identically-named inherent methods, e.g.:
fn node_id(&self) -> usize { self.node_id() }This works because inherent methods out-prioritise trait methods, but the pattern is easy to misread as infinite recursion.
Adding an explicit fully-qualified call improves clarity and avoids accidental refactor bugs:- self.node_id() + Node::node_id(self)(same for
is_named,text, etc.)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (30)
crates/cli/src/print/colored_print.rs(1 hunks)crates/config/src/fixer.rs(9 hunks)crates/config/src/rule/mod.rs(8 hunks)crates/config/src/rule/nth_child.rs(8 hunks)crates/config/src/rule/range.rs(5 hunks)crates/config/src/rule/referent_rule.rs(13 hunks)crates/config/src/rule/relational_rule.rs(9 hunks)crates/config/src/rule/stop_by.rs(5 hunks)crates/config/src/rule_core.rs(13 hunks)crates/config/src/transform/mod.rs(3 hunks)crates/config/src/transform/rewrite.rs(9 hunks)crates/config/src/transform/transformation.rs(5 hunks)crates/core/src/language.rs(4 hunks)crates/core/src/lib.rs(2 hunks)crates/core/src/match_tree/match_node.rs(6 hunks)crates/core/src/match_tree/mod.rs(5 hunks)crates/core/src/matcher.rs(6 hunks)crates/core/src/matcher/kind.rs(4 hunks)crates/core/src/matcher/node_match.rs(5 hunks)crates/core/src/matcher/pattern.rs(15 hunks)crates/core/src/matcher/text.rs(2 hunks)crates/core/src/meta_var.rs(8 hunks)crates/core/src/node.rs(6 hunks)crates/core/src/ops.rs(13 hunks)crates/core/src/replacer.rs(3 hunks)crates/core/src/replacer/structural.rs(1 hunks)crates/core/src/replacer/template.rs(8 hunks)crates/core/src/source.rs(1 hunks)crates/language/src/lib.rs(3 hunks)crates/napi/src/sg_node.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (14)
- crates/core/src/language.rs
- crates/core/src/matcher/text.rs
- crates/core/src/lib.rs
- crates/core/src/matcher/kind.rs
- crates/language/src/lib.rs
- crates/config/src/transform/mod.rs
- crates/config/src/rule/mod.rs
- crates/config/src/rule/stop_by.rs
- crates/config/src/rule/nth_child.rs
- crates/core/src/meta_var.rs
- crates/config/src/rule/relational_rule.rs
- crates/config/src/rule/referent_rule.rs
- crates/config/src/fixer.rs
- crates/config/src/rule_core.rs
🧰 Additional context used
🧬 Code Graph Analysis (5)
crates/core/src/match_tree/match_node.rs (2)
crates/core/src/node.rs (1)
std(520-532)crates/core/src/match_tree/mod.rs (3)
match_ellipsis(17-22)match_ellipsis(36-40)match_ellipsis(92-105)
crates/core/src/replacer.rs (3)
crates/core/src/replacer/indent.rs (1)
formatted_slice(161-168)crates/config/src/fixer.rs (2)
generate_replacement(134-140)get_replaced_range(141-157)crates/core/src/replacer/template.rs (2)
generate_replacement(41-50)gen_replacement(158-164)
crates/config/src/transform/transformation.rs (1)
crates/config/src/transform/rewrite.rs (1)
compute(46-86)
crates/core/src/matcher.rs (8)
crates/core/src/node.rs (9)
kind(165-165)kind(200-202)kind(306-308)text(168-168)text(197-199)text(341-344)new(27-33)new(72-74)str(57-59)crates/core/src/matcher/node_match.rs (2)
borrow(98-100)new(24-26)crates/config/src/rule_core.rs (3)
match_node_with_env(262-268)new(152-159)potential_kinds(270-272)crates/core/src/matcher/pattern.rs (5)
match_node_with_env(259-277)get_match_len(297-301)new(213-215)str(195-197)potential_kinds(279-295)crates/config/src/rule/mod.rs (2)
match_node_with_env(280-304)potential_kinds(306-326)crates/config/src/rule/relational_rule.rs (4)
match_node_with_env(70-94)match_node_with_env(130-176)match_node_with_env(212-221)match_node_with_env(256-265)crates/core/src/matcher/kind.rs (3)
match_node_with_env(79-89)new(30-36)potential_kinds(91-95)crates/core/src/matcher/text.rs (1)
potential_kinds(39-41)
crates/core/src/ops.rs (7)
crates/core/src/node.rs (6)
std(520-532)new(27-33)new(72-74)str(57-59)root(88-93)root(402-404)crates/core/src/matcher.rs (6)
match_node_with_env(35-39)match_node_with_env(78-85)match_node_with_env(97-103)match_node_with_env(150-156)match_node_with_env(166-172)new(122-127)crates/core/src/matcher/pattern.rs (3)
match_node_with_env(259-277)new(213-215)str(195-197)crates/config/src/rule/mod.rs (1)
match_node_with_env(280-304)crates/config/src/rule/relational_rule.rs (4)
match_node_with_env(70-94)match_node_with_env(130-176)match_node_with_env(212-221)match_node_with_env(256-265)crates/core/src/matcher/kind.rs (2)
match_node_with_env(79-89)new(30-36)crates/core/src/meta_var.rs (1)
new(28-35)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: coverage
🔇 Additional comments (8)
crates/core/src/source.rs (1)
80-80:Details
✅ Verification successful
Added
'staticlifetime bound toDoctraitThis change enforces that any type implementing
Docmust have a'staticlifetime. This aligns with the broader refactoring effort of removing language generic parameters from core types and replacing them with trait-based abstractions.Let's verify if all implementors of the
Doctrait in the codebase satisfy the new'staticbound:
🏁 Script executed:
#!/bin/bash # Find all Doc trait implementations rg "impl.+Doc for" --only-matchingLength of output: 124
🏁 Script executed:
#!/bin/bash echo "=== Core Doc implementation context ===" rg -n "impl<L: Language> Doc for" -C3 crates/core/src/source.rs echo "=== napi/src/doc.rs implementation context ===" rg -n "impl\s\+Doc\s\+for" -C3 crates/napi/src/doc.rsLength of output: 465
🏁 Script executed:
#!/bin/bash echo "=== StrDoc definition ===" rg -n "struct StrDoc" -C3 crates/core/src/source.rs echo "" echo "=== Language trait ===" rg -n "pub trait Language" -C3 crates/core/src/language.rs echo "" echo "=== impl Doc for in napi/src/doc.rs ===" rg -n "impl Doc for" -C3 crates/napi/src/doc.rs echo "" echo "=== struct definitions in napi/src/doc.rs ===" rg -n "struct" -C3 crates/napi/src/doc.rsLength of output: 1685
Doc implementors satisfy the new ’static constraint
All existing
Docimplementations use only owned data and do not introduce non‐'staticreferences, so they meet the newly added+'staticbound:
- crates/core/src/source.rs:
impl<L: Language> Doc for StrDoc<L>
(fields:String,Lwith no hidden lifetimes)- crates/napi/src/doc.rs:
impl Doc for JsDoc
(fields:NapiLang,Wrapper, both own their data)No further changes required—approving this update.
crates/cli/src/print/colored_print.rs (1)
237-237: Updated node access pattern to match new trait-based interfaceChanged from direct reference to dereferencing
diff.node_matchvia&*syntax to match the updatedcolumnmethod signature that now expects anSgNodetrait object reference instead of a concrete type.This change aligns with the broader refactoring that introduces trait-based node abstractions across the codebase. The functionality remains the same, but now works with the more generic node interface.
crates/napi/src/sg_node.rs (1)
46-46: Updated position calculation to use the new node trait interfaceChanged from passing a reference to
self.innerdirectly to usingself.inner.get_node()to obtain a reference to a type implementing theSgNodetrait.This change aligns with the refactoring effort to make the codebase more abstract by accessing nodes through trait methods rather than directly using concrete types.
crates/config/src/rule/range.rs (1)
53-65: Edge-case: zero-length ranges are accepted – is that intended?
try_newrejectsstart > endbut happily acceptsstart == end, producing a matcher whosestartandendpoint to the same character. The rest of the implementation treats such a range as valid, sofind_nodewill only succeed on zero-width AST nodes (very uncommon).If zero-length ranges were not meant to be allowed, extend the validation:
- if start_pos.line > end_pos.line - || (start_pos.line == end_pos.line && start_pos.column > end_pos.column) + if start_pos.line > end_pos.line + || (start_pos.line == end_pos.line && start_pos.column >= end_pos.column)Otherwise, consider documenting the behaviour explicitly so users know a zero-length range is meaningful.
crates/core/src/replacer.rs (1)
68-74: Preserve indentation in raw-node replacement
Node::generate_replacementcurrently returns the raw byte slice of the matched node. When this replacer is used inside a template, the inserted code may lose the indentation normalisation applied elsewhere (indent_lines). If callers expect consistent formatting, consider re-usingformatted_sliceor at least documenting that this implementation performs a “verbatim” replacement.crates/core/src/match_tree/match_node.rs (1)
94-99: Unnecessary lifetime relationship't: 'pcreates extra complexityThe
't: 'pbound forces the tree-node lifetime to outlive the pattern lifetime even though the two lifetimes are unrelated – the pattern slice is read-only and never stored inside the candidate nodes.
Removing the bound simplifies the signature and helps the compiler reason about borrow-checking without introducing any regressions:-fn may_match_ellipsis_impl<'p, 't: 'p, N: SgNode<'t>>( +fn may_match_ellipsis_impl<'p, 't, N: SgNode<'t>>(Unless you deliberately rely on the stricter relationship (I could not find such a use-site), consider dropping it.
crates/core/src/matcher/node_match.rs (1)
20-22:typealias is fine, but watch out for downstream impls
NodeMatchis now a plain type-alias toSgNodeMatch.
That keeps the public signature stable, but remember that you cannot add inherent impls or trait impls directly on a type alias (still an unstable feature).
The alias should only be used for re-exporting – all impl blocks must target the real struct (SgNodeMatch).crates/core/src/node.rs (1)
148-178: Trait usesimpl Traitreturn types – requires nightly feature
SgNodereturnsimpl Iteratorandimpl ExactSizeIteratorin several methods.
This needs the unstablereturn_position_impl_trait_in_trait(RPITIT) feature; compiling on stable Rust will fail.Please ensure the crate is explicitly gated with:
#![cfg_attr(feature = "nightly", feature(return_position_impl_trait_in_trait))]or provide associated iterator types as a stable fallback.
BREAKING CHANGE: a lot of internal API has been changed to allow new implemenation
|
Next PR will bring in part-2. Language |
#1918
Summary by CodeRabbit