Skip to content

new rule: PUSH_LITERAL#1092

Merged
tomtau merged 28 commits intopest-parser:masterfrom
yshavit:push_other
Mar 24, 2025
Merged

new rule: PUSH_LITERAL#1092
tomtau merged 28 commits intopest-parser:masterfrom
yshavit:push_other

Conversation

@yshavit
Copy link
Contributor

@yshavit yshavit commented Mar 21, 2025

Add a rule PUSH_LITERAL("a"). This rule always matches and never consumes any input, and it pushes "a" to the stack. The argument must be a literal.

Note

I haven't added any unit tests or documentation yet; I wanted to get a sense of whether you'd be amenable to this new rule in principle, before I did that. But, I tested it locally on a grammar I have, and it does work.

This new rule accomplishes the goals of #880 (in a slightly different way). Instead of PUSH_OTHER("-", " "), this is just PUSH_LITERAL(" "). If you wanted to match A and then push B, you just combine them via a sequence: "-" ~ PUSH_LITERAL(" ").

This can be used to generalize quoted strings to a pair of matching but non-equal delimiters. For example, many programming languages let you express strings using either double or single quotes, and don't require escaping of the other:

string = ${ PUSH( "\"" | "'") ~ quoted_char* ~ POP }
quoted_char = @{!PEEK ~ ANY } // simple version without escape sequences

// will parse "a'b" or 'a"b'

This happens to work because the starting and ending char are the same, but that's not always desirable. For example, we may want to allow strings to be delimited by either ( ... ) or < ... >, with similar rules as above, such that (a>) and <a)> are both valid.

With PUSH_LITERAL, this becomes:

string = ${ quote_open ~ quoted_char* ~ POP }
quote_open = @{
  ( "(" ~ PUSH_LITERAL(")") )
| ( "<" ~ PUSH_LITERAL(">") )
}
quoted_char = @{!PEEK ~ ANY } // simple version without escape sequences

Happy to hear your thoughts!

Summary by CodeRabbit

  • New Features

    • Introduced literal string push operations, enhancing how literal values are handled throughout the system.
    • Expanded the grammar and parsing capabilities to support literal push expressions and flexible quote delimiters.
  • Tests

    • Added comprehensive tests to verify the new literal push behavior and quote parsing, ensuring robust handling and proper error detection.

This accomplishes the goals of pest-parser#880 (in a slightly
different way).

I haven't added any unit tests or documentation yet; this is a preview.
@yshavit yshavit requested a review from a team as a code owner March 21, 2025 06:36
@yshavit yshavit requested review from tomtau and removed request for a team March 21, 2025 06:36
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 21, 2025

Walkthrough

The changes add a new literal push functionality across the codebase. A PushLiteral variant is introduced in several expression enums (AST, Optimizer, Parser, VM) and the corresponding display, conversion, and validation logic. The grammar is updated with a new _push_literal terminal and additional surrounding rules. New helper methods (e.g., stack_push_literal) and enums to support efficient string handling are added to the parser state. Additional tests verify these changes, and certain features are conditionally compiled using the grammar-extras flag.

Changes

File(s) Change Summary
generator/src/generator.rs
meta/src/ast.rs
meta/src/optimizer/mod.rs
vm/src/lib.rs
Added a new PushLiteral variant to expression enums, updated conversion functions, display implementations, and processing logic to support pushing literal strings.
meta/src/grammar.pest Added a new terminal rule _push_literal and updated the terminal rule to include this new literal push expression.
meta/src/parser.rs
meta/src/validator.rs
Introduced PushLiteral support in parsing and validation logic; updated matching, conversion (e.g., in convert_node), and added new tests.
pest/src/parser_state.rs Added new enums (BorrowedOrArc, SpanOrLiteral) and a stack_push_literal method; updated the stack type for improved string and literal handling.
derive/tests/surround.pest
derive/tests/surround.rs
Introduced new grammar rules and a corresponding test parser (SurroundParser) to support and test quoted string handling with various delimiters.
vm/tests/surround.pest
vm/tests/surround.rs
Added new test files referencing existing grammar rules to validate the new functionality without altering existing logic.

Sequence Diagram(s)

sequenceDiagram
    participant Src as Source Code
    participant P as Parser
    participant A as AST
    participant O as Optimizer
    participant G as Generator
    participant VM as Virtual Machine
    participant PS as ParserState

    Src->>P: Parse input (_push_literal rule)
    P->>A: Create PushLiteral expression
    A->>O: Convert AST node to optimized variant
    O->>G: Generate code for PushLiteral
    G->>PS: Call stack_push_literal with literal string
    PS->>VM: Execute literal push operation
Loading

Suggested reviewers

  • tomtau

Poem

In a meadow of code where new features bloom,
I, a little rabbit, dance and zoom.
With literal pushes that gracefully glide,
Our stack sings a tale with joy inside.
Hopping through tests, grammar, and state,
I celebrate changes that truly create!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
meta/src/parser.rs (1)

1-1797: Consider adding unit tests for PUSH_LITERAL functionality.

While the implementation is solid, there are no specific unit tests for the new PUSH_LITERAL functionality. Adding tests would ensure the feature works as expected and prevent regression issues.

Consider adding a test case similar to the existing push test (lines 952-969) to verify that PUSH_LITERAL parses correctly and another test to verify that it generates the expected AST.

#[test]
fn push_literal() {
    parses_to! {
        parser: PestParser,
        input: "PUSH_LITERAL ( \"a\" )",
        rule: Rule::_push_literal,
        tokens: [
            _push_literal(0, 19, [
                opening_paren(13, 14),
                string(15, 18, [
                    quote(15, 16),
                    inner_str(16, 17),
                    quote(17, 18)
                ]),
                closing_paren(18, 19)
            ])
        ]
    };
}

#[test]
fn ast_push_literal() {
    let input = "rule = _{ PUSH_LITERAL(\"a\") }";
    
    let pairs = PestParser::parse(Rule::grammar_rules, input).unwrap();
    let ast = consume_rules_with_spans(pairs).unwrap();
    let ast: Vec<_> = ast.into_iter().map(convert_rule).collect();
    
    assert_eq!(
        ast,
        vec![AstRule {
            name: "rule".to_owned(),
            ty: RuleType::Silent,
            expr: Expr::PushLiteral("a".to_owned()),
        }]
    );
}
pest/src/parser_state.rs (1)

1133-1139: stack_push_literal method is coherent.
Pushing a literal without consuming input is a key feature. Consider adding dedicated unit tests for this method to ensure correctness in edge cases (e.g., empty strings, large strings).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 17a7fa0 and 2eeb379.

📒 Files selected for processing (8)
  • generator/src/generator.rs (2 hunks)
  • meta/src/ast.rs (2 hunks)
  • meta/src/grammar.pest (2 hunks)
  • meta/src/optimizer/mod.rs (3 hunks)
  • meta/src/parser.rs (6 hunks)
  • meta/src/validator.rs (2 hunks)
  • pest/src/parser_state.rs (9 hunks)
  • vm/src/lib.rs (1 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
pest/src/parser_state.rs (5)
pest/src/position.rs (2) (2)
  • new (45-47)
  • span (98-107)
pest/src/stack.rs (1) (1)
  • new (51-57)
pest/src/span.rs (4) (4)
  • new (47-53)
  • as_str (197-200)
  • start (100-102)
  • end (118-120)
pest/src/iterators/pair.rs (2) (2)
  • new (48-60)
  • as_str (114-120)
pest/src/iterators/pairs.rs (3) (3)
  • new (45-87)
  • as_str (116-125)
  • as_str (570-574)
🔇 Additional comments (28)
meta/src/validator.rs (2)

403-403: Correctly handles PushLiteral as non-progressing

The addition of the PushLiteral(_) case in the is_non_progressing function ensures that pushing literal strings to the stack is properly recognized as an operation that doesn't consume input. This is consistent with the behavior of similar operations like Push.


498-498: Correctly handles PushLiteral as non-failing

The addition of the PushLiteral(_) case in the is_non_failing function correctly identifies that this operation is guaranteed to succeed regardless of input. This matches the expected semantics of PUSH_LITERAL which should never fail during parsing.

meta/src/ast.rs (2)

97-98: New expression variant for pushing literal strings

The new PushLiteral(String) variant appropriately extends the AST to support pushing literal strings onto the stack without matching input, which is the core functionality described in the PR objectives. The doc comment clearly explains its purpose.


324-324: Display formatting for the new PushLiteral variant

The implementation for formatting PushLiteral in the Display trait is consistent with the format used for other expression types. Using {:?} for the string ensures proper escaping of special characters.

generator/src/generator.rs (2)

573-577: Generator implementation for PushLiteral

The code generation for PushLiteral correctly calls state.stack_push_literal(#string), which will push the literal string onto the parser state stack. This implementation is consistent with the other stack operations and fulfills the requirements for the new rule.


763-767: Atomic generator implementation for PushLiteral

This implementation mirrors the non-atomic version, ensuring consistent behavior when PushLiteral is used in atomic contexts. The duplicate implementation is necessary because atomic and non-atomic parsing have different code generation patterns in pest.

vm/src/lib.rs (1)

239-239: VM implementation for PushLiteral

The implementation in the VM correctly maps the OptimizedExpr::PushLiteral variant to state.stack_push_literal(string.to_owned()). The to_owned() call is necessary because the VM needs to take ownership of the string when pushing it onto the stack.

meta/src/grammar.pest (2)

84-84: Added new terminal rule for PUSH_LITERAL operation.

The terminal rule has been correctly updated to include the new _push_literal option, expanding the available operations in the grammar.


148-149: Well-structured rule definition for PUSH_LITERAL.

The new _push_literal rule is properly defined with appropriate documentation and follows the pattern established by other similar operations like _push. This rule correctly matches the string "PUSH_LITERAL" followed by an argument enclosed in parentheses, specifically a literal string.

meta/src/optimizer/mod.rs (3)

72-72: Added variant to handle PushLiteral conversion in optimizer.

The to_optimized function has been correctly updated to handle the new Expr::PushLiteral(string) variant, converting it to OptimizedExpr::PushLiteral(string).


155-156: Clear documentation for the new PushLiteral variant.

The new PushLiteral(String) variant has been added to the OptimizedExpr enum with appropriate documentation that clearly explains its purpose: "Pushes a literal string to the stack, e.g. push_literal("")".


331-331: Implemented Display trait for the new variant.

The Display implementation has been extended to handle the PushLiteral variant, correctly formatting it as PUSH_LITERAL({:?}) which will properly display the enclosed string with escaped characters.

meta/src/parser.rs (5)

169-170: Added well-documented PushLiteral variant to ParserExpr.

The new PushLiteral(String) variant has been added to the ParserExpr enum with clear documentation explaining its purpose and usage pattern.


209-209: Updated convert_node to handle the new expression variant.

The convert_node function now correctly handles the new ParserExpr::PushLiteral(string) variant, creating the corresponding Expr::PushLiteral(string) AST node.


233-233: Added PUSH_LITERAL to meta rule renaming function.

The rename_meta_rule function has been updated to handle the new _push_literal rule, returning the user-friendly name "PUSH_LITERAL".


400-409: Implemented PUSH_LITERAL parser logic.

The implementation for handling the _push_literal rule in the consume_expr function is well-structured:

  1. It correctly extracts the string content from the rule
  2. It unescapes the string literals properly
  3. It creates a ParserNode with the PushLiteral expression
  4. It uses the appropriate span for error reporting

1333-1333: Updated test error message to include the new rule.

The test case error message has been properly updated to include the new _push_literal rule in the list of expected rules when a term is missing.

pest/src/parser_state.rs (11)

13-13: No concerns about the new import.
The addition of ToOwned is consistent with the usage of owned/borrowed strings.


217-221: Introducing BorrowedOrRc is a clear improvement.
This enum nicely handles both borrowed & owned strings without complicating the parser’s memory management.


223-227: From<&'static str> impl looks good.
The conversion logic is straightforward and aligns well with BorrowedOrRc<'_>.


229-233: From<String> impl is concise.
Wrapping the String inside Rc fits the parser VM’s needs for reference-counted string literals.


235-245: Adequate docs for BorrowedOrRc.
Documenting the typical usage (&'static str vs Rc<String>) clarifies any confusion about the enum’s purpose.


246-260: SpanOrLiteral enum effectively unifies two data representations.
This design simplifies pushing either spans or literal strings onto the stack.


1498-1498: Storing spans as SpanOrLiteral::Span is correct.
No functional issues identified here.


1530-1531: Passing the borrowed or owned string to match_string is consistent.
Looks correct and should work seamlessly.


1559-1560: Reusing as_borrowed_or_rc here is fine.
No issues with matching strings from the popped value.


1606-1606: Inline closure usage for matching is correct.
No logical problems detected.


1667-1667: Applying match_string to as_borrowed_or_rc().as_str() is straightforward.
No issues found.

Copy link
Contributor

@tomtau tomtau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This would need to be feature-guarded #[cfg(feature = "grammar-extras")], because otherwise the different mixing of pest dependencies that cargo may resolve can fail to compile.
  2. there was also this issue #541 -- not sure if making possible to push insensitive literals can be helpful here?

@yshavit
Copy link
Contributor Author

yshavit commented Mar 21, 2025

  1. This would need to be feature-guarded #[cfg(feature = "grammar-extras")], because otherwise the different mixing of pest dependencies that cargo may resolve can fail to compile.

Happy to do that!

  1. there was also this issue Case-insensitive PUSH / PEEK / POP #541 -- not sure if making possible to push insensitive literals can be helpful here?

I don't think it's possible to combine this feature with that one, though I'm open if you have suggestions. Case-insensitivity feels pretty different, and I don't see any reasonably simple abstraction that combines it with matching delims.

But it sounds like, feature guarding aside (and I'd need to get the CIs passing, docs, etc.), this is something you'd be interested in accepting?

@yshavit
Copy link
Contributor Author

yshavit commented Mar 21, 2025

  1. there was also this issue Case-insensitive PUSH / PEEK / POP #541 -- not sure if making possible to push insensitive literals can be helpful here?

Actually, the SpanOrLiteral I have could be extended pretty easily to handle case-insensitivity. I'll hop over to that other issue for a clarifying question, though.

@tomtau
Copy link
Contributor

tomtau commented Mar 21, 2025

But it sounds like, feature guarding aside (and I'd need to get the CIs passing, docs, etc.), this is something you'd be interested in accepting?

I generally don't feel comfortable with all the ad-hoc stack operations in pest for the extra complexity, but I understand why they were introduced... In terms of extending them / adding new ones, I'm open to it as long as the extension solves several compelling unaddressed use cases (vs one niche use case), so that the increased complexity is somewhat justified.
So that's why I asked if it could be potentially adjusted to address that case insensitivity in some way -- but maybe there's not much intersection with that problem.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
pest/src/parser_state.rs (2)

217-221: Enum definition is well-structured.
SpanOrLiteral correctly unifies Span<'i> and Cow<'i, str>. Consider deriving PartialEq if comparisons of SpanOrLiteral might be needed in the future.


223-231: Consider renaming for clarity and avoiding unnecessary clones.
as_borrowed_or_rc() works but may cause extra cloned allocations. Optionally, you might rename it (e.g., as_cow()) and explore ways to avoid cloning when the Cow is already owned.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2eeb379 and fbe2950.

📒 Files selected for processing (3)
  • meta/src/grammar.pest (2 hunks)
  • meta/src/optimizer/mod.rs (3 hunks)
  • pest/src/parser_state.rs (10 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • meta/src/grammar.pest
  • meta/src/optimizer/mod.rs
🧰 Additional context used
🧬 Code Definitions (1)
pest/src/parser_state.rs (6)
pest/src/iterators/tokens.rs (3) (3)
  • pairs (144-144)
  • pairs (146-146)
  • new (33-58)
pest/src/position.rs (2) (2)
  • new (45-47)
  • span (98-107)
pest/src/stack.rs (1) (1)
  • new (51-57)
pest/src/span.rs (3) (3)
  • new (47-53)
  • start (100-102)
  • end (118-120)
pest/src/iterators/pairs.rs (1) (1)
  • new (45-87)
pest/src/iterators/flat_pairs.rs (1) (1)
  • new (32-46)
🔇 Additional comments (9)
pest/src/parser_state.rs (9)

13-20: Imports look consistent.
These new imports align well with the newly added types and functionalities, and no overshadowing or redundancy is apparent.


31-31: Use of Deref is appropriate.
The import of std::ops::Deref is correctly utilized when calling deref() on Cow instances.


442-442: Stack type update is coherent.
Switching the parser stack to Stack<SpanOrLiteral<'i>> makes sense for pushing string literals and spans uniformly.


1083-1112: New stack_push_literal method looks good.
It offers a clear, always-successful push of a literal. Adequate doc comments and inline tests illustrate usage.


1471-1471: Updated push usage aligns with SpanOrLiteral.
Switching from directly pushing Span to wrapping it in SpanOrLiteral::Span is consistent and maintains existing functionality.


1503-1504: stack_peek adaptation is solid.
Using as_borrowed_or_rc() followed by deref() correctly handles both borrowed and owned cases.


1532-1533: stack_pop adaptation is consistent.
Mirroring the stack_peek logic ensures the popped value works seamlessly as a literal or span.


1579-1580: Slice matching updated correctly.
Iterating over SpanOrLiteral and matching via as_borrowed_or_rc() is straightforward and preserves functionality for partial stack matches.


1641-1641: stack_match_pop behavior is consistent.
Replacing direct Span usage with SpanOrLiteral integration remains functionally valid.

@yshavit yshavit marked this pull request as draft March 22, 2025 04:48
@yshavit
Copy link
Contributor Author

yshavit commented Mar 22, 2025

Bumping this PR to Draft status until it's ready for a re-review.

@yshavit yshavit marked this pull request as ready for review March 22, 2025 21:49
@yshavit
Copy link
Contributor Author

yshavit commented Mar 22, 2025

Should be ready now! I added tests and the grammar-extra cfg.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
pest/src/parser_state.rs (1)

90-1161: 💡 Verification agent

🧩 Analysis chain

Verify that this new feature is guarded with #[cfg(feature = "grammar-extras")].

According to the PR comments, this new functionality should be feature-guarded with #[cfg(feature = "grammar-extras")] to prevent potential compilation issues due to mixed pest dependencies. However, this feature guard is missing from the implementation.

Please verify if the entire new PUSH_LITERAL functionality, including the Literal enum, BorrowedOrRc enum, and stack_push_literal method, should be conditionally compiled with this feature flag.


🏁 Script executed:

#!/bin/bash
# Check if other similar functionality uses feature guards
grep -r "#\[cfg(feature = \"grammar-extras\")]" --include="*.rs" .

Length of output: 3960


Feature Guard Missing: Please wrap new literal functionality with #[cfg(feature = "grammar-extras")].

After verifying similar functionality via the grep search, we see that comparable features (in modules like generator, meta, and vm) are conditionally compiled using #[cfg(feature = "grammar-extras")]. The new PUSH_LITERAL functionality—including the Literal enum, the BorrowedOrRc enum, and the stack_push_literal method—is currently missing this guard. To ensure compatibility with projects that do not enable the "grammar-extras" feature and to maintain consistency with the rest of the codebase, the new code should be conditionally compiled using this feature flag.

  • Action Required:
    • Wrap the definition of Literal and its From implementations with #[cfg(feature = "grammar-extras")].
    • Similarly, guard the BorrowedOrRc enum and its related conversion code.
    • Apply the feature guard to the stack_push_literal method in ParserState.

**

🧹 Nitpick comments (4)
pest/src/parser_state.rs (4)

24-24: Add missing documentation comment for Deref import.

The std::ops::Deref import is used for the BorrowedOrRc implementation but lacks a documentation comment to explain its purpose, unlike other imports in this file.


238-242: Consider adding a reason for using Rc instead of Cow in documentation.

The BorrowedOrRc enum is well-designed but the reasoning behind using Rc<String> instead of Cow<'i, str> is not clear from the documentation. Based on previous comments in the PR, this choice was made to avoid copying when cloning strings, but this context would be valuable in the code comments.

-enum BorrowedOrRc<'i> {
+/// A helper enum that provides efficient string handling without unnecessary copying.
+/// We use `Rc<String>` instead of `Cow<'i, str>` to avoid copying of owned strings when cloning,
+/// as `Rc::clone` only increments a reference count.
+enum BorrowedOrRc<'i> {

1152-1152: Consider removing or documenting #[allow(private_bounds)] attribute.

The #[allow(private_bounds)] attribute is used but its purpose is not clear. Consider adding a brief comment explaining why this attribute is needed or remove it if it's not necessary.


90-1161: Add comprehensive test coverage for the new PUSH_LITERAL functionality.

While the PR indicates local testing has been done, it would be beneficial to add comprehensive unit tests for the new PUSH_LITERAL functionality, covering both static and owned string literals. This would ensure the feature works as expected and remains stable as the codebase evolves.

Consider adding tests that:

  1. Verify stack_push_literal works with both static and owned strings
  2. Test interaction between stack_push_literal and other stack operations
  3. Verify that sequences combining normal character matching with literal pushing work correctly
  4. Test edge cases like empty strings and Unicode characters
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fbe2950 and cdb7315.

📒 Files selected for processing (11)
  • generator/src/generator.rs (2 hunks)
  • grammars/Cargo.toml (1 hunks)
  • grammars/src/grammars/surround.pest (1 hunks)
  • grammars/src/lib.rs (1 hunks)
  • grammars/tests/surround.rs (1 hunks)
  • meta/src/ast.rs (3 hunks)
  • meta/src/optimizer/mod.rs (4 hunks)
  • meta/src/parser.rs (7 hunks)
  • meta/src/validator.rs (4 hunks)
  • pest/src/parser_state.rs (10 hunks)
  • vm/src/lib.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • generator/src/generator.rs
  • meta/src/validator.rs
🧰 Additional context used
🧬 Code Definitions (3)
meta/src/optimizer/mod.rs (2)
meta/src/ast.rs (2)
  • string (451-453)
  • push_literal (691-696)
meta/src/parser.rs (1)
  • string (1065-1078)
meta/src/ast.rs (1)
meta/src/optimizer/mod.rs (1)
  • push_literal (1098-1103)
pest/src/parser_state.rs (6)
pest/src/error.rs (3)
  • from (83-85)
  • from (89-92)
  • start (498-503)
pest/src/span.rs (4)
  • as_str (197-200)
  • new (47-53)
  • start (100-102)
  • end (118-120)
pest/src/iterators/pair.rs (2)
  • as_str (114-120)
  • new (48-60)
pest/src/iterators/pairs.rs (3)
  • as_str (116-125)
  • as_str (570-574)
  • new (45-87)
pest/src/stack.rs (1)
  • new (51-57)
pest/src/position.rs (2)
  • new (45-47)
  • span (98-107)
🔇 Additional comments (33)
grammars/Cargo.toml (1)

18-18: Correctly enables the "grammar-extras" feature.

The addition of the features = ["grammar-extras"] attribute to the pest_derive dependency is correctly implemented. This follows the convention mentioned in the PR comments about feature-guarding the new PUSH_LITERAL rule to prevent potential compilation issues with mixed pest dependencies.

grammars/src/lib.rs (1)

67-85: Well-documented implementation of the surround module.

The newly added surround module provides a clear example of how to use the new PUSH_LITERAL functionality with different delimiter pairs. The documentation is comprehensive, explaining the purpose of the parser and its limitations. This module directly demonstrates the PR objective of supporting quoted strings with different start and end delimiters.

grammars/tests/surround.rs (3)

17-27: Test cases verify parentheses functionality.

The test correctly verifies that the surround parser handles strings enclosed in parentheses, validating a key use case for the new PUSH_LITERAL functionality.


29-39: Test cases verify angle brackets functionality.

The test correctly verifies that the surround parser handles strings enclosed in angle brackets, providing good coverage of the different delimiter types supported.


41-51: Good edge case testing for delimiter mismatch.

This test properly verifies that the parser rejects strings with mismatched delimiters (starting with one type and ending with another), which is an important validation of the grammar's correctness.

vm/src/lib.rs (1)

239-241: Correctly implements the PUSH_LITERAL functionality.

The implementation of the PushLiteral variant is:

  1. Properly feature-guarded with #[cfg(feature = "grammar-extras")]
  2. Integrated correctly into the expression parser
  3. Implemented to take ownership of the string that needs to be pushed onto the stack

This directly implements the core functionality described in the PR objectives, allowing for pushing literal values onto the stack without consuming input.

grammars/src/grammars/surround.pest (1)

1-9: Well-designed grammar example for the new PUSH_LITERAL feature.

This grammar effectively demonstrates the use case for the new PUSH_LITERAL feature by implementing a flexible quoted text matcher. It elegantly handles different delimiter pairs by pushing the appropriate closing delimiter onto the stack based on the detected opening delimiter.

meta/src/ast.rs (3)

97-99: Good implementation with proper feature gating.

The new PushLiteral variant is correctly added to the Expr enum and properly feature-gated with #[cfg(feature = "grammar-extras")], which aligns with the PR feedback.


326-327: Display implementation is correct.

The implementation of the Display trait for the new variant follows the established pattern in the codebase and correctly formats the PushLiteral variant.


689-696: Comprehensive test case.

The test case effectively verifies that the string representation works correctly, including proper handling of special characters and escaping of quotes within the string.

meta/src/parser.rs (6)

169-171: Good implementation with proper feature gating.

The new PushLiteral variant is correctly added to the ParserExpr enum and properly feature-gated with #[cfg(feature = "grammar-extras")].


210-212: Proper conversion implementation.

The conversion from ParserExpr::PushLiteral to Expr::PushLiteral is correctly implemented in the convert_node function.


235-235: Rule renaming is properly handled.

The rename_meta_rule function is updated to include a case for the new _push_literal rule, returning a user-friendly name.


402-422: Well-implemented parsing logic with proper error handling.

The implementation for handling the _push_literal rule is thorough and includes:

  1. Proper string unescaping
  2. Correct node creation
  3. Appropriate error handling when the feature is not enabled

The code follows the established patterns in the codebase for parsing rules.


1346-1346: Test error message updated correctly.

The test error message is updated to include the new _push_literal rule in the list of expected rules.


1543-1579: Comprehensive test coverage.

The tests thoroughly cover both successful parsing of the PUSH_LITERAL syntax and error conditions when invalid input is provided. The error test specifically verifies that a descriptive error message is shown when the PUSH_LITERAL syntax is used incorrectly.

meta/src/optimizer/mod.rs (4)

73-73: Proper optimization handling.

The to_optimized function is correctly updated to handle the conversion from Expr::PushLiteral to OptimizedExpr::PushLiteral.


156-158: Good implementation with proper feature gating.

The new PushLiteral variant is correctly added to the OptimizedExpr enum and properly feature-gated with #[cfg(feature = "grammar-extras")].


334-335: Display implementation is correct.

The implementation of the Display trait for the new variant follows the established pattern in the codebase and correctly formats the PushLiteral variant.


1096-1103: Comprehensive test case.

The test case effectively verifies that the string representation works correctly, including proper handling of special characters and escaping of quotes within the string.

pest/src/parser_state.rs (13)

90-96: New Literal enum provides flexibility for static and dynamic strings.

The new Literal enum is well-designed, allowing for both static strings (preferred for derived parsers) and owned strings (for VM contexts). This approach optimizes memory usage by avoiding unnecessary string allocations when static strings are available.


98-108: Good implementation of From traits for Literal.

The implementations of From<&'static str> and From<String> for Literal follow best practices, enabling ergonomic conversion from both string types. This will make the API more convenient to use.


245-253: Well-implemented as_str method for BorrowedOrRc.

The as_str method is correctly implemented to return a string slice regardless of the variant, making it convenient to work with either borrowed or owned strings uniformly.


255-262: Efficient conversion from Literal to BorrowedOrRc.

The implementation of From<Literal> for BorrowedOrRc correctly maps the variants and wraps the owned string in an Rc to avoid deep copying when cloned later.


264-268: Good design of SpanOrLiteral enum.

The SpanOrLiteral enum effectively unifies handling of both spans from the input and literal strings, which is essential for the new PUSH_LITERAL functionality.


270-278: Efficient implementation of as_borrowed_or_rc for SpanOrLiteral.

The as_borrowed_or_rc method correctly handles both variants, converting a span to a borrowed string and cloning a literal reference efficiently without deep copying the string content when it's in an Rc.


489-489: Updated stack field type to support literals.

The stack field has been correctly updated from Stack<Span<'i>> to Stack<SpanOrLiteral<'i>> to accommodate both spans and literals. This is a key change that enables the new PUSH_LITERAL functionality.


1130-1161: Well-implemented stack_push_literal method.

The new stack_push_literal method is well-designed:

  1. It accepts any type that can be converted to a Literal using impl Into<Literal>, providing flexibility.
  2. It correctly pushes the literal onto the stack after converting it to a SpanOrLiteral::Literal.
  3. The implementation is concise and follows the existing code style.

The doc comments and examples are also thorough and follow the pattern of other methods.


1520-1520: Updated stack_push method to work with SpanOrLiteral.

The stack_push method has been correctly updated to push a SpanOrLiteral::Span instead of just a Span.


1552-1553: Updated stack_peek method to work with SpanOrLiteral.

The stack_peek method has been correctly updated to use as_borrowed_or_rc() to get a string representation of either a span or literal. The change is minimal and maintains the existing logic.


1581-1582: Updated stack_pop method to work with SpanOrLiteral.

The stack_pop method has been correctly updated to use as_borrowed_or_rc() to get a string representation of either a span or literal, maintaining consistency with the stack_peek method.


1628-1629: Updated stack_match_peek_slice method to work with SpanOrLiteral.

The matcher closure in stack_match_peek_slice has been correctly updated to use as_borrowed_or_rc() to get a string representation of either a span or literal. The change is minimal and preserves the existing logic.


1690-1690: Updated stack_match_pop method to work with SpanOrLiteral.

The string matching in stack_match_pop has been correctly updated to use as_borrowed_or_rc() to get a string representation of either a span or literal, maintaining consistency with other methods.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
derive/tests/surround.pest (1)

3-7: Consider making the grammar more extensible for future delimiter pairs.

The current implementation works well for parentheses and angle brackets. If you plan to support more delimiter pairs in the future, you might want to consider a more maintainable structure.

_QuoteStart = _{
    ( "(" ~ PUSH_LITERAL(")") )
    | ( "<" ~ PUSH_LITERAL(">") )
+   // Easy to add more delimiter pairs:
+   // | ( "[" ~ PUSH_LITERAL("]") )
+   // | ( "{" ~ PUSH_LITERAL("}") )
+   // | ( "/*" ~ PUSH_LITERAL("*/") )
}
derive/tests/surround.rs (1)

35-72: Consider adding more comprehensive test cases.

While the current tests cover basic functionality, consider adding tests for:

  1. Nested quotes (e.g., "(outer <inner>)").
  2. Empty quotes (e.g., "()" and "<>").
  3. Quotes containing their counterpart delimiters (e.g., "(contains > symbol)" or "<contains ) symbol>").
  4. Edge cases with whitespace or special characters.
#[test]
#[cfg(feature = "grammar-extras")]
fn nested_quotes() {
    parses_to! {
        parser: SurroundParser,
        input: "(outer <inner>)",
        rule: Rule::Quote,
        tokens: [
            QuoteChars(1, 14)
        ]
    }
}

#[test]
#[cfg(feature = "grammar-extras")]
fn empty_quotes() {
    parses_to! {
        parser: SurroundParser,
        input: "()",
        rule: Rule::Quote,
        tokens: [
            QuoteChars(1, 1)
        ]
    }
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0ffc568 and 7f3e851.

📒 Files selected for processing (4)
  • derive/tests/surround.pest (1 hunks)
  • derive/tests/surround.rs (1 hunks)
  • pest/src/parser_state.rs (10 hunks)
  • vm/src/lib.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • vm/src/lib.rs
🧰 Additional context used
🧬 Code Definitions (2)
derive/tests/surround.rs (1)
pest/src/parser_state.rs (1)
  • rule (677-801)
pest/src/parser_state.rs (7)
pest/src/span.rs (4)
  • as_str (197-200)
  • new (47-53)
  • start (100-102)
  • end (118-120)
pest/src/iterators/pairs.rs (3)
  • as_str (116-125)
  • as_str (570-574)
  • new (45-87)
pest/src/iterators/pair.rs (2)
  • as_str (114-120)
  • new (48-60)
pest/src/stack.rs (1)
  • new (51-57)
pest/src/position.rs (2)
  • new (45-47)
  • span (98-107)
pest/src/iterators/tokens.rs (1)
  • new (33-58)
pest/src/iterators/flat_pairs.rs (1)
  • new (32-46)
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Unit, Style, and Lint Testing
  • GitHub Check: check for semver compatibility
  • GitHub Check: cargo hack check --feature-powerset
  • GitHub Check: check for no_std compatibility
  • GitHub Check: Test Coverage
  • GitHub Check: Documentation check
  • GitHub Check: Minimal Versions Testing
  • GitHub Check: Fuzzing
🔇 Additional comments (20)
derive/tests/surround.pest (2)

1-10: Grammar implementation looks good and demonstrates the PUSH_LITERAL feature well.

This grammar showcases a practical use case for the new PUSH_LITERAL feature by implementing a quote parser that supports different delimiter pairs. The implementation is elegant and demonstrates how PUSH_LITERAL can simplify working with paired delimiters compared to alternative approaches.

The grammar correctly:

  1. Captures opening delimiters and pushes their closing counterparts
  2. Uses PEEK to match content until the closing delimiter
  3. Consumes the closing delimiter with POP

9-10: Verify PEEK usage with appropriate tests.

The !PEEK ~ ANY pattern is key to this grammar's functionality. It ensures characters are consumed only if they're not the expected closing delimiter. This is a good demonstration of combining PUSH_LITERAL with PEEK to create a flexible matching pattern.

Make sure to test this grammar with nested quotes and quotes containing delimiters, such as:

  • "(hello (world))"
  • "<text with > inside>"
    to verify the grammar behaves as expected in edge cases.
derive/tests/surround.rs (5)

14-17: Correctly feature-guarded the implementation.

Good use of the #[cfg(feature = "grammar-extras")] feature guard for the parser implementation as discussed in the PR comments. This will prevent compilation issues due to mixed pest dependencies.


18-33: Well-documented parser with clear examples.

The documentation for the SurroundParser is clear and concise. It effectively explains the purpose of the grammar and provides examples of the supported formats.


35-46: Test case validates parenthesis delimiter functionality.

This test confirms the parser correctly handles text surrounded by parentheses, which is one of the key use cases for the new PUSH_LITERAL feature.


48-59: Test case validates angle bracket delimiter functionality.

This test confirms the parser correctly handles text surrounded by angle brackets, demonstrating the flexibility of the PUSH_LITERAL approach for different delimiter pairs.


61-72: Test case validates error handling for mismatched delimiters.

Good negative test case that verifies the parser correctly rejects input with mismatched opening and closing delimiters.

pest/src/parser_state.rs (13)

22-22: Good: Adding the needed import for Deref

The addition of core::ops::Deref is appropriate for use in the BorrowedOrRc.as_str method.


90-96: Well-designed enum for handling string literals

The Literal enum provides a clean way to handle both static and owned strings, with proper documentation explaining when to use each variant.


98-108: Good implementations for conversion

These From implementations provide convenient, zero-cost conversions from both static string references and owned strings to the Literal enum.


238-258: Well-designed helper for efficient string handling

The BorrowedOrRc enum provides an efficient way to handle string references, using Rc<String> to avoid unnecessary copying when cloning the Owned variant. The documentation clearly explains the rationale for this design choice.

The as_str method implementation correctly handles both variants, using deref() on the Rc<String> to get a string slice.


260-267: Clean conversion implementation

The conversion from Literal to BorrowedOrRc is implemented correctly, with appropriate handling for both variants.


269-283: Good design for unifying spans and literals

The SpanOrLiteral enum allows the parser to handle both spans and literals in a uniform way. The as_borrowed_or_rc method provides a clean interface for working with either variant.

The use of #[inline] is appropriate for this small, frequently-used method.


494-494: Correct stack type update

Updating the stack type from Stack<Span<'i>> to Stack<SpanOrLiteral<'i>> is necessary to support the new literal functionality.


1135-1165: Well-implemented literal push method

The stack_push_literal method is well-designed and implemented, with clear documentation and examples. The use of impl Into<Literal> provides flexibility for callers.

The method correctly returns Ok(self) since the operation always succeeds.

Since the PR description mentions you haven't added tests yet, consider adding unit tests for this new functionality to verify it works correctly with different input types.


1524-1525: Appropriate stack_push update

The update to push SpanOrLiteral::Span instead of Span is necessary to work with the new stack type.


1556-1558: Correct stack_peek adaptation

The method correctly adapts to the new SpanOrLiteral type, using as_borrowed_or_rc to get a string representation regardless of whether the stack entry is a span or a literal.


1585-1587: Proper stack_pop adaptation

Similar to stack_peek, this method correctly adapts to the SpanOrLiteral type, maintaining the existing functionality while supporting the new literal capability.


1632-1634: Good matcher implementation for stack_match_peek_slice

The matcher function is correctly updated to work with the SpanOrLiteral type, ensuring that the method works with both spans and literals.


1694-1695: Proper stack_match_pop adaptation

The method is correctly updated to work with the SpanOrLiteral type, maintaining the existing functionality.

@yshavit yshavit requested a review from tomtau March 23, 2025 05:36
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pest/src/lib.rs (1)

271-275: Consider adding feature guard information to documentation

The documentation clearly explains what the PUSH_LITERAL rule does and provides a helpful example. However, based on the PR comments, this feature should be guarded with #[cfg(feature = "grammar-extras")], but this constraint isn't mentioned in the documentation.

Consider adding a note about the feature requirement:

 `PUSH_LITERAL("a")` pushes the argument to the stack without considering the input. The
 argument must be a literal string. This is often useful in conjunction with another rule before
 it. For example, `"[" ~ PUSH_LITERAL("]")` will look for an opening bracket `[` and, if it
 finds one, will push a closing bracket `]` to the stack.
+
+ Note: This feature requires the `grammar-extras` feature to be enabled.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aa6b1a3 and f25f47d.

📒 Files selected for processing (2)
  • derive/tests/surround.rs (1 hunks)
  • pest/src/lib.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • derive/tests/surround.rs
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: cargo hack check --feature-powerset
  • GitHub Check: check for semver compatibility
  • GitHub Check: Test Coverage
  • GitHub Check: Minimal Versions Testing
  • GitHub Check: check for no_std compatibility
  • GitHub Check: Unit, Style, and Lint Testing
  • GitHub Check: Documentation check
  • GitHub Check: Fuzzing
🔇 Additional comments (2)
pest/src/lib.rs (2)

233-234: Documentation for the new PUSH_LITERAL rule looks good.

The addition of the PUSH_LITERAL special rule documentation is clear and follows the same format as other special rules in this section. This makes it discoverable alongside other stack manipulation features.


266-266: Updated section header correctly includes the new rule.

The section header now properly includes the new PUSH_LITERAL rule alongside the existing stack manipulation rules.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
generator/src/generator.rs (1)

1059-1070: Test covers basic functionality.

The test verifies that the code generator produces the expected output for the PushLiteral variant. This ensures that the implementation is working correctly.

Consider adding a test that combines PushLiteral with other expressions to verify more complex scenarios, such as the use case mentioned in the PR: matching one character and then pushing another.

vm/tests/surround.rs (2)

1-37: Good test setup for the new functionality.

The test correctly verifies a use case of the new PUSH_LITERAL rule by testing a quoted string with parentheses. The test structure follows the pattern used in other pest VM tests.

Consider adding more test cases to verify the functionality with different delimiter pairs (like angle brackets) since the PR mentions this as a key use case. For example:

 #[test]
 fn quote() {
     parses_to! {
         parser: vm(),
         input: "(abc)",
         rule: "Quote",
         tokens: [
             QuoteChars(1, 4)
         ]
     };
 }
+
+#[test]
+fn angle_quote() {
+    parses_to! {
+        parser: vm(),
+        input: "<abc>",
+        rule: "AngleQuote",
+        tokens: [
+            QuoteChars(1, 4)
+        ]
+    };
+}

18-18: Consider including the grammar content for clarity.

It would be helpful to add a comment describing the grammar rules in surround.pest or include a simplified version directly in this file to make the test more self-contained and easier to understand.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bd27f98 and 04c23de.

📒 Files selected for processing (3)
  • generator/src/generator.rs (3 hunks)
  • vm/tests/surround.pest (1 hunks)
  • vm/tests/surround.rs (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • vm/tests/surround.pest
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: cargo hack check --feature-powerset
  • GitHub Check: check for no_std compatibility
  • GitHub Check: Test Coverage
  • GitHub Check: Minimal Versions Testing
  • GitHub Check: check for semver compatibility
  • GitHub Check: Unit, Style, and Lint Testing
  • GitHub Check: Documentation check
  • GitHub Check: Fuzzing
🔇 Additional comments (2)
generator/src/generator.rs (2)

573-578: Good implementation with proper feature guarding.

The implementation of the PushLiteral variant in generate_expr is clean and properly feature-guarded with #[cfg(feature = "grammar-extras")] as discussed in the PR comments. This will prevent compilation issues with mixed pest dependencies.


764-769: Correctly implemented for atomic expressions as well.

The implementation for generate_expr_atomic follows the same pattern as the regular variant, which is consistent with how other expression types are handled in the codebase.

Copy link
Contributor

@tomtau tomtau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems there's a semver breaking change due to the auto-traits not being implemented:

type ParserState is no longer Send, in pest/src/parser_state.rs:448
type ParserState is no longer Sync, in pest/src/parser_state.rs:448

Comment on lines -426 to +474
stack: Stack<Span<'i>>,
stack: Stack<SpanOrLiteral<'i>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems this is a semver-breaking change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because of the Send/Sync, right? (I want to make sure I didn't miss any other aspect -- I wouldn't think so, since this is all just private to parser_state.rs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looks like it. There are of course semver breaking parts on the Rule enum but that's not usually externally consumed, so that should be ok (it's got semver broken since early version 2.X and noone reported any issues due to it)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
pest/src/parser_state.rs (2)

218-265: Consider adding feature flag for grammar-extras

Based on the discussions in the PR comments, these new changes should be guarded by #[cfg(feature = "grammar-extras")] to prevent compatibility issues with mixed pest dependencies. This will help maintain backward compatibility with existing codebases.

+ #[cfg(feature = "grammar-extras")]
enum BorrowedOrArc<'i> {
    Borrowed(&'i str),
    Owned(Arc<String>),
}

+ #[cfg(feature = "grammar-extras")]
enum SpanOrLiteral<'i> {
    Span(Span<'i>),
    Literal(BorrowedOrArc<'i>),
}

// And similar for the implementation blocks and the stack_push_literal method

Also applies to: 476-476, 1117-1147, 1506-1507, 1538-1539, 1567-1568, 1614-1615, 1676-1676


476-476: Consider backward compatibility concerns

Changing the stack field type from Stack<Span<'i>> to Stack<SpanOrLiteral<'i>> could affect API consumers that may be directly accessing this field through unsafe code. Consider whether this change needs to be communicated in release notes.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 04c23de and 52694ae.

📒 Files selected for processing (1)
  • pest/src/parser_state.rs (9 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
pest/src/parser_state.rs (10)
pest/src/error.rs (5)
  • fmt (719-721)
  • fmt (725-730)
  • from (83-85)
  • from (89-92)
  • start (498-503)
pest/src/position.rs (3)
  • fmt (445-447)
  • new (45-47)
  • span (98-107)
pest/src/iterators/pairs.rs (5)
  • fmt (456-458)
  • fmt (462-471)
  • as_str (116-125)
  • as_str (570-574)
  • new (45-87)
pest/src/span.rs (5)
  • fmt (274-280)
  • as_str (197-200)
  • new (47-53)
  • start (100-102)
  • end (118-120)
pest/src/iterators/pair.rs (4)
  • fmt (328-338)
  • fmt (342-363)
  • as_str (114-120)
  • new (48-60)
pest/src/iterators/flat_pairs.rs (2)
  • fmt (152-156)
  • new (32-46)
pest/src/iterators/tokens.rs (2)
  • fmt (129-131)
  • new (33-58)
pest/src/stack.rs (1)
  • new (51-57)
pest/src/iterators/line_index.rs (1)
  • new (18-31)
pest/src/pratt_parser.rs (1)
  • new (211-219)
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: check for no_std compatibility
  • GitHub Check: check for semver compatibility
  • GitHub Check: Test Coverage
  • GitHub Check: cargo hack check --feature-powerset
  • GitHub Check: Unit, Style, and Lint Testing
  • GitHub Check: Documentation check
  • GitHub Check: Minimal Versions Testing
  • GitHub Check: Fuzzing
🔇 Additional comments (12)
pest/src/parser_state.rs (12)

13-14: New imports added for literal handling support

These new imports support the functionality to handle both borrowed string slices and owned Arc strings efficiently, with Deref being used to access the string content from Arc wrappers.

Also applies to: 18-19, 23-23


218-240: New BorrowedOrArc enum for efficient string handling

The enum provides an elegant solution to avoid unnecessary string copying by using Arc<String> instead of Cow<'i, str> when cloning. This design ensures thread safety by implementing Send and Sync traits.

The comment clearly explains why this approach was chosen over Cow - to avoid copying when the Owned variant is cloned, since Arc::clone only increments a reference count.


242-249: Implementation of From for BorrowedOrArc

The implementation is concise and correctly handles both borrowed and owned variants of the input Cow, preserving the efficiency goal by wrapping owned strings in Arc.


251-265: SpanOrLiteral enum and implementation

This enum is the core addition that enables PUSH_LITERAL functionality by allowing the stack to hold both Spans (matched input) and Literals (pushed values). The as_borrowed_or_rc method properly handles both variants and returns a cloned BorrowedOrArc to avoid borrowing issues when calling methods that require mutable access to the parser state.


476-476: Updated stack field type in ParserState

The field type has been changed from Stack<Span<'i>> to Stack<SpanOrLiteral<'i>> to accommodate both matched spans and literal values on the stack.


1117-1147: New stack_push_literal method

This method implements the core PUSH_LITERAL functionality by pushing a literal string onto the stack without consuming any input. The implementation is straightforward and well-documented with usage examples.

The method signature takes impl Into<Cow<'static, str>> which provides flexibility while enforcing that literals must have a 'static lifetime.


1506-1507: Updated stack_push method to use SpanOrLiteral

The stack.push call has been updated to wrap the span in a SpanOrLiteral::Span variant to accommodate the new stack type.


1538-1539: Updated stack_peek method

The method now correctly handles both Span and Literal variants by using the as_borrowed_or_rc method to get a string representation of the stack item.


1567-1568: Updated stack_pop method

Similar to stack_peek, this method has been updated to handle both Span and Literal variants.


1614-1615: Updated stack_match_peek_slice method

The matcher closure has been updated to work with SpanOrLiteral items by extracting the string representation using the as_borrowed_or_rc method.


1676-1676: Updated stack_match_pop method

The string matching logic has been updated to handle both Span and Literal variants on the stack.


218-265:

Details

❓ Verification inconclusive

Verify Send and Sync implications

Based on previous review comments, the change to use Arc instead of Rc may have semver-breaking implications due to the Send/Sync requirements. Ensure this is properly documented and communicated to users.


🏁 Script executed:

#!/bin/bash
# Check for implementations of Send/Sync before this change
grep -r "impl.*Send" --include="*.rs" .
grep -r "impl.*Sync" --include="*.rs" .

Length of output: 83


Action: Verify and Document the Send/Sync Behavior Impact

The change from using Rc to using Arc for the BorrowedOrArc type leverages Rust’s auto trait implementations for Send and Sync. The grep searches did not reveal any explicit implementations of Send or Sync, which is expected since these are automatically derived by the compiler for types like Arc<String>. However, because this change might have semver implications when consumers rely on specific thread-safety guarantees, please ensure that this behavior is clearly documented in the changelog or migration guide.

  • Confirm that the documentation or release notes detail that Arc is used to guarantee Send and Sync traits, contrasting with the previous Rc usage.
  • If further confidence is needed, perform a manual review of the surrounding codebase to ensure no unintended consequences arise from this change.

@yshavit
Copy link
Contributor Author

yshavit commented Mar 23, 2025

Updated parser_state.rs to use Arc instead of Rc. cargo semver-checks now passes for me (I was able to repro the failure before I made that change), except for pest-meta, which fails to build because of windows path issues in importing the grammars.

@yshavit yshavit requested a review from tomtau March 23, 2025 20:17
Copy link
Contributor

@tomtau tomtau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomtau tomtau merged commit 2b30b8d into pest-parser:master Mar 24, 2025
9 checks passed
@yshavit
Copy link
Contributor Author

yshavit commented Mar 24, 2025

Will do!

yshavit added a commit to yshavit/pest-book that referenced this pull request Mar 24, 2025
@yshavit yshavit deleted the push_other branch March 24, 2025 01:33
tomtau pushed a commit to pest-parser/book that referenced this pull request Mar 25, 2025
* add documentation for PUSH_LITERAL

See pest-parser/pest#1092

* fill out the rest of the stack docs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants