refactor(markdown-parser): decompose parse_link_or_image into phase helpers#9452
Conversation
…elpers Extract four helpers from the monolithic `parse_link_or_image` function: - `parse_bracketed_link_text`: parses `[text]` or `![alt]`, reports facts - `parse_inline_link_tail`: parses `(destination "title")` after `]` - `parse_reference_link_tail`: handles full, collapsed, and shortcut references - `abort_link_parse`: centralized rollback with `RollbackKind` enum The coordinator (`parse_link_or_image`) now owns orchestration and policy (nested-link rejection, inline validation dispatch) while helpers report facts and return success/rollback results. Key improvements: - Replaced `expect_with_context(L_PAREN)` with `eat_with_context` to eliminate diagnostic emission for internal preconditions - Centralized rollback ensures consistent state restoration across all failure paths (marker abandonment, checkpoint rewind, conditional relex) - Zero `unwrap`/`expect`/`panic!`/`unreachable!` in the touched flow Pure structural refactor — no behavioral changes.
|
WalkthroughThis change refactors the markdown link parsing logic in Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/biome_markdown_parser/src/syntax/inline/links.rs`:
- Around line 215-220: The rollback path in the match handling
parse_inline_link_tail incorrectly uses RollbackKind::AfterLexSwitch even when
eat_with_context(L_PAREN, MarkdownLexContext::LinkDefinition) failed (no
lex-context switch occurred); update the LinkTailResult::Rollback branch in the
match (the call to abort_link_parse(p, checkpoint, m,
RollbackKind::AfterLexSwitch)) to use RollbackKind::BeforeLexSwitch instead so
abort_link_parse accurately reflects that no lex switch happened (refer to
parse_inline_link_tail, abort_link_parse, eat_with_context, and the RollbackKind
enum).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 693b12a5-cd1d-4ab9-989a-4539016e9931
📒 Files selected for processing (1)
crates/biome_markdown_parser/src/syntax/inline/links.rs
| match parse_inline_link_tail(p, kind, opening_range) { | ||
| LinkTailResult::Parsed => Present(m.complete(p, kind.inline_kind())), | ||
| LinkTailResult::Rollback => { | ||
| abort_link_parse(p, checkpoint, m, RollbackKind::AfterLexSwitch) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the implementation of eat_with_context to understand context-switching semantics
ast-grep --pattern $'fn eat_with_context($$$) {
$$$
}'
# Also check bump_with_context for comparison
rg -n -A10 'fn eat_with_context' --type rustRepository: biomejs/biome
Length of output: 705
🏁 Script executed:
sed -n '376,395p' crates/biome_parser/src/lib.rsRepository: biomejs/biome
Length of output: 519
🏁 Script executed:
rg -n -A15 'fn bump_with_context' crates/biome_parser/src/lib.rs | head -40Repository: biomejs/biome
Length of output: 561
🏁 Script executed:
rg -n 'RollbackKind' crates/biome_markdown_parser/src/syntax/ -A2 -B2Repository: biomejs/biome
Length of output: 3516
🏁 Script executed:
rg -n 'parse_bracketed_link_text' crates/biome_markdown_parser/src/syntax/inline/ -A20 | head -50Repository: biomejs/biome
Length of output: 4464
🏁 Script executed:
sed -n '240,280p' crates/biome_markdown_parser/src/syntax/inline/links.rsRepository: biomejs/biome
Length of output: 1604
The use of RollbackKind::AfterLexSwitch when eat_with_context fails is semantically incorrect.
When eat_with_context(L_PAREN, MarkdownLexContext::LinkDefinition) returns false at line 277, no token was consumed and no context switch occurred—the function returns early without calling do_bump_with_context. However, the coordinator uses RollbackKind::AfterLexSwitch at line 218, which causes force_relex_regular() to be called, implying a context switch that never happened.
In practice, this is a no-op (re-lexing in Regular context when already in Regular context changes nothing), but for correctness, consider using RollbackKind::BeforeLexSwitch instead, since no lex-context switch actually occurred.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/biome_markdown_parser/src/syntax/inline/links.rs` around lines 215 -
220, The rollback path in the match handling parse_inline_link_tail incorrectly
uses RollbackKind::AfterLexSwitch even when eat_with_context(L_PAREN,
MarkdownLexContext::LinkDefinition) failed (no lex-context switch occurred);
update the LinkTailResult::Rollback branch in the match (the call to
abort_link_parse(p, checkpoint, m, RollbackKind::AfterLexSwitch)) to use
RollbackKind::BeforeLexSwitch instead so abort_link_parse accurately reflects
that no lex switch happened (refer to parse_inline_link_tail, abort_link_parse,
eat_with_context, and the RollbackKind enum).
There was a problem hiding this comment.
Inline tail rollback is conservatively treated as AfterLexSwitch because failures can occur both before and after eat_with_context; the latter requires regular-context re-lex.
There was a problem hiding this comment.
Tip @jfmcdowell : start the sentence with @coderabbitai and it will pick up your message (you need to at it)
There was a problem hiding this comment.
@jfmcdowell Thanks for the explanation — the conservative AfterLexSwitch approach makes sense. Since parse_inline_link_tail can fail at multiple points (before or after eat_with_context), the coordinator can't distinguish which happened, and calling force_relex_regular() when already in Regular context is indeed a no-op. No change needed here.
🧠 Learnings used
Learnt from: ematipico
Repo: biomejs/biome PR: 8031
File: crates/biome_html_parser/src/syntax/svelte.rs:140-147
Timestamp: 2025-11-09T12:47:46.298Z
Learning: In the Biome HTML parser, `expect` and `expect_with_context` consume the current token and then lex the next token. The context parameter in `expect_with_context` controls how the next token (after the consumed one) is lexed, not the current token being consumed. For example, in Svelte parsing, after `bump_with_context(T!["{:"], HtmlLexContext::Svelte)`, the next token is already lexed in the Svelte context, so `expect(T![else])` is sufficient unless the token after `else` also needs to be lexed in a specific context.
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Use `p.eat(token)` for optional tokens, `p.expect(token)` for required tokens, `parse_rule(p).ok(p)` for optional nodes, and `parse_rule(p).or_add_diagnostic(p, error)` for required nodes
Learnt from: ematipico
Repo: biomejs/biome PR: 8537
File: crates/biome_js_analyze/src/lint/nursery/no_leaked_render.rs:167-210
Timestamp: 2025-12-22T09:27:13.161Z
Learning: In crates/biome_analyze/**/*analyze/src/**/*.rs, the `fix_kind` field in `declare_lint_rule!` should only be specified when the rule implements the `action` function. Rules that only emit diagnostics without providing code fixes should not include `fix_kind` in their metadata.
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Implement error recovery in list parsing using `or_recover()` to wrap unparseable tokens in a `BOGUS_*` node and consume tokens until a recovery token is found
Learnt from: dyc3
Repo: biomejs/biome PR: 8901
File: crates/biome_analyze_macros/src/group_macro.rs:9-9
Timestamp: 2026-01-29T22:10:30.888Z
Learning: Do not import xtask inside biome crates. In the biomeps repository, ensure that biome crates (e.g., crates/biome_*/...) do not have imports like use xtask_glue::... or other xtask-related crates. The correct dependency direction is for xtask to depend on and import biome crates, not the reverse, since xtask contains heavier utilities (e.g., git2) that can inflate build times when included in biome crates. When reviewing, verify dependency declarations (Cargo.toml) and any use statements to confirm xtask crates are not pulled into biome crates; prefer having xtask depend on biome crates and export needed functionality through stable interfaces.
Note
AI Assistance Disclosure: This PR was developed with assistance from Claude Code.
Summary
Decompose
parse_link_or_imageinto four focused helpers so link/image parsing is easier to follow and review without changing markdown parsing behavior.parse_bracketed_link_textfor[text]/![alt]parsing.parse_inline_link_tailfor(destination "title")parsing.parse_reference_link_tailfor full, collapsed, and shortcut reference handling.abort_link_parsewithRollbackKindenum to centralize all rollback paths (marker abandonment, checkpoint rewind, conditional re-lex).expect_with_context(L_PAREN, ...)witheat_with_contextto eliminate panicking code and spurious diagnostics on internal preconditions.Test Plan
just test-crate biome_markdown_parserjust test-markdown-conformance(652/652, 100%)just f && just lDocs
N/A — internal parser refactor with no user-facing behavior change.