Skip to content

refactor: replace shell-utils string parsing with tree-sitter AST#68

Open
BingqingLyu wants to merge 6 commits into
mainfrom
fork-pr-2652-feat-shell-utils-ast
Open

refactor: replace shell-utils string parsing with tree-sitter AST#68
BingqingLyu wants to merge 6 commits into
mainfrom
fork-pr-2652-feat-shell-utils-ast

Conversation

@BingqingLyu

@BingqingLyu BingqingLyu commented Apr 27, 2026

Copy link
Copy Markdown
Owner

TLDR

Replace all string-based shell command parsing in shell-utils.ts, rule-parser.ts, and shell-semantics.ts with tree-sitter AST-first approach, using the existing shellAstParser.ts infrastructure. This significantly improves parsing robustness for quoted strings, heredocs, nested constructs, and complex compound commands.

Screenshots / Video Demo

N/A — no user-facing change. This is an internal refactor of shell command parsing logic.

Dive Deeper

Architecture: Sync/Async Hybrid with Lazy Initialization

The key challenge is that tree-sitter WASM initialization is async (Parser.init() + Language.load()), but most callers are synchronous. The solution:

  1. ensureParserInitStarted() — All sync AST functions call this when the parser isn't ready, triggering a fire-and-forget initParser(). The current call falls back to the legacy implementation, but the parser starts loading in the background.
  2. Subsequent calls — Once the WASM loads (typically within milliseconds), all sync functions use the AST path.
  3. Legacy fallback preserved — Original string-based implementations are retained as *Legacy() functions, ensuring correctness even if the parser fails to load.

Changes by File

File Change
shellAstParser.ts Added sync AST functions: splitCommandsAST, getCommandRootAST, getCommandRootsAST, detectCommandSubstitutionAST, tokenizeCommandAST, isShellCommandReadOnlySync, ensureParserInitStarted
shell-utils.ts Migrated splitCommands, getCommandRoot, getCommandRoots, detectCommandSubstitution, isCommandNeedsPermission to AST-first with legacy fallback. Added BASIC_READ_ONLY_COMMANDS lightweight fallback set.
rule-parser.ts Migrated splitCompoundCommand to AST-first with legacy fallback
shell-semantics.ts Migrated extractShellOperations to AST-first tokenization with legacy fallback
shellReadOnlyChecker.ts Deleted (superseded by isShellCommandReadOnlySync in shellAstParser)
shellAstParser.test.ts Added 42 new tests for sync AST functions
shell-semantics.test.ts Added 29 new AST-path integration tests

Three-tier fallback for isCommandNeedsPermission

  1. AST full analysis (isShellCommandReadOnlySync) — when parser is ready
  2. BASIC_READ_ONLY_COMMANDS set (getCommandRoot + set lookup) — lightweight sync fallback
  3. Conservative deny — requires permission if neither of the above can determine safety

Reviewer Test Plan

  1. Run unit tests: npx vitest run packages/core/src/utils/shellAstParser.test.ts packages/core/src/utils/shell-utils.test.ts packages/core/src/permissions/shell-semantics.test.ts packages/core/src/permissions/permission-manager.test.ts packages/core/src/tools/shell.test.ts
  2. Verify all 500+ related tests pass
  3. Run full test suite: npm run test
  4. Run typecheck: npx tsc --noEmit --project packages/core/tsconfig.json
  5. Test interactively: run npm start, execute shell commands, verify permission prompts work correctly for both read-only and mutating commands

Testing Matrix

🍏 🪟 🐧
npm run
npx
Docker
Podman - -
Seatbelt - -

Linked issues / bugs

N/A — proactive refactor to improve shell parsing robustness.

- Add sync AST functions to shellAstParser.ts (splitCommandsAST, getCommandRootAST,
  getCommandRootsAST, detectCommandSubstitutionAST, tokenizeCommandAST,
  isShellCommandReadOnlySync)
- Add ensureParserInitStarted() for fire-and-forget lazy initialization
- Migrate shell-utils.ts functions to AST-first with legacy fallback
- Migrate rule-parser.ts splitCompoundCommand to AST-first
- Migrate shell-semantics.ts extractShellOperations to AST-first
- Add BASIC_READ_ONLY_COMMANDS fallback set for isCommandNeedsPermission
- Delete deprecated shellReadOnlyChecker.ts and its tests
- Add comprehensive tests for all new sync AST functions
- Add AST-path integration tests for shell-semantics
# Conflicts:
#	packages/core/src/permissions/rule-parser.ts
#	packages/core/src/utils/shell-utils.ts
#	packages/core/src/utils/shellAstParser.test.ts
- shellAstParser.ts: remove import from deleted shellReadOnlyChecker.ts,
  replace fallback calls with conservative false (requires permission)
- coreToolScheduler.ts: migrate from deleted isShellCommandReadOnly to
  isShellCommandReadOnlySync from shellAstParser.ts (null → false)
- shellAstParser.test.ts: update fallback tests to reflect conservative
  behavior, remove consistency suite that depended on deleted file
- coreToolScheduler: use isCommandNeedsPermission (three-tier fallback)
  instead of isShellCommandReadOnlySync ?? false for isConcurrencySafe;
  this avoids treating all commands as sequential when WASM is still loading
- shell-utils: add GIT_READ_ONLY_SUBCOMMANDS set and git subcommand check
  to isCommandNeedsPermission lightweight fallback, restoring the behavior
  that shellReadOnlyChecker.ts previously provided for git log/status/diff etc.

Fixes: should run read-only shell commands concurrently test
@BingqingLyu BingqingLyu added conflicting-group-1 Conflicting PR group 1 — review as a batch conflicting-pr Shares at least one cross-PR dependency with other PRs labels May 7, 2026
@BingqingLyu

BingqingLyu commented May 7, 2026

Copy link
Copy Markdown
Owner Author

Conflict Group 1

This PR shares modified functions with 1 other PR(s): #96.

These PRs should be reviewed as a batch — merging one may affect the others.

Function File Also modified by
partitionToolCalls coreToolScheduler.ts #96
graph LR
    PR68["PR #68"]
    FpartitionToolCalls_1638["partitionToolCalls<br>coreToolScheduler.ts"]
    PR68 -->|modifies| FpartitionToolCalls_1638
    PR96["PR #96"]
    PR96 -->|modifies| FpartitionToolCalls_1638
Loading

Posted by codegraph-ai conflict detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conflicting-group-1 Conflicting PR group 1 — review as a batch conflicting-pr Shares at least one cross-PR dependency with other PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants