Add FP reduction (script-aware ZWJ/ZWNJ) and OCR confusion detection#45
Merged
Add FP reduction (script-aware ZWJ/ZWNJ) and OCR confusion detection#45
Conversation
push_segment() incorrectly treated VAR=VALUE as the command token. Now skips leading environment variable assignments to find the real command. Adds pub is_env_assignment() helper for use by engine bypass detection. Fixes: TIRITH=0 curl evil.com now correctly identifies curl as command. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add command-aware output-flag skipping for curl (-o/--output) and wget (-O/-OFILE/--output-document). Extract URLs from command+args instead of raw segment text to avoid matching URLs in env-prefix values. Add conservative non-TLD file extensions (.png, .jpg, .mp4, etc.) to schemeless host exclusion list. Fixes issue #33. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
crates/tirith-core/src/engine.rs
Outdated
| return true; | ||
| } | ||
| // Path form: try canonicalize and compare to current_exe | ||
| if cmd.contains('/') { |
There was a problem hiding this comment.
🟡 Medium src/engine.rs:281
On Windows, paths may use backslashes. Consider also checking for \ so paths like .\tirith or C:\path\to\tirith.exe are handled correctly.
Suggested change
| if cmd.contains('/') { | |
| if cmd.contains('/') || cmd.contains('\\') { |
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file crates/tirith-core/src/engine.rs around line 281:
On Windows, paths may use backslashes. Consider also checking for `\` so paths like `.\tirith` or `C:\path\to\tirith.exe` are handled correctly.
Evidence trail:
crates/tirith-core/src/engine.rs lines 276-293 (REVIEWED_COMMIT): `is_tirith_command` function checks `cmd.contains('/')` at line 281 to determine if cmd is a path. Windows backslash paths would not match this condition.
crates/tirith/src/cli/init.rs line 71 (REVIEWED_COMMIT): `#[cfg(windows)]` confirms the project explicitly supports Windows.
- Run cargo fmt --all - Fix uninlined_format_args clippy lints in cli_integration tests - Add .cargo/audit.toml ignoring RUSTSEC-2026-0009 (time crate DoS, not exploitable in our usage, fix requires Rust 1.88) - Add same ignore to deny.toml
- Merge origin/main (glibc build fix) - Fix single & segment boundary in split_raw_words (security) - Use exact match == TIRITH=0 (prevents false bypass) - Skip flags in resolve_command_wrapper - Remove dead code in is_tirith_command Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| i += 1; | ||
| } | ||
| } | ||
| '"' => { |
There was a problem hiding this comment.
🟢 Low src/engine.rs:134
Quoted strings preserve quote characters (e.g., "TIRITH=0" becomes literal "TIRITH=0"), so the comparison == "TIRITH=0" won't match quoted input. Consider stripping outer quotes from each word before returning.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file crates/tirith-core/src/engine.rs around line 134:
Quoted strings preserve quote characters (e.g., `"TIRITH=0"` becomes literal `"TIRITH=0"`), so the comparison `== "TIRITH=0"` won't match quoted input. Consider stripping outer quotes from each word before returning.
Evidence trail:
crates/tirith-core/src/engine.rs lines 134-149 (REVIEWED_COMMIT): `split_raw_words` function explicitly pushes opening quote at line 135 (`current.push(ch);`) and closing quote at line 148 (`current.push(chars[i]);`), preserving quote characters in the returned string.
crates/tirith-core/src/engine.rs line 46 (REVIEWED_COMMIT): comparison `if words[idx] == "TIRITH=0"` compares against literal string without quotes.
Result: Input `"TIRITH=0" cmd` produces word `"\"TIRITH=0\""` which fails equality test against `"TIRITH=0"`.
Resolve 12 file conflicts by taking main's versions for shell hooks, deny.toml, doctor.rs, extract.rs, and tokenize.rs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan
🤖 Generated with Claude Code
Note
Add inline bypass detection and self-invocation resolution in
crates/tirith-core/src/engine.rswhile introducing a duplicateis_tirith_commandthat breaks the buildIntroduce
engine.find_inline_bypass,engine.split_raw_words, and wrapper resolvers forenv,command, andtime, addengine.is_self_invocation, and accidentally duplicateengine.is_tirith_command, causing a compile error in engine.rs.📍Where to Start
Start with
find_inline_bypassand the new tokenization path in engine.rs, then review the duplicateis_tirith_commanddefinition at the bottom of the file.Macroscope summarized 1e25ba1.