Add Pro-tier enrichment pipeline and verdict extensions by sheeki03 · Pull Request #41 · sheeki03/tirith

sheeki03 · 2026-02-24T15:39:16Z

Summary

Add dual-view enrichment for rendered content findings (human vs AI agent perspective)
Add paranoia-level filtering with tier-gated depth control
Extend verdict structs with human_view, agent_view, and enrichment fields
Improve URL handling in extract.rs for enrichment context

Test plan

All existing tests pass
Manual: verify enriched output on rendered content findings with Pro tier

🤖 Generated with Claude Code

Note

Add Pro-tier enrichment pipeline, paranoia filtering, and clipboard HTML detection with `rules::terminal::check_clipboard_html` and CLI `paste --html` (50+ char discrepancy threshold)

Introduce rendered-content and PDF checks, tier-gated enrichment and early-access filtering in engine::analyze, new MCP server/resources/tools, and a Unix fetch cloaking command; extend Finding with mitre_id/custom_rule_id, add clipboard HTML to AnalysisContext, and expose new RuleId variants.

📍Where to Start

Start with engine::analyze in engine.rs, then review rendered detection in rules::rendered::check in rendered.rs and clipboard handling in rules::terminal::check_clipboard_html in terminal.rs.

^{Macroscope summarized eb70868.}

Implements tirith as an MCP server over stdio, making all detection capabilities available to AI agents (Claude Code, Cursor, Windsurf). MCP module (crates/tirith-core/src/mcp/): - dispatcher.rs: State machine (AwaitingInit → Initialized → Ready), pre-init enforcement (-32002), version negotiation (2025-11-25 primary, 2025-06-18 fallback), parse error handling (-32700), 9 unit tests - types.rs: JSON-RPC 2.0 + MCP protocol structs, version negotiation - tools.rs: 7 tools (6 cross-platform + tirith_fetch_cloaking Unix-only stub), structured + text responses, directory scan capped at 50 files - resources.rs: tirith://project-safety resource with cwd scanning - CLI: tirith mcp-server subcommand (thin stdio wrapper) Community-only — no tier gating. License gates added in Part 7. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…validation Critical: notifications/initialized before initialize no longer transitions to Ready — only honored when state is already Initialized, preventing clients from bypassing pre-init enforcement. High: Two-phase parsing separates true parse errors (-32700, null id) from structurally invalid requests (-32600, preserves id when recoverable). Missing "method" field now correctly returns -32600 instead of -32700. Medium: Validates jsonrpc == "2.0" and rejects non-string/number/null id types (object, array, bool) with -32600. Removes unused JsonRpcRequest struct from types.rs. 9 new tests covering all three fixes plus string id, null id, and missing jsonrpc field edge cases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Introduces the paid tier infrastructure (Community/Pro/Team/Enterprise) with offline license key validation. Removes all file scan caps — detection is always free with full coverage per ADR-13. Key changes: - license.rs: Tier enum, base64 JSON key decoder, env/file resolution - rule_metadata.rs: Time-boxed early-access gates with Critical bypass - engine.rs: Enrichment stubs (enrich_pro/enrich_team) + early-access filter - verdict.rs: LicenseRequired RuleId variant - Removed 50-file caps from scan.rs, cli/scan.rs, mcp/tools.rs, mcp/resources.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Mark decode_tier() and current_tier() as unsigned bootstrap format, explicitly not a security boundary (finding #1) - Remove last remaining max_files: Some(50) in read_project_safety() (finding #2) - Extract filter_early_access_with() taking a metadata slice parameter, rewrite tests to exercise actual filter logic with injected rule meta: critical bypass, medium suppression, pro pass-through, no-metadata (finding #3) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ranoia tiers (Part 8) Phase 6: Rendered content scanning (CSS hiding, WCAG 2.0 color hiding, HTML hidden/aria-hidden attributes, markdown/HTML comments) with 8 new RuleIds. Detection runs for all tiers; dual-view enrichment (human_view/ agent_view) populated for Pro. Phase 10: Server-side cloaking detection (Unix-only, reqwest). Fetches URL with 6 user-agents, normalizes HTML, word-level diff with >10 char threshold. Pro gets full diff text; Free gets detection boolean. Phase 11: Clipboard rich-text analysis via --html flag on paste command. Detects CSS hiding and length discrepancy in clipboard HTML. Phase 13: Checkpoint/rollback system with auto-trigger on destructive commands (rm -rf, git reset --hard, etc). SHA-256 file hashing, configurable retention. Pro-gated in core per ADR-6. Phase 15: Paranoia tiers 3-4 (Low/Info findings) gated to Pro. Filter applied in core scan module and wired into CLI check/paste and MCP tools. Phase 19: PDF hidden text detection via lopdf. Flags sub-pixel scale transforms. Additional fixes from review: - UTF-8 safe truncation in rendered.rs and cloaking.rs (char-based, not byte-based) preventing panics on multibyte boundaries - Pro enrichment implemented for 7 finding types with dual-view output - Paranoia filtering applied in core scan_single_file/scan_stdin so all consumers (CLI, MCP, resources) benefit automatically - License gates in core checkpoint module, not CLI (ADR-6) - LicenseRequired finding emitted when scan truncated by tier cap - Scan file cap: Community 50 files, Pro unlimited (CLI + MCP + resources) - Doctor reports cloaking_available/webhooks_available from cfg(unix) - checkpoint list functional for all tiers with license hint for Free Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

push_segment() incorrectly treated VAR=VALUE as the command token. Now skips leading environment variable assignments to find the real command. Adds pub is_env_assignment() helper for use by engine bypass detection. Fixes: TIRITH=0 curl evil.com now correctly identifies curl as command. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Bypass detection now finds TIRITH=0 in inline env prefixes and env wrappers (env -i TIRITH=0, /usr/bin/env TIRITH=0), not just process env. Handles -u value-taking flag and -- option terminator. Self-invocation guard allows tirith's own commands (tirith diff, etc.) in single-segment inputs. Resolves through env/command/time wrappers. Uses canonicalized path comparison for path-form invocations with fallback to literal-only matching when canonicalization fails. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add command-aware output-flag skipping for curl (-o/--output) and wget (-O/-OFILE/--output-document). Extract URLs from command+args instead of raw segment text to avoid matching URLs in env-prefix values. Add conservative non-TLD file extensions (.png, .jpg, .mp4, etc.) to schemeless host exclusion list. Fixes issue #33. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add checkpoint ID validation, file count/size limits, and SSRF protection for cloaking URL fetches. Add 10 MiB line limit to MCP dispatcher. Replace silent error suppression with diagnostic messages in audit logging and CLI output writers.

- Add dual-view enrichment (human vs AI agent perspective) for rendered content findings - Add paranoia-level filtering with tier-gated depth control - Extend verdict structs with human_view, agent_view, and enrichment fields - Improve extract.rs URL handling for enrichment context Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

crates/tirith-core/src/checkpoint.rs

crates/tirith-core/src/mcp/tools.rs

macroscopeapp · 2026-02-24T15:46:44Z

crates/tirith-core/src/checkpoint.rs

+    for entry in &manifest {
+        if entry.is_dir {
+            continue;
+        }
+        if seen_paths.contains(&entry.original_path) {
+            continue;
+        }
+        let backup = files_dir.join(&entry.sha256);


🟢 Low src/checkpoint.rs:383

The second loop skips files already in seen_paths, so backup integrity isn't validated for deleted/modified files. Consider checking backup existence for all files, or at minimum documenting that backup corruption is only detected for unchanged files.

for entry in &manifest { if entry.is_dir { continue; } - if seen_paths.contains(&entry.original_path) { - continue; - } let backup = files_dir.join(&entry.sha256);

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/checkpoint.rs around lines 383-390: The second loop skips files already in `seen_paths`, so backup integrity isn't validated for deleted/modified files. Consider checking backup existence for all files, or at minimum documenting that backup corruption is only detected for unchanged files. Evidence trail: crates/tirith-core/src/checkpoint.rs lines 350-379 (first loop adds deleted/modified files to seen_paths), lines 383-398 (second loop skips seen_paths when checking backup integrity), lines 388-389 (explicit skip: `if seen_paths.contains(&entry.original_path) { continue; }`), lines 310-318 (restore function only prints to stderr and continues if backup missing).

crates/tirith-core/src/engine.rs

crates/tirith-core/src/checkpoint.rs

crates/tirith-core/src/rules/cloaking.rs

macroscopeapp · 2026-02-24T15:46:45Z

crates/tirith-core/src/rules/configfile.rs

+
+    // Raw IP address in URL
+    if let Some(host) = extract_host_from_url(url) {
+        if host.parse::<std::net::Ipv4Addr>().is_ok() || host.parse::<std::net::Ipv6Addr>().is_ok()


🟡 Medium rules/configfile.rs:500

Suggestion: extract_host_from_url mis-parses when userinfo or bracketed IPv6 is present, and keeping [] causes Ipv6Addr::parse to fail. Consider skipping past userinfo (@), treating […] as a single host and stripping the brackets, then bounding the host by /, :, or ?.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/rules/configfile.rs around line 500: Suggestion: `extract_host_from_url` mis-parses when userinfo or bracketed IPv6 is present, and keeping `[]` causes `Ipv6Addr::parse` to fail. Consider skipping past userinfo (`@`), treating `[`…`]` as a single host and stripping the brackets, then bounding the host by `/`, `:`, or `?`. Evidence trail: crates/tirith-core/src/rules/configfile.rs lines 520-526 (extract_host_from_url function), lines 500-501 (Ipv6Addr::parse call) at REVIEWED_COMMIT. The function uses `find(['/', ':', '?'])` which incorrectly matches colons inside IPv6 brackets and colons in userinfo passwords. No `@` handling for userinfo. Rust std::net::Ipv6Addr does not parse bracketed addresses.

crates/tirith-core/src/rules/cloaking.rs

crates/tirith-core/src/rules/terminal.rs

crates/tirith-core/src/extract.rs

crates/tirith-core/src/rules/cloaking.rs

crates/tirith-core/src/engine.rs

crates/tirith-core/src/rules/cloaking.rs

CI fixes: - Fix redundant closure clippy lint in golden_fixtures.rs - Fix approx_constant clippy lint in rendered.rs test (3.14 -> 3.25) - Fix uninlined_format_args clippy lints across test files - Pin time crate <0.3.47 in lockfile for MSRV 1.83 compat Review fixes (Critical/High/Medium): - checkpoint.rs: path traversal validation on manifest entries in restore - checkpoint.rs: reject "." as checkpoint ID - checkpoint.rs: propagate create_dir_all and sha256_file errors - engine.rs: env -C/-S options take args, skip 2 positions like -u - engine.rs: recalculate verdict action after paranoia filtering - rendered.rs: validate ASCII hex before slicing, clamp RGB 0-255 - terminal.rs: strip script/style block content before tag removal - cloaking.rs: skip comparison when baseline fetch returns empty body Review fixes (Low): - mcp/tools.rs: shell-quote URL in synthetic curl command - mcp/tools.rs: use is_file() instead of exists() for file inputs - extract.rs: handle trailing slash in schemeless host detection

crates/tirith-core/src/checkpoint.rs

macroscopeapp · 2026-02-24T16:26:01Z

crates/tirith-core/src/mcp/dispatcher.rs

+                    "tirith mcp-server: line exceeds {MAX_LINE_BYTES} byte limit, dropping"
+                );
+                // Drain remainder of this oversized line (up to next newline)
+                let mut discard = Vec::new();


🟠 High mcp/dispatcher.rs:40

When draining oversized lines, read_until has no size limit, so an attacker can exhaust memory with a newline-free stream. Consider using a bounded read loop (e.g., read fixed chunks until newline or limit reached) instead of unbounded read_until.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/mcp/dispatcher.rs around line 40: When draining oversized lines, `read_until` has no size limit, so an attacker can exhaust memory with a newline-free stream. Consider using a bounded read loop (e.g., read fixed chunks until newline or limit reached) instead of unbounded `read_until`. Evidence trail: crates/tirith-core/src/mcp/dispatcher.rs lines 22-44 at REVIEWED_COMMIT: Line 40-42 shows `let mut discard = Vec::new(); ... let _ = input.read_until(b'\n', &mut discard);` - this `read_until` reads directly from `input` with no size bound, while the main read path at lines 29-31 uses `.take(MAX_LINE_BYTES as u64 + 1)` to limit reads.

macroscopeapp · 2026-02-24T16:26:01Z

crates/tirith-core/src/rules/terminal.rs

+    static SCRIPT_STYLE: Lazy<Regex> =
+        Lazy::new(|| Regex::new(r"(?is)<(?:script|style)[^>]*>.*?</(?:script|style)>").unwrap());
+    static TAGS: Lazy<Regex> = Lazy::new(|| Regex::new(r"<[^>]*>").unwrap());
+    static ENTITIES: Lazy<Regex> = Lazy::new(|| Regex::new(r"&[a-zA-Z]+;|&#\d+;").unwrap());


🟢 Low rules/terminal.rs:408

The ENTITIES regex doesn't match hex entities like A. Consider adding &#x[a-fA-F0-9]+; to the pattern to avoid inflated visible_len calculations.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/rules/terminal.rs around line 408: The `ENTITIES` regex doesn't match hex entities like `A`. Consider adding `&#x[a-fA-F0-9]+;` to the pattern to avoid inflated `visible_len` calculations. Evidence trail: crates/tirith-core/src/rules/terminal.rs:408 (REVIEWED_COMMIT) - The ENTITIES regex is `r"&[a-zA-Z]+;|&#\d+;"` which matches named entities and decimal numeric entities but not hexadecimal entities (`&#x[0-9a-fA-F]+;`). HTML spec supports hex numeric character references (e.g., https://www.w3.org/TR/html52/syntax.html#character-references). The function doc comment states it extracts "approximate visible text content" and ENTITIES.replace_all is used to strip entities for length calculation.

macroscopeapp · 2026-02-24T16:26:01Z

crates/tirith-core/src/checkpoint.rs

+        let current_sha = sha256_file(current_path)
+            .map_err(|e| format!("diff hash {}: {e}", entry.original_path))?;
+        if current_sha != entry.sha256 {


🟢 Low src/checkpoint.rs:378

Race condition: file could be deleted between exists() check and sha256_file(). Consider catching NotFound errors from sha256_file and treating them as Deleted rather than failing the entire operation.

- let current_sha = sha256_file(current_path) - .map_err(|e| format!("diff hash {}: {e}", entry.original_path))?; - if current_sha != entry.sha256 { + let current_sha = match sha256_file(current_path) { + Ok(sha) => sha, + Err(e) if e.contains("No such file") => { + seen_paths.insert(entry.original_path.clone()); + diffs.push(DiffEntry { + path: entry.original_path.clone(), + status: DiffStatus::Deleted, + checkpoint_sha256: entry.sha256.clone(), + current_sha256: None, + }); + continue; + } + Err(e) => return Err(format!("diff hash {}: {e}", entry.original_path)), + }; + if current_sha != entry.sha256 {

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/checkpoint.rs around lines 378-380: Race condition: file could be deleted between `exists()` check and `sha256_file()`. Consider catching `NotFound` errors from `sha256_file` and treating them as `Deleted` rather than failing the entire operation. Evidence trail: crates/tirith-core/src/checkpoint.rs lines 367-378 (REVIEWED_COMMIT): shows the exists() check on line 367 followed by sha256_file() call on line 377-378 with error propagation via ?. crates/tirith-core/src/checkpoint.rs lines 593-605 (REVIEWED_COMMIT): sha256_file() implementation shows it uses fs::File::open(path) which returns NotFound error if file doesn't exist.

crates/tirith-core/src/engine.rs

crates/tirith/src/cli/scan.rs

macroscopeapp · 2026-02-24T16:36:00Z

crates/tirith-core/src/scan.rs

+    }
+
+    ScanResult {
+        scanned_count: file_results.len(),


🟢 Low src/scan.rs:111

scanned_count only counts files where scan_single_file returned Some. Files that fail or return None are neither in scanned_count nor skipped_count. Consider setting scanned_count to files.len() (attempted) or tracking failed files separately.

Suggested change

scanned_count: file_results.len(),

scanned_count: files.len(),

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/scan.rs around line 111: `scanned_count` only counts files where `scan_single_file` returned `Some`. Files that fail or return `None` are neither in `scanned_count` nor `skipped_count`. Consider setting `scanned_count` to `files.len()` (attempted) or tracking failed files separately. Evidence trail: crates/tirith-core/src/scan.rs lines 103-115 (REVIEWED_COMMIT): loop pushes to `file_results` only on `Some`, `scanned_count: file_results.len()`. Lines 93-99: `skipped_count` only set for truncation. Lines 118-128: `scan_single_file` returns `None` on metadata/read failures via `.ok()?`.

time 0.3.47 fixes the CVE but requires Rust 1.88. Pin to 0.3.46 and ignore the advisory — lopdf (our only consumer) doesn't parse user-provided RFC 2822 dates, so the DoS is not exploitable.

crates/tirith-core/src/engine.rs

A single & (background operator) should break command segments just like |, ;, &&, and newline. Without this, `safe && malicious` could slip the malicious part past analysis when only the first segment is checked.

macroscopeapp · 2026-02-24T17:05:57Z

crates/tirith-core/src/mcp/tools.rs

+        Some(c) => c,
+        None => return tool_error("Missing required parameter: command"),
+    };
+    let shell = match args


🟢 Low mcp/tools.rs:172

The shell match only handles "powershell" explicitly—"fish" and "pwsh" fall through to Posix. Consider using ShellType::from_str which already handles all variants.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/mcp/tools.rs around line 172: The `shell` match only handles `"powershell"` explicitly—`"fish"` and `"pwsh"` fall through to `Posix`. Consider using `ShellType::from_str` which already handles all variants. Evidence trail: crates/tirith-core/src/mcp/tools.rs lines 173-178 (REVIEWED_COMMIT) - shows manual match only handling "powershell" explicitly with catch-all to Posix crates/tirith-core/src/tokenize.rs lines 6-21 (REVIEWED_COMMIT) - shows ShellType enum with Fish variant and FromStr implementation that handles "fish" and "pwsh" correctly

crates/tirith-core/src/engine.rs

…ments

- Merge origin/main (glibc build fix) - Use exact match == TIRITH=0 (prevents false bypass) - Skip flags in resolve_command_wrapper - Remove dead code in is_tirith_command - Remove quote-stripping from is_env_assignment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

crates/tirith-core/src/engine.rs

macroscopeapp · 2026-02-24T17:33:56Z

crates/tirith-core/src/mcp/dispatcher.rs

+                "tools/call" => {
+                    let result = handle_tools_call(&params);
+                    JsonRpcResponse::ok(id, serde_json::to_value(result).unwrap_or(json!({})))
+                }


🟢 Low mcp/dispatcher.rs:183

Serialization errors are being swallowed, causing invalid or missing JSON‑RPC responses. Suggest handling serde failures explicitly: log them and surface an -32603 error instead of returning {} or nothing.

- "tools/call" => { - let result = handle_tools_call(¶ms); - JsonRpcResponse::ok(id, serde_json::to_value(result).unwrap_or(json!({}))) - } + "tools/call" => { + let result = handle_tools_call(¶ms); + match serde_json::to_value(result) { + Ok(v) => JsonRpcResponse::ok(id, v), + Err(e) => JsonRpcResponse::err(id, JsonRpcError { + code: -32603, + message: format!("Internal error: {e}"), + data: None, + }), + } + }

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/mcp/dispatcher.rs around lines 183-186: Serialization errors are being swallowed, causing invalid or missing JSON‑RPC responses. Suggest handling serde failures explicitly: log them and surface an `-32603` error instead of returning `{}` or nothing. Evidence trail: crates/tirith-core/src/mcp/dispatcher.rs lines 183-186 (REVIEWED_COMMIT): The `tools/call` handler contains `JsonRpcResponse::ok(id, serde_json::to_value(result).unwrap_or(json!({})))` which uses `.unwrap_or(json!({}))` to silently swallow serialization errors and return an empty object as a success response.

crates/tirith-core/src/license.rs

macroscopeapp · 2026-02-24T17:33:56Z

crates/tirith-core/src/engine.rs

+    // If we truncated (didn't consume all chars), add marker
+    if needs_truncation {
+        // Trim any incomplete bracket marker at the end
+        if let Some(last_open) = out.rfind('[') {


🟢 Low src/engine.rs:761

The incomplete bracket trimming logic (lines 761-764) appears to be dead code. Since escape sequences like [LBRACK] are only appended when they fit completely (line 752-755), and verbatim characters never include [, there's no scenario where out ends with an unmatched [. Consider removing this block or documenting why it's kept as defensive code.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/engine.rs around line 761: The incomplete bracket trimming logic (lines 761-764) appears to be dead code. Since escape sequences like `[LBRACK]` are only appended when they fit completely (line 752-755), and verbatim characters never include `[`, there's no scenario where `out` ends with an unmatched `[`. Consider removing this block or documenting why it's kept as defensive code. Evidence trail: crates/tirith-core/src/engine.rs lines 726, 741-748, 752-755, 761-764 at REVIEWED_COMMIT. Line 726 shows `[` is explicitly matched and escaped to `[LBRACK]`. Lines 752-755 show escape sequences are only added if they fit completely (`if out.len() + escaped.len() > content_cap { break; }`). Lines 741-748 show the verbatim branch excludes `[` since it's handled earlier. Lines 761-764 show the dead code that trims incomplete brackets - a condition that can never be true given the implementation.

- Use JSON-RPC -32603 (Internal error) for resources/read failures instead of -32602 (Invalid params) — uri validation is separate - Skip duplicate audit log_verdict in check.rs and paste.rs when bypass was honored (analyze() already logged it) - Fix "chars" → "bytes" in audit redact message (Rust .len() returns byte count) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ogical test - engine.rs: Use sanitized length (full_len) in truncation marker instead of raw input bytes (total), and document defensive bracket-trimming code - license.rs: Reject tokens with non-string exp values (e.g. numeric timestamps) that previously bypassed expiry validation entirely - license.rs: Replace tautological assertion in test_current_tier_defaults_community with a simple no-panic check - license.rs: Add test_decode_numeric_exp_rejected regression test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

crates/tirith-core/src/engine.rs

- Quoted TIRITH='0' and TIRITH="0" were not detected as bypass because split_raw_words preserves quotes but comparison was literal "TIRITH=0". Extract is_tirith_bypass_assignment() helper that strips quotes from the value portion before comparing. - Increase MARKER_RESERVE from 40 to 50 to accommodate the longer "bytes sanitized" marker text for very large inputs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Resolve all conflicts by taking main's improved code as baseline, then surgically adding PR 41's unique features on top: - Pro/Team enrichment (enrich_pro, enrich_team, evidence_summary) - Paranoia filtering (filter_findings_by_paranoia, filter_findings_by_paranoia_vec) - Rendered content rules in FileScan pipeline - Clipboard HTML analysis (clipboard_html field + check_clipboard_html) - Finding struct: mitre_id + custom_rule_id fields - sanitize_view for enrichment text safety - ConfigSuspiciousIndicator added to all exhaustive RuleId matches All 443 tests pass. cargo fmt + clippy clean. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

macroscopeapp · 2026-02-24T19:55:05Z

crates/tirith-core/src/rules/terminal.rs

+        assert_eq!(strip_html_tags("<p>Hello</p>"), "Hello");
+        assert_eq!(strip_html_tags("<div><span>A</span> <b>B</b></div>"), "A B");
+        assert_eq!(strip_html_tags("No tags here"), "No tags here");
+        assert_eq!(strip_html_tags("&amp; &lt;"), "");


🟢 Low rules/terminal.rs:517

The expected output "" seems incorrect. & and < are HTML entities, not tags. Consider updating the assertion to expect "& <" (if decoding) or "& <" (if preserving), or rename the function if it intentionally strips entities too.

Suggested change

assert_eq!(strip_html_tags("& <"), "");

assert_eq!(strip_html_tags("& <"), "& <");

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/rules/terminal.rs around line 517: The expected output `""` seems incorrect. `&` and `<` are HTML entities, not tags. Consider updating the assertion to expect `"& <"` (if decoding) or `"& <"` (if preserving), or rename the function if it intentionally strips entities too. Evidence trail: crates/tirith-core/src/rules/terminal.rs:517 - test assertion `assert_eq!(strip_html_tags("& <"), "");` crates/tirith-core/src/rules/terminal.rs:422-438 - function implementation with doc comment "Strip HTML tags to extract approximate visible text content" but line 430 defines ENTITIES regex and line 435 strips entities crates/tirith-core/src/rules/terminal.rs:392-395 - usage context showing function is used for length comparison in clipboard hidden content detection

macroscopeapp · 2026-02-24T19:55:05Z

crates/tirith-core/src/license.rs

+    fn test_decode_case_insensitive() {
+        let key = make_key("PRO", "2099-12-31");
+        assert_eq!(decode_tier(&key), Some(Tier::Pro));


🟢 Low src/license.rs:204

Test name suggests case insensitivity is verified, but only uppercase "PRO" is tested. Consider adding assertions for "pro" or "Pro" to actually validate case-insensitive behavior.

Suggested change

fn test_decode_case_insensitive() {

let key = make_key("PRO", "2099-12-31");

assert_eq!(decode_tier(&key), Some(Tier::Pro));

fn test_decode_case_insensitive() {

let key = make_key("PRO", "2099-12-31");

assert_eq!(decode_tier(&key), Some(Tier::Pro));

let key_lower = make_key("pro", "2099-12-31");

assert_eq!(decode_tier(&key_lower), Some(Tier::Pro));

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/license.rs around lines 204-206: Test name suggests case insensitivity is verified, but only uppercase `"PRO"` is tested. Consider adding assertions for `"pro"` or `"Pro"` to actually validate case-insensitive behavior. Evidence trail: crates/tirith-core/src/license.rs lines 204-207 (test `test_decode_case_insensitive` at REVIEWED_COMMIT) - test only uses `make_key("PRO", ...)` with uppercase. crates/tirith-core/src/license.rs lines 113-119 (decode_tier implementation at REVIEWED_COMMIT) - shows `.to_lowercase()` is used, confirming case-insensitivity exists but test doesn't verify multiple case variants.

macroscopeapp · 2026-02-24T19:55:05Z

crates/tirith-core/src/rules/rendered.rs

+                        let d = pdf_operand_to_f64(&op.operands[3]).unwrap_or(1.0);
+                        let scale = a.abs().min(d.abs());
+                        if (0.0..1.0).contains(&scale) {
+                            current_scale = scale;


🟡 Medium rules/rendered.rs:531

The cm operator concatenates matrices, so scales should multiply with the existing CTM rather than replace it. Consider changing line 531 to current_scale *= scale; to properly accumulate transformations.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/rules/rendered.rs around line 531: The `cm` operator concatenates matrices, so scales should multiply with the existing CTM rather than replace it. Consider changing line 531 to `current_scale *= scale;` to properly accumulate transformations. Evidence trail: crates/tirith-core/src/rules/rendered.rs lines 525-535 (REVIEWED_COMMIT) - shows `cm` handler setting `current_scale = scale;` instead of multiplying. PDF specification via pdfjs-docs (https://github.com/MeiKatz/pdfjs-docs): 'Modify the current transformation matrix (CTM) by concatenating the specified matrix' confirms cm concatenates (multiplies) matrices. libharu issue #169 (https://github.com/libharu/libharu/issues/169): 'cm operator is an operator that multiplies LEFT to the current matrix' further confirms matrix multiplication semantics.

macroscopeapp · 2026-02-24T19:55:05Z

crates/tirith-core/src/mcp/dispatcher.rs

            id,
            JsonRpcError {
-                code: -32602,
+                code: -32603, // Internal error (not invalid params — uri validated above)


🟢 Low mcp/dispatcher.rs:306

The error code -32603 is incorrect for unknown resources. The validation above only checks that uri exists and is a string—not that it refers to a valid resource. When read_content returns "Unknown resource", that's still invalid params (-32602), not an internal error.

Suggested change

code: -32603, // Internal error (not invalid params — uri validated above)

code: -32602,

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/mcp/dispatcher.rs around line 306: The error code `-32603` is incorrect for unknown resources. The validation above only checks that `uri` exists and is a string—not that it refers to a valid resource. When `read_content` returns "Unknown resource", that's still invalid params (`-32602`), not an internal error. Evidence trail: crates/tirith-core/src/mcp/dispatcher.rs lines 280-310 (REVIEWED_COMMIT): shows validation only checks uri exists/is a string (lines 281-297), then uses -32603 for read_content errors (line 306) with misleading comment. crates/tirith-core/src/mcp/resources.rs line 75: shows read_content returns Err("Unknown resource: {uri}") for unrecognized URIs. JSON-RPC 2.0 specification defines -32602 as 'Invalid params' for invalid method parameters and -32603 as 'Internal error' for server-side errors.

macroscopeapp · 2026-02-24T19:55:05Z

crates/tirith-core/src/rules/rendered.rs

+                "clip:rect(0...)",
+            ),
+            (
+                Regex::new(r#"(?i)position\s*:\s*(?:absolute|fixed)[^;]*(?:left|top)\s*:\s*-9999"#)


🟡 Medium rules/rendered.rs:66

The [^;]* between position:absolute/fixed and left/top:-9999 won't match across semicolons, but valid CSS separates these with ; (e.g., position:absolute;left:-9999px). Consider changing [^;]* to [^}]* or similar to span multiple declarations.

Suggested change

Regex::new(r#"(?i)position\s*:\s*(?:absolute|fixed)[^;]*(?:left|top)\s*:\s*-9999"#)

Regex::new(r#"(?i)position\s*:\s*(?:absolute|fixed)[^}]*(?:left|top)\s*:\s*-9999"#)

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:

In file crates/tirith-core/src/rules/rendered.rs around line 66: The `[^;]*` between `position:absolute/fixed` and `left/top:-9999` won't match across semicolons, but valid CSS separates these with `;` (e.g., `position:absolute;left:-9999px`). Consider changing `[^;]*` to `[^}]*` or similar to span multiple declarations. Evidence trail: crates/tirith-core/src/rules/rendered.rs line 66: `Regex::new(r#"(?i)position\s*:\s*(?:absolute|fixed)[^;]*(?:left|top)\s*:\s*-9999"#)` - The regex uses `[^;]*` which matches any character except semicolons, but valid CSS like `position:absolute;left:-9999px` uses semicolons between declarations, so the pattern fails to match properly formatted CSS.

sheeki03 and others added 14 commits February 9, 2026 23:32

Rename misleading test: null id is treated as request, not notification

2184d9c

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add --interactive flag to shell hooks for correct bypass policy

53879d9

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix doctor to check fish conf.d and all profile candidates

7cdf2ed

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix schemeless detection false-negative for TLD-overlap domains with …

d5bfd6c

…paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

macroscopeapp bot reviewed Feb 24, 2026

View reviewed changes

crates/tirith-core/src/rules/cloaking.rs Show resolved Hide resolved

crates/tirith-core/src/engine.rs Outdated Show resolved Hide resolved

crates/tirith-core/src/engine.rs Show resolved Hide resolved

crates/tirith-core/src/rules/cloaking.rs Show resolved Hide resolved

macroscopeapp bot reviewed Feb 24, 2026

View reviewed changes

sheeki03 force-pushed the feat/part8-pro-enrichments branch from 19a5029 to 40c0b2d Compare February 24, 2026 16:39

Pin time crate for MSRV 1.83, ignore RUSTSEC-2026-0009

3fefd7d

time 0.3.47 fixes the CVE but requires Rust 1.88. Pin to 0.3.46 and ignore the advisory — lopdf (our only consumer) doesn't parse user-provided RFC 2822 dates, so the DoS is not exploitable.

sheeki03 force-pushed the feat/part8-pro-enrichments branch from 40c0b2d to 3fefd7d Compare February 24, 2026 16:46

macroscopeapp bot reviewed Feb 24, 2026

View reviewed changes

crates/tirith-core/src/engine.rs Outdated Show resolved Hide resolved

Treat single & as segment boundary in split_raw_words

76567fe

A single & (background operator) should break command segments just like |, ;, &&, and newline. Without this, `safe && malicious` could slip the malicious part past analysis when only the first segment is checked.

macroscopeapp bot reviewed Feb 24, 2026

View reviewed changes

sheeki03 and others added 2 commits February 24, 2026 22:55

Merge remote-tracking branch 'origin/main' into feat/part8-pro-enrich…

8df9cef

…ments

macroscopeapp bot reviewed Feb 24, 2026

View reviewed changes

sheeki03 and others added 2 commits February 24, 2026 23:46

macroscopeapp bot reviewed Feb 24, 2026

View reviewed changes

crates/tirith-core/src/engine.rs Outdated Show resolved Hide resolved

macroscopeapp bot reviewed Feb 24, 2026

View reviewed changes

crates/tirith-core/src/engine.rs Outdated Show resolved Hide resolved

sheeki03 and others added 2 commits February 24, 2026 23:56

sheeki03 merged commit 849f625 into main Feb 24, 2026
10 checks passed

sheeki03 deleted the feat/part8-pro-enrichments branch February 24, 2026 19:48

macroscopeapp bot reviewed Feb 24, 2026

View reviewed changes

	scanned_count: file_results.len(),
	scanned_count: files.len(),

	assert_eq!(strip_html_tags("& <"), "");
	assert_eq!(strip_html_tags("& <"), "& <");

	code: -32603, // Internal error (not invalid params — uri validated above)
	code: -32602,

	Regex::new(r#"(?i)position\s:\s(?:absolute\|fixed)[^;](?:left\|top)\s:\s*-9999"#)
	Regex::new(r#"(?i)position\s:\s(?:absolute\|fixed)[^}](?:left\|top)\s:\s*-9999"#)

Conversation

sheeki03 commented Feb 24, 2026 • edited by macroscopeapp bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Add Pro-tier enrichment pipeline, paranoia filtering, and clipboard HTML detection with rules::terminal::check_clipboard_html and CLI paste --html (50+ char discrepancy threshold)

📍Where to Start

Uh oh!

Uh oh!

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

macroscopeapp bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

sheeki03 commented Feb 24, 2026 •

edited by macroscopeapp bot

Loading

Add Pro-tier enrichment pipeline, paranoia filtering, and clipboard HTML detection with `rules::terminal::check_clipboard_html` and CLI `paste --html` (50+ char discrepancy threshold)