fix: strip ANSI escape codes from session persistence hooks#684
Conversation
Windows terminals emit control sequences (cursor movement, screen clearing) that leaked into session.tmp files and were injected verbatim into Claude's context on the next session start. Add a comprehensive stripAnsi() to utils.js that handles CSI, OSC, charset selection, and bare ESC sequences. Apply it in session-end.js (when extracting user messages from the transcript) and in session-start.js (safety net before injecting session content).
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds a new Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Analysis CompleteGenerated ECC bundle from 500 commits | Confidence: 100% View Pull Request #685Repository Profile
Detected Workflows (8)
Generated Instincts (16)
After merging, import with: Files
|
Greptile SummaryThis PR fixes a bug (#642) where Windows terminal control sequences ( Changes:
The fix is well-structured with a defence-in-depth approach (strip at write and at read). The regex is correct for the stated scope, with a minor gap: bare-ESC + lowercase letter sequences (e.g. Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Terminal as Windows Terminal
participant Hook as session-end.js
participant Utils as utils.js (stripAnsi)
participant File as session.tmp
participant Start as session-start.js
participant Claude as Claude Context
Terminal->>Hook: JSONL transcript (may contain ANSI in user messages)
Hook->>Utils: stripAnsi(text)
Utils-->>Hook: cleaned text (ANSI removed)
Hook->>File: write cleaned session summary
Note over File: Previously: ANSI codes persisted here
File->>Start: readFile(latest.path)
Start->>Utils: stripAnsi(content) [safety net]
Utils-->>Start: cleaned content
Start->>Claude: output("Previous session summary:\n" + cleanedContent)
Note over Claude: Context is now free of ANSI escape sequences
|
| function stripAnsi(str) { | ||
| if (typeof str !== 'string') return ''; | ||
| // eslint-disable-next-line no-control-regex | ||
| return str.replace(/\x1b(?:\[[0-9;?]*[A-Za-z]|\][^\x07\x1b]*(?:\x07|\x1b\\)|\([A-Z]|[A-Z])/g, ''); |
There was a problem hiding this comment.
Bare ESC + lowercase letter sequences not stripped
The last alternation [A-Z] only covers uppercase bare-ESC sequences (e.g. \x1bM reverse index, \x1bD index). Lowercase variants such as \x1bc (RIS — Reset to Initial State, a commonly emitted sequence on some Windows terminal emulators) would not be matched and would remain in the cleaned output. Extending the character class to [A-Za-z] would close this gap:
| return str.replace(/\x1b(?:\[[0-9;?]*[A-Za-z]|\][^\x07\x1b]*(?:\x07|\x1b\\)|\([A-Z]|[A-Z])/g, ''); | |
| return str.replace(/\x1b(?:\[[0-9;?]*[A-Za-z]|\][^\x07\x1b]*(?:\x07|\x1b\\)|\([A-Za-z]|[A-Za-z])/g, ''); |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dfd2860c74
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| function stripAnsi(str) { | ||
| if (typeof str !== 'string') return ''; | ||
| // eslint-disable-next-line no-control-regex | ||
| return str.replace(/\x1b(?:\[[0-9;?]*[A-Za-z]|\][^\x07\x1b]*(?:\x07|\x1b\\)|\([A-Z]|[A-Z])/g, ''); |
There was a problem hiding this comment.
Match non-letter CSI terminators in stripAnsi
stripAnsi() only removes CSI sequences whose final byte is [A-Za-z], so valid control sequences like bracketed-paste markers (\x1b[200~ / \x1b[201~) are left intact. In terminals that emit those markers around pasted input, the pasted prompt will still be written into session.tmp and reinjected on the next session-start, which means this fix does not actually eliminate a common class of leaked terminal escape codes.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
1 issue found across 4 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="scripts/lib/utils.js">
<violation number="1" location="scripts/lib/utils.js:482">
P2: The ANSI regex is too narrow and misses valid escape sequences (e.g. CSI ending in `~` and charset selectors like `ESC(0`), so control codes can still persist.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| function stripAnsi(str) { | ||
| if (typeof str !== 'string') return ''; | ||
| // eslint-disable-next-line no-control-regex | ||
| return str.replace(/\x1b(?:\[[0-9;?]*[A-Za-z]|\][^\x07\x1b]*(?:\x07|\x1b\\)|\([A-Z]|[A-Z])/g, ''); |
There was a problem hiding this comment.
P2: The ANSI regex is too narrow and misses valid escape sequences (e.g. CSI ending in ~ and charset selectors like ESC(0), so control codes can still persist.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/lib/utils.js, line 482:
<comment>The ANSI regex is too narrow and misses valid escape sequences (e.g. CSI ending in `~` and charset selectors like `ESC(0`), so control codes can still persist.</comment>
<file context>
@@ -464,6 +464,24 @@ function countInFile(filePath, pattern) {
+function stripAnsi(str) {
+ if (typeof str !== 'string') return '';
+ // eslint-disable-next-line no-control-regex
+ return str.replace(/\x1b(?:\[[0-9;?]*[A-Za-z]|\][^\x07\x1b]*(?:\x07|\x1b\\)|\([A-Z]|[A-Z])/g, '');
+}
+
</file context>
| return str.replace(/\x1b(?:\[[0-9;?]*[A-Za-z]|\][^\x07\x1b]*(?:\x07|\x1b\\)|\([A-Z]|[A-Z])/g, ''); | |
| return str.replace(/\x1b(?:\[[0-?]*[ -/]*[@-~]|\][^\x07\x1b]*(?:\x07|\x1b\\)|\([0-~]|[@-Z\\-_])/g, ''); |
…#642) (affaan-m#684) Windows terminals emit control sequences (cursor movement, screen clearing) that leaked into session.tmp files and were injected verbatim into Claude's context on the next session start. Add a comprehensive stripAnsi() to utils.js that handles CSI, OSC, charset selection, and bare ESC sequences. Apply it in session-end.js (when extracting user messages from the transcript) and in session-start.js (safety net before injecting session content).
Summary
Closes #642
Windows terminals emit control sequences (
\x1b[H,\x1b[2J,\x1b[3J) that leaked intosession.tmpfiles and were injected verbatim into Claude's context on the next session start, making content unreadable.scripts/lib/utils.js: AddedstripAnsi()utility that handles all ANSI escape families — CSI sequences (cursor movement, erase, colors), OSC sequences (window titles, hyperlinks), charset selection, and bare ESC sequencesscripts/hooks/session-end.js: ApplystripAnsi()when extracting user message text from the JSONL transcript (primary fix — prevents ANSI from being persisted)scripts/hooks/session-start.js: ApplystripAnsi()before injecting session content into Claude's context (safety net for existing contaminated session files)tests/lib/utils.test.js: 11 new tests covering SGR colors, cursor movement, screen clear, erase line, OSC, charset selection, DEC private modes, mixed sequences, non-string input, and clean passthroughTest plan
node tests/lib/utils.test.js— 169/169 pass (11 new stripAnsi tests)node tests/run-all.js— 1430/1432 pass (2 pre-existing failures unrelated to this PR)Summary by cubic
Strip ANSI escape codes from session persistence to stop terminal sequences from polluting saved summaries and Claude context. Fixes garbled text on Windows.
stripAnsi()inscripts/lib/utils.js(handles CSI, OSC, charset, bare ESC).scripts/hooks/session-end.jswhen extracting user messages.scripts/hooks/session-start.jsbefore injecting previous session content.Written for commit dfd2860. Summary will update on new commits.
Summary by CodeRabbit
Release Notes
Bug Fixes
Tests