fix(security): add input encoding detection and obfuscation decoder#5923
fix(security): add input encoding detection and obfuscation decoder#5923dan-redcupit wants to merge 1 commit intoopenclaw:mainfrom
Conversation
New security modules to detect and decode obfuscated attack payloads: - src/security/obfuscation-decoder.ts: decodes Base64, ROT13, leetspeak, pig latin, syllable splitting, and Unicode homoglyphs - src/security/input-preprocessing.ts: applies detection to user input - src/security/input-preprocessing.test.ts: regression tests with ZeroLeaks attack payloads Techniques decoded: - Base64 encoded instructions - ROT13 (vtaber -> ignore) - Leetspeak (5y5t3m -> system) - Pig Latin (omptpray -> prompt) - Syllable splitting (ig-nore -> ignore) - Unicode homoglyphs (Cyrillic/Greek lookalikes) Part of Operation CLAW FORTRESS security hardening (ZeroLeaks remediation). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| // Check for ROT13 encoded keywords | ||
| if (ROT13_KEYWORDS.test(content)) { | ||
| encodingTypes.push("rot13"); | ||
| suspiciousKeywords.push(...extractMatches(content, ROT13_KEYWORDS).map((w) => `[ROT13:${w}]`)); | ||
| } |
There was a problem hiding this comment.
[P0] RegExp.test with a global regex mutates lastIndex, which can make detections non-deterministic across calls and even within this function.
Here ROT13_KEYWORDS/REVERSED_KEYWORDS are /.../gi and you call .test(content) and later extractMatches(content, ROT13_KEYWORDS). Because .test advances lastIndex, a previous call can cause .test to return false for content that actually matches, and can also affect subsequent .test calls on the same regex instance.
This can be fixed by removing the g flag from the .test check, or by using a fresh RegExp instance for .test (or resetting lastIndex = 0 before each .test).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/security/input-preprocessing.ts
Line: 95:99
Comment:
[P0] `RegExp.test` with a global regex mutates `lastIndex`, which can make detections non-deterministic across calls and even within this function.
Here `ROT13_KEYWORDS`/`REVERSED_KEYWORDS` are `/.../gi` and you call `.test(content)` and later `extractMatches(content, ROT13_KEYWORDS)`. Because `.test` advances `lastIndex`, a previous call can cause `.test` to return `false` for content that actually matches, and can also affect subsequent `.test` calls on the same regex instance.
This can be fixed by removing the `g` flag from the `.test` check, or by using a fresh `RegExp` instance for `.test` (or resetting `lastIndex = 0` before each `.test`).
How can I resolve this? If you propose a fix, please make it concise.| // Keywords that indicate prompt injection when found in decoded content | ||
| const INSTRUCTION_KEYWORDS = | ||
| /\b(system|prompt|instructions?|ignore|previous|reveal|secrets?|confidential|bypass|override|admin|root|sudo|passwords?|tokens?|keys?|credentials?)\b/gi; | ||
|
|
There was a problem hiding this comment.
[P1] INSTRUCTION_KEYWORDS is global (/gi) and is used with .test(...) in multiple places.
Even though extractMatches clones the regex, the direct .test(decoded) / .test(result.decoded) calls still mutate lastIndex. If detectEncodedContent is called multiple times, .test can start returning false intermittently for the same input. Consider dropping the g flag for INSTRUCTION_KEYWORDS (since you already have extractMatches for collecting matches) or cloning/resetting before .test.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/security/input-preprocessing.ts
Line: 33:36
Comment:
[P1] `INSTRUCTION_KEYWORDS` is global (`/gi`) and is used with `.test(...)` in multiple places.
Even though `extractMatches` clones the regex, the direct `.test(decoded)` / `.test(result.decoded)` calls still mutate `lastIndex`. If `detectEncodedContent` is called multiple times, `.test` can start returning `false` intermittently for the same input. Consider dropping the `g` flag for `INSTRUCTION_KEYWORDS` (since you already have `extractMatches` for collecting matches) or cloning/resetting before `.test`.
How can I resolve this? If you propose a fix, please make it concise.| export function reverseText(text: string): string { | ||
| return text.split("").toReversed().join(""); | ||
| } |
There was a problem hiding this comment.
[P1] reverseText uses toReversed(), which requires a relatively new JS runtime and will throw on older Node versions.
If OpenClaw supports Node versions without Array.prototype.toReversed (or if tests run in such an environment), this will be a runtime error. Safer is text.split("").reverse().join("") (mutates the temp array anyway) or feature-detecting toReversed.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/security/obfuscation-decoder.ts
Line: 167:169
Comment:
[P1] `reverseText` uses `toReversed()`, which requires a relatively new JS runtime and will throw on older Node versions.
If OpenClaw supports Node versions without `Array.prototype.toReversed` (or if tests run in such an environment), this will be a runtime error. Safer is `text.split("").reverse().join("")` (mutates the temp array anyway) or feature-detecting `toReversed`.
How can I resolve this? If you propose a fix, please make it concise.|
This pull request has been automatically marked as stale due to inactivity. |
bfc1ccb to
f92900f
Compare
This comment was marked as spam.
This comment was marked as spam.
|
This pull request has been automatically marked as stale due to inactivity. |
Summary
Adds modules to detect and decode obfuscated prompt injection attacks.
Part 2 of 3 from Operation CLAW FORTRESS security hardening (split from #5863 for easier review).
New Files
src/security/obfuscation-decoder.tssrc/security/input-preprocessing.tssrc/security/input-preprocessing.test.tsObfuscation Techniques Decoded
aWdub3JlIHByZXZpb3Vzvtaber cerivbhf5y5t3m pr0mptomptpray eviousprayig-nore pre-vi-oussуstеm(Cyrillic)ZeroLeaks Findings Addressed
Test Plan
🔒 Generated with Claude Code
Greptile Overview
Greptile Summary
This PR adds a small security-focused preprocessing layer under
src/security/to detect and decode obfuscated prompt-injection content (Base64/ROT13/reversed, plus deobfuscation helpers for leetspeak, syllable splitting, and homoglyph normalization). Unit tests cover the new decoders and some “ZeroLeaks” regression payloads.Main things to double-check before merging: the use of global regexes with
.test()indetectEncodedContentcan make detection flaky due tolastIndexmutation, andreverseTextrelies onArray.prototype.toReversed()which may not exist in all supported runtimes.Confidence Score: 3/5
RegExp.testis used on/.../gregexes (which mutateslastIndexand can lead to intermittent false negatives), andreverseTextusestoReversed()which may not exist depending on the Node/runtime version used in CI or production.(5/5) You can turn off certain types of comments like style here!