Skip to content

fix(security): add input encoding detection and obfuscation decoder#5923

Open
dan-redcupit wants to merge 1 commit intoopenclaw:mainfrom
dan-redcupit:fix/input-encoding-detection
Open

fix(security): add input encoding detection and obfuscation decoder#5923
dan-redcupit wants to merge 1 commit intoopenclaw:mainfrom
dan-redcupit:fix/input-encoding-detection

Conversation

@dan-redcupit
Copy link

@dan-redcupit dan-redcupit commented Feb 1, 2026

Summary

Adds modules to detect and decode obfuscated prompt injection attacks.

Part 2 of 3 from Operation CLAW FORTRESS security hardening (split from #5863 for easier review).

New Files

File Purpose
src/security/obfuscation-decoder.ts Core decoding functions
src/security/input-preprocessing.ts Encoding detection API
src/security/input-preprocessing.test.ts Regression tests

Obfuscation Techniques Decoded

Technique Example Decoded
Base64 aWdub3JlIHByZXZpb3Vz "ignore previous"
ROT13 vtaber cerivbhf "ignore previous"
Leetspeak 5y5t3m pr0mpt "system prompt"
Pig Latin omptpray eviouspray "prompt previous"
Syllables ig-nore pre-vi-ous "ignore previous"
Homoglyphs sуstеm (Cyrillic) "system"

ZeroLeaks Findings Addressed

  • Encoding bypass attacks (Base64, ROT13)
  • Leetspeak obfuscation
  • Unicode homoglyph substitution

Test Plan

  • Unit tests for all decoders
  • Regression tests with ZeroLeaks payloads
  • Integration testing

🔒 Generated with Claude Code

Greptile Overview

Greptile Summary

This PR adds a small security-focused preprocessing layer under src/security/ to detect and decode obfuscated prompt-injection content (Base64/ROT13/reversed, plus deobfuscation helpers for leetspeak, syllable splitting, and homoglyph normalization). Unit tests cover the new decoders and some “ZeroLeaks” regression payloads.

Main things to double-check before merging: the use of global regexes with .test() in detectEncodedContent can make detection flaky due to lastIndex mutation, and reverseText relies on Array.prototype.toReversed() which may not exist in all supported runtimes.

Confidence Score: 3/5

  • This PR is close to safe to merge, but it has a couple of correctness/runtime hazards that could make detection flaky or crash in some environments.
  • The overall change is isolated and test-covered, but RegExp.test is used on /.../g regexes (which mutates lastIndex and can lead to intermittent false negatives), and reverseText uses toReversed() which may not exist depending on the Node/runtime version used in CI or production.
  • src/security/input-preprocessing.ts and src/security/obfuscation-decoder.ts

(5/5) You can turn off certain types of comments like style here!

New security modules to detect and decode obfuscated attack payloads:

- src/security/obfuscation-decoder.ts: decodes Base64, ROT13, leetspeak,
  pig latin, syllable splitting, and Unicode homoglyphs
- src/security/input-preprocessing.ts: applies detection to user input
- src/security/input-preprocessing.test.ts: regression tests with
  ZeroLeaks attack payloads

Techniques decoded:
- Base64 encoded instructions
- ROT13 (vtaber -> ignore)
- Leetspeak (5y5t3m -> system)
- Pig Latin (omptpray -> prompt)
- Syllable splitting (ig-nore -> ignore)
- Unicode homoglyphs (Cyrillic/Greek lookalikes)

Part of Operation CLAW FORTRESS security hardening (ZeroLeaks remediation).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +95 to +99
// Check for ROT13 encoded keywords
if (ROT13_KEYWORDS.test(content)) {
encodingTypes.push("rot13");
suspiciousKeywords.push(...extractMatches(content, ROT13_KEYWORDS).map((w) => `[ROT13:${w}]`));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P0] RegExp.test with a global regex mutates lastIndex, which can make detections non-deterministic across calls and even within this function.

Here ROT13_KEYWORDS/REVERSED_KEYWORDS are /.../gi and you call .test(content) and later extractMatches(content, ROT13_KEYWORDS). Because .test advances lastIndex, a previous call can cause .test to return false for content that actually matches, and can also affect subsequent .test calls on the same regex instance.

This can be fixed by removing the g flag from the .test check, or by using a fresh RegExp instance for .test (or resetting lastIndex = 0 before each .test).

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/security/input-preprocessing.ts
Line: 95:99

Comment:
[P0] `RegExp.test` with a global regex mutates `lastIndex`, which can make detections non-deterministic across calls and even within this function.

Here `ROT13_KEYWORDS`/`REVERSED_KEYWORDS` are `/.../gi` and you call `.test(content)` and later `extractMatches(content, ROT13_KEYWORDS)`. Because `.test` advances `lastIndex`, a previous call can cause `.test` to return `false` for content that actually matches, and can also affect subsequent `.test` calls on the same regex instance.

This can be fixed by removing the `g` flag from the `.test` check, or by using a fresh `RegExp` instance for `.test` (or resetting `lastIndex = 0` before each `.test`).

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +33 to +36
// Keywords that indicate prompt injection when found in decoded content
const INSTRUCTION_KEYWORDS =
/\b(system|prompt|instructions?|ignore|previous|reveal|secrets?|confidential|bypass|override|admin|root|sudo|passwords?|tokens?|keys?|credentials?)\b/gi;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] INSTRUCTION_KEYWORDS is global (/gi) and is used with .test(...) in multiple places.

Even though extractMatches clones the regex, the direct .test(decoded) / .test(result.decoded) calls still mutate lastIndex. If detectEncodedContent is called multiple times, .test can start returning false intermittently for the same input. Consider dropping the g flag for INSTRUCTION_KEYWORDS (since you already have extractMatches for collecting matches) or cloning/resetting before .test.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/security/input-preprocessing.ts
Line: 33:36

Comment:
[P1] `INSTRUCTION_KEYWORDS` is global (`/gi`) and is used with `.test(...)` in multiple places.

Even though `extractMatches` clones the regex, the direct `.test(decoded)` / `.test(result.decoded)` calls still mutate `lastIndex`. If `detectEncodedContent` is called multiple times, `.test` can start returning `false` intermittently for the same input. Consider dropping the `g` flag for `INSTRUCTION_KEYWORDS` (since you already have `extractMatches` for collecting matches) or cloning/resetting before `.test`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +167 to +169
export function reverseText(text: string): string {
return text.split("").toReversed().join("");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] reverseText uses toReversed(), which requires a relatively new JS runtime and will throw on older Node versions.

If OpenClaw supports Node versions without Array.prototype.toReversed (or if tests run in such an environment), this will be a runtime error. Safer is text.split("").reverse().join("") (mutates the temp array anyway) or feature-detecting toReversed.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/security/obfuscation-decoder.ts
Line: 167:169

Comment:
[P1] `reverseText` uses `toReversed()`, which requires a relatively new JS runtime and will throw on older Node versions.

If OpenClaw supports Node versions without `Array.prototype.toReversed` (or if tests run in such an environment), this will be a runtime error. Safer is `text.split("").reverse().join("")` (mutates the temp array anyway) or feature-detecting `toReversed`.

How can I resolve this? If you propose a fix, please make it concise.

@openclaw-barnacle
Copy link

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added the stale Marked as stale due to inactivity label Feb 15, 2026
@openclaw-barnacle openclaw-barnacle bot removed the stale Marked as stale due to inactivity label Feb 16, 2026
@mudrii

This comment was marked as spam.

@openclaw-barnacle
Copy link

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle bot added the stale Marked as stale due to inactivity label Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Marked as stale due to inactivity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants