fix: allow ZWJ inside emoji grapheme clusters in context/memory/skills scanners#12673
Open
witt3rd wants to merge 1 commit into
Open
fix: allow ZWJ inside emoji grapheme clusters in context/memory/skills scanners#12673witt3rd wants to merge 1 commit into
witt3rd wants to merge 1 commit into
Conversation
270e8d4 to
dbb92aa
Compare
d904348 to
bfa7c1b
Compare
42b0021 to
72f532e
Compare
|
Friendly nudge — this PR ships from a fork, so workflows are gated on maintainer approval and If a maintainer could click Approve and run workflows when convenient, the green ticks will land and reviewers will have CI signal to lean on. Happy to rebase or address review comments any time. Thanks! ⚒️ |
72f532e to
00d1659
Compare
00d1659 to
3841dc0
Compare
3841dc0 to
517e859
Compare
…s scanners SOUL.md, memory entries, and skill files containing emoji ZWJ sequences (e.g. 🧙♂️ = 🧙 + ZWJ + ♂ + VS16) were being silently blocked as prompt-injection attempts. ZWJ (U+200D) is in the invisible-char blocklist for good reason — it can hide text inside benign-looking strings — but it is also required inside emoji sequences and has no way to hide anything harmful there. Upstream PR NousResearch#28589 ("fix(cron): allow emoji ZWJ sequences in prompts", a salvage of NousResearch#28164) established the precedent for this fix, but only applied it to the cron prompt scanner via a cronjob_tools-local helper (_strip_legitimate_emoji_zwj). The identical false positive still affects the other three scanners that share the same invisible-char blocklist. This PR completes the job for those three, factoring the context check into a single shared helper instead of adding a fourth copy of the logic. Added shared utils.find_unsafe_invisibles() that context-checks ZWJ: allowed between two pictographic codepoints (skipping variation selectors), flagged everywhere else. All other invisibles in the blocklist remain unconditionally flagged. Callers updated: - agent/prompt_builder.py (_scan_context_content — blocks SOUL.md et al.) - tools/memory_tool.py (_scan_memory_content — blocks memory add/update) - tools/skills_guard.py (scan_file — blocks skill install) tools/cronjob_tools.py is intentionally left untouched — PR NousResearch#28589 already fixes _scan_cron_prompt. Adds 6 tests covering: - ZWJ inside 🧙♂️ (gendered emoji) — allowed - Multi-ZWJ family emoji 👨👩👧 — allowed - ZWJ between letters (classic injection shape) — still blocked - Mixed legit emoji + injection ZWJ — blocked (at least one unsafe ZWJ) - ZWSP adjacent to emoji — still blocked (only ZWJ is context-whitelisted) 221/221 tests pass across the affected test modules. Motivation: a user SOUL.md containing 🧙♂️ was being silently blocked from loading, with a [BLOCKED: ... invisible unicode U+200D] marker leaking into the system prompt in place of the actual identity content. The scan was eating its own foot on a legitimate, widely-used emoji sequence.
517e859 to
f647451
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The invisible-unicode blocklist in
_scan_context_content(and siblings inmemory_toolandskills_guard) treats ZWJ (U+200D) as categorically malicious. But ZWJ is a required component of emoji grapheme clusters — any gendered emoji (🧙♂️,👩⚕️,🏃♀️), family emoji (👨👩👧), rainbow flag (🏳️🌈), or other multi-pictograph emoji ischar + ZWJ + char [+ VS16].Any context file (
SOUL.md,AGENTS.md,.hermes.md), memory entry, or skill file containing such emoji is silently replaced with:Relationship to #28589
Upstream PR #28589 — "fix(cron): allow emoji ZWJ sequences in prompts" (a salvage of the original #28164 by @outsourc-e) — fixed exactly this false positive, but only for the cron prompt scanner (
_scan_cron_prompt), via acronjob_tools-local helper_strip_legitimate_emoji_zwj.That set the precedent — but three other scanners share the identical invisible-char blocklist and still reject legitimate emoji. This PR completes the job for those three, and factors the context check into a single shared helper rather than adding a fourth copy of the logic.
tools/cronjob_tools.pyis intentionally left untouched — #28589 already covers it.Fix
New shared helper
utils.find_unsafe_invisibles(content, blocklist):FE0E/FE0F) — allowed.helloworld) — flagged.Pictographic ranges checked:
1F000–1FFFF,2600–27BF,2300–23FF,2B00–2BFF, plus©,®,™,ℹ,〰,〽,㊗,㊙. Narrow by design — prefers false-negative (flagging a legit emoji ZWJ) over false-positive (letting a text-hiding ZWJ past).Callers updated
agent/prompt_builder.py::_scan_context_contenttools/memory_tool.py::_scan_memory_contenttools/skills_guard.py::scan_fileTests
6 new tests in
tests/agent/test_prompt_builder.py::TestScanContextContent:🧙♂️— allowed👨👩👧— allowedhelloworld— still blockedtest_invisible_unicode_blocked— still passes221/221 tests pass across the affected modules (
test_prompt_builder,test_memory_tool,test_skills_guard).Repro
Before:
After: returns
contentunchanged.