fix(cron): allow emoji ZWJ sequences in prompts (#28164)#28589
Merged
Conversation
Contributor
🔎 Lint report:
|
witt3rd
added a commit
to witt3rd/hermes-agent
that referenced
this pull request
May 29, 2026
…s scanners SOUL.md, memory entries, and skill files containing emoji ZWJ sequences (e.g. 🧙♂️ = 🧙 + ZWJ + ♂ + VS16) were being silently blocked as prompt-injection attempts. ZWJ (U+200D) is in the invisible-char blocklist for good reason — it can hide text inside benign-looking strings — but it is also required inside emoji sequences and has no way to hide anything harmful there. Upstream PR NousResearch#28589 ("fix(cron): allow emoji ZWJ sequences in prompts", a salvage of NousResearch#28164) established the precedent for this fix, but only applied it to the cron prompt scanner via a cronjob_tools-local helper (_strip_legitimate_emoji_zwj). The identical false positive still affects the other three scanners that share the same invisible-char blocklist. This PR completes the job for those three, factoring the context check into a single shared helper instead of adding a fourth copy of the logic. Added shared utils.find_unsafe_invisibles() that context-checks ZWJ: allowed between two pictographic codepoints (skipping variation selectors), flagged everywhere else. All other invisibles in the blocklist remain unconditionally flagged. Callers updated: - agent/prompt_builder.py (_scan_context_content — blocks SOUL.md et al.) - tools/memory_tool.py (_scan_memory_content — blocks memory add/update) - tools/skills_guard.py (scan_file — blocks skill install) tools/cronjob_tools.py is intentionally left untouched — PR NousResearch#28589 already fixes _scan_cron_prompt. Adds 6 tests covering: - ZWJ inside 🧙♂️ (gendered emoji) — allowed - Multi-ZWJ family emoji 👨👩👧 — allowed - ZWJ between letters (classic injection shape) — still blocked - Mixed legit emoji + injection ZWJ — blocked (at least one unsafe ZWJ) - ZWSP adjacent to emoji — still blocked (only ZWJ is context-whitelisted) 221/221 tests pass across the affected test modules. Motivation: a user SOUL.md containing 🧙♂️ was being silently blocked from loading, with a [BLOCKED: ... invisible unicode U+200D] marker leaking into the system prompt in place of the actual identity content. The scan was eating its own foot on a legitimate, widely-used emoji sequence.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvage of #28164 by @outsourc-e.
What:
_scan_cron_promptblocked U+200D (Zero-Width Joiner) as a hidden-character deception attack. ZWJ is legitimately required to form many emoji sequences (👨👩👧, 🏳️🌈, ❤️🩹, 🧑💻), so all those emoji in cron prompts hit the security guard.How: Allow ZWJ when its neighbors are emoji codepoints (Misc Symbols, Pictographs, Dingbats, regional indicators, variation selectors) or another ZWJ within the same emoji cluster; still block ZWJ when both neighbors are plain text. New unit tests cover legitimate emoji clusters AND continue to block plain-text ZWJ smuggling.
Original PR: #28164