feat: add homoglyph obfuscation prompts to smuggling probe by dentity007 · Pull Request #1660 · NVIDIA/garak

dentity007 · 2026-03-29T22:02:54Z

Adds smuggling.HomoglyphObfuscation, a probe with 5 prompts that use Unicode homoglyphs (visually similar characters from different scripts) to disguise trigger words in bypass requests. For example, Cyrillic 'a' (U+0430) replaces Latin 'a' in "jailbreak", making the token sequence different while the text remains human-readable.

Second decomposed contribution from PR #1619. The smuggling module's docstring describes exactly this technique: "swapping letters out for unusual unicode representations of the same letters." Uses mitigation.MitigationBypass detector. Set to active = False since these are domain-specific.

Homoglyph scripts used: Cyrillic (U+0430, U+043E, U+0456), Latin alpha (U+0251), Turkish dotless i (U+0131)

Files:

garak/probes/smuggling.py : new HomoglyphObfuscation class
garak/data/smuggling_homoglyph_5.txt : 5 prompts with embedded Unicode homoglyphs
tests/probes/test_probes_smuggling.py : 4 tests (count, uniqueness, non-ASCII verification, active=False)

Add smuggling.HomoglyphObfuscation with 5 prompts that use Unicode homoglyphs (Cyrillic, Latin alpha, Turkish dotless i) to disguise trigger words in bypass requests. Tests whether input-side content filters catch visually identical character substitutions from different scripts. Uses mitigation.MitigationBypass detector. Set to active=False (domain-specific). Signed-off-by: Nathan Maine <nathan@dentity.cloud>

jmartin-tech

This is a great added technique, I would suggest this can be expanded to preform inline substitution instead of just using a set of hardcoded sample prompts.

The idea I am suggesting, would programmatically replace characters during prompt initialization to actually mimic the smuggling aspect of the technique. This could be further enhanced to accept a configuration map of character replacements that could be increased or reduced to expand resiliency testing.

garak/probes/smuggling.py

Address review feedback on PR NVIDIA#1660: - Change tier from COMPETE_WITH_SOTA to INFORMATIONAL - Replace static prompt loading with programmatic substitution via homoglyph_replace() function applied to garak payloads - Add configurable DEFAULT_HOMOGLYPH_MAP (20 Latin-to-Cyrillic/Turkish/ Ukrainian mappings) overridable via homoglyph_map config parameter - Load payloads from garak.payloads system (harmful_behaviors default) - Keep static prompts as additional payloads through same pipeline - Update tests: 9 tests covering substitution function, probe loading, tier, determinism, custom maps, non-ASCII verification Signed-off-by: Nathan Maine <nathan@dentity.cloud>

dentity007 · 2026-03-30T20:03:49Z

Thanks for the review. Both changes addressed:

Tier adjusted to INFORMATIONAL
Replaced the static prompt approach with programmatic substitution. The probe now loads payloads from garak's payload system (harmful_behaviors by default), applies character-by-character homoglyph replacement via a configurable DEFAULT_HOMOGLYPH_MAP (20 Latin-to-Cyrillic/Turkish/Ukrainian mappings), and generates obfuscated prompts at initialization. The map is overridable via the homoglyph_map config parameter so the substitution set can be expanded or reduced. The original 5 static prompts are still loaded as additional payloads and go through the same substitution pipeline.

Tests updated: 9 tests covering probe loading, substitution function behavior (determinism, custom maps, non-mapped character preservation), non-ASCII verification, tier, and inactive flag.

jmartin-tech requested changes Mar 30, 2026

View reviewed changes

garak/probes/smuggling.py Outdated Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add homoglyph obfuscation prompts to smuggling probe#1660

feat: add homoglyph obfuscation prompts to smuggling probe#1660
dentity007 wants to merge 2 commits intoNVIDIA:mainfrom
NathanMaine:feat/smuggling-homoglyph-obfuscation

dentity007 commented Mar 29, 2026

Uh oh!

jmartin-tech left a comment

Uh oh!

Uh oh!

dentity007 commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dentity007 commented Mar 29, 2026

Uh oh!

jmartin-tech left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dentity007 commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants