hardenUnicodeText must strip U+200E, U+200F, U+00AD, and U+034F to close residual `@mention` neutralization bypass

## Summary

The `hardenUnicodeText` function in `actions/setup/js/sanitize_content_core.cjs` (v0.64.2–v0.64.4) correctly strips strong BiDi override characters (U+202A–U+202E) and zero-width chars (U+200B–U+200D), but leaves four invisible/weak characters unhandled: **U+200F** (RTL Mark), **U+200E** (LTR Mark), **U+00AD** (soft hyphen), and **U+034F** (combining grapheme joiner). Inserting any of these between `@` and a username causes the `neutralizeAllMentions` regex (`/(^|[^A-Za-z0-9\`])@([A-Za-z0-9]...)/g`) to fail — the injected character is not in `[A-Za-z0-9]` — so the mention is never backtick-wrapped. The same gap exists in `sanitizeLabelContent`. Runtime testing against deployed v0.64.2 scripts confirms all four bypasses.

## Affected Area

Safe-outputs content sanitization boundary — `@mention` neutralization in `sanitize_content_core.cjs` and `sanitize_label_content.cjs`. This is a documented security control; bypassing it contradicts the stated guarantee of the gh-aw safe-outputs architecture.

## Reproduction Outline

1. Deploy a gh-aw workflow with `safe-outputs: add-comment` or `create-issue` permission on a GitHub Actions runner.
2. Inject a prompt payload (e.g., via cross-prompt injection in an issue body) that causes the AI to emit `@\u200Fadmin please review` in its output.
3. Observe that `sanitizeContentCore` passes the text through with U+200F preserved and the mention NOT backtick-wrapped.
4. The GitHub API receives the unsanitized mention string; GitHub's server-side markdown parser may notify the targeted user.

To verify the code path directly:
```bash
node -e "
const { hardenUnicodeText } = require('\$RUNNER_TEMP/gh-aw/safeoutputs/sanitize_content_core.cjs');
const input = '@\u200Fadmin please review';
const output = hardenUnicodeText(input);
console.log('U+200F still present:', output.includes('\u200F'));  // true (should be false)
"
```

## Observed Behavior

`@\u200Fadmin` → `@admin` (invisible char preserved; mention NOT neutralized — no backticks added)

## Expected Behavior

`@\u200Fadmin` → `` `@admin` `` (invisible char stripped, mention neutralized with backticks)

## Security Relevance

The exploit chain — prompt injection → AI includes invisible-char mention in output → safe-output posts to GitHub → victim notified — is complete with no speculative steps. U+200E and U+200F are common weak BiDi marks covered by Unicode TR#36, and U+00AD (soft hyphen) appears in ordinary Latin text, making these easy to inject. The fix is to extend Step 3 of `hardenUnicodeText` to include `\u200E\u200F\u00AD\u034F` (and apply the same extension to `sanitize_label_content.cjs`), plus add regression tests.

**gh-aw version**: v0.64.4 (researcher-confirmed)

Original finding: https://github.com/githubnext/gh-aw-security/issues/1611




> Generated by [File Issue](https://github.com/githubnext/gh-aw-security/actions/runs/23799833966/agentic_workflow) · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+githubnext%2Fgh-aw-security%2Ffile-issue%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hardenUnicodeText must strip U+200E, U+200F, U+00AD, and U+034F to close residual `@mention` neutralization bypass #23727

Summary

Affected Area

Reproduction Outline

Observed Behavior

Expected Behavior

Security Relevance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

hardenUnicodeText must strip U+200E, U+200F, U+00AD, and U+034F to close residual @mention neutralization bypass #23727

Description

Summary

Affected Area

Reproduction Outline

Observed Behavior

Expected Behavior

Security Relevance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

hardenUnicodeText must strip U+200E, U+200F, U+00AD, and U+034F to close residual `@mention` neutralization bypass #23727