Skip to content

Fix: underscore stripping regex in gateway helpers consumes identifiers#15076

Closed
Super-Yu wants to merge 1 commit into
NousResearch:mainfrom
Super-Yu:fix/underscore-markdown-stripping
Closed

Fix: underscore stripping regex in gateway helpers consumes identifiers#15076
Super-Yu wants to merge 1 commit into
NousResearch:mainfrom
Super-Yu:fix/underscore-markdown-stripping

Conversation

@Super-Yu

@Super-Yu Super-Yu commented Apr 24, 2026

Copy link
Copy Markdown

Problem

gateway/platforms/helpers.py contains the same overly-greedy Markdown underscore regex that was already fixed in cli.py. The TextBatchAggregator class uses _RE_ITALIC_UNDER = re.compile(r"_(.+?)_", re.DOTALL) which strips underscores from alphanumeric identifiers like send_as_bot, user_id, my_variable.

Root Cause

Lines 160-161 in gateway/platforms/helpers.py:

# Old (buggy) — matches ANY underscore pair greedily
_RE_BOLD_UNDER = re.compile(r"__(.+?)__", re.DOTALL)
_RE_ITALIC_UNDER = re.compile(r"_(.+?)_", re.DOTALL)

This is identical to the bug that was fixed in cli.py's _strip_markdown_syntax() — but the helpers.py copy was missed.

Fix

Replace with lookbehind/lookahead assertions that only match actual Markdown italic delimiters:

# New (fixed) — requires non-alphanumeric boundaries on both sides
_RE_BOLD_UNDER = re.compile(r'(?<![a-zA-Z0-9])__(?=[^\s])(.+?)(?<=[^_])__(?![a-zA-Z0-9])', re.DOTALL)
_RE_ITALIC_UNDER = re.compile(r'(?<![a-zA-Z0-9])_(?=[^\s])(.+?)(?<=[^_])_(?![a-zA-Z0-9])', re.DOTALL)

Impact

  • Identifiers like send_as_bot, user_id, my_variable are no longer corrupted when text is stripped via TextBatchAggregator
  • Actual Markdown italic _text_ still correctly stripped
  • Consistent with the fix already applied to cli.py

Type of Change

  • Bug fix (non-breaking change that fixes an issue)

Changes Made

  • gateway/platforms/helpers.py: Fixed _RE_BOLD_UNDER and _RE_ITALIC_UNDER regex patterns (2 lines changed)

How to Test

  1. Send a message containing send_as_bot or user_id through any gateway platform
  2. Verify the identifiers are preserved in the output text
  3. Verify actual Markdown italic _text_ is still correctly stripped

…down

The gateway/platforms/helpers.py strip_markdown() function had the same
greedy underscore regex bug as cli.py. Apply identical lookbehind/lookahead
fixes to _RE_ITALIC_UNDER and _RE_BOLD_UNDER patterns.
@Super-Yu Super-Yu force-pushed the fix/underscore-markdown-stripping branch from d37bc4f to 3057850 Compare April 24, 2026 10:50
@Super-Yu Super-Yu changed the title Fix/underscore markdown stripping Fix: underscore stripping regex in gateway helpers consumes identifiers Apr 24, 2026
@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery labels Apr 24, 2026
@teknium1

Copy link
Copy Markdown
Contributor

Closing as already fixed on main.

Triage notes (high confidence):
gateway/platforms/helpers.py:171-172 on main already restricts underscore italic/bold regex with word boundaries (\b__(?![\s_])...(?<![\s_])__\b), addressing identifier-consuming bug with a different but equivalent fix.

If you still see this on the latest version, please reopen with reproduction steps.

(Bulk-closed during a CLI triage sweep.)

@teknium1 teknium1 closed this May 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants