Skip to content

fix(google_chat): restore nested format_message placeholders in reverse insertion order (#24567)#24582

Closed
briandevans wants to merge 1 commit into
NousResearch:mainfrom
briandevans:fix/google-chat-nested-placeholder-restore-24567
Closed

fix(google_chat): restore nested format_message placeholders in reverse insertion order (#24567)#24582
briandevans wants to merge 1 commit into
NousResearch:mainfrom
briandevans:fix/google-chat-nested-placeholder-restore-24567

Conversation

@briandevans

@briandevans briandevans commented May 12, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Reverses the format_message placeholder-restore iteration so outer placeholders whose values embed earlier (inner) placeholder keys are substituted first, re-introducing the inner key into text while the inner key is still pending in the loop.

GoogleChatAdapter.format_message protects Markdown constructs from cross-conversion by substituting them with \x00GC<n>\x00 sentinels via the _ph helper, then restoring values at the end. Each _ph callback runs left-to-right inside a re.sub pass, so a later placeholder's value can contain earlier keys — e.g. **Use \active` mode**first becomes\x00GC0\x00 = `active`(inline-code pass) then\x00GC1\x00 = Use \x00GC0\x00 mode(bold pass). The restore loop iteratedplaceholders.items()in forward insertion order, so when the loop reached\x00GC0\x00that marker was not yet intext— the iteration was a no-op. Then\x00GC1\x00was substituted, writingUse \x00GC0\x00 modeback intotext— but inner key's iteration already passed. JSON serialization / the Google Chat REST API silently strips NUL bytes downstream, surfacing the bareGC` token in the rendered message (direct quote from #24567 reporter's session: "Status moved from Abandoned to GC6").

One-line semantic change: iterate reversed(placeholders.items()). The _ph callbacks can only produce values that embed keys with lower counter numbers, so substituting outermost first re-introduces inner keys while every inner key is still pending in the loop.

Related Issue

Fixes #24567

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • plugins/platforms/google_chat/adapter.pyfor key, value in reversed(placeholders.items()): in format_message's restore loop.
  • tests/gateway/test_google_chat.py — 8 new cases in TestFormatMessage covering bold-around-inline-code, bold-italic-around-inline-code, link-around-bold, link-around-inline-code, header-with-inline-code, three-deep link→bold→inline, and a realistic sales-pipeline body. Each asserts both expected Chat-dialect output and that neither a raw \x00 nor a bare GC<digit> token survives.

How to Test

  1. uv run --with pytest --with pytest-xdist --with pytest-asyncio python3 -m pytest tests/gateway/test_google_chat.py::TestFormatMessage -v
  2. Expected: 19 passed (11 pre-existing + 8 new).
  3. Regression guard before fix: **Use \active` mode**'Use \x00GC0\x00 mode'(sentinel leak). After fix:'Use `active` mode'`. Reproduced for all 6 nesting patterns and the realistic scenario.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run focused tests for the touched code and all pass (19/19)
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS 15.x

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guidereversed(dict.items()) supported since Python 3.8 (project requires ≥3.11)
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A

Related / Positioning

  • Fixes [Bug]: Google chat placeholders (GC0, GC1, ...) show up in output #24567.
  • The 6 pre-existing failures elsewhere in tests/gateway/test_google_chat.py (TestPlatformRegistration::test_enum_value, TestEnvConfigLoading::*, TestAuthorizationEmailMatch::*) reproduce on clean origin/main under the local .[all,dev] install matrix and are unrelated to this change.

Audited siblings: format_message is the single placeholder-restore site; no other adapter has the same nested-sentinel pattern. No widening needed.

Copilot AI review requested due to automatic review settings May 12, 2026 22:16

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Google Chat outbound formatting bug where nested Markdown conversions could leak placeholder tokens (GC<n>) into the final rendered message by restoring placeholders in the wrong order.

Changes:

  • Restore format_message placeholders in reverse insertion order to correctly resolve nested placeholder values.
  • Add regression tests covering multiple nesting shapes (bold/italic/link/header around inline code/bold) and a realistic multi-line sales update scenario.
  • Add targeted assertions that neither NUL sentinels nor bare GC<n> tokens survive in formatted output.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
plugins/platforms/google_chat/adapter.py Reverse placeholder restoration order in format_message to prevent nested placeholder leaks.
tests/gateway/test_google_chat.py Add parameterized + scenario regression tests that reproduce and prevent GC<n> placeholder leakage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@alt-glitch alt-glitch added type/bug Something isn't working comp/gateway Gateway runner, session dispatch, delivery comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have labels May 12, 2026
@briandevans briandevans force-pushed the fix/google-chat-nested-placeholder-restore-24567 branch from 3b5dcd8 to 2f0cb36 Compare May 15, 2026 19:14
@briandevans

Copy link
Copy Markdown
Contributor Author

CI audit — all 8 failing checks are pre-existing baselines on clean origin/main (931caf2b2). Zero failures are in touched code (plugins/platforms/google_chat/adapter.py, tests/gateway/test_google_chat.py).

Check Failure Status on main
test tests/run_agent/test_provider_parity.py::TestDeveloperRoleSwap::test_developer_role_via_nous_portal TestBuildApiKwargsNousPortal::test_uses_chat_completions_format ::test_includes_nous_product_tagsValueError: Model has a context window of 15,000 tokens, which is below the minimum 64,000. The Nous portal /v1/models endpoint is hit at AIAgent.__init__ (live network call, not mocked); the test helper passes model=None so the single-iter heuristic matches the first portal entry, which currently returns 15K. Reproduces on clean 931caf2b2. Fix in flight: #26310 / #26312 (both pass model="gpt-5" explicitly). unmerged
e2e 4 × tests/e2e/test_discord_adapter.py::*AttributeError: 'types.SimpleNamespace' object has no attribute 'history' and TypeError: catching classes that do not inherit from BaseException is not allowed. The recent default-on history backfill (gateway/platforms/discord.py::_fetch_channel_context) calls channel.history(...) on the test fixture's SimpleNamespace, then except discord.Forbidden: can't catch the resulting AttributeError because mocked discord.Forbidden isn't a real exception class. Reproduces on clean 931caf2b2. Fix in flight: #26301 (mine) / #26310 / #26312. unmerged
Scan PR for critical supply chain risks Install-hook file added or modified: hermes_cli/setup.py. False positive — this PR doesn't touch hermes_cli/setup.py. The scanner uses two-dot git diff BASE..HEAD which includes every commit upstream merged since the PR's last push; hermes_cli/setup.py was modified by upstream ~20 times in the last 14 days. Three-dot BASE...HEAD returns empty here (correct). Fix in flight: #13411. unmerged

Nothing in plugins/platforms/google_chat/adapter.py or tests/gateway/test_google_chat.py touches any of these paths. Happy to rebase once any of the three above lands.

@briandevans briandevans force-pushed the fix/google-chat-nested-placeholder-restore-24567 branch from 2f0cb36 to 274e109 Compare May 19, 2026 04:36
briandevans added a commit to briandevans/hermes-agent that referenced this pull request May 19, 2026
…'t false-positive

GitHub re-resolves `pull_request.base.sha` to the current upstream HEAD
at workflow-trigger time, not at branch-off time. The supply-chain
scanner's `git diff "$BASE".."$HEAD"` (two-dot) therefore includes
every commit upstream merged after the PR contributor's last push.

Concrete failure mode the scanner is currently hitting: if upstream
modifies `hermes_cli/setup.py` (which happens often — ~20 commits in
the last 14 days alone), every open PR re-scanned after that point
will be flagged with `### 🚨 CRITICAL: Install-hook file added or
modified`, even though the PR itself doesn't touch `setup.py`. The
"Fail on critical findings" step then blocks the PR until the next
scanner run.

Switching to three-dot (`A...B`, which git rewrites to
`git diff $(merge-base A B)..B`) restricts the diff to changes the
PR branch actually introduced. The two file-list scans (`.pth` and
install-hook) and the line-level DIFF all get the same fix, so
B64_EXEC_HITS / PROC_HITS see consistent input.

Evidence on a live PR (NousResearch#24582):
  2-dot: hermes_cli/setup.py  (false positive — touched by upstream)
  3-dot: (empty)              (correct — PR doesn't touch any install-hook)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@briandevans briandevans force-pushed the fix/google-chat-nested-placeholder-restore-24567 branch 3 times, most recently from 4d9260d to cc7972b Compare May 28, 2026 01:16
…se insertion order (NousResearch#24567)

Real-world agent output mixes Markdown constructs — bold spans around inline
code, links that wrap bold or inline code, headers containing inline code —
and Google Chat users started seeing the raw sentinel token (`GC0`, `GC6`, …)
in their rendered messages, e.g.:

> * Reactivated Copper Opportunity XXXX (Status moved from Abandoned to GC6).

Root cause is in `GoogleChatAdapter.format_message`. Each `_ph` callback
runs left-to-right inside a `re.sub` pass and stores a placeholder whose
value can embed the keys of previously-inserted placeholders (e.g. a header
whose body contained inline code that was already protected). The restore
loop iterates `placeholders.items()` in forward insertion order, so when
the outer key is substituted into `text`, it re-introduces the inner
`\x00GC<n>\x00` marker *after* the inner-key iteration already passed —
leaving the sentinel in the returned string. JSON / Google Chat API
stripping of the NUL boundary bytes downstream then surfaces the bare
`GC<n>` token to the user.

Iterating in reverse insertion order is sufficient: each `_ph` callback
can only embed keys with *lower* counter numbers (later regex passes see
text that already contains earlier placeholder keys, never vice versa).
Substituting outer (highest counter) first re-introduces each inner key
into `text` while the inner key is still pending in the loop.

Reproduces with:
- `**Use \`active\` mode**`          → leaked `*Use \x00GC0\x00 mode*`
- `***Use \`mode\` here***`          → leaked `*_Use \x00GC0\x00 here_*`
- `[See **important** docs](url)`    → leaked `<url|See \x00GC0\x00 docs>`
- `[Run \`cmd\`](url)`               → leaked `<url|Run \x00GC0\x00>`
- `## Status: \`active\``            → leaked `*Status: \x00GC0\x00*`

With the fix, every one of those plus the three-deep
`[See **\`cmd\` mode**](url)` case round-trip cleanly.

Tests: parametrized regression coverage for each nesting shape plus a
realistic sales-update message body that mirrors the reporter's symptom,
both asserting (a) the expected Chat-dialect output and (b) that neither
`\x00` nor a bare `GC<digit>` token survives in the rendered output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@briandevans briandevans force-pushed the fix/google-chat-nested-placeholder-restore-24567 branch from cc7972b to 054ecf9 Compare May 29, 2026 22:10
@briandevans

Copy link
Copy Markdown
Contributor Author

Closing this to keep the contribution queue manageable while it awaits maintainer review. The fix still applies cleanly and I'm happy to reopen if a maintainer wants to pick it up — just give it a nudge and I'll refresh it against main.

@donbowman

Copy link
Copy Markdown

is it going to get found that way? I'll keep my subscribe on it, but, i doubt any maintainer is looking at closed issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Google chat placeholders (GC0, GC1, ...) show up in output

4 participants