Skip to content

fix(gateway): preserve thread routing for process completions#24222

Open
web3blind wants to merge 1 commit into
NousResearch:mainfrom
web3blind:fix/process-completion-thread-routing
Open

fix(gateway): preserve thread routing for process completions#24222
web3blind wants to merge 1 commit into
NousResearch:mainfrom
web3blind:fix/process-completion-thread-routing

Conversation

@web3blind

Copy link
Copy Markdown
Contributor

Summary

  • carry session/platform/chat/user/thread metadata on background process completion queue events
  • let the gateway drain completion events through the same synthetic-message routing path as watch notifications
  • mark completions consumed after queue/watcher delivery to avoid duplicate notify_on_complete messages
  • document investigation in plan.md

Tests

  • scripts/run_tests.sh tests/tools/test_notify_on_complete.py tests/gateway/test_background_process_notifications.py

Why

Delayed background-process notifications must route to the original Telegram topic/thread where the process was launched, not whichever topic is active when the notification is drained.

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery tool/terminal Terminal execution and process management labels May 12, 2026
@web3blind

Copy link
Copy Markdown
Contributor Author

Additional routing note from user report:

The observed misdelivery may correlate with cron outputs delivered to a Telegram topic (e.g. a farm/cron topic) while a background process from another topic completes later. This PR addresses that failure mode for process completions by making queued completion events self-routable: the event now carries platform/chat_id/thread_id/user metadata from the original process-starting context, and gateway injection reconstructs the synthetic MessageEvent from the event itself rather than any currently active foreground topic.

Cron delivery itself already resolves thread_id from job origin or explicit deliver targets, and cron agent runs intentionally clear HERMES_SESSION_* context so cron ticks do not masquerade as live inbound gateway messages. The added regression test test_completion_event_routes_to_original_telegram_thread covers the key invariant: queued completion notifications route to the original Telegram thread, not whichever thread is active when the queue is drained.

@liuhao1024 liuhao1024 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan.md file looks like a development planning document that was accidentally included in this PR. It should be removed before merging — it doesn't belong in the repo and exposes internal design rationale that's better suited to the PR description.

@Bartok9

Bartok9 commented May 12, 2026

Copy link
Copy Markdown
Contributor

I rechecked this against current origin/main (99ad2d137).

tests/gateway/test_background_process_notifications.py and tests/tools/test_notify_on_complete.py both pass cleanly on current main (49 tests, no failures). The cross-topic routing gap this PR fixes is still present in main — the queue drain at run.py:7651 explicitly skips completion events with the comment 'completion events are handled by the watcher task', but the watcher task doesn't carry thread/session metadata through to the notification call. This PR's approach of routing completions through the same _inject_watch_notification path as watch events is the right fix.

No baseline CI conflicts seen — this PR should merge cleanly.

@web3blind web3blind force-pushed the fix/process-completion-thread-routing branch from b0c6965 to 7fb0c50 Compare May 23, 2026 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists tool/terminal Terminal execution and process management type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants