fix(gateway/webhook): don't pop delivery_info on send by jescalan · Pull Request #5864 · NousResearch/hermes-agent

jescalan · 2026-04-07T15:18:09Z

Summary

The webhook adapter silently downgrades the configured deliver type to "log" whenever the agent emits any interim status message before the final response. The most common trigger is fallback-model activation:

[webhook] POST event=unknown route=agentmail-email delivery=...
[webhook] Response for ...: 🔄 Primary model failed — switching to fallback: claude-opus-4.6 via anthropic
gateway.run: response ready: platform=webhook ... response=912 chars
[Webhook] Sending response (912 chars) to webhook:agentmail-email:...
[webhook] Response for ...: <agent's actual response>     ← logged, never delivered

The user configured --deliver telegram --deliver-chat-id <id>, but the response lands in the gateway log instead of Telegram. No warning, no error.

Root cause

gateway/platforms/webhook.py::send() consumed _delivery_info[chat_id] via .pop():

delivery = self._delivery_info.pop(chat_id, {})
deliver_type = delivery.get("deliver", "log")

That works when the agent run produces exactly one outbound message — the final response. But interim status messages flow through the same send() path. When run_agent.py::_emit_status() fires (e.g. on fallback activation, context-pressure warnings, or any other lifecycle event routed through status_callback), the gateway calls adapter.send(chat_id, message, ...) for the same chat_id. That first call pops the entry. The subsequent final-response send() call sees an empty dict, falls back to deliver_type = "log", and writes the response to the gateway log instead of crossing over to the configured platform.

This is easy to hit in practice. Anyone with fallback_providers configured runs into it the first time their primary provider hiccups on a webhook-triggered run. Routes that work perfectly in dev silently drop responses in prod.

The bug is not specific to fallback notifications — it affects any code path that emits an interim status message before the final response.

Fix

Read _delivery_info with .get() instead of .pop() so multiple send() calls for the same chat_id all see the same delivery config.
Add a parallel _delivery_info_created: Dict[str, float] timestamp dict.
Add _prune_delivery_info(now) that drops entries older than _idempotency_ttl (1h, the same window already used by _seen_deliveries).
Call _prune_delivery_info() on each POST, mirroring the existing cleanup pattern for _seen_deliveries.

Worst-case bound on _delivery_info size is now rate_limit * TTL = 30/min * 60min = 1800 entries, each ~1 KB → under 2 MB. In practice it'll be much smaller because most webhooks complete in seconds.

Tests

test_delivery_info_cleaned_after_send is replaced with test_delivery_info_survives_multiple_sends — the regression test for this bug. It asserts that two consecutive send() calls for the same chat_id both see the delivery config.
New test_delivery_info_pruned_via_ttl covers the TTL cleanup behavior.
Two integration tests in test_webhook_integration.py that previously asserted chat_id not in adapter._delivery_info after send() now assert the opposite, with a comment explaining why.

All 40 tests in tests/gateway/test_webhook_adapter.py and tests/gateway/test_webhook_integration.py pass:

$ pytest tests/gateway/test_webhook_adapter.py tests/gateway/test_webhook_integration.py -q
........................................                                 [100%]
40 passed in 0.89s

End-to-end verification

Reproduced and verified locally with a dynamic subscription:

hermes webhook subscribe agentmail-email \
  --prompt "{text}\n\nReview this email and decide what to do..." \
  --deliver telegram \
  --deliver-chat-id <user-id> \
  --secret INSECURE_NO_AUTH

With gpt-5.4 as primary (currently flaky in my environment) and claude-opus-4.6 as fallback:

Before the fix: every POST hit the fallback path, the fallback notification consumed _delivery_info, and the final response landed in the gateway log. Zero Telegram messages delivered.
After the fix: same conditions, fallback still fires, and the final response is delivered to Telegram as expected.

Notes

Behavior change is intentional: _delivery_info entries now live until TTL expiry instead of being consumed on send. This is consistent with the existing _seen_deliveries lifecycle and is what the cross-platform delivery contract actually requires.
No public API changes.
No config changes.

The webhook adapter stored per-request `deliver`/`deliver_extra` config in `_delivery_info[chat_id]` during POST handling and consumed it via `.pop()` inside `send()`. That worked for routes whose agent run produced exactly one outbound message — the final response — but it broke whenever the agent emitted any interim status message before the final response. Status messages flow through the same `send(chat_id, ...)` path as the final response (see `gateway/run.py::_status_callback_sync` → `adapter.send(...)`). Common triggers include: - "🔄 Primary model failed — switching to fallback: ..." (run_agent.py::_emit_status when `fallback_providers` activates) - context-pressure / compression notices - any other lifecycle event routed through `status_callback` When any of those fired, the first `send()` call popped the entry, so the subsequent final-response `send()` saw an empty dict and silently downgraded `deliver_type` from `"telegram"` (or `discord`/`slack`/etc.) to the default `"log"`. The agent's response was logged to the gateway log instead of being delivered to the configured cross-platform target — no warning, no error, just a missing message. This was easy to hit in practice. Any user with `fallback_providers` configured saw it the first time their primary provider hiccuped on a webhook-triggered run. Routes that worked perfectly in dev (where the primary stays healthy) silently dropped responses in prod. Fix: read `_delivery_info` with `.get()` so multiple `send()` calls for the same `chat_id` all see the same delivery config. To keep the dict bounded without relying on per-send cleanup, add a parallel `_delivery_info_created` timestamp dict and a `_prune_delivery_info()` helper that drops entries older than `_idempotency_ttl` (1h, same window already used by `_seen_deliveries`). Pruning runs on each POST, mirroring the existing `_seen_deliveries` cleanup pattern. Worst-case memory footprint is now `rate_limit * TTL = 30/min * 60min = 1800` entries, each ~1KB → under 2 MB. In practice it'll be far smaller because most webhooks complete in seconds, not the full hour. Test changes: - `test_delivery_info_cleaned_after_send` is replaced with `test_delivery_info_survives_multiple_sends`, which is now the regression test for this bug — it asserts that two consecutive `send()` calls both see the delivery config. - A new `test_delivery_info_pruned_via_ttl` covers the TTL cleanup behavior. - The two integration tests that asserted `chat_id not in adapter._delivery_info` after `send()` now assert the opposite, with a comment explaining why. All 40 tests in `tests/gateway/test_webhook_adapter.py` and `tests/gateway/test_webhook_integration.py` pass. Verified end-to-end locally against a dynamic `hermes webhook subscribe` route configured with `--deliver telegram --deliver-chat-id <user>`: with `gpt-5.4` as the primary (currently flaky) and `claude-opus-4.6` as the fallback, the fallback notification fires, the agent finishes, and the final response is delivered to Telegram as expected.

teknium1 · 2026-04-08T00:27:27Z

Merged via PR #5942. Your commit was cherry-picked onto current main with authorship preserved in git log. Thanks for the fix!

teknium1 mentioned this pull request Apr 7, 2026

fix(gateway): webhook delivery_info persistence + full session id in /status #5942

Merged

teknium1 closed this in #5942 Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gateway/webhook): don't pop delivery_info on send#5864

fix(gateway/webhook): don't pop delivery_info on send#5864
jescalan wants to merge 1 commit into
NousResearch:mainfrom
jescalan:fix/webhook-delivery-info-pop

jescalan commented Apr 7, 2026

Uh oh!

teknium1 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jescalan commented Apr 7, 2026

Summary

Root cause

Fix

Tests

End-to-end verification

Notes

Uh oh!

teknium1 commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants