fix: salvage 9 community bugfixes — gateway, cron, deps, macOS launchd by teknium1 · Pull Request #5288 · NousResearch/hermes-agent

teknium1 · 2026-04-05T18:49:17Z

Summary

Salvages 9 community PRs into a single consolidated bugfix PR. All contributor commits are cherry-picked with original authorship preserved.

Included PRs

PR	Author	Fix
#5115	@memosr	Remove full traceback from cron error delivery output (security)
#4976	@bugkill3r	Pass fallback_model to AIAgent in api_server platform
#5224	@nibzard	Cap memory flush retries at 3 to prevent infinite retry loop
#5080	@teyrebaz33	/status command bypasses active-session guard during agent run
#4961	@bg-l2norm	Include Telegram webhook extras in messaging dependency
#4892	@tmchow	Replace deprecated launchctl load/unload/start/stop with bootstrap/bootout/kickstart/kill (macOS)
#5201	@kshitijk4poor	Harden Telegram polling handoffs + flood-control send retries
#5149	@damianpdr	Resolve listed messaging targets consistently (copy-paste from list works)
#5205	@el-analista	Resolve Telegram underscored /commands to skill/plugin keys + unknown command warning

Follow-up fixes (our commit)

Fix GatewayApp → GatewayRunner import in api_server.py (fix(api-server): pass fallback_model to AIAgent #4976 used wrong class name)
Update launchd test assertions for new bootstrap/bootout/kickstart commands (fix(gateway): replace deprecated launchctl start/stop with kickstart/kill #4892)
Add nonlocal message in run_sync() to fix pre-existing UnboundLocalError scoping bug

Test results

3081 passed, 16 skipped. All 16 failures are pre-existing (missing optional deps: nio.crypto, davey, faster_whisper; plus unrelated signal/tools_config issues). Zero regressions from this PR.

…t info leakage

…ed import

The API server platform never passed fallback_model to AIAgent(), so the fallback provider chain was always empty for requests through the OpenAI-compatible endpoint. Load it via GatewayApp._load_fallback_model() to match the behavior of Telegram/Discord/Slack platforms.

The _session_expiry_watcher retried failed memory flushes forever because exceptions were caught at debug level without setting memory_flushed=True. Expired sessions with transient failures (rate limits, network errors) would retry every 5 minutes indefinitely, burning API quota and blocking gateway message processing via 429 rate limit cascades. Observed case: a March 19 session retried 28+ times over ~17 days, causing repeated 429 errors that made Telegram unresponsive. Add a per-session failure counter (_flush_failures) that gives up after 3 consecutive attempts and marks the session as flushed to break the loop.

…5046) When an agent was actively processing a message, /status sent via Telegram (or any gateway) was queued as a pending interrupt instead of being dispatched immediately. The base platform adapter's handle_message() only had special-case bypass logic for /approve and /deny, so /status fell through to the default interrupt path and was never processed as a system command. Apply the same bypass pattern used by /approve//deny: detect cmd == 'status' inside the active-session guard, dispatch directly to the message handler, and send the response without touching session lifecycle or interrupt state. Adds a regression test that verifies /status is dispatched and responded to immediately even when _active_sessions contains an entry for the session.

…kill launchctl load/unload/start/stop are deprecated on macOS since 10.10 and fail silently on modern versions. This replaces them with the current equivalents: - load -> bootstrap gui/<uid> <plist> - unload -> bootout gui/<uid>/<label> - start -> kickstart gui/<uid>/<label> - stop -> kill SIGTERM gui/<uid>/<label> Adds _launchd_domain() helper returning the gui/<uid> target domain. Updates test assertions to match the new command signatures. Fixes #4820

Replace the two-step stop/start restart with a single launchctl kickstart -k call. When the gateway triggers a restart from inside its own process tree, the old stop command kills the shell before the start half is reached. kickstart -k lets launchd handle the kill+restart atomically.

Telegram polling can inherit a stale webhook registration when a deployment switches transport modes, which leaves getUpdates idle even though the gateway starts cleanly. Outbound send also treats Telegram retry_after responses as terminal errors, so brief flood control can drop tool progress and replies. Constraint: Keep the PR narrowly scoped to upstream/main Telegram adapter behavior Rejected: Port OpenClaw's broader polling supervisor and offset persistence | too broad for an isolated fix PR Confidence: high Scope-risk: narrow Reversibility: clean Directive: Polling mode should clear webhook state before starting getUpdates, and send-path retry logic must distinguish flood control from timeouts Tested: uv run --extra dev pytest tests/gateway/test_telegram_* -q Not-tested: Live Telegram webhook-to-polling migration and real Bot API 429 behavior

…n keys Telegram's Bot API disallows hyphens in command names, so _build_telegram_menu registers /claude-code as /claude_code. When the user taps it from autocomplete, the gateway dispatch did a direct lookup against skill_cmds (keyed on the hyphenated form) and missed, silently falling through to the LLM as plain text. The model would then typically call delegate_task, spawning a Hermes subagent instead of invoking the intended skill. Normalize underscores to hyphens in skill and plugin command lookup, matching the existing pattern in _check_unavailable_skill.

…e LLM Previously, typing a /command that isn't a built-in, plugin, or skill would silently fall through to the LLM as plain text. The model often interprets it as a loose instruction and invents unrelated tool calls — e.g. a stray /claude_code slipped through and the model fabricated a delegate_task invocation that got stuck in an OAuth loop. Now we check GATEWAY_KNOWN_COMMANDS after the skill / plugin / unavailable-skill lookups and return an actionable message pointing the user at /commands. The user gets feedback, and the agent doesn't waste a round-trip guessing what /foo-bar was supposed to mean.

The warning said 'forwarding as plain text' but the code returns a user-facing error reply instead of forwarding. Describe what actually happens.

- Fix GatewayApp → GatewayRunner import in api_server.py (PR #4976) - Update launchd test assertions for new bootstrap/bootout/kickstart commands (PR #4892) - Add nonlocal message declaration in run_sync() to fix UnboundLocalError (pre-existing scoping bug)

github-actions · 2026-04-05T18:49:32Z

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Install hook files modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

memosr and others added 14 commits April 5, 2026 11:32

fix(security): remove full traceback from cron error output to preven…

7a2c328

…t info leakage

fix: use logger.exception to preserve traceback in logs and drop unus…

0a85547

…ed import

fix(deps): include telegram webhook extra in messaging installs (#4915)

94894bd

fix: resolve listed messaging targets consistently

a8a858c

fix(gateway): correct misleading log text for unknown /commands

0282221

The warning said 'forwarding as plain text' but the code returns a user-facing error reply instead of forwarding. Describe what actually happens.

teknium1 merged commit 0c95e91 into main Apr 5, 2026
5 of 6 checks passed

teknium1 mentioned this pull request Apr 27, 2026

fix(api-server): pass fallback config into AIAgent #4958

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: salvage 9 community bugfixes — gateway, cron, deps, macOS launchd#5288

fix: salvage 9 community bugfixes — gateway, cron, deps, macOS launchd#5288
teknium1 merged 14 commits into
mainfrom
hermes/hermes-7fbab92d

teknium1 commented Apr 5, 2026

Uh oh!

github-actions Bot commented Apr 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

teknium1 commented Apr 5, 2026

Summary

Included PRs

Follow-up fixes (our commit)

Test results

Uh oh!

github-actions Bot commented Apr 5, 2026

⚠️ Supply Chain Risk Detected

⚠️ WARNING: Install hook files modified

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants