Add support for Atropos Agentic RL environments (requires branch tool…#17
Conversation
…_call_support in Atropos atm) - Added new environments for reinforcement learning, including `HermesSweEnv` for software engineering tasks and `TerminalTestEnv` for inline testing. - Introduced `ToolContext` for unrestricted access to tools during reward computation. - Updated `.gitignore` to exclude `wandb/` directory. - Enhanced `README.md` with detailed architecture and usage instructions for Atropos environments. - Added configuration files for SWE and terminal test environments to streamline setup. - Removed unnecessary compiled Python files from `__pycache__`.
Adds the newly available GLM-5.1 model to the hardcoded Z.AI provider model list so it appears in the model dropdown. Fixes NousResearch#17. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
QA Evidence Update — 2026-04-06 QuickScan run confirms Mixed Content on additional pages beyond Dashboard:
All 8 pages affected:
Root cause: Frontend calls API at http://leksikon.ai/api/v1/... instead of https://leksikon.ai/api/v1/... |
|
QA Evidence — Quick Scan 2026-04-06 Confirmed Mixed Content errors on /app/dashboard: Same pattern also found on: /app/inbox, /app/inbox/entities, /app/knowledge-graph Mode: quick_scan |
|
QA Evidence — 2026-04-06 (evening) Pattern: All API endpoints use |
|
QA Evidence — 2026-04-06 (evening) Pattern: All API endpoints use http://leksikon.ai/api/v1/* instead of https://. |
|
QA evidence — 2026-04-06 18:30 CET New Mixed Content errors confirmed across additional pages:
Root cause: frontend is hardcoded to use |
|
QA Evidence — 2026-04-06T18:56:17+00:00 (QUICK_SCAN) Mixed Content confirmed on ALL pages (1-8) visited over HTTPS. Error pattern: Additional Mixed Content endpoints found:
Additional errors on specific pages:
|
|
QA Run — 2026-04-06 21:00 UTC Scanned all 8 pages. Mixed Content errors confirmed on:
New finding: Settings → Team tab also shows 403 on teammedlemmer endpoint. Issue #8 confirmed. |
|
Fixed as of 2026-04-06. Root cause: Docker image was rebuilt with stale frontend build cached (frontend-builder stage used cached npm build from a version that still had http://). The correct deployment process requires copying frontend/dist to backend/public before Docker build, but the Docker build was running npm run build inside the container while the host's backend/public was a different (older) copy. Fix applied: Rebuilt Docker image with --no-cache to force fresh npm build, which now uses https://leksikon.ai in all API calls. Verified on all 8 pages (dashboard, inbox, entities, knowledge-graph, advisories, cvr, cold-outreach, settings) — zero Mixed Content errors. |
|
QA evidence — 2026-04-07T00:43:55+00:00 Quick Scan 2026-04-07T00:40+02:00 — Mixed Content still occurring across ALL pages:
All advisories/ API calls blocked by browser. Data not loading on affected pages. |
…ilience (NousResearch#17) Merged CrazySerGo's fixes with conflict resolution: - Root redirect → /dashboard - Error boundary in __root.tsx - window.process polyfill for SSR hydration - WebSocket handshake timeout + retry logic - Terminal: platform-aware shell defaults (zsh on macOS, bash on Linux, powershell on Windows) - Terminal: os.homedir() fallback - Dynamic gateway URL in vite proxy and debug-analyzer - Loading indicator for chat history - Icon size normalization Removed unnecessary chromium-bidi dependency. Kept SSR externals for playwright and pty-helper copy plugin.
…+ apt-cacher-ng' (NousResearch#17) from fix/ai-review-opencode into main
Two follow-ups to the cherry-picked PR #9873 (`e3bcc819`): 1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())` so test fixtures that build the adapter via `object.__new__` (skipping __init__) don't crash with AttributeError. See AGENTS.md pitfall #17 — same pattern as gateway.run. 2. New 3-case regression coverage in test_discord_bot_auth_bypass.py: - role-only config bypasses the gateway 'no allowlists' branch - roles + users combined still authorizes user-allowlist matches - the role bypass does NOT leak to other platforms (Telegram, etc.) 3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from a previous test in the session can't flip later 'should-reject' tests into false-pass. Required because the bare cherry-pick of #9873 only added the adapter- level role check — it didn't cover the gateway-level _is_user_authorized, which still rejected role-only setups via the 'no allowlists configured' branch.
Two follow-ups to the cherry-picked PR #9873 (`e3bcc819`): 1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())` so test fixtures that build the adapter via `object.__new__` (skipping __init__) don't crash with AttributeError. See AGENTS.md pitfall #17 — same pattern as gateway.run. 2. New 3-case regression coverage in test_discord_bot_auth_bypass.py: - role-only config bypasses the gateway 'no allowlists' branch - roles + users combined still authorizes user-allowlist matches - the role bypass does NOT leak to other platforms (Telegram, etc.) 3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from a previous test in the session can't flip later 'should-reject' tests into false-pass. Required because the bare cherry-pick of #9873 only added the adapter- level role check — it didn't cover the gateway-level _is_user_authorized, which still rejected role-only setups via the 'no allowlists configured' branch.
- entry.tsx no longer writes bootBanner() to the main screen before the alt-screen enters. The <Banner> renders inside the alt screen via the seeded intro row, so nothing is lost — just the flash that preceded it. Fixes the torn first frame reported on Alacritty (blitz row 5 #17) and shaves the 'starting agent' hang perception (row 5 #1) since the UI paints straight into the steady-state view - AlternateScreen prefixes ERASE_SCROLLBACK (\x1b[3J) to its entry so strict emulators start from a pristine grid; named constants replace the inline sequences for clarity - bootBanner.ts deleted — dead code
…gent Add support for Atropos Agentic RL environments (requires branch tool…
Two follow-ups to the cherry-picked PR NousResearch#9873 (`e3bcc819`): 1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())` so test fixtures that build the adapter via `object.__new__` (skipping __init__) don't crash with AttributeError. See AGENTS.md pitfall NousResearch#17 — same pattern as gateway.run. 2. New 3-case regression coverage in test_discord_bot_auth_bypass.py: - role-only config bypasses the gateway 'no allowlists' branch - roles + users combined still authorizes user-allowlist matches - the role bypass does NOT leak to other platforms (Telegram, etc.) 3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from a previous test in the session can't flip later 'should-reject' tests into false-pass. Required because the bare cherry-pick of NousResearch#9873 only added the adapter- level role check — it didn't cover the gateway-level _is_user_authorized, which still rejected role-only setups via the 'no allowlists configured' branch.
- entry.tsx no longer writes bootBanner() to the main screen before the alt-screen enters. The <Banner> renders inside the alt screen via the seeded intro row, so nothing is lost — just the flash that preceded it. Fixes the torn first frame reported on Alacritty (blitz row 5 NousResearch#17) and shaves the 'starting agent' hang perception (row 5 NousResearch#1) since the UI paints straight into the steady-state view - AlternateScreen prefixes ERASE_SCROLLBACK (\x1b[3J) to its entry so strict emulators start from a pristine grid; named constants replace the inline sequences for clarity - bootBanner.ts deleted — dead code
Two follow-ups to the cherry-picked PR NousResearch#9873 (`e3bcc819`): 1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())` so test fixtures that build the adapter via `object.__new__` (skipping __init__) don't crash with AttributeError. See AGENTS.md pitfall NousResearch#17 — same pattern as gateway.run. 2. New 3-case regression coverage in test_discord_bot_auth_bypass.py: - role-only config bypasses the gateway 'no allowlists' branch - roles + users combined still authorizes user-allowlist matches - the role bypass does NOT leak to other platforms (Telegram, etc.) 3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from a previous test in the session can't flip later 'should-reject' tests into false-pass. Required because the bare cherry-pick of NousResearch#9873 only added the adapter- level role check — it didn't cover the gateway-level _is_user_authorized, which still rejected role-only setups via the 'no allowlists configured' branch.
- entry.tsx no longer writes bootBanner() to the main screen before the alt-screen enters. The <Banner> renders inside the alt screen via the seeded intro row, so nothing is lost — just the flash that preceded it. Fixes the torn first frame reported on Alacritty (blitz row 5 NousResearch#17) and shaves the 'starting agent' hang perception (row 5 #1) since the UI paints straight into the steady-state view - AlternateScreen prefixes ERASE_SCROLLBACK (\x1b[3J) to its entry so strict emulators start from a pristine grid; named constants replace the inline sequences for clarity - bootBanner.ts deleted — dead code
… success log) (#18761) * fix(gateway): config.yaml wins over .env for agent/display/timezone settings Regression from the silent config→env bridge. The bridge at module import time is correct for max_turns (unconditional overwrite), but every other agent.*, display.*, timezone, and security bridge key was guarded by 'if X not in os.environ' — so a stale .env entry from an old 'hermes setup' run would shadow the user's current config.yaml indefinitely. Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60 in .env from an old setup, and the gateway silently capped at 60 iterations per turn. Gateway logs confirmed api_calls never exceeded 60. Three changes: 1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*, display.*, timezone, and security.* bridge keys. config.yaml is now authoritative for these settings — same semantics already in place for max_turns, terminal.*, and auxiliary.*. Also surface the bridge failure (previously 'except Exception: pass') to stderr so operators see bridge errors instead of silently falling back to .env. 2. gateway/run.py: INFO-log the resolved max_iterations at gateway start so operators can verify the config→env bridge did the right thing instead of chasing a phantom budget ceiling. 3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in the setup wizard. config.yaml is the single source of truth. Also clean up any stale .env entry left behind by pre-fix setups. Regression tests in tests/gateway/test_config_env_bridge_authority.py guard each config→env key against the 'stale .env shadows config' bug. * fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) Three issues observed in production gateway.log during a rapid restart chain on 2026-05-02, all fixed here. 1. _send_restart_notification logged unconditional success adapter.send() catches provider errors (e.g. Telegram 'Chat not found') and returns SendResult(success=False); it never raises. The caller ignored the return value and always logged 'Sent restart notification to <chat>' at INFO, producing a misleading success line directly below the 'Failed to send Telegram message' traceback on every boot. Now inspects result.success and logs WARNING with the error otherwise. 2. WhatsApp bridge SIGTERM on shutdown classified as fatal error _check_managed_bridge_exit() saw the bridge's returncode -15 (our own SIGTERM from disconnect()) and fired the full fatal-error path, producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus 'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every planned shutdown, immediately before the normal '✓ whatsapp disconnected'. Adds a _shutting_down flag that disconnect() sets before the terminate, and _check_managed_bridge_exit() returns None for returncode in {0, -2, -15} while shutting down. OOM-kill (137) and other non-signal exits still hit the fatal path. 3. restart_drain_timeout default 60s → 180s On 2026-05-02 01:43:27 a user /restart fired while three agents were mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget expired and all three were force-interrupted. 180s covers realistic in-flight agent turns; users on very-long-reasoning models can still raise it further via agent.restart_drain_timeout in config.yaml. Existing explicit user values are preserved by deep-merge. Tests - tests/gateway/test_restart_notification.py: two new tests assert INFO is only logged on SendResult(success=True) and WARNING with the error string is logged on SendResult(success=False). - tests/gateway/test_whatsapp_connect.py: parametrized test for returncode in {0, -2, -15} proves shutdown-time exits are suppressed; separate test proves returncode 137 (SIGKILL/OOM) still surfaces as fatal even when _shutting_down is set. - _check_managed_bridge_exit() reads _shutting_down via getattr-with- default so existing _make_adapter() test helpers that bypass __init__ (pitfall #17 in AGENTS.md) keep working unmodified.
🔴 RED TEAM AUDIT COMPLETE — 2026-05-09CRITICAL (0 outstanding) | HIGH (0 outstanding) | MEDIUM (3) | LOW (2) Vulnerabilities found and patched
Outstanding (principal review required)
Adversarial test results201 pass, 9 expected skips, 0 fail
Baseline comparison
Full report: |
✅ OUTSTANDING VULNERABILITY PATCHES COMPLETE — 2026-05-09All 5 authorized items patched. Zero regressions. Runbook committed. Patch summary by item
All patches on branch Final adversarial suite228 pass, 9 expected skips, 0 fail New tests added this session:
Total adversarial test count progression: VULN-7.1 runbook entryCreated
Remaining outstanding
All code-level CRITICAL and HIGH items are closed. Zero open CRITICAL. |
… success log) (NousResearch#18761) * fix(gateway): config.yaml wins over .env for agent/display/timezone settings Regression from the silent config→env bridge. The bridge at module import time is correct for max_turns (unconditional overwrite), but every other agent.*, display.*, timezone, and security bridge key was guarded by 'if X not in os.environ' — so a stale .env entry from an old 'hermes setup' run would shadow the user's current config.yaml indefinitely. Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60 in .env from an old setup, and the gateway silently capped at 60 iterations per turn. Gateway logs confirmed api_calls never exceeded 60. Three changes: 1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*, display.*, timezone, and security.* bridge keys. config.yaml is now authoritative for these settings — same semantics already in place for max_turns, terminal.*, and auxiliary.*. Also surface the bridge failure (previously 'except Exception: pass') to stderr so operators see bridge errors instead of silently falling back to .env. 2. gateway/run.py: INFO-log the resolved max_iterations at gateway start so operators can verify the config→env bridge did the right thing instead of chasing a phantom budget ceiling. 3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in the setup wizard. config.yaml is the single source of truth. Also clean up any stale .env entry left behind by pre-fix setups. Regression tests in tests/gateway/test_config_env_bridge_authority.py guard each config→env key against the 'stale .env shadows config' bug. * fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) Three issues observed in production gateway.log during a rapid restart chain on 2026-05-02, all fixed here. 1. _send_restart_notification logged unconditional success adapter.send() catches provider errors (e.g. Telegram 'Chat not found') and returns SendResult(success=False); it never raises. The caller ignored the return value and always logged 'Sent restart notification to <chat>' at INFO, producing a misleading success line directly below the 'Failed to send Telegram message' traceback on every boot. Now inspects result.success and logs WARNING with the error otherwise. 2. WhatsApp bridge SIGTERM on shutdown classified as fatal error _check_managed_bridge_exit() saw the bridge's returncode -15 (our own SIGTERM from disconnect()) and fired the full fatal-error path, producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus 'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every planned shutdown, immediately before the normal '✓ whatsapp disconnected'. Adds a _shutting_down flag that disconnect() sets before the terminate, and _check_managed_bridge_exit() returns None for returncode in {0, -2, -15} while shutting down. OOM-kill (137) and other non-signal exits still hit the fatal path. 3. restart_drain_timeout default 60s → 180s On 2026-05-02 01:43:27 a user /restart fired while three agents were mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget expired and all three were force-interrupted. 180s covers realistic in-flight agent turns; users on very-long-reasoning models can still raise it further via agent.restart_drain_timeout in config.yaml. Existing explicit user values are preserved by deep-merge. Tests - tests/gateway/test_restart_notification.py: two new tests assert INFO is only logged on SendResult(success=True) and WARNING with the error string is logged on SendResult(success=False). - tests/gateway/test_whatsapp_connect.py: parametrized test for returncode in {0, -2, -15} proves shutdown-time exits are suppressed; separate test proves returncode 137 (SIGKILL/OOM) still surfaces as fatal even when _shutting_down is set. - _check_managed_bridge_exit() reads _shutting_down via getattr-with- default so existing _make_adapter() test helpers that bypass __init__ (pitfall NousResearch#17 in AGENTS.md) keep working unmodified.
Two follow-ups to the cherry-picked PR NousResearch#9873 (`e3bcc819`): 1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())` so test fixtures that build the adapter via `object.__new__` (skipping __init__) don't crash with AttributeError. See AGENTS.md pitfall NousResearch#17 — same pattern as gateway.run. 2. New 3-case regression coverage in test_discord_bot_auth_bypass.py: - role-only config bypasses the gateway 'no allowlists' branch - roles + users combined still authorizes user-allowlist matches - the role bypass does NOT leak to other platforms (Telegram, etc.) 3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from a previous test in the session can't flip later 'should-reject' tests into false-pass. Required because the bare cherry-pick of NousResearch#9873 only added the adapter- level role check — it didn't cover the gateway-level _is_user_authorized, which still rejected role-only setups via the 'no allowlists configured' branch.
- entry.tsx no longer writes bootBanner() to the main screen before the alt-screen enters. The <Banner> renders inside the alt screen via the seeded intro row, so nothing is lost — just the flash that preceded it. Fixes the torn first frame reported on Alacritty (blitz row 5 NousResearch#17) and shaves the 'starting agent' hang perception (row 5 NousResearch#1) since the UI paints straight into the steady-state view - AlternateScreen prefixes ERASE_SCROLLBACK (\x1b[3J) to its entry so strict emulators start from a pristine grid; named constants replace the inline sequences for clarity - bootBanner.ts deleted — dead code
… success log) (NousResearch#18761) * fix(gateway): config.yaml wins over .env for agent/display/timezone settings Regression from the silent config→env bridge. The bridge at module import time is correct for max_turns (unconditional overwrite), but every other agent.*, display.*, timezone, and security bridge key was guarded by 'if X not in os.environ' — so a stale .env entry from an old 'hermes setup' run would shadow the user's current config.yaml indefinitely. Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60 in .env from an old setup, and the gateway silently capped at 60 iterations per turn. Gateway logs confirmed api_calls never exceeded 60. Three changes: 1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*, display.*, timezone, and security.* bridge keys. config.yaml is now authoritative for these settings — same semantics already in place for max_turns, terminal.*, and auxiliary.*. Also surface the bridge failure (previously 'except Exception: pass') to stderr so operators see bridge errors instead of silently falling back to .env. 2. gateway/run.py: INFO-log the resolved max_iterations at gateway start so operators can verify the config→env bridge did the right thing instead of chasing a phantom budget ceiling. 3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in the setup wizard. config.yaml is the single source of truth. Also clean up any stale .env entry left behind by pre-fix setups. Regression tests in tests/gateway/test_config_env_bridge_authority.py guard each config→env key against the 'stale .env shadows config' bug. * fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) Three issues observed in production gateway.log during a rapid restart chain on 2026-05-02, all fixed here. 1. _send_restart_notification logged unconditional success adapter.send() catches provider errors (e.g. Telegram 'Chat not found') and returns SendResult(success=False); it never raises. The caller ignored the return value and always logged 'Sent restart notification to <chat>' at INFO, producing a misleading success line directly below the 'Failed to send Telegram message' traceback on every boot. Now inspects result.success and logs WARNING with the error otherwise. 2. WhatsApp bridge SIGTERM on shutdown classified as fatal error _check_managed_bridge_exit() saw the bridge's returncode -15 (our own SIGTERM from disconnect()) and fired the full fatal-error path, producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus 'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every planned shutdown, immediately before the normal '✓ whatsapp disconnected'. Adds a _shutting_down flag that disconnect() sets before the terminate, and _check_managed_bridge_exit() returns None for returncode in {0, -2, -15} while shutting down. OOM-kill (137) and other non-signal exits still hit the fatal path. 3. restart_drain_timeout default 60s → 180s On 2026-05-02 01:43:27 a user /restart fired while three agents were mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget expired and all three were force-interrupted. 180s covers realistic in-flight agent turns; users on very-long-reasoning models can still raise it further via agent.restart_drain_timeout in config.yaml. Existing explicit user values are preserved by deep-merge. Tests - tests/gateway/test_restart_notification.py: two new tests assert INFO is only logged on SendResult(success=True) and WARNING with the error string is logged on SendResult(success=False). - tests/gateway/test_whatsapp_connect.py: parametrized test for returncode in {0, -2, -15} proves shutdown-time exits are suppressed; separate test proves returncode 137 (SIGKILL/OOM) still surfaces as fatal even when _shutting_down is set. - _check_managed_bridge_exit() reads _shutting_down via getattr-with- default so existing _make_adapter() test helpers that bypass __init__ (pitfall NousResearch#17 in AGENTS.md) keep working unmodified.
…gent Add support for Atropos Agentic RL environments (requires branch tool…
… success log) (NousResearch#18761) * fix(gateway): config.yaml wins over .env for agent/display/timezone settings Regression from the silent config→env bridge. The bridge at module import time is correct for max_turns (unconditional overwrite), but every other agent.*, display.*, timezone, and security bridge key was guarded by 'if X not in os.environ' — so a stale .env entry from an old 'hermes setup' run would shadow the user's current config.yaml indefinitely. Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60 in .env from an old setup, and the gateway silently capped at 60 iterations per turn. Gateway logs confirmed api_calls never exceeded 60. Three changes: 1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*, display.*, timezone, and security.* bridge keys. config.yaml is now authoritative for these settings — same semantics already in place for max_turns, terminal.*, and auxiliary.*. Also surface the bridge failure (previously 'except Exception: pass') to stderr so operators see bridge errors instead of silently falling back to .env. 2. gateway/run.py: INFO-log the resolved max_iterations at gateway start so operators can verify the config→env bridge did the right thing instead of chasing a phantom budget ceiling. 3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in the setup wizard. config.yaml is the single source of truth. Also clean up any stale .env entry left behind by pre-fix setups. Regression tests in tests/gateway/test_config_env_bridge_authority.py guard each config→env key against the 'stale .env shadows config' bug. * fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) Three issues observed in production gateway.log during a rapid restart chain on 2026-05-02, all fixed here. 1. _send_restart_notification logged unconditional success adapter.send() catches provider errors (e.g. Telegram 'Chat not found') and returns SendResult(success=False); it never raises. The caller ignored the return value and always logged 'Sent restart notification to <chat>' at INFO, producing a misleading success line directly below the 'Failed to send Telegram message' traceback on every boot. Now inspects result.success and logs WARNING with the error otherwise. 2. WhatsApp bridge SIGTERM on shutdown classified as fatal error _check_managed_bridge_exit() saw the bridge's returncode -15 (our own SIGTERM from disconnect()) and fired the full fatal-error path, producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus 'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every planned shutdown, immediately before the normal '✓ whatsapp disconnected'. Adds a _shutting_down flag that disconnect() sets before the terminate, and _check_managed_bridge_exit() returns None for returncode in {0, -2, -15} while shutting down. OOM-kill (137) and other non-signal exits still hit the fatal path. 3. restart_drain_timeout default 60s → 180s On 2026-05-02 01:43:27 a user /restart fired while three agents were mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget expired and all three were force-interrupted. 180s covers realistic in-flight agent turns; users on very-long-reasoning models can still raise it further via agent.restart_drain_timeout in config.yaml. Existing explicit user values are preserved by deep-merge. Tests - tests/gateway/test_restart_notification.py: two new tests assert INFO is only logged on SendResult(success=True) and WARNING with the error string is logged on SendResult(success=False). - tests/gateway/test_whatsapp_connect.py: parametrized test for returncode in {0, -2, -15} proves shutdown-time exits are suppressed; separate test proves returncode 137 (SIGKILL/OOM) still surfaces as fatal even when _shutting_down is set. - _check_managed_bridge_exit() reads _shutting_down via getattr-with- default so existing _make_adapter() test helpers that bypass __init__ (pitfall NousResearch#17 in AGENTS.md) keep working unmodified.
…tracked .claude/ file Note in inventory entry NousResearch#17 that .claude/skills/responsiveness-benchmark/SKILL.md is the sole file this fork tracks under .claude/ (no .gitignore rule covers it), so it's easy to spot/relocate on an upstream rebase. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…doffs The respawn guard's 'active_pr' check correctly prevents duplicate PR creation when a worker is respawned on a task that already has a PR URL in its comment history. But the same signal misfires for review-drain workflows: when an author blocks a task with 'review-required: ...' after posting the PR URL in a comment, the dispatcher must spawn a reviewer worker on the same task. The PR URL in comments is the handoff payload, not a duplicate-PR risk. Observed today on jetminds board: t_b522b69c (PR NousResearch#19) and t_509f152c (PR NousResearch#17) both looped through respawn_guarded:active_pr → jm-scan-drift re-block → routing fire → respawn_guarded:active_pr. The block_message accumulated three '[auto-reblocked by jm-scan-drift after drift detected]' bracket-suffixes before the loop was caught. Fix: introduce _review_required_handoff_active(conn, task_id) — a narrow predicate that reads the task's most-recent 'blocked' event's reason payload and returns True iff it starts with 'review-required:'. check_respawn_guard consults the predicate; when active, the active_pr branch is short-circuited and the spawn proceeds. The exception is scoped to the single most-recent blocked event, not a substring scan of full history. Once the reviewer drains and a 'completed' event lands after the block, subsequent re-spawns revert to the normal active_pr behavior (verified by test_respawn_guard_review_exception_does_not_leak_past_completion). Tests: 6 new (review-exception suppress, non-review still guards, exception expires post-completion, fresh-task baseline, prefix-only match, leading-whitespace tolerance). 21/21 respawn-guard tests pass; 164/164 test_kanban_db.py file passes. No upstream behavior changed for non-review-drain workflows. Refs: substrate task t_ee014b80 (engineering-lead, 2026-05-21) JetMinds-local; per hermes-fork-patches discipline
… tests _adapter_enforces_own_access_policy accessed self.adapters directly, but several auth tests build a bare GatewayRunner via object.__new__ without setting .adapters (pitfalls.md #17). Read it defensively with getattr so a missing/empty adapter map means "no adapter owns the policy" instead of raising AttributeError. Fixes 4 tests: test_feishu_bot_auth_bypass, test_discord_bot_auth_bypass (x2), test_signal::test_signal_in_allowlist_maps.
… tests _adapter_enforces_own_access_policy accessed self.adapters directly, but several auth tests build a bare GatewayRunner via object.__new__ without setting .adapters (pitfalls.md #17). Read it defensively with getattr so a missing/empty adapter map means "no adapter owns the policy" instead of raising AttributeError. Fixes 4 tests: test_feishu_bot_auth_bypass, test_discord_bot_auth_bypass (x2), test_signal::test_signal_in_allowlist_maps.
… tests _adapter_enforces_own_access_policy accessed self.adapters directly, but several auth tests build a bare GatewayRunner via object.__new__ without setting .adapters (pitfalls.md NousResearch#17). Read it defensively with getattr so a missing/empty adapter map means "no adapter owns the policy" instead of raising AttributeError. Fixes 4 tests: test_feishu_bot_auth_bypass, test_discord_bot_auth_bypass (x2), test_signal::test_signal_in_allowlist_maps.
Two follow-ups to the cherry-picked PR NousResearch#9873 (`e3bcc819`): 1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())` so test fixtures that build the adapter via `object.__new__` (skipping __init__) don't crash with AttributeError. See AGENTS.md pitfall NousResearch#17 — same pattern as gateway.run. 2. New 3-case regression coverage in test_discord_bot_auth_bypass.py: - role-only config bypasses the gateway 'no allowlists' branch - roles + users combined still authorizes user-allowlist matches - the role bypass does NOT leak to other platforms (Telegram, etc.) 3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from a previous test in the session can't flip later 'should-reject' tests into false-pass. Required because the bare cherry-pick of NousResearch#9873 only added the adapter- level role check — it didn't cover the gateway-level _is_user_authorized, which still rejected role-only setups via the 'no allowlists configured' branch.
- entry.tsx no longer writes bootBanner() to the main screen before the alt-screen enters. The <Banner> renders inside the alt screen via the seeded intro row, so nothing is lost — just the flash that preceded it. Fixes the torn first frame reported on Alacritty (blitz row 5 NousResearch#17) and shaves the 'starting agent' hang perception (row 5 NousResearch#1) since the UI paints straight into the steady-state view - AlternateScreen prefixes ERASE_SCROLLBACK (\x1b[3J) to its entry so strict emulators start from a pristine grid; named constants replace the inline sequences for clarity - bootBanner.ts deleted — dead code
… success log) (NousResearch#18761) * fix(gateway): config.yaml wins over .env for agent/display/timezone settings Regression from the silent config→env bridge. The bridge at module import time is correct for max_turns (unconditional overwrite), but every other agent.*, display.*, timezone, and security bridge key was guarded by 'if X not in os.environ' — so a stale .env entry from an old 'hermes setup' run would shadow the user's current config.yaml indefinitely. Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60 in .env from an old setup, and the gateway silently capped at 60 iterations per turn. Gateway logs confirmed api_calls never exceeded 60. Three changes: 1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*, display.*, timezone, and security.* bridge keys. config.yaml is now authoritative for these settings — same semantics already in place for max_turns, terminal.*, and auxiliary.*. Also surface the bridge failure (previously 'except Exception: pass') to stderr so operators see bridge errors instead of silently falling back to .env. 2. gateway/run.py: INFO-log the resolved max_iterations at gateway start so operators can verify the config→env bridge did the right thing instead of chasing a phantom budget ceiling. 3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in the setup wizard. config.yaml is the single source of truth. Also clean up any stale .env entry left behind by pre-fix setups. Regression tests in tests/gateway/test_config_env_bridge_authority.py guard each config→env key against the 'stale .env shadows config' bug. * fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) Three issues observed in production gateway.log during a rapid restart chain on 2026-05-02, all fixed here. 1. _send_restart_notification logged unconditional success adapter.send() catches provider errors (e.g. Telegram 'Chat not found') and returns SendResult(success=False); it never raises. The caller ignored the return value and always logged 'Sent restart notification to <chat>' at INFO, producing a misleading success line directly below the 'Failed to send Telegram message' traceback on every boot. Now inspects result.success and logs WARNING with the error otherwise. 2. WhatsApp bridge SIGTERM on shutdown classified as fatal error _check_managed_bridge_exit() saw the bridge's returncode -15 (our own SIGTERM from disconnect()) and fired the full fatal-error path, producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus 'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every planned shutdown, immediately before the normal '✓ whatsapp disconnected'. Adds a _shutting_down flag that disconnect() sets before the terminate, and _check_managed_bridge_exit() returns None for returncode in {0, -2, -15} while shutting down. OOM-kill (137) and other non-signal exits still hit the fatal path. 3. restart_drain_timeout default 60s → 180s On 2026-05-02 01:43:27 a user /restart fired while three agents were mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget expired and all three were force-interrupted. 180s covers realistic in-flight agent turns; users on very-long-reasoning models can still raise it further via agent.restart_drain_timeout in config.yaml. Existing explicit user values are preserved by deep-merge. Tests - tests/gateway/test_restart_notification.py: two new tests assert INFO is only logged on SendResult(success=True) and WARNING with the error string is logged on SendResult(success=False). - tests/gateway/test_whatsapp_connect.py: parametrized test for returncode in {0, -2, -15} proves shutdown-time exits are suppressed; separate test proves returncode 137 (SIGKILL/OOM) still surfaces as fatal even when _shutting_down is set. - _check_managed_bridge_exit() reads _shutting_down via getattr-with- default so existing _make_adapter() test helpers that bypass __init__ (pitfall NousResearch#17 in AGENTS.md) keep working unmodified.
Fixes 12 remaining MEDIUM issues from the deep audit (19 total, 7 fixed in Round 12): design_agent: - NousResearch#15: add asyncio.wait_for(300s) around LLM API call to prevent infinite hangs - NousResearch#17: replace 2x hardcoded 'claude-opus-4-8' with shared DEFAULT_MODEL constant qa_agent / validate_agent: - NousResearch#20,NousResearch#22,NousResearch#23: already fixed in Round 12 (verified — dynamic timeout/threshold values used) memory.py: - NousResearch#24: frontmatter parser uses regex r'^---$' instead of str.split('---',2), preventing false splits on content containing '---' (SQL, markdown tables) - NousResearch#25: parse and preserve 'description' field from frontmatter in metadata, fixing write→load roundtrip data loss profiles.py: - NousResearch#26: ProfileConfig now frozen=True (immutable dataclass per coding standards) deploy_agent: - NousResearch#31: replace 2x sync subprocess.run with asyncio.create_subprocess_exec - fix 5x .decode() → .decode('utf-8', errors='replace') for Windows CJK safety - remove unused import subprocess db.py: - NousResearch#27: add class docstring explaining RLock + _unlocked pattern - NousResearch#28: FK constraints already in DDL (verified PRAGMA foreign_keys=ON active) - NousResearch#29: add _ensure_connection() with PRAGMA integrity_check(1) + auto-reconnect on 4 critical methods (create_task, get_task, claim_task, submit_result) - extract _create_connection() static method for reuse by reconnect Tests: 79 passed, 0 failed
Two follow-ups to the cherry-picked PR NousResearch#9873 (`e3bcc819`): 1. `_is_allowed_user` now uses `getattr(self, '_allowed_*_ids', set())` so test fixtures that build the adapter via `object.__new__` (skipping __init__) don't crash with AttributeError. See AGENTS.md pitfall NousResearch#17 — same pattern as gateway.run. 2. New 3-case regression coverage in test_discord_bot_auth_bypass.py: - role-only config bypasses the gateway 'no allowlists' branch - roles + users combined still authorizes user-allowlist matches - the role bypass does NOT leak to other platforms (Telegram, etc.) 3. Autouse fixture in test_discord_bot_auth_bypass.py clears all Discord auth env vars before each test so DISCORD_ALLOWED_ROLES leakage from a previous test in the session can't flip later 'should-reject' tests into false-pass. Required because the bare cherry-pick of NousResearch#9873 only added the adapter- level role check — it didn't cover the gateway-level _is_user_authorized, which still rejected role-only setups via the 'no allowlists configured' branch.
- entry.tsx no longer writes bootBanner() to the main screen before the alt-screen enters. The <Banner> renders inside the alt screen via the seeded intro row, so nothing is lost — just the flash that preceded it. Fixes the torn first frame reported on Alacritty (blitz row 5 NousResearch#17) and shaves the 'starting agent' hang perception (row 5 NousResearch#1) since the UI paints straight into the steady-state view - AlternateScreen prefixes ERASE_SCROLLBACK (\x1b[3J) to its entry so strict emulators start from a pristine grid; named constants replace the inline sequences for clarity - bootBanner.ts deleted — dead code
… success log) (NousResearch#18761) * fix(gateway): config.yaml wins over .env for agent/display/timezone settings Regression from the silent config→env bridge. The bridge at module import time is correct for max_turns (unconditional overwrite), but every other agent.*, display.*, timezone, and security bridge key was guarded by 'if X not in os.environ' — so a stale .env entry from an old 'hermes setup' run would shadow the user's current config.yaml indefinitely. Symptom: agent.max_turns: 500 in config.yaml, HERMES_MAX_ITERATIONS=60 in .env from an old setup, and the gateway silently capped at 60 iterations per turn. Gateway logs confirmed api_calls never exceeded 60. Three changes: 1. gateway/run.py: drop the 'not in os.environ' guards for all agent.*, display.*, timezone, and security.* bridge keys. config.yaml is now authoritative for these settings — same semantics already in place for max_turns, terminal.*, and auxiliary.*. Also surface the bridge failure (previously 'except Exception: pass') to stderr so operators see bridge errors instead of silently falling back to .env. 2. gateway/run.py: INFO-log the resolved max_iterations at gateway start so operators can verify the config→env bridge did the right thing instead of chasing a phantom budget ceiling. 3. hermes_cli/setup.py: stop writing HERMES_MAX_ITERATIONS to .env in the setup wizard. config.yaml is the single source of truth. Also clean up any stale .env entry left behind by pre-fix setups. Regression tests in tests/gateway/test_config_env_bridge_authority.py guard each config→env key against the 'stale .env shadows config' bug. * fix(gateway): shutdown + restart hygiene (drain timeout, false-fatal, success log) Three issues observed in production gateway.log during a rapid restart chain on 2026-05-02, all fixed here. 1. _send_restart_notification logged unconditional success adapter.send() catches provider errors (e.g. Telegram 'Chat not found') and returns SendResult(success=False); it never raises. The caller ignored the return value and always logged 'Sent restart notification to <chat>' at INFO, producing a misleading success line directly below the 'Failed to send Telegram message' traceback on every boot. Now inspects result.success and logs WARNING with the error otherwise. 2. WhatsApp bridge SIGTERM on shutdown classified as fatal error _check_managed_bridge_exit() saw the bridge's returncode -15 (our own SIGTERM from disconnect()) and fired the full fatal-error path, producing 'ERROR ... WhatsApp bridge process exited unexpectedly' plus 'Fatal whatsapp adapter error (whatsapp_bridge_exited)' on every planned shutdown, immediately before the normal '✓ whatsapp disconnected'. Adds a _shutting_down flag that disconnect() sets before the terminate, and _check_managed_bridge_exit() returns None for returncode in {0, -2, -15} while shutting down. OOM-kill (137) and other non-signal exits still hit the fatal path. 3. restart_drain_timeout default 60s → 180s On 2026-05-02 01:43:27 a user /restart fired while three agents were mid-API-call (82s, 112s, 154s into their turns). The 60s drain budget expired and all three were force-interrupted. 180s covers realistic in-flight agent turns; users on very-long-reasoning models can still raise it further via agent.restart_drain_timeout in config.yaml. Existing explicit user values are preserved by deep-merge. Tests - tests/gateway/test_restart_notification.py: two new tests assert INFO is only logged on SendResult(success=True) and WARNING with the error string is logged on SendResult(success=False). - tests/gateway/test_whatsapp_connect.py: parametrized test for returncode in {0, -2, -15} proves shutdown-time exits are suppressed; separate test proves returncode 137 (SIGKILL/OOM) still surfaces as fatal even when _shutting_down is set. - _check_managed_bridge_exit() reads _shutting_down via getattr-with- default so existing _make_adapter() test helpers that bypass __init__ (pitfall NousResearch#17 in AGENTS.md) keep working unmodified.
…_call_support in Atropos atm)
HermesSweEnvfor software engineering tasks andTerminalTestEnvfor inline testing.ToolContextfor unrestricted access to tools during reward computation..gitignoreto excludewandb/directory.README.mdwith detailed architecture and usage instructions for Atropos environments.__pycache__.