Skip to content

feat(gateway): user permission tiers — role-based access control for messaging platforms#4443

Closed
ReqX wants to merge 11 commits into
NousResearch:mainfrom
ReqX:user-permission-tiers-v2
Closed

feat(gateway): user permission tiers — role-based access control for messaging platforms#4443
ReqX wants to merge 11 commits into
NousResearch:mainfrom
ReqX:user-permission-tiers-v2

Conversation

@ReqX

@ReqX ReqX commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Introduces opt-in permission tiers for the Hermes messaging gateway. Operators can define named tiers (Owner → Admin → User → Guest) with granular control over tool access, slash commands, exec approvals, rate limiting, time windows, platform role mapping, audit logging, and persistent usage tracking.

When no permission_tiers config exists, behavior is unchanged — fully backward compatible.

Replaces #3156 (v1 — that PR lacked security hardening, auto-tier, composite keys, and documentation; this is a complete rewrite).

Related Issue

Fixes #527

Type of Change

  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix

Changes Made

New files (5):

  • gateway/permissions.pyPermissionManager, runtime user store (SQLite), rate limiting, i18n, platform role mapping, promote request store (+1,400 lines)
  • gateway/audit.pyAuditLog + NullAuditLog (append-only SQLite with rotation)
  • gateway/usage.pyUsageStore + NullUsageStore (persistent token tracking with windowed aggregation)
  • docs/feature-user-permissions.md — full feature documentation
  • tests/gateway/test_permission_tiers.py448 tests

Modified files (13):

  • gateway/config.pyPermissionTiersConfig, TierDefinition, UserTierConfig, built-in presets, @group tool expansion, per-command allowlists, per-user tool overrides
  • gateway/run.py — tier resolution, tool filtering, command gating, time restrictions, context injection, /promote + /audit handlers, owner escalation guard, security fixes
  • gateway/session.pyuser_roles field on SessionSource
  • gateway/platforms/telegram.pygetChatMember() with TTL cache for role detection
  • gateway/platforms/discord.py — role + permission flag extraction at all build_source sites
  • gateway/platforms/base.pybuild_source() accepts user_roles
  • run_agent.pyallowed_tool_names enforcement at dispatch
  • tools/delegate_tool.py — child agents inherit allowed_tool_names from parent
  • model_tools.py — MCP tool filtering in tool definitions
  • hermes_cli/commands.pyadmin_only and owner_only flags, promote and audit command defs
  • cli-config.yaml.example — permission tiers config example
  • tests/gateway/test_session_race_guard.py — adapted for permission-tiered approval keys

How to Test

  1. Add permission tiers config:
cat >> ~/.hermes/config.yaml << 'EOF'
permission_tiers:
  default_tier: guest
  tiers:
    admin:
      allowed_toolsets: ["*"]
      allow_exec: true
      allow_admin_commands: true
    guest:
      allowed_tools: ["@safe"]
      allow_exec: false
      allow_admin_commands: false
      requests_per_hour: 10
  users:
    "telegram:YOUR_USER_ID": admin
  audit:
    enabled: true
EOF
  1. Run the test suite:
pytest tests/gateway/test_permission_tiers.py -v   # 448 tests
  1. Test manually with a live gateway:
hermes gateway / hermes gateway start
# Then: /whoami, try restricted commands as guest user
# Try: /promote admin as a non-admin → blocked
# Try: /users set <guest_id> admin as admin → works

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/gateway/test_permission_tiers.py -q — 448/448 pass
  • I've run pytest tests/gateway/ -q — 2232 passed, 10 failed (all pre-existing on main — see Pre-existing test failures), 38 skipped (optional deps)
  • I've added tests for my changes (448 tests in test_permission_tiers.py)
  • I've tested on my platform: Ubuntu (via CI and local)

Documentation & Housekeeping

  • I've updated relevant documentation (docs/feature-user-permissions.md, docstrings)
  • I've updated cli-config.yaml.example — added permission_tiers config example
  • I've updated AGENTS.md — added permission tier architecture, tool registration patterns, testing notes
  • I've considered cross-platform impact — SQLite-backed stores work on all platforms, no OS-specific code
  • I've updated tool descriptions/schemas — new tools registered with proper schemas in registry

Security

Five rounds of security audit (including a multi-LLM council review with Claude Sonnet/Opus-4.6, Gemini-Pro-3.1-pro, GPT-5.4, GLM-5.1). All findings resolved:

Finding Severity Status
/promote lacked admin gate — any user could self-approve HIGH ✅ Fixed — admin gate inside handler
MCP tool stripping failed open on exception MEDIUM ✅ Fixed — fails closed with frozenset()
/audit N accepted unbounded query limit MEDIUM ✅ Fixed — clamped to 100
Rate counter dict grew without cleanup LOW ✅ Fixed — periodic cleanup scheduled
Admins could grant owner tier MEDIUM ✅ Fixed — owner escalation guard
/promote approve/deny TypeError at runtime HIGH ✅ Fixed — corrected parameter names
Denial messages unhelpful LOW ✅ Fixed — suggest alternatives, /whoami, next steps

Key enforcement layers: tool filtering at agent creation, dispatch-level rejection, delegate propagation to child agents, composite keys for cross-platform isolation, owner-only command gate, owner-only tier grants. All enforcement fails closed.

Issue #527 Coverage

Requirement Status
PermissionManager + tier resolution
Config format (tiers, users, default_tier)
Tool filtering at agent creation
Command gating (admin_only + owner_only)
Per-command allowlists
Per-user tool overrides
Session context injection
Backward compatible (no config = no change)
Default tier for unassigned users
Owner auto-detected from ALLOWED_USERS
/users command family
Per-tier rate limiting
Persistent rate limiting across restarts
Persistent user-role storage
Per-user usage tracking
Telegram group admin → tier mapping
Discord role → tier mapping
Audit logging
Guest tier "useful not crippled" preset
MCP tool default-deny below admin
/promote tier request flow
Smart denial messages
Owner escalation guard (only owners grant owner)
Mode system integration (#476) ⬌ Future

Deviations from Issue #527 spec + opinionated answers

Spec / Opinionated Answer Implementation Rationale
User tier: @web + @read only Broader preset with @media + @skills + @memory + @planning + @clarify Issue spec was a minimum baseline; operators can match it exactly by defining a custom tier — see docs/feature-user-permissions.md
Admin tier: @web + @read + @code Nearly everything except @all wildcard Trusted role; operators can define a stricter custom tier — see docs/feature-user-permissions.md
Memory: read-only for User tier Full read+write via @memory group Splitting memory into read/write needs a larger refactor. Workaround: list memory_search without memory_write in allowed_tools
Tool-call interception at dispatch Schema stripping before agent runs Strictly more secure — agent physically cannot call restricted tools. Friendly guidance via system prompt instead


PS: Pre-existing test failures

The gateway test suite has 10 failures and 7 collection errors that exist on main — none are introduced by this PR. Verified by running the same tests against main — all 10 failures reproduce identically.10 runtime failures (pre-existing):

Test Area Root cause
test_no_env_vars_no_injection Auto-tier env vars Test isolation — env leakage
test_pairing_no_store_no_error Pairing store Mock setup mismatch
test_pairing_store_exception_handled Pairing store Mock setup mismatch
test_session_hygiene_messages_stay_in_originating_topic Session hygiene Topic routing
test_signal_not_loaded_without_both_vars Signal config Env var leakage between tests
test_rate_limited_user_gets_no_response Unauthorized DM Rate limit timing
test_rejection_message_records_rate_limit Unauthorized DM Rate limit timing
test_managed_install_returns_package_manager_guidance Update command Package manager detection
test_spawns_setsid Update command setsid availability
test_fallback_when_no_setsid Update command setsid availability

7 collection errors (pre-existing):

  • tests/acp/ (5 files) — missing acp package (optional dependency)
  • tests/tools/test_mcp_tool.py — missing mcp package (optional dependency)
  • tests/hermes_cli/test_commands.py — stale import (_TG_NAME_LIMIT removed in upstream commit not yet in this branch)

38 skipped: All optional dependency guards — lark-oapi (Feishu), PyNaCl (voice), matrix-nio. Normal and expected.

This PR's results: tests/gateway/test_permission_tiers.py448/448 pass, 0 failures, 0 skipped.

@ReqX ReqX mentioned this pull request Apr 1, 2026
10 tasks
@ReqX ReqX force-pushed the user-permission-tiers-v2 branch from a29ccf5 to 4b12cd1 Compare April 1, 2026 12:06
ReqX added 11 commits April 1, 2026 12:14
Add TOOL_GROUPS dictionary with 16 groups (@web, @READ, @code, @safe, etc.)
and recursive _expand_tool_groups() for @-prefixed group references in config.

Add BUILTIN_TIER_PRESETS (owner, admin, user, guest) with tool lists, exec
flags, rate/time limits, and admin command gating.

Add PermissionTiersConfig dataclass with full config schema validation,
auto-tier support (env_auto_tier), and safe defaults.

Add admin_only and owner_only fields to CommandDef. Mark sethome, provider,
personality as admin_only; update, shutdown, restart as owner_only.

Reformat CommandDef fields and COMMAND_REGISTRY entries for readability.
…ores

New gateway/permissions.py with:

- RuntimeUserStore: SQLite-backed runtime user database (composite keys,
  last_seen tracking, platform-agnostic user_id)
- PromoteRequestStore: SQLite-backed promotion request tracking with expiry
- PermissionManager: central permission resolution engine
  - Tier resolution: config → env overrides → auto-tier fallback
  - Tool filtering: tier allowed_tools + per-user overrides + exec stripping
  - Rate limiting: sliding window with per-tier requests_per_hour
  - Time restrictions: allowed_hours with timezone support (fail-closed)
  - MCP default-deny: MCP tools blocked unless explicitly in allowed_tools
  - Command gating: admin_only / owner_only / allowed_commands checks
  - Usage tracking integration for quota enforcement

All security decisions are fail-closed: missing config = no access.
Wire PermissionManager into the gateway message loop:

- gateway/run.py: gate every incoming message through tier resolution,
  time restriction checks, rate limiting, and command allowlist before
  agent invocation. Filter tool schemas per-user before passing to AIAgent.
  Add /promote, /demote, /users, /usage permission slash commands.
  Add admin broadcast support.

- model_tools.py: pass platform source context through to delegate tool
  for permission-aware subagent execution.

- run_agent.py: use @functools.cached_property for _permissions to avoid
  repeated PermissionManager instantiation. Wire permission context into
  agent initialization.

- tools/delegate_tool.py: inherit parent permission context in subagents.
  Strip disallowed tools from child agent toolsets.
Comprehensive test suite covering:

- Config schema: presets, tool groups, PermissionTiersConfig validation,
  env_auto_tier parsing, safe defaults
- Tier resolution: config lookup, env overrides, auto-tier fallback,
  runtime overlays, owner-tier hierarchy
- Tool filtering: group expansion, per-user overrides, exec stripping,
  MCP default-deny, composite group refs
- Enforcement: time restrictions, rate limiting (sliding window),
  command gating (admin/owner/allowlist)
- Commands: /promote, /demote, /users, /usage message formatting
- Stores: RuntimeUserStore, PromoteRequestStore SQLite operations

Also fix formatting in test_session_race_guard.py (trailing commas,
context manager syntax).
Add feature-user-permissions.md with architecture overview, configuration
reference, tier presets, tool groups, security model, and per-platform
setup guides (Telegram, Discord, Slack, Signal, WhatsApp, Home Assistant).

Add permission_tiers example to cli-config.yaml.example.

Add .audit/ and .archive/ to .gitignore (local-only tracking dirs).
…promote flow, and MCP default-deny

Phase 3 enhancements to permission tiers:

- Telegram group admin/creator → tier mapping via getChatMember() with 5-min TTL cache
- Discord role name + permission flags → tier mapping
- Append-only audit log (SQLite) with rotation and /audit admin command
- Persistent usage tracking (SQLite) with windowed aggregation and token counting
- /promote request flow: users request tier promotion, admins approve/deny
- MCP tool default-deny below admin tier (schema stripping defense-in-depth)
- Guest preset upgraded from @clarify to @safe tools
- Smart denial messages with helpful alternatives and /whoami references
- functools.cached_property fix for _permissions (was creating new manager per access)
- Unused import cleanup (json, Set, Platform, get_hermes_home)
…e, and MCP deny

- TestAuditLog: 17 tests (append, rotation, cleanup, null object)
- TestNullAuditLog: 4 tests
- TestAuditLogIntegration: 6 tests (runner wiring)
- TestUsageStore: 15 tests (record, query, cleanup, windows)
- TestNullUsageStore: 5 tests
- TestUsageStoreIntegration: 5 tests (runner wiring)
- TestPlatformRoleMapping: 14 tests (Telegram/Discord role resolution)
- TestSmartDenialMessages: 8 tests (fallback messages, behavioral guidance)
- TestPromoteRequestStore: 15 tests (create, approve, deny, expire)
- TestIsElevatedTier: 15 tests (MCP access detection)
- Updated 3 existing tests for guest preset and MCP tool group changes
- Total: 304 → 408 tests, all passing
…nfig model)

T1 — Per-user tool overrides:
- Add allowed_tools + resolved_tools_override to UserTierConfig
- @group expansion in from_dict() via _expand_tool_groups()
- whoami() returns user_tool_override when per-user tools configured

T2 — Per-command allowlists:
- Add allowed_commands (Optional[FrozenSet[str]]) to TierDefinition
- Parsing in from_dict(), serialization in to_dict()
- None = no restriction (backward compatible)
Audit fixes:
- H-1: Add admin gate in /promote handler for list/approve/deny subcommands
  (was: any user could self-approve privilege escalation requests)
- M-1: MCP tool stripping now fails closed with frozenset() instead of pass
  (was: exception during MCP stripping left all tools available)
- M-2: Clamp /audit query limit to 100 max
  (was: unbounded integer accepted from user input)
- L-3: Schedule rate counter cleanup every 10 cron ticks (~10 min)
  (was: cleanup_rate_counters() existed but was never called)

Feature wiring (T1/T2):
- T1: Per-user tool override in _run_agent() — intersects tier tools with
  user-level override (restrictive only, never expands access)
- T1: Display user tool override in /whoami output
- T2: Per-command allowlist dispatch gate after admin-only gate, before
  owner-only gate — when allowed_commands is set, only listed commands
  are permitted; None preserves backward-compatible binary gating
Security audit fix tests:
- TestH1PromoteAdminGate (3): admin gate blocks list/approve/deny for
  non-admin, allows /promote <tier> request creation
- TestPromoteAdminGateExtended (2): cross-tier escalation prevention
- TestM2AuditLimitClamp (3): /audit N clamped to 100 for all variants

T1 — Per-user tool override tests:
- TestUserTierConfigToolOverride (6): config parsing, @group expansion,
  invalid input handling, to_dict serialization
- TestPerUserToolOverrideResolution (6): tool resolution intersection,
  wildcard tier behavior, user override cannot expand access

T2 — Per-command allowlist tests:
- TestTierDefinitionCommandAllowlist (5): config parsing, normalization,
  to_dict serialization, invalid input handling
- TestCommandAllowlistGating (6): dispatch gate logic, non-admin commands
  gated, audit logging, backward compatibility when unset
…d promote param fix

- Only owners can grant the owner tier via /users set or /promote approve.
  Non-owners (including admins) are rejected with a clear message.
  Config.yaml remains the bootstrap mechanism for the first owner.
- Improved all permission denial fallback messages to suggest alternatives
  (timezone in time messages, reset countdown in rate messages, /whoami refs).
- Fixed TypeError in /promote approve|deny: parameter names were
  approved_by/denied_by but PromoteRequestStore expects resolved_by.
- 9 new tests in TestOwnerEscalationGuard (448 total).
@ReqX ReqX force-pushed the user-permission-tiers-v2 branch from 4b12cd1 to 8fa7a3c Compare April 1, 2026 12:52
@alt-glitch alt-glitch added type/feature New feature or request comp/gateway Gateway runner, session dispatch, delivery platform/telegram Telegram bot adapter platform/discord Discord bot adapter type/security Security vulnerability or hardening P3 Low — cosmetic, nice to have area/auth Authentication, OAuth, credential pools and removed type/security Security vulnerability or hardening labels May 1, 2026
teknium1 added a commit that referenced this pull request May 10, 2026
…age of #4443) (#23373)

* feat(gateway): per-platform admin/user split for slash commands

Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:

  - allow_admin_from         — full slash command access
  - user_allowed_commands    — what non-admins may run
  - group_allow_admin_from   — same, group/channel scope
  - group_user_allowed_commands

When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.

Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.

Adds `/whoami` showing platform, scope, tier, and runnable commands.

Salvage of PR #4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.

Co-authored-by: ReqX <mike@grossmann.at>

* fix(gateway): close running-agent fast-path bypass + add coverage and central docs

The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.

Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.

Also added 18 more dispatch tests covering:
  - Running-agent fast-path: blocks non-admin, allows admin, /status
    always works
  - Alias canonicalization (gate uses canonical name, not user alias)
  - Unknown / unregistered commands pass through (don't false-positive)
  - DM admin scope-locked when group has its own admin list
  - Multi-platform isolation (Discord gated, Telegram unrestricted)

Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.

Co-authored-by: ReqX <mike@grossmann.at>

---------

Co-authored-by: ReqX <mike@grossmann.at>
@teknium1

Copy link
Copy Markdown
Contributor

Hey @ReqX — closing this in favor of #23373 which just merged into main.

We built off your design but scoped down to slash commands only for v1. The slimmer landing point keeps this surface minimal so we can re-expand into the rest of your tier work (tool filtering, audit log, usage tracking, rate limiting, /promote flow, MCP default-deny) as separate PRs when concrete need emerges.

Your authorship is preserved via Co-authored-by: trailer on the merge commit, and you're added to the AUTHOR_MAP in scripts/release.py.

The shape that landed:

  • 2 lists per scope (allow_admin_from + user_allowed_commands) instead of 4 tiers
  • Gate at the gateway dispatch site uses is_gateway_known_command() so plugin-registered slash commands are covered through the live registry
  • Backward compat: gating only kicks in when allow_admin_from is set
  • DM and group are independent scopes (mirroring existing allow_from / group_allow_from)
  • /whoami shows tier + runnable commands

Six weeks of work and a multi-LLM security audit went into the original design; appreciated. Thanks for the contribution.

Merged: #23373

@teknium1 teknium1 closed this May 10, 2026
@ReqX

ReqX commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

Honestly happy you found parts useful, keep it up big <3

JZKK720 pushed a commit to JZKK720/hermes-agent that referenced this pull request May 11, 2026
…age of NousResearch#4443) (NousResearch#23373)

* feat(gateway): per-platform admin/user split for slash commands

Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:

  - allow_admin_from         — full slash command access
  - user_allowed_commands    — what non-admins may run
  - group_allow_admin_from   — same, group/channel scope
  - group_user_allowed_commands

When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.

Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.

Adds `/whoami` showing platform, scope, tier, and runnable commands.

Salvage of PR NousResearch#4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.

Co-authored-by: ReqX <mike@grossmann.at>

* fix(gateway): close running-agent fast-path bypass + add coverage and central docs

The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.

Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.

Also added 18 more dispatch tests covering:
  - Running-agent fast-path: blocks non-admin, allows admin, /status
    always works
  - Alias canonicalization (gate uses canonical name, not user alias)
  - Unknown / unregistered commands pass through (don't false-positive)
  - DM admin scope-locked when group has its own admin list
  - Multi-platform isolation (Discord gated, Telegram unrestricted)

Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.

Co-authored-by: ReqX <mike@grossmann.at>

---------

Co-authored-by: ReqX <mike@grossmann.at>
rmulligan pushed a commit to rmulligan/hermes-agent that referenced this pull request May 11, 2026
…age of NousResearch#4443) (NousResearch#23373)

* feat(gateway): per-platform admin/user split for slash commands

Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:

  - allow_admin_from         — full slash command access
  - user_allowed_commands    — what non-admins may run
  - group_allow_admin_from   — same, group/channel scope
  - group_user_allowed_commands

When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.

Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.

Adds `/whoami` showing platform, scope, tier, and runnable commands.

Salvage of PR NousResearch#4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.

Co-authored-by: ReqX <mike@grossmann.at>

* fix(gateway): close running-agent fast-path bypass + add coverage and central docs

The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.

Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.

Also added 18 more dispatch tests covering:
  - Running-agent fast-path: blocks non-admin, allows admin, /status
    always works
  - Alias canonicalization (gate uses canonical name, not user alias)
  - Unknown / unregistered commands pass through (don't false-positive)
  - DM admin scope-locked when group has its own admin list
  - Multi-platform isolation (Discord gated, Telegram unrestricted)

Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.

Co-authored-by: ReqX <mike@grossmann.at>

---------

Co-authored-by: ReqX <mike@grossmann.at>
JinyuID pushed a commit to JinyuID/hermes-agent that referenced this pull request May 11, 2026
…age of NousResearch#4443) (NousResearch#23373)

* feat(gateway): per-platform admin/user split for slash commands

Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:

  - allow_admin_from         — full slash command access
  - user_allowed_commands    — what non-admins may run
  - group_allow_admin_from   — same, group/channel scope
  - group_user_allowed_commands

When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.

Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.

Adds `/whoami` showing platform, scope, tier, and runnable commands.

Salvage of PR NousResearch#4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.

Co-authored-by: ReqX <mike@grossmann.at>

* fix(gateway): close running-agent fast-path bypass + add coverage and central docs

The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.

Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.

Also added 18 more dispatch tests covering:
  - Running-agent fast-path: blocks non-admin, allows admin, /status
    always works
  - Alias canonicalization (gate uses canonical name, not user alias)
  - Unknown / unregistered commands pass through (don't false-positive)
  - DM admin scope-locked when group has its own admin list
  - Multi-platform isolation (Discord gated, Telegram unrestricted)

Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.

Co-authored-by: ReqX <mike@grossmann.at>

---------

Co-authored-by: ReqX <mike@grossmann.at>
dusterbloom pushed a commit to dusterbloom/hermes-agent that referenced this pull request May 12, 2026
…age of NousResearch#4443) (NousResearch#23373)

* feat(gateway): per-platform admin/user split for slash commands

Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:

  - allow_admin_from         — full slash command access
  - user_allowed_commands    — what non-admins may run
  - group_allow_admin_from   — same, group/channel scope
  - group_user_allowed_commands

When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.

Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.

Adds `/whoami` showing platform, scope, tier, and runnable commands.

Salvage of PR NousResearch#4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.

Co-authored-by: ReqX <mike@grossmann.at>

* fix(gateway): close running-agent fast-path bypass + add coverage and central docs

The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.

Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.

Also added 18 more dispatch tests covering:
  - Running-agent fast-path: blocks non-admin, allows admin, /status
    always works
  - Alias canonicalization (gate uses canonical name, not user alias)
  - Unknown / unregistered commands pass through (don't false-positive)
  - DM admin scope-locked when group has its own admin list
  - Multi-platform isolation (Discord gated, Telegram unrestricted)

Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.

Co-authored-by: ReqX <mike@grossmann.at>

---------

Co-authored-by: ReqX <mike@grossmann.at>
bot-ted added a commit to bot-ted/hermes-agent that referenced this pull request May 13, 2026
* feat(stream-retry): add upstream + timing diagnostics to drop log (#23005)

The previous PR (#22993) gave us a structured WARNING per stream drop
but the only diagnostic was 'error_type=APIError error=Network
connection lost.' — same nothing the user started with. To actually
diagnose why subagents drop streams disproportionately we need to know
WHERE the drop happened.

Adds three breadcrumbs to the agent.log WARNING:

1. Inner exception chain. openai SDK wraps httpx errors as
   APIConnectionError / APIError so the catch site only sees the
   wrapper. _flatten_exception_chain walks __cause__/__context__ up to
   4 levels deep and renders 'Outer(msg) <- Inner(msg)' so we can
   tell ConnectError vs RemoteProtocolError vs ReadError vs
   ProxyError without enabling verbose mode.

2. Upstream HTTP headers. Snapshots cf-ray, x-openrouter-provider,
   x-openrouter-model, x-openrouter-id, x-request-id, server, via,
   etc. from stream.response immediately after open (so they survive
   even when the stream dies before the first chunk). These answer
   'is one CF edge / one downstream provider responsible, or random?'

3. Per-attempt counters. bytes streamed, chunk count, elapsed time on
   the dying attempt, and time-to-first-byte. Distinguishes 'couldn't
   connect at all' (0s, 0 bytes) from 'died after 30s mid-stream'
   (very different root causes — first is auth/routing, second is
   upstream idle-kill or proxy timeout).

Plumbing:

- _stream_diag_init / _stream_diag_capture_response live on AIAgent
  and produce a per-attempt dict held on request_client_holder['diag']
  for closure access from the retry block.
- _call_chat_completions and _call_anthropic both initialize the diag
  and increment counters per chunk/event (best-effort, never raises in
  the streaming hot path).
- _log_stream_retry / _emit_stream_drop accept an optional diag and
  render the new fields. Final-exhaustion log goes through the same
  helper so it gets the same diagnostic dump.
- UI status line gains a brief 'after Xs' suffix when timing is
  available — distinguishes 'connect failed' from 'died mid-stream'
  at a glance without grepping logs.

Sample WARNING after this change:

  Stream drop mid tool-call on attempt 2/3 — retrying.
    subagent_id=sa-2-cafef00d depth=1 provider=openrouter
    base_url=https://openrouter.ai/api/v1
    error_type=APIError error=Connection error.
    chain=APIError(Connection error.) <- RemoteProtocolError(peer
      closed connection without sending complete message body)
    http_status=200 bytes=12400 chunks=47 elapsed=12.00s ttfb=0.83s
    upstream=[cf-ray=8f1a2b3c4d5e6f7g-LAX
      x-openrouter-provider=Anthropic
      x-openrouter-id=gen-abc123 server=cloudflare]

Tests: 10 covering diag init, header capture (whitelist enforced for
PII), exception-chain walking + depth cap, log content with full diag,
log content without diag (placeholders), UI elapsed-suffix on/off.

* fix(review): tell background reviewer not to capture transient env failures as skills (#23004)

Closes #6051.

Reported failure mode: agent migrated to WSL2, browser launch failed
because Playwright wasn't installed yet. Background reviewer captured
the failure as a durable skill (`browser-tool-launch-issue`) and the
agent kept refusing the browser tool for weeks after Playwright was
installed and verified working. Negative claims also propagated into
unrelated skills ("browser tools do not work", "cannot use Y from
execute_code").

Root cause: `_SKILL_REVIEW_PROMPT` and `_COMBINED_REVIEW_PROMPT` both
lean hard on "be active, save things, a pass that does nothing is a
missed learning opportunity." Neither distinguished durable knowledge
from transient environment state. The reviewer was doing what it was
told.

Fix at the write site — both prompts now carry a "Do NOT capture"
section calling out:
  • Environment-dependent failures (missing binaries, fresh-install
    errors, post-migration path mismatches, 'command not found',
    unconfigured credentials, uninstalled packages)
  • Negative claims about tools or features ("X does not work")
    that harden into self-cited refusals
  • Session-specific transient errors that resolved before the
    conversation ended
  • One-off task narratives ("summarize today's market", "analyze
    this PR") — also addresses the #12812 / #4538 family

Plus a positive-reframing line: when a tool fails because of setup
state, capture the FIX (install command, config step, env var)
under an existing setup/troubleshooting skill — never "this tool
doesn't work" as a standalone constraint.

Targeted tests: 24/24 passing in tests/run_agent/test_review_prompt_class_first.py
(2 new + all existing review-prompt assertions). Substring-based
checks so future prompt edits don't false-fail.

* feat(codex): add gpt-5.3-codex-spark model

* fix(model-metadata): restore gpt-5.3-codex-spark fallback context

* fix(model-metadata): set codex-spark fallback context to 128k

* fix: surface Codex CLI-only models

* chore: add codex-spark salvage contributors to AUTHOR_MAP

Maps olegwn@gmail.com → nederev (PR #18286) and vesper@askclaw.dev →
askclaw-vesper (PR #19530) so the contributor attribution check passes
when their commits land via this salvage.

* docs(codex-spark): document ChatGPT Pro entitlement gating

PR #12994 stripped gpt-5.3-codex-spark on the assumption that it was
unsupported. It's actually research-preview, ChatGPT-Pro-only, exposed
via the Codex OAuth backend at chatgpt.com/backend-api/codex/models —
not via the public OpenAI API.

Add explanatory comments in:
  - DEFAULT_CODEX_MODELS / _FORWARD_COMPAT_TEMPLATE_MODELS (codex_models.py)
  - _CODEX_OAUTH_CONTEXT_FALLBACK (model_metadata.py)
  - list_authenticated_providers' live-discovery branch (model_switch.py)

so future maintainers don't strip the entry again. Also documents the
intentional asymmetry that Spark stays out of the "openai" provider
catalog (it isn't on the public API) and why the supported_in_api
filter is *not* applied for the openai-codex route.

* test(codex-spark): add live-API regression and make picker test deterministic

Two follow-ups from self-review:

1. Add unit test for _fetch_models_from_api covering the live HTTP path.
   The salvaged PR #19530 dropped the supported_in_api:false filter in
   both _fetch_models_from_api and _read_cache_models, but only the
   cache path had a regression test. This adds the symmetric live-fetch
   test (mocked httpx) so a future drive-by change to the HTTP path
   can't silently re-introduce the filter.

2. Pin test_codex_picker_uses_live_codex_catalog to the cache fallback.
   The test wrote a fake JWT and a CODEX_HOME cache, but provider_model_ids
   ('openai-codex') still issued a real 10s HTTP probe to
   chatgpt.com/backend-api/codex/models before falling back to the cache.
   That made the test slow and non-deterministic in restricted/CI
   networks. Patch _fetch_models_from_api to return [] so we go straight
   to the cache path the test actually means to exercise.

* fix(codex-spark): defensive 128k entry in DEFAULT_CONTEXT_LENGTHS + clarify validation test docstring

Two follow-ups from self-review:

1. Add gpt-5.3-codex-spark to DEFAULT_CONTEXT_LENGTHS at 128k. The
   primary resolution path for Spark goes through provider='openai-codex'
   → _CODEX_OAUTH_CONTEXT_FALLBACK (already correct). But if any future
   code path resolves Spark's context with a different provider (custom
   proxy, generic fallthrough), the longest-substring-first lookup in
   step 8 would match 'gpt-5' and report 400k, which is wrong by ~3x.
   Adding the explicit override is a cheap defensive correctness fix
   matching how gpt-5.4-mini and gpt-5.4-nano already shadow the generic
   gpt-5 entry.

2. Update test_openai_codex_model_validation_fallback.py docstring. The
   bug it was originally written for (gpt-5.3-codex-spark missing from
   listing) is now resolved by this PR's catalog restoration. The test
   still validly exercises the soft-accept code path for any future
   entitlement-gated Codex slug that ships before Hermes catalogs it,
   but the framing was stale — clarified.

* feat(kanban): add orchestrator board tools

* fix(kanban): parse include_archived explicitly

* fix(kanban): parse triage flag explicitly

* fix(kanban): restrict board routing tools to orchestrators

Adapted from PR #20568 commit ce3518578 (Eric Litovsky / @kallidean).
Adds two-tier gating for the kanban tool surface so dispatcher-spawned
workers see only task-lifecycle tools (show/complete/block/heartbeat/
comment/create/link) while orchestrator profiles with `toolsets: [kanban]`
also see board-routing tools (kanban_list, kanban_unblock).

Workers shouldn't be enumerating or unblocking the board — they should
close their own task via the lifecycle tools. Hiding board-routing tools
from worker schemas keeps the worker focused and the toolset-isolation
contract honest.

Plus inherited from the same upstream commit:
- 50/200 row bound on kanban_list with `truncated` + `next_limit` metadata.
- Belt-and-suspenders runtime guard `_require_orchestrator_tool()` inside
  the orchestrator handlers in case a stale registration ever routes a
  worker to one of them.
- Tests for the new gate, the stricter bound, and the fact that even a
  worker with `toolsets: [kanban]` in config still doesn't see board
  routing.

Co-authored-by: Eric Litovsky <elitovsky@zenproject.net>

* chore: AUTHOR_MAP entry for kallidean (#20568)

* docs(user-stories): add 4 entries from @emmagine79 thread (#23204)

Captain Awesome's May 10 thread on hermes + Discord with GPT-5.5 / DeepSeek v4:
- life-changing umbrella tweet
- Google-me -> SSH-deploy landing page to VPS
- cron jobs triaging tech news into Discord channels by urgency
- PM paperclip agent running morning + evening standups for ADHD

* docs(web-search): explain auxiliary-model summarization for web_extract (#23211)

web_extract runs returned page content through the web_extract auxiliary
model when pages exceed 5 000 chars (single-pass up to 500k, chunked up
to 2M, refused above that). The user-guide page didn't mention this —
users were surprised that long-page extracts produced summaries instead
of raw markdown, and that those summaries cost main-model tokens by
default.

Adds:
- size-driven behavior table (under 5k / 5k–500k / 500k–2M / over 2M)
- which auxiliary task does the work (auxiliary.web_extract)
- how to route summaries to a cheap model regardless of main
- escape hatch: browser_navigate when you need raw content
- troubleshooting entry for summarization timeouts

* feat(gateway): add LINE Messaging API platform plugin (#23197)

* feat(gateway): add LINE Messaging API platform plugin

Adds LINE as a bundled platform plugin under `plugins/platforms/line/`,
synthesized from the strongest pieces of seven open community PRs. The
adapter requires zero core edits — `Platform("line")` is auto-discovered
via the bundled-plugin scan in `gateway/config.py`, and all hooks
(setup, env-enablement, cron delivery, standalone send) are wired
through `register_platform()` kwargs the way IRC and Teams do it.

Highlights merged into one plugin:

- **Reply token preferred, Push fallback.** Try the free reply token
  first (single-use, ~60s TTL); fall back to metered Push when the
  token is absent, expired, or rejected. (PR #21023)
- **Slow-LLM Template Buttons postback.** When the LLM is still running
  past `LINE_SLOW_RESPONSE_THRESHOLD` (default 45s), the adapter burns
  the original reply token to send a "Get answer" button bubble. The
  user taps it to fetch the cached answer via a fresh reply token —
  also free. State machine: PENDING → READY → DELIVERED, ERROR for
  cancelled runs (orphan resolves to `LINE_INTERRUPTED_TEXT` after
  /stop). Set threshold to 0 to disable. (PR #18153)
- **Three-allowlist gating** — separate user / group / room allowlists
  with `LINE_ALLOW_ALL_USERS=true` dev-only escape hatch. (PR #18153)
- **Markdown URL preservation.** Strip bold/italic/code-fence/heading
  markers (LINE renders them literally) but keep `[label](url)` →
  `label (url)` so URLs stay tappable. (PR #18153)
- **System-message bypass** for `⚡ Interrupting`, `⏳ Queued`, etc. —
  busy-acks reach the user as visible bubbles instead of being
  swallowed into the postback cache. (PR #18153)
- **Media via public HTTPS URLs.** LINE doesn't accept binary uploads;
  images/audio/video must be HTTPS-reachable. The adapter serves
  registered tempfiles under `/line/media/<token>/<filename>` from the
  same aiohttp app. Allowed-roots traversal guard covers
  `tempfile.gettempdir()`, `/tmp` (→ `/private/tmp` on macOS), and
  `HERMES_HOME`. `LINE_PUBLIC_URL` overrides URL construction for
  setups behind tunnels/proxies. (PR #8398)
- **5-message-per-call batching.** LINE rejects >5 messages per
  Reply/Push; smart-chunker caps text at 4500 chars per bubble.
- **Inbound dedup** via `webhookEventId` LRU. (PR #21023)
- **Self-message filter** via `/v2/bot/info` userId lookup. (PR #21023)
- **Loading-animation indicator** wired to LINE's `chat/loading/start`
  endpoint, DM-only (LINE rejects it for groups/rooms). (PR #21023)
- **Out-of-process cron delivery** via `_standalone_send`, so
  `deliver: line` cron jobs work even when cron runs detached from
  the gateway.
- **Webhook hardening** — 1 MiB body cap, constant-time HMAC-SHA256
  signature verification, dedup, scoped lock so two profiles can't
  bind the same channel.

Validation
----------

- `scripts/run_tests.sh tests/gateway/test_line_plugin.py` →
  73 passed in 1.05s
- `scripts/run_tests.sh tests/gateway/test_line_plugin.py
  tests/gateway/test_irc_adapter.py
  tests/gateway/test_plugin_platform_interface.py
  tests/gateway/test_platform_registry.py
  tests/gateway/test_config.py` → 193 passed, 7 skipped
- E2E import + register + signature roundtrip + `Platform("line")`
  bundled-plugin discovery verified against current `origin/main`.

Closes the seven open LINE PRs (#18153, #16832, #6676, #21023, #14942,
#14988, #8398) by superseding them with a single plugin-form
implementation that takes the best idea from each.

Co-authored-by: pwlee <32443648+leepoweii@users.noreply.github.com>
Co-authored-by: Jetha Chan <jetha@google.com>
Co-authored-by: Cattia <openclaw@liyangchen.me>
Co-authored-by: perng <charles@perng.com>
Co-authored-by: Soichiro Yoshimura <soichiro0111.dev@gmail.com>
Co-authored-by: David Zhou <77736378+David-0x221Eight@users.noreply.github.com>
Co-authored-by: Yu-ga <74749461+yuga-hashimoto@users.noreply.github.com>

* docs(platforms): document platform-specific slow-LLM UX pattern

Add a 'Platform-Specific Slow-LLM UX' section to the platform-adapter
developer guide covering the _keep_typing override pattern that LINE
uses for its Template Buttons postback flow.

Three subsections:
- Pattern: subclass _keep_typing to layer mid-flight UX (with code)
- Pattern: subclass send to route through a cache instead of sending
- When this pattern is appropriate (vs. always-Push fallback)

Plus a short pointer in gateway/platforms/ADDING_A_PLATFORM.md so
tree-readers find the prose walkthrough on the docsite.

Filed because the LINE plugin (PR #23197) was the first bundled
adapter to need this pattern — every prior plugin (irc, teams,
google_chat) handles slow responses with the default typing-loop and
a regular send_text. Documenting now while the rationale is fresh.

---------

Co-authored-by: pwlee <32443648+leepoweii@users.noreply.github.com>
Co-authored-by: Jetha Chan <jetha@google.com>
Co-authored-by: Cattia <openclaw@liyangchen.me>
Co-authored-by: perng <charles@perng.com>
Co-authored-by: Soichiro Yoshimura <soichiro0111.dev@gmail.com>
Co-authored-by: David Zhou <77736378+David-0x221Eight@users.noreply.github.com>
Co-authored-by: Yu-ga <74749461+yuga-hashimoto@users.noreply.github.com>

* feat(curator): hint at `hermes curator pin` in the rename block (#23212)

Surfaces the pin command at the moment users care about it: when a
consolidation just landed against their skill library and they're
looking at the umbrella name in the curator output. Previously `hermes
curator pin` existed but had no discovery surface — users only learned
it existed by reading docs or stumbling onto `hermes curator --help`.

The hint:

    archived 3 skill(s):
      • docx-extraction → document-tools
      • pdf-extraction → document-tools
      • old-stale — pruned (stale)
    full report: hermes curator status
    keep an umbrella stable: hermes curator pin document-tools

Gated on having at least one consolidation that produced an umbrella.
Pruned-only runs (nothing surviving to pin) skip the hint. When
multiple umbrellas were produced, picks alphabetically first as a
concrete example rather than listing them all.

3 new tests in tests/agent/test_curator_classification.py covering:
consolidation produces hint with real umbrella name, pruned-only run
omits it, multi-umbrella picks one example.

* fix(security): require dashboard auth for plugin API routes

Remove the blanket /api/plugins/* exemption from auth_middleware so
plugin API routes (e.g. Kanban dashboard) require the same session
token as all other /api/ endpoints.

Fixes #19533

* test(security): broaden plugin API auth coverage + correct stale docstring

Follow-up to the previous commit's middleware fix.

- plugins/kanban/dashboard/plugin_api.py: rewrite the "Security note"
  docstring. The previous text said "/api/plugins/ is unauthenticated by
  design" — that's now actively wrong and dangerously misleading. New
  text explains that plugin routes flow through the same session-token
  middleware as core API routes and that --host 0.0.0.0 is safe to use
  on a LAN as a result.

- tests/hermes_cli/test_web_server.py: extend TestPluginAPIAuth to cover
  the surfaces the original PR didn't pin:
  * test_plugin_route_allows_auth now exercises a real plugin path
    (/api/plugins/example/hello) instead of accepting 200 OR 404 from
    a maybe-loaded kanban plugin — the assertion was effectively vacuous.
  * test_plugin_patch_requires_auth + test_plugin_delete_requires_auth
    cover non-GET mutation methods in case a future regression
    whitelists them by accident.
  * test_non_kanban_plugin_route_requires_auth proves the fix is
    plugin-agnostic, not kanban-specific (hits hermes-achievements +
    a non-existent plugin namespace; both 401 before route resolution).
  * test_plugin_websocket_unaffected_by_http_middleware locks in that
    the HTTP middleware change didn't accidentally start gating WS
    upgrades — kanban /events still uses its own ?token= check.
  Plus a cosmetic blank-line cleanup.

* feat(plugins): run any LLM call from inside a plugin via ctx.llm (#23194)

* feat(plugins): host-owned LLM access via ctx.llm

Plugins can now ask the host to run a one-shot chat or structured
completion against the user's active model and auth, without ever
seeing an OAuth token or API key. Closes the gap where plugins that
needed bounded structured inference (receipts, CRM extraction,
support classification) had to either bring their own provider keys
or register a tool the agent had to call.

New surface on PluginContext:
- ctx.llm.complete(messages, ...)
- ctx.llm.complete_structured(instructions, input, json_schema, ...)
- async siblings ctx.llm.acomplete / acomplete_structured

Backed by the existing auxiliary_client.call_llm pipeline — every
provider, fallback chain, vision routing, and timeout policy Hermes
already supports applies automatically.

Trust gate (fail-closed by default):
- plugins.entries.<id>.llm.allow_model_override
- plugins.entries.<id>.llm.allowed_models (allowlist; '*' = any)
- plugins.entries.<id>.llm.allow_agent_id_override
- plugins.entries.<id>.llm.allow_profile_override

Embedded model@profile shorthand goes through the same gate as
explicit profile=, so it can't bypass the auth-profile policy.
Conflicting explicit and embedded profiles fail closed.

Also lands:
- plugins/plugin-llm-example/ — reference plugin that registers
  /receipt-extract, demonstrating image+text structured input,
  jsonschema validation, and the trust-gate config.
- website/docs/developer-guide/plugin-llm-access.md — full API docs.
- 45 unit tests covering trust gates, JSON parsing, schema
  validation, image encoding, async surface, and config loading.

Validation:
- 2628 tests pass in tests/agent/
- E2E: bundled plugin loaded with isolated HERMES_HOME, slash
  command produced parsed JSON via stubbed call_llm
- response_format extra_body wired correctly for both json_object
  and json_schema modes

* docs(plugin-llm): rewrite quickstart and framing

The quickstart now uses a meeting-notes-to-tasks example instead of
a receipt extractor, and the page leads with hook-time / gateway
pre-filter / scheduled-job framing rather than the OpenClaw
KB/support/CRM/finance/migration enumeration that the original
upstream PR used. Receipt example moved to a separate worked
example link so the docs page itself doesn't echo any of the
upstream framing.

Also clarifies where ctx.llm fits in the broader plugin surface
(table comparing register_tool / register_platform / register_hook
/ etc.) and what makes this lane different from auxiliary_client
internals.

No code change.

* docs(plugin-llm): reframe as any LLM call, not just structured output

The original draft leaned heavily on complete_structured() and made
the chat lane (complete() / acomplete()) feel like a footnote.
Restructure so:

- The page title and description say 'any LLM call.'
- The lead shows BOTH a plain chat call (error rewriter) AND a
  structured call (triage scorer) up top.
- Quick start has two complete plugin examples — /tldr (chat) and
  /paste-to-tasks (structured).
- New 'When to use which' table for choosing complete() vs
  complete_structured() vs the async siblings.
- Trust-gate sections explicitly note 'all four methods,' and the
  request-shaping list calls out chat-only fields (messages) and
  structured-only fields (instructions, input, json_schema)
  alongside each other.
- The 'Where this fits' section now says 'for any reason,
  structured or not.'

The receipt-extractor reference plugin still exists under
plugins/plugin-llm-example/ — but the docs page no longer treats
it as the canonical surface example. It's now described as 'a third
worked example, this time with image input.'

No code change.

* feat(plugin-llm): split provider/model into independent explicit kwargs

The first cut accepted a single 'provider/model' slug on every method
and split it internally. That looked clean but broke under live test:
the model-override path tried to use the slug's vendor prefix as a
literal Hermes provider id, which silently switched the user off
their aggregator (e.g. plugin asks for 'openai/gpt-4o-mini' on a user
who routes through OpenRouter — host attempted to call the 'openai'
provider directly, failed because OPENAI_API_KEY wasn't set).

New shape mirrors the host's main config:

  ctx.llm.complete(
      messages=[...],
      provider='openrouter',         # gated, optional
      model='openai/gpt-4o-mini',    # gated, optional
      profile='work',                # gated, optional
      ...
  )

Each is independently gated by its own allow_*_override flag.
Granting model-override does NOT auto-grant provider-override.
Allowlists are now per-axis (allowed_providers, allowed_models)
matched literally against whatever string the plugin sends.

Dropped 'model@profile' embedded-suffix shorthand entirely. Hermes
doesn't use that pattern anywhere else; profile= is its own kwarg.

Live E2E (against real OpenRouter via Teknium's config) confirms:
- zero-config call works
- default-deny blocks each override with a helpful error
- model-only override stays on user's active provider (the bug)
- provider+model override switches cleanly
- allowlist refuses non-listed entries
- structured output round-trip parses + schema-validates

Tests: 49 cases (up from 45); all green. Docs updated to match the
new shape, including a 'most plugins never need this section' callout
on the trust-gate config block.

* fix+cleanup(plugin-llm): real attribution, hook-mode coverage, move example out of core

Three integration fixes for the ctx.llm surface:

1. Attribution bug — result.provider and result.model now reflect
   what call_llm actually used, not placeholder fallbacks ('auto',
   'default'). New _resolve_attribution() helper:

     - explicit overrides win (what the call targeted)
     - response.model wins for the recorded model (provider
       canonicalisation: 'gpt-4o' → 'gpt-4o-2024-08-06' etc.)
     - falls back to _read_main_provider() / _read_main_model()
       when no override is set, so audit logs reflect the user's
       active main provider/model
     - 'auto' / 'default' only when EVERYTHING is empty

   Live verified: zero-config call now records
   provider='openrouter', model='anthropic/claude-4.7-opus-20260416'
   instead of provider='auto', model='default'.

2. Hook-mode coverage — TestHookMode confirms ctx.llm.complete
   works from inside a registered post_tool_call callback. The
   docs page promised hook integration; now there's a test that
   exercises the lazy-import path through the real invoke_hook
   machinery. Two cases: traceback-rewrite hook with conditional
   ctx.llm.complete, and minimal hook regression for the
   sync-hook + sync-llm path.

3. Reference plugin moved out of core. plugins/plugin-llm-example/
   is gone from hermes-agent — it now lives in the new
   NousResearch/hermes-example-plugins companion repo. The docs
   page links there. Hermes' bundled plugins should be plugins
   users actually run; reference / docs-companion plugins live
   externally.

Test count: 56 (up from 49). Wider sweep on tests/hermes_cli/
+ tests/gateway/ + tests/tools/ + tests/agent/ shows 16770
passing; the 12 failures are all pre-existing on origin/main
(verified by stashing this branch's changes and re-running) —
kanban-boards, delegate-task, gateway-restart, tts-routing —
none touch the plugin_llm surface.

* chore(plugins): move all example plugins to companion repo

Reference / docs-companion plugins now live exclusively in
NousResearch/hermes-example-plugins, not bundled with the core repo:

- example-dashboard
- strike-freedom-cockpit

A new fourth example, plugin-llm-async-example, was added to that
repo demonstrating ctx.llm's async surface (acomplete()) with
asyncio.gather() — registers /translate <lang>: <text> which fires
forward translation + sentiment classifier in parallel, then a
back-translation for QA. Live-tested at 2.5s for three real
provider round-trips (would be ~5-6s sequential).

Docs updated:
- developer-guide/plugin-llm-access.md links both sync and async
  examples in the Reference section
- user-guide/features/extending-the-dashboard.md repoints both demo
  sections to the companion repo with corrected install paths
- user-guide/features/built-in-plugins.md drops the two demo rows
- AGENTS.md notes that example plugins live in the companion repo

Net: hermes-agent's plugins/ directory now contains only plugins
users actually run (memory providers, dashboard tabs that ship real
features, the disk-cleanup hook, platform adapters). All four
demo / reference plugins live externally where they can be cloned
on demand instead of inflating the core install.

* fix(kanban): use sys.executable -m hermes for dispatcher spawn

In NixOS container mode, hermes is installed at a store path with no
symlink on PATH (e.g. /data/current-package/bin/hermes). The kanban
dispatcher spawns workers via _default_spawn() using a bare 'hermes'
subprocess call, which fails with 'hermes executable not found on PATH'
in container mode.

Fix by calling sys.executable -m hermes instead, which is guaranteed
to resolve to the same Python interpreter running the dispatcher.

* fix(kanban): correct dispatcher spawn module name + PATH-first lookup

Follow-up to the previous commit's contributor cherry-pick.

The cherry-picked change replaced the bare ``["hermes", ...]`` spawn with
``[sys.executable, "-m", "hermes", ...]``. The intent was right (avoid
PATH dependence — cron, systemd User= services, launchd jobs, and other
detached dispatcher invocations routinely run with a stripped $PATH that
doesn't include the venv's bin/, breaking the bare-shim spawn) but the
module name is wrong: there is no top-level ``hermes`` package. The
console-script entry point in pyproject.toml is
``hermes = "hermes_cli.main:main"``, and ``python -m hermes`` fails with
``No module named hermes``. The cherry-picked form would have replaced a
sometimes-broken spawn with an always-broken one.

This commit:

- Adds ``_resolve_hermes_argv()``, mirroring ``gateway.run._resolve_hermes_bin``.
  Tries ``shutil.which("hermes")`` first (preferred — keeps existing ``ps``
  output and log lines familiar in the common case) and falls back to
  ``[sys.executable, "-m", "hermes_cli.main"]`` when the shim is not on
  PATH. The fallback goes through the running interpreter so it's
  PATH-independent. Kept as a local helper rather than imported from
  gateway because ``hermes_cli`` sits below ``gateway`` in the dependency
  order.
- Switches the dispatcher's ``cmd`` list to use ``*_resolve_hermes_argv()``.
- Adds three regression tests:
  * ``test_resolve_hermes_argv_prefers_path_shim`` — pins the PATH-first
    branch so a future refactor doesn't silently flip the order.
  * ``test_resolve_hermes_argv_falls_back_to_module_form_when_no_path_shim`` —
    pins the correct module name (``hermes_cli.main``, NOT ``hermes``).
    Direct regression guard for the form that shipped in the original PR.
  * ``test_resolve_hermes_argv_module_actually_runs`` — runs the fallback
    invocation as a real subprocess and asserts ``--version`` works, so
    losing ``hermes_cli.main``'s ``__main__`` handling can't slip past the
    string-match test.

Verified end-to-end: with the shim on PATH the resolver returns
``[/.../hermes]`` and ``--version`` works; with the shim removed the
resolver returns ``[python, -m, hermes_cli.main]`` and ``--version``
still works; the original PR's ``python -m hermes`` invocation fails as
expected (``No module named hermes``).

* feat(i18n): localize all gateway commands + web dashboard, add 8 new locales (16 total) (#22914)

* feat(i18n): localize /model command output

Reported by @tianma8888: when Chinese users run /model, the labels
("Provider:", "Context:", "_session only_", etc.) are still English.
This routes the static prose through the existing i18n catalog so it
follows display.language / HERMES_LANGUAGE.

Changes:
- locales/{en,zh,ja,de,es,fr,tr,uk}.yaml: add 17 keys under
  gateway.model.* covering switched/provider/context/max_output/cost/
  capabilities/prompt_caching/warning/saved_global/session_only_hint/
  current_label/current_tag/more_models_suffix/usage_*.
- gateway/run.py _handle_model_command: replace hardcoded f-strings in
  the picker callback, the text-list fallback, and the direct-switch
  confirmation block with t("gateway.model.<key>", ...).

What stays English:
- model IDs, provider slugs, capability strings, cost figures, and the
  "[Note: model was just switched...]" prepended to the model's next
  prompt (LLM-facing, not user-facing).
- The two slightly-different session-only hints unify on a single key
  with the em-dash phrasing.

Validation: tests/agent/test_i18n.py 27/27 passing (parity contract
holds), tests/gateway/ -k 'model or i18n' 74/74 passing.

* feat(i18n): localize all gateway slash command outputs

Expands the i18n catalog from 7 strings to 234 keys across 35 gateway
slash command handlers, so non-English users see localized output for
\`/profile\`, \`/status\`, \`/help\`, \`/personality\`, \`/voice\`, \`/reset\`,
\`/agents\`, \`/restart\`, \`/commands\`, \`/goal\`, \`/retry\`, \`/undo\`,
\`/sethome\`, \`/title\`, \`/yolo\`, \`/background\`, \`/approve\`, \`/deny\`,
\`/insights\`, \`/debug\`, \`/rollback\`, \`/reasoning\`, \`/fast\`,
\`/verbose\`, \`/footer\`, \`/compress\`, \`/topic\`, \`/kanban\`,
\`/resume\`, \`/branch\`, \`/usage\`, \`/reload-mcp\`, \`/reload-skills\`,
\`/update\`, \`/stop\` (plus the \`/model\` block already added in the
previous commit).

Reported by @tianma8888 — Chinese users want command output prose in
their language, not just the labels we already had.

Translations are hand-written for all 8 supported locales (en, zh, ja,
de, es, fr, tr, uk), matching each catalog's existing style: full-width
punctuation in zh, em-dashes in zh/ja/uk, French spaced colons,
German noun capitalization, etc.

What stays English (unchanged):
- Identifiers/values: model IDs, file paths, profile names, session IDs,
  command flag names like --global, URLs, config keys.
- Backtick code spans: \`/foo\`, \`config.yaml\`.
- Log messages (logger.info/warning/error).
- LLM-facing system notes prepended to next prompt (e.g. [Note: model
  was just switched...]).
- Strings produced by external modules (gateway_help_lines,
  format_gateway, manual_compression_feedback) — those have their
  own surfaces.

New shared keys for cross-handler boilerplate:
- gateway.shared.session_db_unavailable (5 call sites: branch, title,
  resume, topic, _disable_telegram_topic_mode_for_chat)
- gateway.shared.session_not_found (1 site)
- gateway.shared.warn_passthrough (2 sites in /title's f"⚠️ {e}" pattern)

YAML gotcha fixed: \`yolo.on\` and \`yolo.off\` were originally written
unquoted, which YAML 1.1 parses as boolean True/False keys. Renamed to
\`yolo.enabled\` / \`yolo.disabled\` for both safety and clarity.

Test fix: tests/agent/test_i18n.py::test_t_missing_key_in_non_english_falls_back_to_english
now resets the catalog cache on teardown, so the fake "foo: English Foo"
locale doesn't poison the module-level cache for subsequent tests in
the same xdist worker. (Without this, every gateway slash command test
that shares a worker with the i18n suite would see the fake catalog.)

Validation:
- tests/agent/test_i18n.py: 27/27 (parity contract — every key in every
  locale, matching placeholder tokens).
- tests/gateway/: 5077 passed, 0 failed (full gateway suite).
- 180 t() call sites added across 35 handlers; 1872 catalog entries
  total (234 keys × 8 locales).

* feat(i18n): add 8 new locales — af, ko, it, ga, zh-hant, pt, ru, hu

Expands the static-message catalog from 8 → 16 languages, each with full
270-key parity against the English source-of-truth.  Every locale now
covers the same surface PR #22914 added: approval prompts plus all 35
gateway slash command outputs.

New locales:
- af  Afrikaans      (community ask in #21961 by @GodsBoy; PRs #21962, #21970)
- ko  Korean         (PRs #20297 by @tmdgusya, #22285 by @project820)
- it  Italian        (PR #20371 by @leprincep35700)
- ga  Irish/Gaeilge  (PR #20962 by @ryanmcc09-dot)
- zh-hant Traditional Chinese (PRs #20523 by @jackey8616, #13140 by @anomixer)
- pt  Portuguese     (PRs #20443 by @pedroborges, #15737 by @carloshenriquecarniatto, #22063 by @Magaav)
- ru  Russian        (PR #22770 by @DrMaks22)
- hu  Hungarian      (PR #22336 by @lunasec007)

Each locale uses native-quality translations matching the existing tone
and conventions of the older 8 locales:
- zh-hant uses 繁體 characters with TW/HK technical vocabulary (軟體
  not 软件, 連線 not 连接, 設定 not 设置, 訊息 not 消息, 工作階段 not 会话, 程式
  not 程序, 預設 not 默认, 伺服器 not 服务器), full-width punctuation 「:()」.
- ko uses formal 합니다체 (습니다/합니다) register throughout.
- pt uses European Portuguese as baseline with neutral PT/BR vocabulary
  where possible.
- ga uses standard An Caighdeán Oifigiúil; English loanwords retained
  for tech terms without good Irish equivalents (gateway, API, JSON).
- All preserve {placeholder} tokens, backtick code spans, slash commands,
  brand names (Hermes, MCP, TTS, YOLO, OpenAI, Telegram, etc.), and emoji.

Aliases added in agent/i18n.py:
- af-za, Afrikaans → af
- ko-kr, Korean, 한국어 → ko
- it-it, italiano → it
- ga-ie, Irish, Gaeilge → ga
- zh-tw, zh-hk, zh-mo, traditional-chinese → zh-hant (note: zh-tw used to
  alias to zh; now aliases to its own zh-hant catalog)
- zh-cn, zh-hans, zh-sg → zh (unchanged from before)
- pt-pt, pt-br, brazilian, portuguese → pt
- ru-ru, Russian, русский → ru
- hu-hu, Magyar → hu

The zh-tw alias re-routing is intentional: previously typing 'zh-TW' got
the Simplified Chinese catalog (wrong vocabulary for Taiwan/HK users).
Now those users get the proper Traditional Chinese catalog.

Validation:
- tests/agent/test_i18n.py: 43/43 (parity contract holds for all 16
  languages × 270 keys = 4320 catalog entries, with matching placeholder
  tokens).
- E2E alias resolution verified for all 19 alias inputs (Afrikaans, ko-KR,
  한국어, italiano, Gaeilge, zh-TW, zh-HK, traditional-chinese, pt-BR,
  brazilian, Magyar, etc.).
- tests/gateway/: 5198 passed (3 pre-existing TTS routing failures
  unrelated to i18n).

Credit to all contributors whose PRs surfaced these language requests.
Their original PRs may now be closed as superseded with credit.

* feat(dashboard-i18n): add 14 web dashboard locales matching the static catalog

Brings the React dashboard (web/src/) up to the same 16-language
coverage the static catalog already has after the previous commits in
this PR. The Translations interface is TypeScript-typed, so every new
locale must provide every key — tsc -b is the parity guard.

Languages added (each is a complete 429-line locale file):
- af  Afrikaans
- ja  Japanese        (PR #22513 by @snuffxxx surfaced this)
- de  German          (PR #21749 by @mag1art)
- es  Spanish         (PR #21749)
- fr  French          (PRs #21749, #10310 by @foXaCe)
- tr  Turkish
- uk  Ukrainian
- ko  Korean          (PRs #21749, #18894 by @ovstng, #22285 by @project820)
- it  Italian
- ga  Irish (Gaeilge)
- zh-hant Traditional Chinese (PR #13140 by @anomixer)
- pt  Portuguese      (PRs #22063 by @Magaav, #22182 by @wesleysimplicio, #15737 by @carloshenriquecarniatto)
- ru  Russian         (PRs #21749, #22770 by @DrMaks22)
- hu  Hungarian       (PR #22336 by @lunasec007)

Each translation covers all 15 namespaces with full key parity vs en.ts,
preserves every {placeholder} token verbatim, keeps identifiers
untranslated (brand names, file paths, cron expressions, code spans),
translates the language.switchTo tooltip into the target language, and
matches existing tone conventions (zh-hant uses TW/HK vocab; ja uses
formal desu/masu; ko uses formal seumnida register; ga uses An
Caighdean Oifigiuil with English loanwords for tech vocab without good
Irish equivalents).

Plumbing:
- web/src/i18n/types.ts: Locale union expanded to all 16 codes.
- web/src/i18n/context.tsx: imports all 16 catalogs; exports
  LOCALE_META (endonym + flag per locale); isLocale() type guard.
- web/src/i18n/index.ts: re-export LOCALE_META.
- web/src/components/LanguageSwitcher.tsx: replaced two-state EN-ZH
  toggle with a click-to-open dropdown listing all 16 languages.

Note: zh-hant.ts exports zhHant (camelCase) since hyphen is invalid in
a JS identifier; the canonical 'zh-hant' string keys it in TRANSLATIONS.

Validation:
- npx tsc -b: 0 errors. Every locale satisfies Translations.
- npm run build (tsc + vite production): green, 2062 modules.
- Each locale file is exactly 429 lines.

Out of scope: plugin dashboards (kanban/achievements ship as prebuilt
bundles with no source in repo); Docusaurus docs (separate surface);
TUI (no i18n yet).

* feat(plugin-i18n): localize achievements + kanban plugin dashboards across all 16 locales

Brings the two shipped plugin dashboards (hermes-achievements, kanban)
under the same i18n umbrella as the core dashboard PR #22914 just
established.  Both bundles now read user-facing strings from the host's
i18n catalog via SDK.useI18n() instead of hardcoded English.

## Approach

Plugin dashboards ship as prebuilt IIFE bundles in
plugins/<name>/dashboard/dist/index.js — no build step, no source in
repo (upstream-authored, vendored as compiled JS).  Earlier contributor
PRs (#22594, #22595, #18747) tried direct edits but didn't actually
wire the bundles to read translations.

This change does the wiring properly:

1.  Each bundle gets a useI18n shim at IIFE scope:
        const useI18n = SDK.useI18n
          || function () { return { t: { kanban: null }, locale: "en" }; };
    Older host SDKs without useI18n still load the bundle and render
    English fallbacks.

2.  A small tx(t, path, fallback, vars) helper resolves dotted keys
    under the plugin's namespace (t.kanban.* or t.achievements.*) and
    interpolates {placeholder} tokens.

3.  Every React component starts with const { t } = useI18n() and
    each user-visible string is wrapped in tx(t, "key", "English fallback").
    Helpers called outside React components (window.prompt callers,
    constants used during init) take t as a parameter.

4.  Top-level constants that were English dictionaries (COLUMN_LABEL,
    COLUMN_HELP, DESTRUCTIVE_TRANSITIONS, DIAGNOSTIC_EVENT_LABELS in
    kanban) become getColumnLabel(t, status)-style functions backed by
    FALLBACK_* dictionaries.

## Translations added

Two new top-level namespaces added to the dashboard's TypeScript-typed
Translations interface:

- achievements: ~70 keys covering the hero, scan banner, achievement
  card, share dialog, stats, filters, and empty states.
- kanban: ~145 keys covering the board, columns (with nested
  columnLabels and columnHelp sub-dicts), card detail panel,
  bulk-actions toolbar, dependency editor, board switcher, and
  diagnostic callouts.

Each key is provided across all 16 supported locales:
en, zh, zh-hant, ja, de, es, fr, tr, uk, af, ko, it, ga, pt, ru, hu.

Total new translation entries: ~3,440 (215 keys × 16 locales).

## What stays English (deliberate)

- API paths, CSS class names, data-* attributes, JSON keys, regex
  strings, URLs, file paths (~/.hermes/kanban.db, boards/_archived/).
- State identifier strings used as lookup keys (triage / todo / ready /
  running / blocked / done / archived) — labels translate, key strings
  don't.
- The PNG share-card text rendered to canvas in the achievements
  ShareDialog (HERMES AGENT watermark, UNLOCKED stamp, tier names) —
  these become part of a globally-shared image and stay English.
- localStorage keys (hermes.kanban.selectedBoard).
- Brand names (Kanban, Hermes, WebSocket, Nous Research).

## Contributor credit

PR #22594 by @02356abc and PR #22595 by @02356abc supplied the
en + zh kanban namespace skeleton (145 keys); used as the en source-
of-truth in this commit and translated to the other 14 locales.

PR #18747 by @laolaoshiren first surfaced the achievements
localization request.

## Validation

- npx tsc -b: 0 errors. All 16 locale .ts files satisfy the
  Translations type with full key parity.
- npm run build (tsc + vite production build): green, 2062 modules,
  1.56MB JS / 95KB CSS, ~2.5s build.
- node --check on both plugin bundles: parse cleanly.
- 126 tx() call sites in kanban, 46 in achievements.

## Out of scope

- TUI (ui-tui/) has no i18n infrastructure yet.
- Docusaurus docs (website/i18n/) — already had zh-Hans; expanding
  is a separate translation workstream (Thai / Korean / Hindi PRs).

* fix(kanban): guard task_age against corrupt created_at values like '%s'

task_age() crashed with ValueError when created_at contained the
literal format string '%s' instead of a Unix timestamp, taking down
the entire GET /board endpoint with a 500.

- Add _safe_int() helper that returns None on non-numeric values
- Refactor task_age() to use _safe_int instead of bare int() casts
- Wrap task_age() call in _task_dict with try/except fallback so one
  corrupt row never kills the whole board endpoint

* test(kanban): cover task_age safe-int guards + AUTHOR_MAP entry

Follow-up to the previous commit's safe-int task_age fix.

The original PR shipped without test coverage. This commit adds:

- test_safe_int_accepts_int_and_int_string — sanity for the well-typed
  path so the helper itself can't quietly start swallowing valid values.
- test_safe_int_returns_none_on_corrupt_inputs — the failure modes
  (None, '%s', 'abc', '', '1.5', random objects). Covers both the
  ValueError and TypeError catch branches.
- test_task_age_handles_corrupt_created_at — the headline regression:
  a task with created_at='%s' used to raise ValueError and turn
  GET /api/plugins/kanban/board into a 500.
- test_task_age_handles_corrupt_started_and_completed — confirms the
  safe-int treatment is consistent across all three timestamp fields.
- test_task_age_well_formed_task — regression that the safe path
  doesn't change observable output for normal data.
- test_task_dict_survives_corrupt_created_at — defense in depth.
  Writes a corrupt row directly via SQL, reads it back through the
  ORM, and confirms task_age + the surrounding plugin_api guard
  degrade gracefully instead of crashing.

Also adds the AUTHOR_MAP entry for the contributor's GitHub-noreply
email so release notes credit @baocin (the commit was authored locally
as `aoi <aoi@hino.local>` — re-attributed during salvage to the
github noreply form).

* fix(kanban): preserve assignee casing in dashboard

* test(kanban-dashboard): pin assignee-casing static-asset regressions + AUTHOR_MAP

Follow-up to the previous commit's casing fix.

The original PR shipped the dist edits without test coverage. The
contributor's reasoning (UI-only attributes in a pre-built JS bundle,
nothing meaningful to unit-test) is fair, but a static-asset assertion
catches the most likely regression vector — a future rebuild of the
dist bundle that loses the attributes — at near-zero cost.

Adds two regression tests in tests/plugins/test_kanban_dashboard_plugin.py:

- test_dashboard_assignee_inputs_preserve_casing — reads dist/index.js
  and asserts autoCapitalize="none", autoCorrect="off", spellCheck=false,
  and textTransform="none" each appear at least twice (one per assignee
  input — inline triage/lane create + task-edit panel).
- test_dashboard_lane_head_preserves_assignee_casing — reads dist/style.css
  and asserts the .hermes-kanban-lane-head rule body does NOT contain
  text-transform: uppercase. Locates the rule by marker so unrelated CSS
  churn nearby doesn't flake the test.

Both follow the same shape as the existing test_dashboard_requests_default_board_explicitly
static-asset guard from PR #22940's salvage.

Also adds the AUTHOR_MAP entry for princepal9120's GitHub-noreply email
so release notes credit the right account.

* perf(browser): route browser_console eval through supervisor's persistent CDP WS (180x faster) (#23226)

Adds CDPSupervisor.evaluate_runtime() and wires it into _browser_eval as a
fast path when a supervisor is alive for the current task_id. Replaces the
~180ms agent-browser subprocess fork+exec+Node-startup hop with a ~1ms
Runtime.evaluate over the supervisor's already-connected WebSocket.

Falls through to the existing agent-browser CLI path when no supervisor is
running (e.g. backends without CDP, or before the first browser_navigate
attaches one), so behaviour is unchanged where it can't apply.

JS-side exceptions surface directly without falling through to the
subprocess (the subprocess would just re-raise the same error, slower);
supervisor-side failures (loop down, no session) fall through cleanly.

Benchmark — 30 iterations of `1 + 1` against headless Chrome:
  supervisor WS              mean=  0.96ms  median=  0.91ms
  agent-browser subprocess   mean=179.35ms  median=167.73ms
  → 187x speedup mean

Tests: 14 unit tests (mocked supervisor + response-shape coverage), 5
real-Chrome e2e tests in test_browser_supervisor.py (gated on Chrome
being installed). Browser test suite: 355 passed, 1 skipped.

* fix(kanban-dashboard): tone down completed-run metadata panel (#19548)

Hand-rebased onto current main from PR #19980; the original branch was stale
against main (~6 unrelated dashboard fixes had landed since), so applying
the PR's dist files directly would have silently reverted them.

The run-history panel in the task drawer rendered each completed run's
`metadata` field as a `<code class="hermes-kanban-run-meta">` containing
`JSON.stringify(r.metadata)` — a single unindented monoline. With
`white-space: pre-wrap` and a monospace font, a writer task's metadata
(changed_files paths, source URLs, generated-artifact details) wrapped
into a tall block of code-ish text that filled the parent run row. The
container's faint `--color-foreground 3%` background then made the whole
thing read like a crash dump even though the run completed normally.

Restyle and label, no interactivity changes:

- Wrap the meta payload in a `.hermes-kanban-run-meta-block` sub-block
  with an explicit `Metadata` label (small, uppercase, muted) so the
  panel reads as auxiliary detail at a glance.
- Pretty-print the JSON (`indent=2`) so the structure is scannable
  instead of a wall of monoline text.
- Cap `.hermes-kanban-run-meta` at `max-height: 8.5rem; overflow: auto`
  so a verbose blob scrolls inside its own pane rather than swamping
  the run row.
- Sub-block uses a thin `border-left` rule and `background: transparent`
  — distinct from the destructive-tinted treatment used by crashed /
  timed_out / blocked / spawn_failed runs higher in the same file.

Tests: two new static-asset assertions in
`tests/plugins/test_kanban_dashboard_plugin.py` lock in the rendered
shape (the plugin ships built-only, no src/).

* feat(kanban-dashboard): native <details> collapse + skip empty metadata

Two follow-up improvements to Tranquil-Flow's metadata-panel restyle.
Both stay within the parent PR's "tone down the panel" scope.

1. Native <details>/<summary> collapse for verbose metadata.

   The parent PR consciously deferred this ("adding native expand/collapse
   would be the next step but requires UX agreement"). The default they
   asked for is straightforward: collapsed when the rendered JSON exceeds
   300 chars (the threshold where the max-height: 8.5rem cap actually
   starts mattering), expanded otherwise. <details>/<summary> is the right
   primitive — zero JS, browser-handled state, accessible by default
   (keyboard-navigable, screen-reader announces the disclosure state),
   and survives any react-state churn for free.

   The OS-default disclosure marker is suppressed (list-style: none +
   ::-webkit-details-marker hidden) and replaced with a CSS ::before
   chevron that rotates 90deg on the [open] attribute, so the look is
   consistent across Firefox/WebKit/Blink without the double-marker
   that would otherwise appear on the platforms that still render the
   default triangle.

2. Skip rendering when metadata is an empty object.

   `r.metadata && ...` truthy-checks, but `{}` is truthy in JS — so a
   completed task with no actual metadata would render a "Metadata"
   labeled disclosure block containing literal `{}`. Adds an
   Object.keys(r.metadata).length > 0 guard so empty payloads render
   nothing instead of an empty disclosure stub.

Tests: three new static-asset assertions covering the <details> shape,
the empty-object skip, and the suppress-default-marker + animated-chevron
CSS — all in `tests/plugins/test_kanban_dashboard_plugin.py`.

* fix(kanban): reject toolset names in task skills

* feat(kanban): aggregate all toolset-name typos in skills before raising

Follow-up to the previous commit's toolset-vs-skill validation.

The contributor's fix raises ValueError on the first toolset name found
in the skills list. That works for one mistake, but agents that confuse
skills with toolsets usually pass several at once
(`skills=["web", "browser", "terminal"]`) — and serial-correcting one
per failure round-trip wastes tokens. Collect all toolset-shaped
entries first, then raise once with the full list.

The error message is also slightly clearer:

    'web', 'browser', 'terminal' are toolset names, not skill name(s).
    Put toolsets in the assignee profile's `toolsets:` config instead of
    per-task skills. Skills are named skill bundles (e.g. `kanban-worker`,
    `blogwatcher`); toolsets are runtime capabilities (e.g. `web`,
    `browser`, `terminal`).

vs. the previous "the assignee profile's toolsets" — explicitly naming
the YAML key (`toolsets:`) and giving concrete examples in both
categories closes the conceptual gap that produced the bug to begin
with.

Adds one regression test (test_create_task_skills_lists_all_toolset_typos)
covering the multi-name aggregation path. The single-typo test from
the original PR still passes (the loose `match="toolset name"` matches
both singular and plural forms).

* feat(gateway): shutdown forensics — non-blocking diag, per-phase timing, stale-unit warning (#23285)

When the gateway received SIGTERM, the shutdown_signal_handler ran a
synchronous 'ps aux' (3s timeout) inside the asyncio event loop, then
asyncio.create_task(runner.stop()).  On a busy host that ate 1-3s of
the teardown budget before draining could even start, and the resulting
log line was a multi-line ps dump that didn't tell us who sent the
signal.  The shutdown path itself logged 'Stopping gateway...' and then
nothing until 'Gateway stopped' — when systemd SIGKILLed mid-drain,
there was no way to see which phase wedged.

Changes:
- New gateway/shutdown_forensics.py:
  * snapshot_shutdown_context(sig) — sub-millisecond /proc-only capture
    of signal name, parent pid+name+cmdline, INVOCATION_ID (systemd
    marker), loadavg_1m, TracerPid, takeover/planned-stop marker
    presence + whether-it-names-self.  Pure stdlib, never raises.
  * spawn_async_diagnostic(log_path, sig) — detached subprocess with
    its own 'timeout 5s', start_new_session=True, writes ps auxf +
    pstree + dmesg to ~/.hermes/logs/gateway-shutdown-diag.log.
    Returns immediately, can't block the event loop or the cgroup
    teardown.
  * check_systemd_timing_alignment(drain_timeout) — reads
    /proc/self/cgroup for our unit, asks systemctl show for
    TimeoutStopUSec, returns mismatch info when the unit's stop
    timeout is smaller than restart_drain_timeout + 30s headroom
    (the case where systemd SIGKILLs mid-drain).
  * _parse_systemd_duration_to_us — covers '90s', '1min 30s',
    '500ms', '1h' style values from systemctl show.
  * format_context_for_log — single scannable key=value line, parent
    cmdline last.
- gateway/run.py shutdown_signal_handler:
  * Replaces synchronous ps aux + ad-hoc 'hermes-related lines' filter
    with snapshot + detached spawn.
  * Always logs 'Shutdown context: signal=... parent_pid=...
    parent_cmdline=...' regardless of planned/unexpected so we can
    correlate signal source even on planned restarts.
- gateway/run.py _stop_impl:
  * Per-phase '+X.XXs' timing for notify_active_sessions, drain
    (with drain_seconds, active_at_start, active_now, timed_out),
    post-interrupt tool kill, each adapter disconnect (Xs),
    all adapters disconnected, final-cleanup tool kill, SessionDB
    close, total teardown.
- gateway/run.py start():
  * Stale-unit warning at startup when the running systemd unit's
    TimeoutStopSec is smaller than the configured drain timeout.
    Points the user at 'hermes gateway service install --replace'
    to regenerate, or at shortening agent.restart_drain_timeout.

Tests: 30 new in tests/gateway/test_shutdown_forensics.py — snapshot
speed bound, signal name resolution, marker detection self-vs-other,
async diag spawn doesn't block caller, systemd duration parser, and
alignment check returns None outside systemd.  Wider tests/gateway/
suite: 5258 passing, 3 pre-existing TTS-routing failures unchanged
on main.

* fix(kanban): cap dispatch by running workers

* docs(kanban): document max_spawn as live concurrency cap (not per-tick budget)

Follow-up to the previous commit's behavior fix.

Adds a paragraph to dispatch_once's docstring making the concurrency-cap
semantic explicit, and an inline comment near the running_count query
explaining why we do the count (so a future reader doesn't refactor it
back to per-tick semantics thinking it's redundant). Both call out the
unbounded-accumulation failure mode that motivated the fix, since
nothing in the codebase or skills currently documents what max_spawn
is supposed to mean.

The semantic is per-board: each kanban board has its own SQLite file,
so the running-count COUNT(*) is naturally scoped to the board the
dispatcher tick is processing.

* chore: AUTHOR_MAP entry for guglielmofonda (#21505)

* fix(xai): drop models being retired May 15, 2026 from pickers (#23291)

xAI is retiring grok-4, grok-4-0709, grok-4-fast{,-reasoning,-non-reasoning},
grok-4-1-fast{,-reasoning,-non-reasoning}, and grok-code-fast-1 on
May 15, 2026 at 12:00 PT. Remove them from the static fallbacks so the
`hermes model` picker, gateway /model picker, and setup wizard stop
auto-suggesting models that will be dead in days.

- _XAI_STATIC_FALLBACK in hermes_cli/models.py now lists only grok-4.20-*
  and grok-4.3 (the live replacements).
- copilot lists in hermes_cli/models.py and hermes_cli/setup.py drop
  grok-code-fast-1 (Copilot proxies it through xAI, so the upstream
  retirement breaks it there too).

Old configs that already reference retired IDs keep working until xAI
flips the switch — context-length lookups in agent/model_metadata.py and
the cache-affinity-header logic in provider_profiles still recognise the
old names. The cleanup here is purely about not advertising them to new
users.

Closes #23278.

Source: https://docs.x.ai/developers/migration/may-15-retirement

* feat(gateway): per-platform admin/user split for slash commands (salvage of #4443) (#23373)

* feat(gateway): per-platform admin/user split for slash commands

Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:

  - allow_admin_from         — full slash command access
  - user_allowed_commands    — what non-admins may run
  - group_allow_admin_from   — same, group/channel scope
  - group_user_allowed_commands

When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.

Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.

Adds `/whoami` showing platform, scope, tier, and runnable commands.

Salvage of PR #4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.

Co-authored-by: ReqX <mike@grossmann.at>

* fix(gateway): close running-agent fast-path bypass + add coverage and central docs

The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.

Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.

Also added 18 more dispatch tests covering:
  - Running-agent fast-path: blocks non-admin, allows admin, /status
    always works
  - Alias canonicalization (gate uses canonical name, not user alias)
  - Unknown / unregistered commands pass through (don't false-positive)
  - DM admin scope-locked when group has its own admin list
  - Multi-platform isolation (Discord gated, Telegram unrestricted)

Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.

Co-authored-by: ReqX <mike@grossmann.at>

---------

Co-authored-by: ReqX <mike@grossmann.at>

* refactor(kanban-orchestrator): drop hardcoded specialist roster, add Step-0 profile discovery

The skill enumerated 8 specialist profile names (researcher, analyst,
writer, reviewer, backend-eng, frontend-eng, ops, pm) as "the standard
roster" and told orchestrators to "assume these exist." Almost no real
Hermes setup matches that fleet — single-profile setups, Docker-worker
setups, and curated-team setups all violate it — so following the skill
literally produced cards assigned to non-existent profiles, which the
dispatcher silently failed to spawn (no autocorrect, no fallback, just
sits in `ready` forever).

Changes:

- Drop the standard-specialist-roster table.
- Add a "Profiles are user-configured — not a fixed roster" section at
  the top with a Step 0 that prescribes `hermes profile list` (or asking
  the user) before fanning out. Cache the result in working memory.
- Rewrite the worked task-graph example with placeholder names
  (<profile-A>, <profile-B>, <profile-C>) so the structure is still
  teachable but doesn't invite copy-paste of role names that may not
  exist.
- Reframe the "If no specialist fits" anti-temptation rule: don't
  invent profile names; ask the user.
- Add a "Inventing profile names that doesn't exist" entry to Pitfalls.
- Bump skill version 2.0.0 → 3.0.0 (semantic break: previous behavior
  promised a roster the skill no longer enumerates).
- Update website/docs/user-guide/features/kanban.md to drop the
  matching "(researcher, writer, analyst, backend-eng, reviewer, ops)"
  line and explain the discovery prompt instead.
- Re-run website/scripts/generate-skill-docs.py to refresh the
  auto-generated skill page + catalog.

Closes #21131 in spirit — addresses the same hardcoded-names footgun
@yehuosi flagged, with a different shape than their PR (delete the
roster rather than replace each name with placeholder, since the
roster table was the load-bearing footgun and the worked example is
salvageable with placeholder profile names).

Co-authored-by: yehuosi <yehuosi@users.noreply.github.com>

* feat(session): add /handoff command for cross-platform session transfer

Adds /handoff <platform> CLI command that queues the current session for
resume on the configured home channel of any messaging platform.

CLI side:
- /handoff telegram — marks session in shared DB, sends summary to
  the Telegram home channel via send_message
- /handoff discord — same for Discord
- Supports telegram, discord, slack, whatsapp, signal, matrix

Gateway side:
- On new session creation, checks for pending handoffs for the
  incoming message's platform
- If found, loads the CLI session's full conversation history and
  injects it into the context prompt as a handoff transcript
- Agent continues the conversation seamlessly

Files:
- hermes_state.py: handoff_pending, handoff_platform columns + helpers
- cli.py: _handle_handoff_command dispatch + handler
- hermes_cli/commands.py: CommandDef entry
- gateway/run.py: handoff detection in _handle_message_with_agent
- tests/hermes_cli/test_session_handoff.py: 8 tests

* feat(session): make /handoff actually transfer the session live

Builds on @kshitijk4poor's CLI handoff stub. The original PR's flow
deferred everything to whenever a real user happened to message the
target platform; this rewrites it so the gateway picks up handoffs
immediately and the destination chat just starts working.

State machine on sessions table replaces the boolean flag:
  None -> 'pending' -> 'running' -> ('completed' | 'failed')
plus handoff_error for failure reasons. CLI request_handoff /
get_handoff_state / list_pending_handoffs / claim_handoff /
complete_handoff / fail_handoff helpers wrap the transitions.

CLI side (cli.py): /handoff <platform> validates the platform's home
channel via load_gateway_config, refuses if the agent is mid-turn,
flips the row to 'pending', and poll-blocks (60s) on terminal state.
On 'completed' it prints the /resume hint and exits the CLI like
/quit. On 'failed' or timeout it surfaces the reason and the CLI
session stays intact.

Gateway side (gateway/run.py): new _handoff_watcher background task
scans state.db every 2s, atomically claims pending rows, and runs
_process_handoff for each. _process_handoff:

  1. Resolves the platform's home channel.
  2. Asks the adapter for a fresh thread via the new
     create_handoff_thread(parent_chat_id, name) capability so the
     handed-off conversation gets its own scrollback. Adapters that
     don't support threads (or fail) return None and the watcher
     falls back to the home channel directly.
  3. Constructs a SessionSource keyed as 'thread' when a thread was
     created, 'dm' otherwise, then session_store.switch_session
     re-binds the destination key to the CLI session_id. The full
     role-aware transcript replays via load_transcript on the next
     turn (no flat-text injection into context_prompt).
  4. For…
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…age of NousResearch#4443) (NousResearch#23373)

* feat(gateway): per-platform admin/user split for slash commands

Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:

  - allow_admin_from         — full slash command access
  - user_allowed_commands    — what non-admins may run
  - group_allow_admin_from   — same, group/channel scope
  - group_user_allowed_commands

When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.

Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.

Adds `/whoami` showing platform, scope, tier, and runnable commands.

Salvage of PR NousResearch#4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.

Co-authored-by: ReqX <mike@grossmann.at>

* fix(gateway): close running-agent fast-path bypass + add coverage and central docs

The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.

Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.

Also added 18 more dispatch tests covering:
  - Running-agent fast-path: blocks non-admin, allows admin, /status
    always works
  - Alias canonicalization (gate uses canonical name, not user alias)
  - Unknown / unregistered commands pass through (don't false-positive)
  - DM admin scope-locked when group has its own admin list
  - Multi-platform isolation (Discord gated, Telegram unrestricted)

Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.

Co-authored-by: ReqX <mike@grossmann.at>

---------

Co-authored-by: ReqX <mike@grossmann.at>
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
…age of NousResearch#4443) (NousResearch#23373)

* feat(gateway): per-platform admin/user split for slash commands

Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:

  - allow_admin_from         — full slash command access
  - user_allowed_commands    — what non-admins may run
  - group_allow_admin_from   — same, group/channel scope
  - group_user_allowed_commands

When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.

Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.

Adds `/whoami` showing platform, scope, tier, and runnable commands.

Salvage of PR NousResearch#4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.

Co-authored-by: ReqX <mike@grossmann.at>

* fix(gateway): close running-agent fast-path bypass + add coverage and central docs

The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.

Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.

Also added 18 more dispatch tests covering:
  - Running-agent fast-path: blocks non-admin, allows admin, /status
    always works
  - Alias canonicalization (gate uses canonical name, not user alias)
  - Unknown / unregistered commands pass through (don't false-positive)
  - DM admin scope-locked when group has its own admin list
  - Multi-platform isolation (Discord gated, Telegram unrestricted)

Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.

Co-authored-by: ReqX <mike@grossmann.at>

---------

Co-authored-by: ReqX <mike@grossmann.at>
AlexFoxD pushed a commit to AlexFoxD/hermes-agent that referenced this pull request May 21, 2026
…age of NousResearch#4443) (NousResearch#23373)

* feat(gateway): per-platform admin/user split for slash commands

Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:

  - allow_admin_from         — full slash command access
  - user_allowed_commands    — what non-admins may run
  - group_allow_admin_from   — same, group/channel scope
  - group_user_allowed_commands

When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.

Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.

Adds `/whoami` showing platform, scope, tier, and runnable commands.

Salvage of PR NousResearch#4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.

Co-authored-by: ReqX <mike@grossmann.at>

* fix(gateway): close running-agent fast-path bypass + add coverage and central docs

The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.

Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.

Also added 18 more dispatch tests covering:
  - Running-agent fast-path: blocks non-admin, allows admin, /status
    always works
  - Alias canonicalization (gate uses canonical name, not user alias)
  - Unknown / unregistered commands pass through (don't false-positive)
  - DM admin scope-locked when group has its own admin list
  - Multi-platform isolation (Discord gated, Telegram unrestricted)

Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.

Co-authored-by: ReqX <mike@grossmann.at>

---------

Co-authored-by: ReqX <mike@grossmann.at>
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…age of NousResearch#4443) (NousResearch#23373)

* feat(gateway): per-platform admin/user split for slash commands

Adds an opt-in two-list access control on top of the existing per-platform
`allow_from` allowlists, scoped to slash commands only:

  - allow_admin_from         — full slash command access
  - user_allowed_commands    — what non-admins may run
  - group_allow_admin_from   — same, group/channel scope
  - group_user_allowed_commands

When `allow_admin_from` is unset for a scope, gating is disabled and every
allowed user keeps full access (backward compat). Plain chat is unaffected.
`/help` and `/whoami` are always reachable so users can see what they
can run.

Gate runs at the slash command dispatch site in gateway/run.py and uses
`is_gateway_known_command()`, so it covers built-in AND plugin-registered
commands through the live registry without per-feature wiring.

Adds `/whoami` showing platform, scope, tier, and runnable commands.

Salvage of PR NousResearch#4443's permission tier work, scoped down. The full tier
system, tool filtering, audit log, usage tracking, rate limiting,
`/promote` flow, and persistent SQLite stores are not included here —
those can be re-expanded later if needed.

Co-authored-by: ReqX <mike@grossmann.at>

* fix(gateway): close running-agent fast-path bypass + add coverage and central docs

The slash command access gate was only applied at the cold dispatch site
(line ~5921). When an agent was already running, the running-agent
fast-path block (line ~5574) dispatched /restart, /stop, /new, /steer,
/model, /approve, /deny, /agents, /background, /kanban, /goal, /yolo,
/verbose, /footer, /help, /commands, /profile, /update directly
without going through the gate — letting non-admins bypass gating just
because an agent happens to be busy.

Refactored the gate into _check_slash_access() and called from BOTH
paths. /status remains intentionally pre-gate so users can always see
session state.

Also added 18 more dispatch tests covering:
  - Running-agent fast-path: blocks non-admin, allows admin, /status
    always works
  - Alias canonicalization (gate uses canonical name, not user alias)
  - Unknown / unregistered commands pass through (don't false-positive)
  - DM admin scope-locked when group has its own admin list
  - Multi-platform isolation (Discord gated, Telegram unrestricted)

Docs: added Slash Command Access Control section to the central
messaging index page + /whoami row in the chat commands table.

Co-authored-by: ReqX <mike@grossmann.at>

---------

Co-authored-by: ReqX <mike@grossmann.at>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/auth Authentication, OAuth, credential pools comp/gateway Gateway runner, session dispatch, delivery P3 Low — cosmetic, nice to have platform/discord Discord bot adapter platform/telegram Telegram bot adapter type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Gateway Permission Tiers — Role-Based Access Control (Owner/Admin/User/Guest) for Messenger Platforms

3 participants