Skip to content

Feature: Gateway Permission Tiers — Role-Based Access Control (Owner/Admin/User/Guest) for Messenger Platforms #527

@teknium1

Description

@teknium1

Overview

Hermes Agent currently has a binary authorization model for gateway platforms: users are either fully authorized (all commands, all tools, full terminal access) or completely blocked. There is no concept of permission levels, roles, or per-user capability restrictions.

This proposal introduces a tiered permission system (Owner → Admin → User → Guest) for the gateway/messenger platforms (Telegram, Discord, WhatsApp, Slack). Each tier gets a defined set of allowed commands, tools, and capabilities. This is essential for anyone running Hermes Agent as a shared service — e.g., a team lead who wants colleagues to use the agent for research/search but not have full terminal access to the host machine.

The idea is straightforward: the person who deployed Hermes Agent should be able to share it with others while controlling what those others can do. Right now, sharing means giving full access to everything — including terminal, write_file, schedule_cronjob, process management, and configuration commands like /model and /update. This is a significant security and trust barrier to multi-user adoption.


Research Findings

Current Authorization Architecture

The existing system (gateway/run.py:532-595, _is_user_authorized()) checks in this order:

  1. Per-platform allow-all flag (TELEGRAM_ALLOW_ALL_USERS=true)
  2. DM pairing approval (pairing_store.is_approved())
  3. Per-platform allowlist env var (TELEGRAM_ALLOWED_USERS=id1,id2)
  4. Global allowlist (GATEWAY_ALLOWED_USERS=id1,id2)
  5. Global allow-all (GATEWAY_ALLOW_ALL_USERS=true)
  6. Default: deny

Key limitations:

  • All-or-nothing: Every authorized user gets identical capabilities — same tools, same commands, same terminal access
  • No per-command gating: All slash commands (/model, /update, /reload-mcp, /sethome, /personality) are available to every authorized user — meaning any user can change the model, update the agent, or reconfigure it
  • Toolset is uniform: _HERMES_CORE_TOOLS (defined in toolsets.py:31-67) is identical across all platform toolsets — every authorized user gets terminal, write_file, schedule_cronjob, process, etc.
  • Session sharing: Non-WhatsApp DMs share a single session per platform, meaning all authorized users see the same conversation history and memory
  • Global safety only: HERMES_EXEC_ASK=1 applies uniformly — no way to auto-approve for admins but require approval for regular users
  • Config mutations are unprotected: /model and /personality modify ~/.hermes/config.yaml, meaning any authorized user can change settings for all users

How Other Platforms Handle This

Telegram bot ecosystem: The dominant pattern for AI bots (chatgpt-telegram-bot, TeleChat) uses tiered env var lists:

  • ADMIN_USER_IDS=12345,67890 — no budget limits, admin commands
  • ALLOWED_TELEGRAM_USER_IDS=12345,67890,11111 — standard access, per-user budgets
  • Guest budget pool for unlisted users in groups

Telegram's native BotCommandScope API supports command visibility scoping (show different command menus to admins vs regular users).

Discord bot ecosystem: Rich built-in permission infrastructure — @commands.has_permissions(), @commands.has_role(), @commands.is_owner() decorators. Permission bitfields, role-based checks, channel-level permission overwrites. Discord bots commonly map server roles to bot permission tiers.

AI agent platforms: MCP servers and agent frameworks are converging on role-to-tool mapping — each role defines which tools are accessible. Enforcement happens at the tool dispatch layer, not just the UI layer.

Universal pattern: Define tiers in config, enforce via middleware at boundaries, default to least privilege, support both platform-native and custom role mapping.

Relationship to Issue #476 (Agent Mode System)

Issue #476 proposes a mode system where tool access is scoped by mode (Code, Ask, Architect, etc.). Modes are self-selected by the user — they choose which mode to operate in. The permission system proposed here is admin-imposed — the operator decides what each user can do. These are complementary:

  • Permissions define what a user is allowed to do (ceiling)
  • Modes define what a user chooses to do within their permissions (self-restriction)

A natural integration: admins can force certain user tiers into specific modes. E.g., "user" tier users are locked to "ask" mode (read-only tools), while admins can switch freely.


Current State in Hermes Agent

What we have:

  • Binary allow/deny authorization (gateway/run.py:532-595)
  • Per-platform and global allowlist env vars
  • DM pairing system with cryptographic codes (gateway/pairing.py)
  • Platform-specific user identification (SessionSource in session.py:30-103)
  • Hook system that emits command:* events (gateway/hooks.py)
  • Toolset system with per-platform toolset names (but all map to the same tool list)

What we don't have:

  • Any concept of user roles or permission tiers
  • Per-command authorization checks
  • Per-user tool filtering
  • Per-tier rate limiting or budget controls
  • Admin commands for user management
  • Persistent user-role storage

Relevant files that would need changes:

  • gateway/run.py — Authorization check, command dispatch, agent creation
  • gateway/config.py — Permission tier definitions
  • gateway/session.py — User role injection into session context
  • gateway/pairing.py — Pairing with role assignment
  • toolsets.py — Per-user tool filtering
  • tools/registry.py — Tool dispatch with permission checks

Implementation Plan

Core Architecture: Classification

This is a core codebase change, not a skill or tool. Per CONTRIBUTING.md criteria:

  • It requires end-to-end integration with the gateway authorization system
  • It needs deterministic enforcement logic (not LLM-interpreted)
  • It touches the agent harness itself (tool filtering, command dispatch, session management)
  • It spans multiple core modules (gateway, tools, config)

Permission Tier Design

OWNER  →  ADMIN  →  USER  →  GUEST  →  BLOCKED
  │         │         │         │         │
  │         │         │         │         └── No access (explicit deny)
  │         │         │         └── Chat only, no tools, rate-limited
  │         │         └── Standard tools (search, read, web), no terminal/write/system
  │         └── All tools, all commands, user management
  └── System config, add/remove admins, unrestricted, update agent

Default tier permissions (configurable via config.yaml):

Capability Owner Admin User Guest
Chat with agent
Web search/extract
Read files
Write/patch files
Terminal access
Code execution
Process management
Schedule cronjobs
Delegate tasks
Image generation
Vision analysis
Memory read/write ✓ (read)
/model command
/personality
/update
/reload-mcp
/sethome
/usage
/new, /reset
/help
User management
Rate limit none none configurable strict

Configuration Format

# In ~/.hermes/config.yaml
permissions:
  # Tier definitions (override defaults)
  tiers:
    user:
      tools:
        - web_search
        - web_extract
        - read_file
        - search_files
        - vision_analyze
        - image_generate
        - clarify
        - text_to_speech
        - skills_list
        - skill_view
      commands:
        - help
        - new
        - reset
        - usage
        - stop
        - compress
      rate_limit:
        requests_per_hour: 60
    guest:
      tools: []   # chat only, no tools
      commands:
        - help
      rate_limit:
        requests_per_hour: 10

  # User role assignments
  users:
    # Owner is auto-detected from ALLOWED_USERS env vars (first entry)
    # or explicitly set:
    "telegram:12345": owner
    "discord:98765432": admin
    "telegram:67890": user

  # Default tier for authorized-but-unassigned users
  default_tier: user

  # Default tier for allow-all / pairing-approved users
  pairing_tier: user

  # Platform-native role mapping
  platform_roles:
    telegram:
      # Map Telegram group admin status to a tier
      group_admin: admin
    discord:
      # Map Discord roles to tiers
      roles:
        "Admin": admin
        "Moderator": admin
        "Member": user

What We'd Need

  1. PermissionManager class (gateway/permissions.py) — loads tier config, resolves user→tier, checks tool/command access, caches lookups
  2. Modify _is_user_authorized()_get_user_permission(source) -> PermissionTier — returns the tier instead of bool (tier=BLOCKED means unauthorized)
  3. Command dispatch gating — before each command handler, check permission_manager.can_execute_command(source, command)
  4. Tool filtering at agent creation — filter enabled_toolsets through permission_manager.get_allowed_tools(source) before passing to AIAgent()
  5. Session context injection — tell the agent what tier the user is, so it can explain "I can't do that for you" instead of silently failing
  6. User management commands/users list, /users set <user_id> <tier>, /users remove <user_id>
  7. Rate limiting middleware — per-tier request counting with configurable windows
  8. Platform-native role detection — query Telegram getChatMember() or Discord member roles at authorization time

Phased Rollout

Phase 1: Core Permission Engine + Tool Filtering

  • PermissionManager class with tier resolution
  • Config format in config.yaml (tiers, user assignments, default_tier)
  • Tool filtering: strip disallowed tools from the tool list before passing to AIAgent()
  • Command gating: check tier before executing slash commands
  • Session context: inject tier info so the agent knows the user's access level
  • Backward compatible: if no permissions config exists, behave exactly as today (all authorized users = owner)

Phase 2: User Management + Rate Limiting

  • /users command family for runtime user management (list, set tier, remove)
  • Per-tier rate limiting with configurable windows
  • Persistent user-role storage (extend pairing SQLite or new permissions.json)
  • Usage tracking per user (extend existing /usage command)

Phase 3: Platform-Native Roles + Advanced Features

  • Telegram group admin → tier mapping (query getChatMember API)
  • Discord role → tier mapping (query member roles)
  • Custom tier definitions (beyond the 4 built-in tiers)
  • Per-user tool overrides (grant specific user access to a tool beyond their tier)
  • Audit logging (who did what, when)
  • Integration with Feature: Agent Mode System — Persona + Tool Scoping + Behavioral Constraints (inspired by Kilocode) #476 Mode System (admin-imposed mode restrictions per tier)
  • Per-user session isolation option (separate sessions per user instead of shared)

Pros & Cons

Pros

  • Unlocks multi-user sharing: The Terminal tool #1 barrier to sharing Hermes Agent with teammates, family, or community is that authorization = full access. This removes that barrier entirely.
  • Security posture: Currently any authorized Telegram user can terminal rm -rf / or read sensitive files. Tiered permissions provide defense-in-depth beyond the binary allowlist.
  • Cost control: Guests and regular users can be rate-limited and budget-capped, preventing runaway API costs from shared access.
  • Backward compatible: No permissions config = today's behavior. Zero migration friction.
  • Leverages existing infrastructure: The allowlist env vars, pairing system, toolset system, and hook system all provide natural integration points.
  • Platform-native integration: Can leverage Telegram admin status and Discord roles instead of requiring manual user ID configuration.
  • Composable with Mode System (Feature: Agent Mode System — Persona + Tool Scoping + Behavioral Constraints (inspired by Kilocode) #476): Permissions set the ceiling, modes let users self-restrict within that ceiling.

Cons / Risks

  • Complexity: Adds a new subsystem to the gateway. The permission resolution logic (config tiers + env vars + platform roles + pairing + defaults) could become hard to reason about.
  • Configuration burden: Users need to assign roles to user IDs. Mitigated by sensible defaults (first allowlist entry = owner, pairing = user tier).
  • False sense of security: Tool filtering happens at the agent level, but the LLM could potentially instruct users to bypass restrictions (e.g., "run this command in your terminal"). This is access control, not a security sandbox. Should be documented clearly.
  • Session sharing complications: If user A (admin) and user B (user) share a DM session, whose permissions apply? Need clear policy (most restrictive? per-message? separate sessions?). This is solvable but needs design decisions.
  • Platform role detection latency: Querying Telegram/Discord APIs for role info on every message adds latency. Needs caching with reasonable TTL.
  • Testing surface: Every command and tool needs permission checks. Easy to miss one and create a bypass.

Open Questions

  1. Session sharing policy: When users of different tiers share a group session, whose tier applies? Options: (a) most restrictive tier in the chat, (b) per-message tier based on who sent it, (c) configurable. Option (b) is most flexible but most complex.

  2. Backward compatibility for env vars: Should the existing TELEGRAM_ALLOWED_USERS env var entries default to owner tier (preserving current behavior) or admin tier? Recommendation: owner for first entry, admin for the rest, configurable via default_tier.

  3. Guest access model: Should guests (unauthenticated users in groups with ALLOW_ALL_USERS=true) get any tool access by default, or chat-only? Recommendation: chat-only (no tools) by default, configurable.

  4. Tool-level vs toolset-level filtering: Filter individual tools, or entire toolsets? Individual tools gives finer granularity but more config. Recommendation: individual tools, with toolset shorthand (e.g., tools: ["@web"] expands to web_search + web_extract).

  5. Interaction with MCP tools: MCP tools are dynamically discovered. Should they be allowed/denied per tier? Recommendation: MCP tools follow the same tier rules, with a config option for default MCP tool policy per tier.

  6. Integration with Feature: Agent Mode System — Persona + Tool Scoping + Behavioral Constraints (inspired by Kilocode) #476 modes: Should this be built to compose with modes from day 1, or independently first? Recommendation: build independently in Phase 1, add mode integration in Phase 3.


References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions