Skip to content

Feature: Autonomous Skill Templates — Tool Allowlists, Requirement Declarations, Scheduled Execution & Lifecycle (inspired by OpenFang Hands) #492

@teknium1

Description

@teknium1

Overview

OpenFang (RightNow-AI/openfang), a Rust-based Agent Operating System, introduces the concept of "Hands" — pre-configured autonomous capability packages that combine domain knowledge, tool restrictions, scheduling, requirement declarations, and lifecycle management into a single deployable unit. A Hand is activated with openfang hand activate researcher and runs autonomously on a schedule with guardrails.

Hermes Agent has Skills (instruction-based SKILL.md files with YAML frontmatter) and Cron Jobs (scheduled agent runs with a prompt), but these are separate, disconnected systems. A skill tells the agent how to do something; a cron job tells it when to run. There's no unified concept of "activate this autonomous capability with these tool restrictions, these requirements, and this schedule."

This proposal bridges that gap by enhancing the skill format with optional autonomous execution metadata — tool allowlists, requirement declarations, agent configuration, and scheduled execution — creating what could be called "Autonomous Skill Templates." This doesn't replace skills or cron; it layers autonomous packaging on top of both.


Research Findings

How OpenFang's "Hands" Work

Despite the ambitious framing, a Hand is architecturally simple — it's a config file + a system prompt, not compiled autonomous code. The "autonomy" comes from the LLM following detailed instructions with tool access. Each Hand consists of:

  1. HAND.toml — Configuration manifest:

    [hand]
    id = "researcher"
    name = "Deep Researcher"
    description = "CRAAP-criteria research with APA citations"
    category = "research"
    
    [tools]
    allowed = ["web_search", "web_extract", "read_file", "write_file", "memory_store"]
    # Only these tools are available — everything else is blocked
    
    [requirements]
    binaries = ["curl"]           # Must be on PATH
    env_vars = ["OPENAI_API_KEY"] # Must be set
    api_keys = ["serper"]         # Must be configured
    
    [agent]
    provider = "openai"
    model = "gpt-4o"
    temperature = 0.3
    system_prompt_file = "SKILL.md"
    
    [schedule]
    cron = "0 */6 * * *"  # Every 6 hours
    
    [dashboard]
    metrics = ["reports_generated", "sources_evaluated", "avg_craap_score"]
  2. SKILL.md — Multi-phase system prompt with domain expertise (the actual "brain" of the Hand)

  3. HandRegistry (registry.rs) — Manages lifecycle:

    • activate(hand_id) → Start autonomous execution
    • deactivate(hand_id) → Stop execution
    • pause(hand_id) / resume(hand_id) → Suspend/resume
    • status(hand_id) → Check state and metrics
    • check_requirements(hand) → Verify all prerequisites before activation

Key Design Decisions

  • Config-driven, not code-driven: Hands are defined entirely through configuration + prompts, not custom code. This makes them safe and portable.
  • Tool allowlists: The most impactful security feature. An autonomous researcher can web_search and write_file but can't terminal or process. This is defense-in-depth for autonomous operation.
  • Requirement checking: Before activation, the system verifies that all required binaries, env vars, and API keys are present. Fail-fast instead of failing mid-execution.
  • Unified lifecycle: Spawn/pause/resume/stop as first-class operations, not just "start a cron job and hope."

What's Actually Novel (vs. Just Repackaging)

The HAND.toml format bundles 5 things that Hermes currently treats as separate concerns:

  1. Domain knowledge (SKILL.md) — Hermes has this
  2. Tool restrictions (tools.allowed) — Hermes does NOT have per-skill tool restrictions
  3. Requirement declarations (requirements) — Hermes does NOT have this
  4. Agent configuration (agent.model, temperature) — Hermes does NOT have per-skill model config
  5. Scheduled execution (schedule.cron) — Hermes has cron, but it's disconnected from skills

The real value is #2 (tool restrictions) and #3 (requirement declarations) — these are the features that make autonomous execution safe and reliable.


Current State in Hermes Agent

Skills system (tools/skill_tools.py, skills/):

  • SKILL.md files with YAML frontmatter (name, description, tags)
  • Skills are loaded into the system prompt as instructions
  • The agent reads skills and follows their guidance
  • No tool restrictions, no requirements, no scheduling, no lifecycle
  • Skills Hub for community skills with security scanning (tools/skills_guard.py)

Cron system (cron/, tools/cronjob_tools.py):

  • Schedule prompts to run on cron expressions or intervals
  • Each cron job runs in an isolated session
  • Results delivered to messaging platforms or files
  • No connection to skills — the prompt is freeform text
  • No tool restrictions on cron jobs (full tool access)
  • No requirement checking before execution
  • No pause/resume lifecycle

Toolsets (toolsets.py):

  • Tool groupings (web, terminal, file, browser, vision, etc.)
  • Per-platform toolset configuration (CLI gets more tools than Telegram)
  • enabled_toolsets / disabled_toolsets at session level
  • No per-skill or per-task tool restrictions

Existing related issues:


Implementation Plan

Classification: Core Codebase Change + Skill Format Extension

This should be a core codebase change that extends the existing skill system, not a new standalone system. Reasons:

  • Tool allowlists require harness-level enforcement (the LLM can't be trusted to self-restrict)
  • Requirement checking needs to inspect the environment (binaries, env vars) deterministically
  • Scheduling integration needs to bridge skills and cron at the code level
  • Per-skill model/temperature config needs to be applied by the agent harness

What We'd Need

  • Extended SKILL.md YAML frontmatter schema (new optional fields)
  • Tool restriction enforcement in the tool dispatch path of run_agent.py
  • Requirement checker utility in agent/ or tools/
  • Bridge between skills and cron: "activate this skill on this schedule"
  • New agent-facing tool or command: skill_activate / skill_deactivate

Phased Rollout

Phase 1: Tool Allowlists & Requirement Declarations

  • Extend SKILL.md YAML frontmatter with optional fields:
    ---
    name: deep-researcher
    description: CRAAP-criteria research with APA citations
    tags: [research, autonomous]
    tools_allowed: [web_search, web_extract, read_file, write_file, search_files]
    tools_denied: [terminal, process]  # Alternative: deny-list approach
    requirements:
      binaries: [curl]
      env_vars: [OPENAI_API_KEY]
      python_packages: [beautifulsoup4]
    ---
  • When a skill is loaded (via skill_view or cron), enforce tool restrictions by filtering the tool registry for that session
  • check_requirements() function that verifies prerequisites and reports missing items
  • Integrate requirement checking into skills_list output (show ✅/❌ per skill)
  • Integrate requirement checking into cron execution (fail-fast with clear error if requirements unmet)

Phase 2: Skill-Cron Bridge & Per-Skill Agent Config

  • New YAML fields:
    agent_config:
      model: gpt-4o
      temperature: 0.3
      max_iterations: 30
    schedule:
      cron: "0 */6 * * *"
      deliver: telegram
  • New CLI command: hermes skill activate <name> — creates a cron job from the skill's schedule config, with the skill's system prompt, tool restrictions, and agent config applied
  • hermes skill deactivate <name> — removes the associated cron job
  • hermes skill status <name> — shows whether active, last run, next run
  • When a skill-based cron job fires, the agent session is configured with the skill's tool restrictions and model settings

Phase 3: Lifecycle & Dashboard


Pros & Cons

Pros

  • Safety for autonomous operation: Tool allowlists are the single most impactful feature. An autonomous researcher that can't run terminal is dramatically safer than one with full access.
  • Fail-fast reliability: Requirement checking prevents frustrating mid-execution failures ("curl not found on line 47 of a 60-step workflow").
  • Unified mental model: "Activate this skill" is simpler than "create a cron job with this prompt and remember to restrict tools and set the right model."
  • Backward compatible: All new fields are optional. Existing skills work unchanged. This is purely additive.
  • Natural extension: Builds on existing systems (skills, cron, toolsets) rather than creating parallel infrastructure.
  • Composable: A skill can be used manually (as today) OR activated for autonomous execution. Same SKILL.md serves both modes.

Cons / Risks

  • Scope creep risk: This touches skills, cron, toolsets, and the agent loop. Careful scoping per phase is essential.
  • Tool restriction bypass: The LLM might try to work around restrictions (e.g., using execute_code to call subprocess). Need to ensure restrictions apply transitively to code execution tools.
  • Configuration complexity: More YAML fields = more things to get wrong. Mitigated by making everything optional and providing good defaults.
  • Testing burden: Tool restriction enforcement needs thorough testing across all tools and edge cases.

Open Questions

  • Allowlist vs. denylist for tools? Allowlist is safer (explicit grant), denylist is more convenient (block a few, keep the rest). Could support both with tools_allowed taking precedence over tools_denied.
  • Should tool restrictions apply to execute_code? If a skill restricts terminal but allows execute_code, the agent can call terminal() from inside execute_code. Need to decide: restrict execute_code entirely, or filter the hermes_tools available inside it?
  • Per-skill model config vs. global config? Some skills might work best with specific models (e.g., a coding skill with Claude, a research skill with GPT-4o). Is per-skill model override worth the complexity?
  • How does this interact with sub-agents? If a restricted skill uses delegate_task, should the sub-agent inherit the tool restrictions?
  • Naming: "Autonomous Skill Templates," "Skill Profiles," "Skill Packages," or just extend "Skills" with new fields? The naming matters for documentation and mental models.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions