Skip to content

Feature: Blackbox CLI Skill — Multi-Model Agent Delegation with Built-in Judge (inspired by Blackbox AI) #475

@teknium1

Description

@teknium1

Overview

Blackbox AI is a multi-agent coding platform that unifies Claude Code, Codex, Gemini, and Blackbox's own models into a single workflow. Its standout architectural feature is the "Chairman" pattern — tasks are dispatched to multiple models simultaneously, and a specialized judge LLM evaluates all outputs to select the best implementation.

The Blackbox CLI is open-source (GPL-3.0, TypeScript, forked from Gemini CLI) and provides a terminal-based interface for multi-agent code generation with a built-in judge mechanism. It supports non-interactive mode, MCP, checkpointing/resume, and vision model switching — making it viable as another delegatable agent alongside our existing `claude-code` and `codex` skills.

This issue proposes a Blackbox CLI skill following the same pattern as the existing autonomous agent skills, enabling Hermes users who subscribe to Blackbox AI to delegate coding tasks through their multi-model pipeline.


Research Findings

How the Blackbox CLI Works

Architecture: The CLI is a Node.js (v20+) application built on the Google Gemini CLI codebase (`gemini.tsx` core), extended with:

  • SubagentManager (`packages/core/src/subagents/`) — Orchestrates multiple sub-agents with events, hooks, statistics tracking, and validation
  • Built-in agents (`builtin-agents.ts`) — Pre-configured agent definitions for Claude, Codex, Gemini, and Blackbox models
  • Non-interactive mode (`nonInteractiveCli.ts`) — Full headless execution with tool calls, checkpointing, and structured output formatting. Separates thoughts (stderr) from results (stdout)
  • Judge mechanism — Evaluates competing agent outputs and selects the best implementation
  • Checkpoint system — Save/resume sessions via `.blackboxcli/checkpoint-*.json` files
  • MCP support — Model Context Protocol integration for external tool connectivity

Installation:
```bash

From npm (prebuilt)

npm install -g @blackboxai/cli

From source

git clone https://github.com/blackboxaicode/cli.git
cd cli && npm install && npm install -g .

Configure

blackbox configure # Enter API key from app.blackbox.ai/dashboard
```

Non-interactive usage (key for Hermes delegation):
```bash

One-shot task execution

blackbox --prompt "Implement JWT authentication for the Express API"

Resume from checkpoint

blackbox --resume-checkpoint "task-abc123" --prompt "Add refresh token support"
```

Session commands: `/compress` (shrink history), `/clear` (reset), `/stats` (token usage)

Key Design Decisions

  1. Gemini CLI as base — Rather than building from scratch, Blackbox forked Google's Gemini CLI for its tool execution engine, event handling, and terminal UI (React/Ink). This gave them a production-quality foundation.
  2. Multi-model by default — The CLI can run the same prompt through multiple models (Blackbox Pro, Claude Sonnet 4.5, GPT-5.2 Codex, Gemini 2.5 Pro) and use a judge to pick the best.
  3. Credit-based pricing — Free tier has basic access; Pro ($10/mo) adds $30 credits for premium models; Pro Plus ($20/mo) unlocks multi-agent execution.
  4. GPL-3.0 license — The CLI is copyleft, which is important to note for any integration approach.

Platform Capabilities Beyond the CLI

The Blackbox platform also offers:

  • Multi-Agent Task API (`POST cloud.blackbox.ai/api/tasks`) — Programmatic dispatch to 2-5 agents with automated judge selection. Requires `bb_xxxxxxxx` API key.
  • Semantic Knowledge Graph — Indexes entire repos, commits, and URLs for cross-file reasoning
  • Isolated sandboxes — Long-running tasks in background environments
  • IDE support — VS Code extension with 4.2M+ installs, JetBrains, Blackbox IDE

Current State in Hermes Agent

Existing agent delegation skills:

Skill Agent Installation Non-interactive Auth Required
`claude-code` Claude Code CLI `npm install -g @anthropic-ai/claude-code` Yes (`--dangerously-skip-permissions`) Anthropic API key
`codex` OpenAI Codex CLI `npm install -g @openai/codex` Yes (`exec` subcommand) OpenAI API key
`hermes-agent` Hermes Agent Already installed Yes (`-q` flag) Configured LLM provider
`blackbox` (proposed) Blackbox CLI `npm install -g @blackboxai/cli` Yes (`--prompt`) Blackbox API key

No existing Blackbox integration. Zero mentions of "blackbox" in the hermes-agent codebase.

Related issues:


Implementation Plan

Skill vs. Tool Classification

This should be a skill because:

  • It wraps an external CLI (`blackbox`) callable via `terminal(pty=true)`
  • Follows the exact same pattern as existing `claude-code` and `codex` skills
  • No custom Python integration needed — all interaction is through the terminal
  • API key management is handled by the user running `blackbox configure`

Not bundled — Skills Hub. Blackbox AI requires a paid subscription, has GPL-3.0 licensing, and modest adoption (191 stars). It's specialized enough to be a Skills Hub skill rather than bundled.

What We'd Need

  1. Skill file (`SKILL.md`) following the `claude-code`/`codex` pattern
  2. Prerequisites section — Node.js 20+, npm, Blackbox API key
  3. One-shot mode — `blackbox --prompt "task"` via `terminal(pty=true)`
  4. Background mode — Long-running sessions with `process(poll/log)` monitoring
  5. Checkpoint/resume — Leverage Blackbox's built-in checkpoint system
  6. Multi-model hints — Document how to configure which models run (agent selection)
  7. PR review pattern — Adapt the existing PR review workflow to Blackbox's capabilities

Phased Rollout

Phase 1: Basic Delegation Skill

  • SKILL.md with one-shot and background delegation patterns
  • Prerequisites, configuration, and usage documentation
  • Standard PR review workflow adaptation
  • Metadata: tags, related_skills (claude-code, codex, hermes-agent)

Phase 2: Multi-Model Features

  • Document Blackbox's built-in multi-agent/judge mode for skill users
  • Add guidance for when to use Blackbox's multi-model mode vs. Hermes's native delegation
  • Checkpoint/resume workflow for long-running tasks

Phase 3: Cross-CLI Integration


Pros & Cons

Pros

  • Consistent pattern — Follows the established claude-code/codex skill architecture exactly
  • Multi-model value — Users get access to Blackbox's built-in judge/multi-model pipeline, which is unique among CLI agents
  • Low effort — ~200 lines of SKILL.md, no code changes needed
  • MCP compatibility — Blackbox CLI speaks MCP, potentially enabling tool sharing with Hermes
  • Checkpoint/resume — Built-in session persistence for long-running tasks

Cons / Risks

  • GPL-3.0 license — Copyleft license on the CLI; doesn't affect a skill (we're just calling it), but worth noting
  • Paid service — Requires Blackbox subscription ($10-40/mo) with credit-based consumption
  • Modest adoption — 191 GitHub stars, forked from Gemini CLI; uncertain long-term maintenance
  • Wrapper-of-wrappers concern — Blackbox CLI itself wraps Claude Code, Codex, and Gemini. If Hermes already delegates to those directly, the added value is mainly the judge layer
  • Last commit Dec 2025 — ~3 months since last meaningful update; maintenance risk
  • TrustPilot 1.9/5 — Mixed user reviews, primarily around billing transparency and credit consumption

Open Questions


References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions