feat(workbench): add browser workspace, kanban feedback, and executor CLI foundations by gu87 · Pull Request #2 · gu87/hermes-agent

gu87 · 2026-06-06T05:05:16Z

Summary

This branch delivers three foundation layers for the Hermes Agent desktop workbench. First, a read-only Browser Workspace — an Electron host that renders external URLs and exposes a chat message insertion contract, without any browser automation capability. Second, kanban review/QA feedback events — structured defect tracking wired into the CLI kanban surface. Third, a complete executors framework spanning a typed registry, capability-based router, five CLI adapters, guarded worktree tools, context/inbox/prompt tooling, an IPC bridge, review chain primitives, and a top-level argparse CLI entry point. Two UI upgrades ship alongside: structured tool-result preview panels and embedded chat session routing. All executor modules are backed by 463 passing tests, destructive operations are gated behind confirmation prompts, and import side-effect guards prevent the CLI from leaking into live subsystems.

What Changed

Browser Workspace

Electron host scaffold: main.ts (lifecycle, window management, IPC handlers), preload.ts (context bridge), renderer.html (shell), schema.ts (typed contracts)
Chat message insertion contract (browser-workspace:insert-chat-message) validated against a golden contract-snapshot.json via validate-contract.mjs
Static dashboard plugin assets under plugins/browser-workspace/dashboard/
The host renders external URLs in a BrowserView; it does not implement click, type, submit, or navigate automation

Kanban Review/QA Feedback

hermes_cli/kanban_feedback.py — structured review and QA defect event types with typed routing
hermes_cli/kanban.py — kanban surface integration glue
hermes_cli/web_server.py — lightweight HTTP server for dashboard plugin delivery
Unit coverage in tests/hermes_cli/test_kanban_cli.py

Structured Preview UI

web/src/components/ToolCall.tsx — structured preview panel that renders tool results with formatted sections instead of raw JSON blobs
web/src/components/ChatSidebar.tsx — expanded sidebar with session list, routing, and status indicators
web/src/pages/ChatPage.tsx / web/src/pages/ConfigPage.tsx — page-level wiring for preview and sidebar
web/src/App.tsx — top-level routing integration

Embedded Chat Routing Mode

Chat session routing embedded into the web app shell so the UI can deep-link into specific sessions without full page reloads

Executors Framework

A — Core Registry and Router

executors/registry.py — typed executor registry with binary discovery and structured health probes
executors/router.py — capability-based router that produces recommendations and accepts user overrides
executors/types.py — shared dataclass type system (capabilities, routing decisions, health status, executor info)
executors/health.py — per-executor health checks with table and JSON output formatters

B — Adapters

executors/claude_code_adapter.py — Claude Code CLI adapter
executors/codex_adapter.py — Codex adapter
executors/deepseek_tui_adapter.py — DeepSeek TUI adapter
executors/hermes_local_adapter.py — Hermes local adapter
executors/opencode_adapter.py — OpenCode adapter
All adapters conform to the shared ExecutorCapabilities contract defined in types.py

C1 — Context / Inbox / Prompt Tools

executors/context.py + executors/context_cli.py — workspace context injection CLI
executors/inbox.py + executors/inbox_cli.py — external inbox management CLI
executors/prompt_builder.py — structured prompt assembly from context + task spec

C2 — Guarded Worktree Tools

executors/worktree.py + executors/worktree_cli.py — worktree create, list, merge, and discard
Merge and discard operations require explicit user confirmation or --force; the gate is verified in tests
Root resolution walks up to the parent repository, avoiding path assumptions

D1a — Bridge / IPC / Review Parsing Primitives

executors/ipc.py — typed IPC message protocol with structured envelopes
executors/bridge.py + executors/bridge_cli.py — run-event-log ↔ review bridge
executors/review_agent.py — review chain agent logic

D1b — Review Handler + Review CLI

executors/review_handler.py — review orchestration handler
executors/review_cli.py — review subcommand CLI

D2 — Top-Level CLI

executors/cli.py — argparse-based entry point wiring all subcommands: registry, health, routing, worktree, context, review, QA, inbox, and bridge
executors/__init__.py — package public API surface
Delegates destructive subcommands (worktree) to the guarded handler paths
Import side-effect guards verified in tests: importing the CLI module does not invoke shutil.which, load hermes_cli, or spawn subprocesses

Safety Boundaries

Browser Workspace is strictly read-only. The Electron host renders external URLs in a BrowserView and exposes a chat message insertion IPC contract. No agent can click, type, submit, or navigate within the rendered page. No browser automation capability exists in this PR.
No Chrome extension compatibility layer. The Browser Workspace uses an Electron shell, not a Chrome extension harness. Chrome extension APIs (tabs, scripting, debugger) are not used or polyfilled.
Worktree destructive operations are gated. Merge and discard both require explicit user confirmation (interactive prompt) or --force. Tests verify the gate blocks unconfirmed destructive calls.
Executor tests never call real external CLIs or models. All adapter tests use binary stubs or mock registries. No network requests, no child-process side effects, no real model invocations.
Import side-effect isolation. Five boundary guard tests confirm that importing executors/cli.py does not trigger shutil.which, does not load hermes_cli, does not spawn subprocesses, and does not import worktree, kanban_feedback, or other uncommitted modules at the top level.
No force push, no destructive repo operations. All worktree tests operate on temporary directories under $TMPDIR. No repository-level destructive operations are performed by any test.

Validation

Check	Result
Browser host TypeScript compilation	✅ passes
Browser host contract validation (`validate-contract.mjs`)	✅ passes (snapshot matches schema)
Web build (`web/`)	✅ passes
Kanban CLI tests	✅ passes
Structured preview UI build	✅ passes
`py_compile` — all executor modules	✅ passes
`tests/executors/` full suite	✅ 463 passed, 0 failed
`tests/executors/test_cli.py`	✅ 48 passed
Import side-effect guards (5 checks)	✅ all pass
Missing-binary fallback resilience (3 checks)	✅ all pass
Worktree confirmation gate tests	✅ gate blocks unconfirmed, `--force` bypasses

What Did Not Ship

Browser action automation — no click, type, submit, navigate, or scroll APIs
Chrome extension compatibility — no chrome.tabs / chrome.scripting / chrome.debugger bridge
ChatSessionSidebar wiring — kept as a local excluded draft, not connected to the app shell in this PR
Production Electron packaging, code signing, or auto-update
Full security threat model for the Electron Browser Workspace host
External browser adapter implementation (browser_adapter.py)
End-to-end multi-step executor orchestration pipelines
Bridge ↔ kanban feedback path integration (the two remain architecturally separate by design)

Risks / Follow-ups

Native macOS visual checks are still needed for the Electron Browser Workspace. TypeScript compilation and contract validation pass, but the rendered shell has not been visually verified on macOS.
Executors review/QA path remains separate from the kanban_feedback event path by design. Future work may bridge them at the routing or bridge layer, but for now they are independent subsystems.
ChatSessionSidebar is kept as a local excluded draft. It is not wired into the app shell and is not part of this PR. Downstream work will need to integrate it.
Larger end-to-end executor orchestration (multi-step run pipelines with real adapters) still needs integration testing. The current tests validate each module in isolation.
hermes_cli/kanban_feedback.py introduces a new feedback event path. Its interaction with the existing kanban CLI surface should be documented before downstream consumers adopt it.
Browser Workspace contract is validated against a golden snapshot. If the IPC contract evolves, the snapshot must be updated and re-validated.

Commit Breakdown

SHA	Title	Purpose
`ce62fbc5c`	`feat(browser-workspace): Electron host lifecycle + chat insertion contract`	Scaffold the read-only Browser Workspace: Electron main process lifecycle, preload context bridge, renderer shell, typed schema, contract validation script, and golden snapshot
`312050bfe`	`feat(kanban): add review and QA feedback events`	Structured review/QA defect event types, routing, and kanban surface integration
`313b5d896`	`feat(ui): add structured preview panel for tool results`	Render structured tool-call results in a formatted preview panel instead of raw JSON
`86d2b64e7`	`feat(ui): add embedded chat routing mode`	Add embedded chat session routing to the web app shell
`51671f813`	`feat(executors): add core registry and router API`	Typed executor registry with binary discovery, health probes, and capability-based router
`b9b28ddf4`	`feat(executors): add CLI adapter implementations`	Five adapters (Claude Code, Codex, DeepSeek TUI, Hermes local, OpenCode) implementing the shared capabilities contract
`db8129cf0`	`feat(executors): add context inbox and prompt tools`	Workspace context injection CLI, external inbox management CLI, and structured prompt assembly
`a95211d9e`	`feat(executors): add guarded worktree tools`	Worktree create/list/merge/discard with confirmation gates on destructive operations and parent-repo root resolution
`50bb17679`	`feat(executors): add bridge IPC and review parsing primitives`	Typed IPC message protocol, run-event-log ↔ review bridge, and review chain agent logic
`5b150a6be`	`feat(executors): add review handler and CLI subcommands`	Review orchestration handler and review subcommand CLI
`7bbb0a3ed`	`feat(executors): add top-level CLI integration`	Argparse-based entry point wiring all subcommands with import side-effect isolation and delegation to guarded handlers

🤖 Generated with Claude Code

…tract browser-host/: new Electron child-process (status/start/stop + snapshot/screenshot/context proxies) — Phase 2B + 2C plugins/browser-workspace/dashboard/: Codex-like sidebar tab (manifest + bundled dist/index.js) — Phase 4 entry point hermes_cli/web_server.py: expose /api/browser-host/{status,start,stop,snapshot,screenshot,context} and read-only HTTP proxy to the host on localhost — Phase 2B/2C web/src/pages/ChatPage.tsx: window.__HERMES_INSERT_CHAT_TEXT__ (term.paste path) + window.__HERMES_SET_BROWSER_CONTEXT_REF__ (metadata only — no DOM/selection/clipboard) — Phase 4A/4B Out of scope: Kanban review/qa CLI + diff events; Codex-like embedded-chat nav reshuffle; ToolCall selected ring; ChatSidebar structured preview parser; executors/ module; dashboard screenshots.

Add review/qa CLI subcommands, emit diff events after dispatch, and publish diff/review/qa task events to the dashboard WebSocket stream. Includes focused CLI tests for dispatch diff recording and review/QA result events.

Parse diff, review, QA, and artifact payloads from tool events and render them in a dedicated ChatSidebar detail panel. Add ToolCall onClick/selected props for activity-list selection and highlight state, with graceful fallback for unparseable results.

Redirect root and unknown routes to /chat when embedded chat mode is enabled, slim the sidebar nav for chat-first usage, and add Config diagnostics links for hidden dashboard pages.

Add the core executor types, registry, router, and health helpers behind a core-safe package API that does not import adapter implementations. Add router and registry tests covering keyword recommendations, fallback behavior, default manifests, and missing-binary health checks without invoking live models.

Add Hermes Local, Claude Code, Codex, OpenCode, and DeepSeek TUI executor adapters implementing the AgentExecutorAdapter protocol. CLI adapters use asyncio subprocess array calls without shell=True; Hermes Local restores cwd with try/finally; DeepSeek TUI remains an explicit unavailable stub. Add adapter registry tests covering imports, missing-binary health checks, inert imports, and no live CLI/model/worktree side effects.

Add project-scoped workspace context, inbox persistence, and per-executor prompt building utilities. These tools write only to caller-supplied project .hermes files and do not invoke subprocesses, models, git, or external services. Add tests covering JSON round trips, default loading, prompt generation, inbox lifecycle, and side-effect boundaries.

Add worktree lifecycle utilities and CLI helpers with dirty-tree rejection and confirmation gates for destructive merge/discard operations. Fix the clean-working-tree preflight check so user dirty files reject worktree creation while .hermes infrastructure files are ignored as intended. Add worktree tests covering dirty rejection, infra filtering, merge/discard confirmation, and tmp-repo-only git operations.

Add IPC dataclasses, review/QA prompt parsing primitives, and a fixture-driven bridge CLI simulation layer without importing the real review handler. Keep D1a read-only and subprocess-free; bridge CLI uses local stubs for review/QA simulation while D1b decides the real review backend boundary. Fix diff deletion parsing so file header lines are not counted as deletions.

Adds the executors/IPC rail for review and QA, parallel to the existing hermes_cli.kanban_feedback trigger_* path. The two rails target different consumers (Kanban SQLite vs. Electron IPC) and are NOT interchangeable. - executors/review_handler.py: trigger_review_ipc / trigger_qa_ipc async wrappers around _launch_opencode, with shutil.which gate, OpencodeUnavailable exception, emit_diff_event side-channel, and stub_review_report / stub_qa_report fallback when opencode is absent. Module docstring documents the dual-rail architecture to prevent confusion with kanban_feedback.trigger_*. - executors/review_cli.py: 6 subcommand handlers (build-prompt / parse / executor for both review and QA) + handle_review_command / handle_qa_command dispatchers. SEVERITY_ICONS map for human-readable output. - tests/executors/test_review_handler.py: 582 lines, 9 test classes covering rename verification, IPC happy paths, fallback paths, subprocess invocation, side-channel events, stub reports, exception handling, and boundary guards (no kanban_feedback, no executors.cli, no executors.bridge, no sqlite3 imports, no real opencode invocation). - tests/executors/test_review_cli.py: 443 lines, 9 test classes covering all 6 subcommand handlers, dispatchers, SEVERITY_ICONS lookup, and parallel boundary guards. The rename trigger_* -> trigger_*_ipc was required to disambiguate from hermes_cli.kanban_feedback.trigger_* which writes to the Kanban SQLite task_events table. Both are live in parallel; callers pick the rail appropriate to their consumer (CLI vs. Electron IPC). 70 new tests; full executors suite: 415 passing (was 345).

Add the argparse-based executor CLI entry point wiring registry, health, routing, worktree, context, review, QA, inbox, and bridge subcommands. Preserve destructive operation gates by delegating worktree actions to the guarded C2 handlers, and keep imports free of hermes_cli or live external CLI side effects. Add CLI tests covering registry commands, delegation, help output, missing binaries, import side effects, and boundary guards.

github-actions · 2026-06-06T05:07:30Z

⚠️ npm lockfile hash out of date

Checked against commit e144574 (PR head at check time).

The hash = "sha256-..." line in these nix files no longer matches the committed package-lock.json:

Apply the fix

Apply lockfile fix — tick to push a commit with the correct hashes to this PR branch
Or run the Nix Lockfile Fix workflow manually (pass PR #2)
Or locally: nix run .#fix-lockfiles and commit the diff

Gu added 11 commits June 6, 2026 12:54

feat(kanban): add review and QA feedback events

0e5b837

Add review/qa CLI subcommands, emit diff events after dispatch, and publish diff/review/qa task events to the dashboard WebSocket stream. Includes focused CLI tests for dispatch diff recording and review/QA result events.

feat(ui): add embedded chat routing mode

c4cee92

Redirect root and unknown routes to /chat when embedded chat mode is enabled, slim the sidebar nav for chat-first usage, and add Config diagnostics links for hidden dashboard pages.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(workbench): add browser workspace, kanban feedback, and executor CLI foundations#2

feat(workbench): add browser workspace, kanban feedback, and executor CLI foundations#2
gu87 wants to merge 11 commits into
upstream-main-cleanfrom
codex/workbench-foundations-clean

gu87 commented Jun 6, 2026

Uh oh!

github-actions Bot commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gu87 commented Jun 6, 2026

Summary

What Changed

Browser Workspace

Kanban Review/QA Feedback

Structured Preview UI

Embedded Chat Routing Mode

Executors Framework

Safety Boundaries

Validation

What Did Not Ship

Risks / Follow-ups

Commit Breakdown

Uh oh!

github-actions Bot commented Jun 6, 2026

⚠️ npm lockfile hash out of date

Apply the fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant