Skip to content

feat(workbench): add browser workspace, kanban feedback, and executor CLI foundations#2

Open
gu87 wants to merge 11 commits into
upstream-main-cleanfrom
codex/workbench-foundations-clean
Open

feat(workbench): add browser workspace, kanban feedback, and executor CLI foundations#2
gu87 wants to merge 11 commits into
upstream-main-cleanfrom
codex/workbench-foundations-clean

Conversation

@gu87

@gu87 gu87 commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Summary

This branch delivers three foundation layers for the Hermes Agent desktop workbench. First, a read-only Browser Workspace — an Electron host that renders external URLs and exposes a chat message insertion contract, without any browser automation capability. Second, kanban review/QA feedback events — structured defect tracking wired into the CLI kanban surface. Third, a complete executors framework spanning a typed registry, capability-based router, five CLI adapters, guarded worktree tools, context/inbox/prompt tooling, an IPC bridge, review chain primitives, and a top-level argparse CLI entry point. Two UI upgrades ship alongside: structured tool-result preview panels and embedded chat session routing. All executor modules are backed by 463 passing tests, destructive operations are gated behind confirmation prompts, and import side-effect guards prevent the CLI from leaking into live subsystems.

What Changed

Browser Workspace

  • Electron host scaffold: main.ts (lifecycle, window management, IPC handlers), preload.ts (context bridge), renderer.html (shell), schema.ts (typed contracts)
  • Chat message insertion contract (browser-workspace:insert-chat-message) validated against a golden contract-snapshot.json via validate-contract.mjs
  • Static dashboard plugin assets under plugins/browser-workspace/dashboard/
  • The host renders external URLs in a BrowserView; it does not implement click, type, submit, or navigate automation

Kanban Review/QA Feedback

  • hermes_cli/kanban_feedback.py — structured review and QA defect event types with typed routing
  • hermes_cli/kanban.py — kanban surface integration glue
  • hermes_cli/web_server.py — lightweight HTTP server for dashboard plugin delivery
  • Unit coverage in tests/hermes_cli/test_kanban_cli.py

Structured Preview UI

  • web/src/components/ToolCall.tsx — structured preview panel that renders tool results with formatted sections instead of raw JSON blobs
  • web/src/components/ChatSidebar.tsx — expanded sidebar with session list, routing, and status indicators
  • web/src/pages/ChatPage.tsx / web/src/pages/ConfigPage.tsx — page-level wiring for preview and sidebar
  • web/src/App.tsx — top-level routing integration

Embedded Chat Routing Mode

  • Chat session routing embedded into the web app shell so the UI can deep-link into specific sessions without full page reloads

Executors Framework

A — Core Registry and Router

  • executors/registry.py — typed executor registry with binary discovery and structured health probes
  • executors/router.py — capability-based router that produces recommendations and accepts user overrides
  • executors/types.py — shared dataclass type system (capabilities, routing decisions, health status, executor info)
  • executors/health.py — per-executor health checks with table and JSON output formatters

B — Adapters

  • executors/claude_code_adapter.py — Claude Code CLI adapter
  • executors/codex_adapter.py — Codex adapter
  • executors/deepseek_tui_adapter.py — DeepSeek TUI adapter
  • executors/hermes_local_adapter.py — Hermes local adapter
  • executors/opencode_adapter.py — OpenCode adapter
  • All adapters conform to the shared ExecutorCapabilities contract defined in types.py

C1 — Context / Inbox / Prompt Tools

  • executors/context.py + executors/context_cli.py — workspace context injection CLI
  • executors/inbox.py + executors/inbox_cli.py — external inbox management CLI
  • executors/prompt_builder.py — structured prompt assembly from context + task spec

C2 — Guarded Worktree Tools

  • executors/worktree.py + executors/worktree_cli.py — worktree create, list, merge, and discard
  • Merge and discard operations require explicit user confirmation or --force; the gate is verified in tests
  • Root resolution walks up to the parent repository, avoiding path assumptions

D1a — Bridge / IPC / Review Parsing Primitives

  • executors/ipc.py — typed IPC message protocol with structured envelopes
  • executors/bridge.py + executors/bridge_cli.py — run-event-log ↔ review bridge
  • executors/review_agent.py — review chain agent logic

D1b — Review Handler + Review CLI

  • executors/review_handler.py — review orchestration handler
  • executors/review_cli.py — review subcommand CLI

D2 — Top-Level CLI

  • executors/cli.py — argparse-based entry point wiring all subcommands: registry, health, routing, worktree, context, review, QA, inbox, and bridge
  • executors/__init__.py — package public API surface
  • Delegates destructive subcommands (worktree) to the guarded handler paths
  • Import side-effect guards verified in tests: importing the CLI module does not invoke shutil.which, load hermes_cli, or spawn subprocesses

Safety Boundaries

  • Browser Workspace is strictly read-only. The Electron host renders external URLs in a BrowserView and exposes a chat message insertion IPC contract. No agent can click, type, submit, or navigate within the rendered page. No browser automation capability exists in this PR.
  • No Chrome extension compatibility layer. The Browser Workspace uses an Electron shell, not a Chrome extension harness. Chrome extension APIs (tabs, scripting, debugger) are not used or polyfilled.
  • Worktree destructive operations are gated. Merge and discard both require explicit user confirmation (interactive prompt) or --force. Tests verify the gate blocks unconfirmed destructive calls.
  • Executor tests never call real external CLIs or models. All adapter tests use binary stubs or mock registries. No network requests, no child-process side effects, no real model invocations.
  • Import side-effect isolation. Five boundary guard tests confirm that importing executors/cli.py does not trigger shutil.which, does not load hermes_cli, does not spawn subprocesses, and does not import worktree, kanban_feedback, or other uncommitted modules at the top level.
  • No force push, no destructive repo operations. All worktree tests operate on temporary directories under $TMPDIR. No repository-level destructive operations are performed by any test.

Validation

Check Result
Browser host TypeScript compilation ✅ passes
Browser host contract validation (validate-contract.mjs) ✅ passes (snapshot matches schema)
Web build (web/) ✅ passes
Kanban CLI tests ✅ passes
Structured preview UI build ✅ passes
py_compile — all executor modules ✅ passes
tests/executors/ full suite 463 passed, 0 failed
tests/executors/test_cli.py ✅ 48 passed
Import side-effect guards (5 checks) ✅ all pass
Missing-binary fallback resilience (3 checks) ✅ all pass
Worktree confirmation gate tests ✅ gate blocks unconfirmed, --force bypasses

What Did Not Ship

  • Browser action automation — no click, type, submit, navigate, or scroll APIs
  • Chrome extension compatibility — no chrome.tabs / chrome.scripting / chrome.debugger bridge
  • ChatSessionSidebar wiring — kept as a local excluded draft, not connected to the app shell in this PR
  • Production Electron packaging, code signing, or auto-update
  • Full security threat model for the Electron Browser Workspace host
  • External browser adapter implementation (browser_adapter.py)
  • End-to-end multi-step executor orchestration pipelines
  • Bridge ↔ kanban feedback path integration (the two remain architecturally separate by design)

Risks / Follow-ups

  • Native macOS visual checks are still needed for the Electron Browser Workspace. TypeScript compilation and contract validation pass, but the rendered shell has not been visually verified on macOS.
  • Executors review/QA path remains separate from the kanban_feedback event path by design. Future work may bridge them at the routing or bridge layer, but for now they are independent subsystems.
  • ChatSessionSidebar is kept as a local excluded draft. It is not wired into the app shell and is not part of this PR. Downstream work will need to integrate it.
  • Larger end-to-end executor orchestration (multi-step run pipelines with real adapters) still needs integration testing. The current tests validate each module in isolation.
  • hermes_cli/kanban_feedback.py introduces a new feedback event path. Its interaction with the existing kanban CLI surface should be documented before downstream consumers adopt it.
  • Browser Workspace contract is validated against a golden snapshot. If the IPC contract evolves, the snapshot must be updated and re-validated.

Commit Breakdown

SHA Title Purpose
ce62fbc5c feat(browser-workspace): Electron host lifecycle + chat insertion contract Scaffold the read-only Browser Workspace: Electron main process lifecycle, preload context bridge, renderer shell, typed schema, contract validation script, and golden snapshot
312050bfe feat(kanban): add review and QA feedback events Structured review/QA defect event types, routing, and kanban surface integration
313b5d896 feat(ui): add structured preview panel for tool results Render structured tool-call results in a formatted preview panel instead of raw JSON
86d2b64e7 feat(ui): add embedded chat routing mode Add embedded chat session routing to the web app shell
51671f813 feat(executors): add core registry and router API Typed executor registry with binary discovery, health probes, and capability-based router
b9b28ddf4 feat(executors): add CLI adapter implementations Five adapters (Claude Code, Codex, DeepSeek TUI, Hermes local, OpenCode) implementing the shared capabilities contract
db8129cf0 feat(executors): add context inbox and prompt tools Workspace context injection CLI, external inbox management CLI, and structured prompt assembly
a95211d9e feat(executors): add guarded worktree tools Worktree create/list/merge/discard with confirmation gates on destructive operations and parent-repo root resolution
50bb17679 feat(executors): add bridge IPC and review parsing primitives Typed IPC message protocol, run-event-log ↔ review bridge, and review chain agent logic
5b150a6be feat(executors): add review handler and CLI subcommands Review orchestration handler and review subcommand CLI
7bbb0a3ed feat(executors): add top-level CLI integration Argparse-based entry point wiring all subcommands with import side-effect isolation and delegation to guarded handlers

🤖 Generated with Claude Code

Gu added 11 commits June 6, 2026 12:54
…tract

browser-host/: new Electron child-process (status/start/stop + snapshot/screenshot/context proxies) — Phase 2B + 2C

plugins/browser-workspace/dashboard/: Codex-like sidebar tab (manifest + bundled dist/index.js) — Phase 4 entry point

hermes_cli/web_server.py: expose /api/browser-host/{status,start,stop,snapshot,screenshot,context} and read-only HTTP proxy to the host on localhost — Phase 2B/2C

web/src/pages/ChatPage.tsx: window.__HERMES_INSERT_CHAT_TEXT__ (term.paste path) + window.__HERMES_SET_BROWSER_CONTEXT_REF__ (metadata only — no DOM/selection/clipboard) — Phase 4A/4B

Out of scope: Kanban review/qa CLI + diff events; Codex-like embedded-chat nav reshuffle; ToolCall selected ring; ChatSidebar structured preview parser; executors/ module; dashboard screenshots.
Add review/qa CLI subcommands, emit diff events after dispatch, and publish diff/review/qa task events to the dashboard WebSocket stream.

Includes focused CLI tests for dispatch diff recording and review/QA result events.
Parse diff, review, QA, and artifact payloads from tool events and render them in a dedicated ChatSidebar detail panel.

Add ToolCall onClick/selected props for activity-list selection and highlight state, with graceful fallback for unparseable results.
Redirect root and unknown routes to /chat when embedded chat mode is enabled, slim the sidebar nav for chat-first usage, and add Config diagnostics links for hidden dashboard pages.
Add the core executor types, registry, router, and health helpers behind a core-safe package API that does not import adapter implementations.

Add router and registry tests covering keyword recommendations, fallback behavior, default manifests, and missing-binary health checks without invoking live models.
Add Hermes Local, Claude Code, Codex, OpenCode, and DeepSeek TUI executor adapters implementing the AgentExecutorAdapter protocol.

CLI adapters use asyncio subprocess array calls without shell=True; Hermes Local restores cwd with try/finally; DeepSeek TUI remains an explicit unavailable stub.

Add adapter registry tests covering imports, missing-binary health checks, inert imports, and no live CLI/model/worktree side effects.
Add project-scoped workspace context, inbox persistence, and per-executor prompt building utilities.

These tools write only to caller-supplied project .hermes files and do not invoke subprocesses, models, git, or external services.

Add tests covering JSON round trips, default loading, prompt generation, inbox lifecycle, and side-effect boundaries.
Add worktree lifecycle utilities and CLI helpers with dirty-tree rejection and confirmation gates for destructive merge/discard operations.

Fix the clean-working-tree preflight check so user dirty files reject worktree creation while .hermes infrastructure files are ignored as intended.

Add worktree tests covering dirty rejection, infra filtering, merge/discard confirmation, and tmp-repo-only git operations.
Add IPC dataclasses, review/QA prompt parsing primitives, and a fixture-driven bridge CLI simulation layer without importing the real review handler.

Keep D1a read-only and subprocess-free; bridge CLI uses local stubs for review/QA simulation while D1b decides the real review backend boundary.

Fix diff deletion parsing so file header lines are not counted as deletions.
Adds the executors/IPC rail for review and QA, parallel to the existing
hermes_cli.kanban_feedback trigger_* path. The two rails target different
consumers (Kanban SQLite vs. Electron IPC) and are NOT interchangeable.

- executors/review_handler.py: trigger_review_ipc / trigger_qa_ipc async
  wrappers around _launch_opencode, with shutil.which gate, OpencodeUnavailable
  exception, emit_diff_event side-channel, and stub_review_report / stub_qa_report
  fallback when opencode is absent. Module docstring documents the dual-rail
  architecture to prevent confusion with kanban_feedback.trigger_*.
- executors/review_cli.py: 6 subcommand handlers (build-prompt / parse /
  executor for both review and QA) + handle_review_command / handle_qa_command
  dispatchers. SEVERITY_ICONS map for human-readable output.
- tests/executors/test_review_handler.py: 582 lines, 9 test classes
  covering rename verification, IPC happy paths, fallback paths, subprocess
  invocation, side-channel events, stub reports, exception handling, and
  boundary guards (no kanban_feedback, no executors.cli, no executors.bridge,
  no sqlite3 imports, no real opencode invocation).
- tests/executors/test_review_cli.py: 443 lines, 9 test classes covering
  all 6 subcommand handlers, dispatchers, SEVERITY_ICONS lookup, and
  parallel boundary guards.

The rename trigger_* -> trigger_*_ipc was required to disambiguate from
hermes_cli.kanban_feedback.trigger_* which writes to the Kanban SQLite
task_events table. Both are live in parallel; callers pick the rail
appropriate to their consumer (CLI vs. Electron IPC).

70 new tests; full executors suite: 415 passing (was 345).
Add the argparse-based executor CLI entry point wiring registry, health, routing, worktree, context, review, QA, inbox, and bridge subcommands.

Preserve destructive operation gates by delegating worktree actions to the guarded C2 handlers, and keep imports free of hermes_cli or live external CLI side effects.

Add CLI tests covering registry commands, delegation, help output, missing binaries, import side effects, and boundary guards.
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown

⚠️ npm lockfile hash out of date

Checked against commit e144574 (PR head at check time).

The hash = "sha256-..." line in these nix files no longer matches the committed package-lock.json:

Apply the fix

  • Apply lockfile fix — tick to push a commit with the correct hashes to this PR branch
  • Or run the Nix Lockfile Fix workflow manually (pass PR #2)
  • Or locally: nix run .#fix-lockfiles and commit the diff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant