docs: browser-harness integration spike + recommendation by bglusman · Pull Request #25 · bglusman/calciforge

bglusman · 2026-04-25T02:58:12Z

Summary

User asked whether `browser-use/browser-harness` could be wired
into Claude Code, openclaw, or zeroclawed for agentic browser
automation. 30-min investigation; this RFC captures findings + a
ranked recommendation.

Headline

Browser-harness is ~592 lines of Python connecting an agent to a
user-running Chrome via CDP. Native model: agent reads
`SKILL.md` and shells out to `browser-harness <<PY ... PY` with
helpers like `new_tab`, `click_at_xy`, `capture_screenshot`,
`js`, `http_get`, `cdp` pre-imported. Claude Code already
supports this pattern via skills.

Recommended path

(A) Install + drop SKILL.md into `~/.claude/skills/`. ~5 min,
zero code changes, immediate browser automation in any Claude Code
session. Use for two weeks, then triage whether (B)–(D) are needed.

Other options ranked

(B) `zeroclawed-MCP` tool wrapper — half a day, only if non-Claude
agents need browser access too
(C) Rust port — don't; loses the harness's edit-helpers-on-the-fly property
(D) Daemon-spawned persistent browser pool for async chat-driven web
tasks — defer until that's a real use case

Security flags

Profile mechanism claims "cookies-only login state" — verify before
trusting for high-value accounts
`BROWSER_USE_API_KEY` should be a `{{secret:...}}` ref once the
substitution layer ships
Categorical capability expansion (anything the user's logged into is
scrapeable) — worth flagging in deploy docs

What I did NOT do

No code changes
Did not install on the Mac (the `chrome://inspect` checkbox
needs human eyeballs)
Did not file a follow-up task — next move is human-driven

🤖 Generated with Claude Code

User asked whether browser-use/browser-harness could be wired into Claude Code, openclaw, or zeroclawed for agentic browser automation. 30-minute investigation; this RFC captures findings and recommends a path. ## Headline conclusion Browser-harness is ~592 lines of Python that connects an agent to a user-running Chrome via CDP and ships pre-imported helpers (\`new_tab\`, \`click_at_xy\`, \`capture_screenshot\`, \`js\`, \`http_get\`, \`cdp\`). The native model is "agent reads SKILL.md and shells out to \`browser-harness <<PY ... PY\`" — Claude Code already supports this via skills. Recommended first step: install browser-harness + drop SKILL.md into ~/.claude/skills/. Five minutes; zero code changes; immediate value. Don't build an MCP wrapper or port to Rust until the skill has been used in anger. ## Four integration options ranked A. Claude Code skill — recommended first step (5 min, no code) B. zeroclawed-MCP tool wrapper — half a day, only if multi-agent support becomes a real need C. Rust port — DON'T (weeks of work; loses the harness's edit-helpers-on-the-fly property) D. Daemon-spawned browser pool — defer until async web automation from chat channels is a real use case ## Security notes - Profile mechanism claims cookies-only login state — verify before trusting for high-value accounts - \`BROWSER_USE_API_KEY\` should be a \`{{secret:...}}\` ref once the substitution layer ships - Agents with browser-harness can scrape anything the user's logged into — categorical capability expansion worth flagging in deploy docs Doc explicitly notes what I did NOT do (install on the Mac; \`chrome://inspect\` needs human eyeballs anyway), and why no follow-up task is filed — the next move is "Brian tries A for two weeks then we triage". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds an RFC/spike write-up evaluating whether browser-use/browser-harness should be integrated into this repo’s agent tooling (Claude Code / OpenClaw / zeroclawed), and recommends starting with a Claude Code skill drop-in before building any native integration.

Changes:

Introduces a new RFC documenting what browser-harness is, its requirements, and integration options (A–D).
Recommends adopting option (A) first (Claude Code skill), with (B) as a follow-up only if needed.
Captures initial security considerations and operational tradeoffs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-25T03:00:21Z

+Cost: ~half a day. Adds a `browse(action, params)` tool to
+`crates/zeroclawed-mcp` (the MCP server we just scaffolded for secret
+discovery in PR #23). The tool would shell out to `browser-harness`
+exactly as the skill does, but agents would discover it via MCP
+instead of reading a skill prompt.
+


This section references crates/zeroclawed-mcp and a “PR #23” scaffold, but there is no zeroclawed-mcp crate (and no other mention of that path) in the current workspace. To avoid sending readers on a dead-end, either point to the actual crate/module that would host an MCP server/tool wrapper, or phrase this as a hypothetical new crate without citing a non-existent path/PR.

Suggested change

Cost: ~half a day. Adds a `browse(action, params)` tool to

`crates/zeroclawed-mcp` (the MCP server we just scaffolded for secret

discovery in PR #23). The tool would shell out to `browser-harness`

exactly as the skill does, but agents would discover it via MCP

instead of reading a skill prompt.

Cost: ~half a day. Adds a `browse(action, params)` tool to a future

MCP server crate/module for zeroclawed. The tool would shell out to

`browser-harness` exactly as the skill does, but agents would

discover it via MCP instead of reading a skill prompt.

Copilot · 2026-04-25T03:00:21Z

+- Exposes a heredoc API: `browser-harness <<'PY' ... PY` — the script
+  body runs in a Python REPL with the helpers pre-imported


The heredoc invocation is shown both as browser-harness <<'PY' ... PY and browser-harness <<PY ... PY. It’d be better to use one consistent form, and prefer a single-quoted heredoc delimiter in examples to prevent accidental shell interpolation/expansion in the embedded script.

Copilot · 2026-04-25T03:00:21Z

+- `BROWSER_USE_API_KEY` (cloud-daemon feature) should be a
+  `{{secret:BROWSER_USE_API_KEY}}` reference per the substitution
+  RFC §3 once we wire it.


This references “the substitution RFC §3” and uses a {{secret:...}} syntax, but there’s no corresponding RFC/link or other usage in-repo to explain the convention. Consider linking to the actual document (path/URL) that defines secret substitution, or rewording this as a future TODO without an unresolvable reference/syntax.

Suggested change

- `BROWSER_USE_API_KEY` (cloud-daemon feature) should be a

`{{secret:BROWSER_USE_API_KEY}}` reference per the substitution

RFC §3 once we wire it.

- If we wire up the cloud-daemon feature, pass

`BROWSER_USE_API_KEY` through the project's standard secret

injection/configuration mechanism rather than hard-coding it.

Copilot · 2026-04-25T03:00:21Z

+
+Status: SPIKE — 30-minute investigation, not a build commitment.
+Filed because the user asked whether `browser-use/browser-harness`
+could be wired into us (Claude Code, openclaw, or zeroclawed) to give


“openclaw” here reads like the product name, which is capitalized as “OpenClaw” elsewhere in docs. If you’re referring to the project (not the adapter-kind string), consider capitalizing for consistency/clarity.

Suggested change

could be wired into us (Claude Code, openclaw, or zeroclawed) to give

could be wired into us (Claude Code, OpenClaw, or zeroclawed) to give

bglusman · 2026-04-25T04:44:01Z

Codex integration sweep note: I reviewed the inline comments on this PR. GitHub rejected direct inline replies for these older/outdated review comments with HTTP 422, so I am responding top-level instead: 3141257552, 3141257563, 3141257573, 3141257590.\n\nI did not edit this branch. Items that overlap the secure/fnox/host-agent/digest integration work are addressed in draft PR #38 (codex-integration-code), including stdin-based fnox set, bounded fnox waits, whitespace-safe !secure parsing, identity-aware !secure audit logs, valid-input host-agent properties, real WhatsApp HMAC verification, loopback OneCLI default bind, and race-free digest temp paths. Remaining PR-specific findings stay actionable for this branch owner or a follow-up.

bglusman · 2026-04-25T11:13:13Z

Acted on the recommended action (Option A) per the user's directive to follow up on RFC PRs:

uv tool install -e . for browser-use/browser-harness — installed at /Users/admin/.local/bin/browser-harness
SKILL.md copied to ~/.claude/skills/browser-harness/SKILL.md

Effective immediately in new Claude Code sessions. No work required on Option B (MCP wrapper) until the skill has been used in anger and we know what's missing.

bglusman · 2026-04-25T18:37:57Z

Subsumed by #44 (squashed to 9ed51fbc on main). All commits from this branch are present in the squash. Closing as redundant rather than merging again.

V1 of `.github/copilot-instructions.md` was ~970 words and read more like documentation than reviewer guidance. Two issues that hurt signal: 1. **Length** — Copilot's per-repo instructions read window is ~4000 chars; v1 was over that, so the trailing past-mistake list and skip-list were getting truncated. 2. **Format** — long bulleted exposition reads less like a rule and more like prose, which Copilot treats as background context rather than as constraints to apply. V2 changes: - Cuts to ~3500 chars by condensing the prioritization tiers and removing per-class HIGH/MED/LOW exposition (the priority-order list carries the same info in 5 lines). - Leads with a single review philosophy line ("if uncertain, do not comment"), the highest-leverage rule borrowed from deno's copilot-instructions.md. - Names specific past Copilot noise patterns from this repo's PR history (env-mutex/serial_test repeated 8+ times across #19/#22/#23; dead-doc-reference 4x across #20/#23/#25) so the "don't repeat across PRs" rule has teeth. - Cross-references a new path-scoped file at `.github/instructions/rust.instructions.md` (`applyTo: "**/*.rs"`), which carries the Rust-specific review nits (`#[expect]` over `#[allow]`, `// SAFETY:` requirement, `Mutex` across `.await`, `select!` cancellation safety, `kill_on_drop`, `&str` over `&String`, `LazyLock<Regex>` for hot paths, etc.). Path-scoped instructions are loaded only when a PR touches a file matching `applyTo`, so Rust-specific rules don't burn the global 4000-char budget on PRs that only touch docs / TOML / shell.

…de (#56) * chore(.github): add copilot-instructions.md to tune PR-review behavior GitHub Copilot supports per-repo review instructions at .github/copilot-instructions.md (≤2 pages, applied to every Copilot PR review automatically). Adds calciforge-specific guidance to improve signal-to-noise: - Skip what pre-commit already gates (fmt, clippy, gitleaks) - Prioritize HIGH-severity classes that bit us in past reviews: secret leakage in logs, substitution-boundary correctness, unwrap/expect outside tests, missing unsafe around set_var (edition 2024), blocking I/O in async, auth bypass paths - Tell Copilot what's NOT a bug despite looking like one: {{secret:NAME}} sentinel syntax, post-history-scrub fake test values, FnoxClient subprocess-by-design, clashd/zeroclaw_* upstream references, mixed Rust edition (known) - Past-mistake checklist (6 classes from real findings that landed and were caught later — substitution-after-bypass, None dest_host, bearer-in-info-log, fnox set argv leak, 0.0.0.0 default, hardcoded fallback URLs) - Skip even-if-correct: 'consider adding tests' without specifics, rename suggestions vs. functional convention, feature-creep proposals Cross-references AGENTS.md (host-agent coding standards) and CLAUDE.md (public-repo secret discipline) so Copilot follows both. 83 lines, well under the documented 2-page cap. * chore(.github): tighten copilot-instructions + add path-scoped Rust file V1 of `.github/copilot-instructions.md` was ~970 words and read more like documentation than reviewer guidance. Two issues that hurt signal: 1. **Length** — Copilot's per-repo instructions read window is ~4000 chars; v1 was over that, so the trailing past-mistake list and skip-list were getting truncated. 2. **Format** — long bulleted exposition reads less like a rule and more like prose, which Copilot treats as background context rather than as constraints to apply. V2 changes: - Cuts to ~3500 chars by condensing the prioritization tiers and removing per-class HIGH/MED/LOW exposition (the priority-order list carries the same info in 5 lines). - Leads with a single review philosophy line ("if uncertain, do not comment"), the highest-leverage rule borrowed from deno's copilot-instructions.md. - Names specific past Copilot noise patterns from this repo's PR history (env-mutex/serial_test repeated 8+ times across #19/#22/#23; dead-doc-reference 4x across #20/#23/#25) so the "don't repeat across PRs" rule has teeth. - Cross-references a new path-scoped file at `.github/instructions/rust.instructions.md` (`applyTo: "**/*.rs"`), which carries the Rust-specific review nits (`#[expect]` over `#[allow]`, `// SAFETY:` requirement, `Mutex` across `.await`, `select!` cancellation safety, `kill_on_drop`, `&str` over `&String`, `LazyLock<Regex>` for hot paths, etc.). Path-scoped instructions are loaded only when a PR touches a file matching `applyTo`, so Rust-specific rules don't burn the global 4000-char budget on PRs that only touch docs / TOML / shell. * chore(.github): restore AGENTS.md + CLAUDE.md cross-refs in copilot instructions Verified GitHub's copilot-instructions docs do not specify the ~4000-char read window I'd assumed in the previous commit — that was the older Copilot Chat feature, not the Copilot code-review one. With no real length pressure, the AGENTS.md / CLAUDE.md pointers (dropped in v2 to save chars) are worth restoring. CLAUDE.md's "never commit these" list is exactly the kind of leakage Copilot should be enforcing on diff. * docs: split AGENTS.md into workspace-wide root + host-agent crate file The root `AGENTS.md` was titled "Calciforge Host-Agent" and carried host-agent-specific build/architecture rules — at the repo root, where agents (Claude Code, Codex, Copilot cloud agent, OpenClaw) read it as workspace-wide guidance. The mismatch meant agents working in any non-host-agent crate were getting irrelevant rules ("ZFS snapshot delegation", "mTLS CN→Unix user") and missing the actually-shared ones (substitution-boundary order, sentinel string contract, public-repo secret discipline pointer). Restructure: - Move the existing host-agent content verbatim to `crates/host-agent/AGENTS.md` (`git mv` so history is preserved). - New root `AGENTS.md` covers the whole workspace: crate inventory, project vocabulary, mandatory rules every agent must follow (CLAUDE.md secret discipline, pre-commit gate, sentinel contract, substitution boundary order, no-secret-values-in-logs, fnox stdin mode), workspace build/test commands, and pointers into per-area files (`crates/host-agent/AGENTS.md`, `docs/rfcs/`, `docs/security-gateway.md`, `docs/model-gateway.md`). Cross-refs `.github/copilot-instructions.md` and `.github/instructions/rust.instructions.md` so agents that find AGENTS.md first can pick up the Copilot-specific tuning if relevant. Pairs with the copilot-instructions tightening earlier on this branch. * fix(.github): correct gitleaks allowlist description in copilot-instructions Copilot's review caught a real factual error in v2: the line claimed specific literals (`+15555550100`, `7000000001`, `eyJ0eXAi...`) were allowlisted in `.gitleaks.toml`. They aren't — `7000000001` is even used in non-allowlisted source (`crates/calciforge/src/auth.rs`). The real allowlist mechanism is path-based (`tests/**/fixtures/`, `docs/rfcs/*.md`, lockfiles, etc.) plus a small regex list (loopback, RFC 5737, a few inherited-from-main values). Replace the misleading "these specific literals are allowlisted" claim with an accurate description of how the allowlist actually works, so Copilot doesn't downgrade real findings on the assumption they fall under a non-existent literal-match exemption. Pleasingly meta: this is exactly the "verify against the codebase before commenting" rule from the same file working as intended on the PR that introduced the file.

Copilot AI review requested due to automatic review settings April 25, 2026 02:58

Copilot started reviewing on behalf of bglusman April 25, 2026 02:58 View session

Copilot AI reviewed Apr 25, 2026

View reviewed changes

bglusman mentioned this pull request Apr 25, 2026

[integration] All non-conflicting code changes — evaluation target #42

Closed

bglusman closed this Apr 25, 2026

bglusman mentioned this pull request Apr 26, 2026

chore: tune Copilot PR review + restructure AGENTS.md as workspace-wide #56

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: browser-harness integration spike + recommendation#25

docs: browser-harness integration spike + recommendation#25
bglusman wants to merge 1 commit intomainfrom
docs/browser-harness-spike

bglusman commented Apr 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 25, 2026

Uh oh!

Copilot AI Apr 25, 2026

Uh oh!

Copilot AI Apr 25, 2026

Uh oh!

Copilot AI Apr 25, 2026

Uh oh!

bglusman commented Apr 25, 2026

Uh oh!

bglusman commented Apr 25, 2026

Uh oh!

bglusman commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		- Exposes a heredoc API: `browser-harness <<'PY' ... PY` — the script
		body runs in a Python REPL with the helpers pre-imported

	could be wired into us (Claude Code, openclaw, or zeroclawed) to give
	could be wired into us (Claude Code, OpenClaw, or zeroclawed) to give

Conversation

bglusman commented Apr 25, 2026

Summary

Headline

Recommended path

Other options ranked

Security flags

What I did NOT do

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

bglusman commented Apr 25, 2026

Uh oh!

bglusman commented Apr 25, 2026

Uh oh!

bglusman commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants