feat: zeroclawed-mcp scaffold — agent-facing secret discovery (#30)#23
feat: zeroclawed-mcp scaffold — agent-facing secret discovery (#30)#23
Conversation
Adds `!secure` as a post-auth chat command dispatched before any agent
sees the message. Subcommands:
- `!secure set NAME=value` — stores the secret in fnox and replies
with "Stored `NAME`" + a retention-warning. Never echoes the value.
- `!secure list` — lists stored secret names (not values). Parses
fnox's output defensively; extra columns that look like values are
stripped before replying.
- `!secure help` / `!secure` — usage string.
Dispatch model (matches existing `!switch`/`!sessions`/`!default`):
- `CommandHandler::is_secure_command(&str)` — static, case-insensitive.
- `CommandHandler::handle_secure(text, identity_id) -> String` —
async (shells out to fnox), takes identity for future per-identity
audit/ACL though not used yet. Wired into telegram.rs in both
`handle_message_nonblocking` and `handle_message` paths, and added
to the unknown-command exclusion list so `!secure` doesn't get
intercepted as unknown before hitting the post-auth handler.
Subprocess approach today — `Command::new("fnox").args(["set", …])`.
Acknowledged-cost: three places now shell out to fnox (vault.rs,
secure_set, secure_list), which means three places that need an
availability check. **Follow-up PR (in flight) will migrate all three
to the fnox library crate via `fnox = { git = "https://github.com/jdx/fnox" }`**
— fnox already exposes public modules (secret_resolver, commands,
etc.) at `src/lib.rs`, so no upstream PR is needed.
Redaction discipline:
- The success reply names the stored secret but never echoes the
value; tested.
- The failure reply surfaces fnox's stderr (user needs to know why
it failed) but doesn't carry the value; tested.
- The telegram channel's `debug!` logs only that a !secure command
was handled, without the message text. A `!secure set FOO=bar`
would otherwise leak the value into ops logs.
- Name validation: `[A-Za-z0-9_-]+` only (matches the substitution
engine's accepted ref-name syntax — so a stored secret can be
immediately referenced as `{{secret:NAME}}` without a rename).
Tested with three invalid-char cases.
5 given/when/then behavioral tests:
- `secure_set_reply_includes_name_but_not_value`
- `secure_set_surfaces_fnox_error_without_echoing_value`
- `secure_unknown_subcommand_returns_help`
- `secure_set_rejects_invalid_name_chars`
- `secure_list_returns_names_only`
All use a fake-fnox shell script installed on PATH via TempDir, so
the tests are hermetic regardless of whether the real fnox is
installed on the dev/CI machine.
Explicit non-goals for this PR (follow-ups):
- `!secure remove NAME` — pending a decision on whether fnox's
own remove command or a config-file edit is the right approach.
- `!secure request NAME` — the out-of-band localhost-paste flow
that avoids chat-transport retention entirely (see
docs/rfcs/agent-secret-gateway.md §5).
- Matrix/WhatsApp channel wiring — same pattern as telegram, but
adds separate verification surface. Doing in a follow-up to keep
this PR reviewable.
- Library-based fnox integration (swap subprocess → lib calls) —
committed as next PR per user direction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User and I had a back-and-forth on whether to migrate fnox usage to
the upstream Rust library or keep subprocess. After investigating
fnox's actual library API (`fnox::secret_resolver::resolve_secret`
needs pre-loaded `Config` + per-secret `SecretConfig`; no clean
programmatic SET; ~30 transitive deps), subprocess is still the
right tool — but each call-site shouldn't be reinventing the wheel.
This commit lands `crates/onecli-client/src/fnox_client.rs` — one
typed wrapper around `fnox` that all current and future call-sites
use:
- `FnoxClient::get(name) -> Result<String, FnoxError>`
- `FnoxClient::set(name, value) -> Result<(), FnoxError>`
- `FnoxClient::list() -> Result<Vec<String>, FnoxError>`
- `FnoxClient::is_available() -> bool`
Errors are typed (`NotInstalled`, `EmptyValue`, `Failed { exit_code,
stderr }`, `InvalidUtf8`), not stringified-stderr, so callers can
pattern-match: e.g. `!secure set` gives a different message on
NotInstalled (suggest `brew install fnox`) vs Failed (surface fnox's
own error).
Migration in this PR:
- `commands.rs::secure_set` and `secure_list` now call FnoxClient.
Before: each had its own `Command::new("fnox")`, its own io::Error
-> "not available" stringification, its own stderr-to-string error.
After: one-liner calls + match on the typed error.
Migration left for follow-up (when PR #15 lands or this branch is
rebased on it):
- `vault.rs::get_secret_from_fnox` — same swap; needs PR #15 in
history first since that's where the function lives.
Tests:
- 11 new given/when/then tests for FnoxClient itself, using
`FnoxClient::with_binary(path)` to point at fake-fnox shell scripts
in TempDir. No PATH manipulation, no global env mutation, no
`serial_test` requirement (cleaner than the existing fake-fnox-on-
PATH pattern in `vault_fallthrough.rs` and `commands::tests`).
- Existing 5 `!secure` tests in `commands::tests` still pass
unchanged — they used PATH-prepend, which still works because
`FnoxClient::new()` defaults to `"fnox"` (PATH lookup).
- Notable adversarial assertions added:
- `get_empty_value_is_error_not_empty_string` — empty stdout
returned as `EmptyValue(name)` so a downstream `Authorization:
Bearer ` (empty) silent-anonymous failure is impossible.
- `set_passes_name_and_value_in_argv_order` — captures argv to a
temp file and asserts position; guards against an off-by-one
rearrangement that would store the value under the wrong name.
- `list_extracts_names_only_dropping_value_columns` — explicit
negative assertion that no value substring survives.
`tempfile` added as a dev-dep on `onecli-client`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per docs/rfcs/agent-secret-gateway.md §4. New crate
\`crates/zeroclawed-mcp\` exposes a deliberately narrow MCP surface
that lets agents:
- discover what secrets exist (\`list_secrets\` returns NAMES only,
never values)
- build canonical references for substitution
(\`secret_reference(name) → "{{secret:NAME}}"\`)
- initiate user-facing add flows
(\`add_secret_request(name, description, retention_ok)\` — currently
stub, future will return a short-lived localhost URL)
**Critically, the server does NOT expose \`get_secret\`.** That's the
core threat-model decision: any agent that connects to MCP could
otherwise enumerate names and pull values. Values flow through the
security-proxy substitution layer ONLY, where they're injected at the
network boundary and never enter agent context.
## Implementation
Built on \`rmcp\` v0.8 (the official Rust MCP SDK) with the
\`server + macros + transport-io + schemars\` features. \`#[tool_router]\`
+ \`#[tool_handler]\` attributes wire the three tools through stdio
JSON-RPC. \`schemars\` v1 (matching rmcp's transitive requirement —
catching this misalignment cost a build cycle).
Wraps \`onecli_client::FnoxClient\` for the underlying secret store.
Tests inject a fake-\`fnox\` shell script via
\`FnoxClient::with_binary(path)\` so they're hermetic — no PATH
manipulation, no env mutation.
## Why a separate binary, not a module on \`zeroclawed\`
MCP servers run as agent-spawned subprocesses over stdio. Splitting
this out:
1. Agents that don't need secret discovery don't pay the startup cost.
2. Server can be installed by an agent that doesn't have access to
the full zeroclawed daemon (e.g., \`claude\` in a sandbox).
3. Failures here don't take down the gateway.
## Validation symmetry
\`is_valid_name\` (in this crate) matches \`substitution::find_refs\`
name-shape rules in security-proxy exactly: \`[A-Za-z0-9_-]+\`. Tests
guard the symmetry — if MCP returned a token the substitution engine
wouldn't accept, the agent's request would silently fail downstream.
## Tests (7 given/when/then)
- \`list_secrets_returns_name_list_with_count\` — happy path with
3-name fixture
- \`list_secrets_returns_actionable_error_when_fnox_missing\` — error
path returns "install fnox" hint, not a stack trace
- \`secret_reference_returns_canonical_token\` — exact token shape +
network-boundary usage hint
- \`secret_reference_rejects_invalid_name_shape\` — name validation
prevents downstream silent failures
- \`add_secret_request_with_retention_ok_offers_chat_path\` — both
routes (host fnox + !secure set) suggested
- \`add_secret_request_with_retention_not_ok_rejects_chat_path\` —
high-value secrets must NOT slip through to chat just because the
agent forgot to set the flag (safety contract)
- \`server_advertises_tools_capability\` — guards against macro
registration silently losing a tool
## Stacks on PR #21
Depends on \`onecli_client::FnoxClient\` from PR #21 for the
\`fnox.list()\` call. PR #21 must merge first or this rebases.
## Follow-ups
- Wire the server into agent configs (#32 — install.sh seeds the
\`mcp_servers\` entry for claude/opencode/etc.)
- Implement the \`!secure request\` localhost-paste flow that
\`add_secret_request\` currently stubs out
- Add a \`describe_secret(name)\` tool returning rich metadata
(description, last-rotated, allowed-destinations) without ever
returning the value
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR scaffolds an agent-facing “secret discovery” MCP server (zeroclawed-mcp) and wires up a !secure chat command path backed by a shared FnoxClient wrapper, with the explicit goal of exposing secret names and canonical references without ever exposing secret values to agents.
Changes:
- Add
crates/zeroclawed-mcpMCP server (stdio transport) with tools:list_secrets,secret_reference,add_secret_request(stub) + tests. - Add
onecli_client::FnoxClientwrapper with typedFnoxErrorand hermetic tests viawith_binary. - Add
!securehandling in ZeroClawed (command parsing + Telegram interception) and add the new crate to the workspace.
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/zeroclawed/src/commands.rs | Adds !secure command detection/dispatch, secure_set/list/help, and associated tests. |
| crates/zeroclawed/src/channels/telegram.rs | Intercepts !secure post-auth to avoid routing secret values to agents and avoids logging message text. |
| crates/zeroclawed-mcp/src/main.rs | Runs the MCP server over rmcp stdio with tracing to stderr. |
| crates/zeroclawed-mcp/src/lib.rs | Implements the MCP tools and validates secret-name shape; includes tool-contract tests. |
| crates/zeroclawed-mcp/Cargo.toml | New crate manifest and rmcp dependency selection. |
| crates/onecli-client/src/lib.rs | Exposes the new fnox_client module and re-exports FnoxClient/FnoxError. |
| crates/onecli-client/src/fnox_client.rs | Implements the fnox subprocess wrapper + typed errors + tests. |
| crates/onecli-client/Cargo.toml | Adds tempfile dev-dependency for the new tests. |
| Cargo.toml | Adds zeroclawed-mcp to workspace members and default-members. |
| Cargo.lock | Records new dependencies (including an additional rmcp version). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Handle `!secure <subcommand>`. Dispatches to `fnox` as a | ||
| /// subprocess; nothing about the secret value transits an agent's | ||
| /// context. Never echoes back values — responses are name-only. | ||
| /// | ||
| /// Subcommands: | ||
| /// - `!secure set NAME=value` — store a secret | ||
| /// - `!secure list` — list stored secret names | ||
| /// - `!secure help` — usage string | ||
| /// | ||
| /// **Retention warning** (documented in first-time-use UX): | ||
| /// this command's text passes through the chat transport | ||
| /// (Telegram/Matrix/WhatsApp), which retains message history. | ||
| /// For values where chat-transport exposure is unacceptable, | ||
| /// a follow-up `!secure request NAME` flow (out-of-band) is | ||
| /// planned — see `docs/rfcs/agent-secret-gateway.md` §5. | ||
| pub async fn handle_secure(&self, text: &str, _identity_id: &str) -> String { | ||
| let trimmed = text.trim(); |
There was a problem hiding this comment.
The docstring says !secure runs post-auth “so we can audit who set what”, but identity_id is unused (_identity_id) and neither the handler nor secure_set/secure_list record the actor + secret name anywhere. Either remove the audit claim or emit an audit log/event that includes identity_id and the secret name (never the value).
| // Only allow the same name-shape substitution accepts, so callers | ||
| // can immediately use the stored name in a `{{secret:NAME}}` ref. | ||
| // See crates/security-proxy/src/substitution.rs. |
There was a problem hiding this comment.
This comment references crates/security-proxy/src/substitution.rs, but that file doesn’t exist in the repo (security-proxy has no substitution module). Please update the reference to the actual secret-substitution implementation (or remove the pointer) so future readers aren’t sent on a dead-end chase.
| // Only allow the same name-shape substitution accepts, so callers | |
| // can immediately use the stored name in a `{{secret:NAME}}` ref. | |
| // See crates/security-proxy/src/substitution.rs. | |
| // Only allow the same secret-name shape used by substitution, so callers | |
| // can immediately use the stored name in a `{{secret:NAME}}` ref. |
| // These use the same fake-fnox-on-PATH trick as | ||
| // `onecli-client/tests/vault_fallthrough.rs`: a temp dir holding a | ||
| // shell script named `fnox` goes to the FRONT of PATH; that script | ||
| // acts like fnox for the test's purposes. Real fnox presence on | ||
| // the dev machine doesn't affect the result. |
There was a problem hiding this comment.
The test scaffolding comment says “No PATH manipulation / no global env mutation” in the PR description, but these tests explicitly prepend a temp dir to PATH and rely on it. If PATH mutation is required here, please update the test/module comment (and/or PR description) so the hermeticity claim matches reality, or refactor to use FnoxClient::with_binary(...) injection like the other tests do.
| @@ -13,6 +14,7 @@ members = [ | |||
| # loom-tests requires RUSTFLAGS='--cfg loom' — exclude from default cargo test | |||
| default-members = [ | |||
| "crates/zeroclawed", | |||
| "crates/zeroclawed-mcp", | |||
| "crates/adversary-detector", | |||
There was a problem hiding this comment.
CI currently runs cargo test -p <crate> for a fixed crate matrix; adding crates/zeroclawed-mcp to workspace/default-members won’t automatically run its tests in that job. Please ensure the CI test matrix (or a cargo test --workspace) is updated so zeroclawed-mcp’s tool-contract tests actually execute on PRs.
| # Crate-specific | ||
| # Official Rust MCP SDK. The "server" feature pulls in the | ||
| # stdio transport and tool/resource registration macros. | ||
| rmcp = { version = "0.8", features = ["server", "macros", "transport-io", "schemars"] } |
There was a problem hiding this comment.
This pins rmcp = "0.8", but the workspace already pulls in rmcp 0.9.1 (via other deps, as evidenced by the Cargo.lock changes). Carrying two rmcp major/minor versions increases build time/binary size and can complicate future upgrades; consider aligning zeroclawed-mcp to the already-present rmcp version (or moving rmcp to a workspace dependency) to avoid the duplicate graph.
| rmcp = { version = "0.8", features = ["server", "macros", "transport-io", "schemars"] } | |
| rmcp = { version = "0.9.1", features = ["server", "macros", "transport-io", "schemars"] } |
| /// `fnox set <name> <value>` — store a secret. | ||
| /// | ||
| /// `value` is passed as a CLI argument, which fnox docs note is | ||
| /// visible in `ps` output during the call. For our use case | ||
| /// (`!secure set`) the value is already in the chat transport, | ||
| /// so this isn't a regression. Future improvement: pipe value | ||
| /// via stdin, which fnox supports when value is omitted. | ||
| pub async fn set(&self, name: &str, value: &str) -> Result<(), FnoxError> { | ||
| debug!("fnox set {}", name); | ||
| let output = self | ||
| .run(&["set", name, value]) | ||
| .await | ||
| .map_err(FnoxError::NotInstalled)?; |
There was a problem hiding this comment.
FnoxClient::set passes the secret value as a CLI argument (fnox set <name> <value>), which is typically visible to other local users via process listings while the subprocess runs. Since this path is explicitly for handling secrets, it would be safer to pass the value via stdin (fnox supports prompting/stdin when value is omitted) to avoid an additional local-side-channel leak.
| dir.path().to_path_buf() | ||
| } | ||
|
|
||
| struct PathGuard { | ||
| original: Option<String>, | ||
| } | ||
| impl PathGuard { | ||
| fn prepend(dir: &std::path::Path) -> Self { | ||
| let original = std::env::var("PATH").ok(); | ||
| let new_path = match &original { | ||
| Some(p) => format!("{}:{}", dir.display(), p), | ||
| None => dir.display().to_string(), | ||
| }; | ||
| // Safety: tests holding SECURE_ENV_MUTEX serialize env | ||
| // mutation. `std::env::set_var` is marked unsafe in | ||
| // Rust 2024 for this exact reason. | ||
| unsafe { | ||
| std::env::set_var("PATH", new_path); | ||
| } | ||
| Self { original } | ||
| } | ||
| } | ||
| impl Drop for PathGuard { | ||
| fn drop(&mut self) { | ||
| unsafe { | ||
| match &self.original { | ||
| Some(p) => std::env::set_var("PATH", p), | ||
| None => std::env::remove_var("PATH"), | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
These tests mutate the process-wide PATH via std::env::set_var. Even with SECURE_ENV_MUTEX, other tests can still read env vars concurrently, which is exactly the unsoundness that motivated making env mutation unsafe; this can lead to flaky/UB behavior under parallel cargo test. Prefer avoiding global env mutation by making the fnox binary injectable (e.g., store a FnoxClient on CommandHandler or pass a FnoxClient::with_binary(...) into secure_set/secure_list), so tests stay hermetic without touching PATH.
| dir.path().to_path_buf() | |
| } | |
| struct PathGuard { | |
| original: Option<String>, | |
| } | |
| impl PathGuard { | |
| fn prepend(dir: &std::path::Path) -> Self { | |
| let original = std::env::var("PATH").ok(); | |
| let new_path = match &original { | |
| Some(p) => format!("{}:{}", dir.display(), p), | |
| None => dir.display().to_string(), | |
| }; | |
| // Safety: tests holding SECURE_ENV_MUTEX serialize env | |
| // mutation. `std::env::set_var` is marked unsafe in | |
| // Rust 2024 for this exact reason. | |
| unsafe { | |
| std::env::set_var("PATH", new_path); | |
| } | |
| Self { original } | |
| } | |
| } | |
| impl Drop for PathGuard { | |
| fn drop(&mut self) { | |
| unsafe { | |
| match &self.original { | |
| Some(p) => std::env::set_var("PATH", p), | |
| None => std::env::remove_var("PATH"), | |
| } | |
| } | |
| } | |
| } | |
| bin | |
| } |
| /// Validates a secret name against the same shape the substitution | ||
| /// engine accepts (see `crates/security-proxy/src/substitution.rs`). | ||
| /// Keeping the two in sync is critical: if `secret_reference` | ||
| /// produced a token the substitution engine wouldn't accept, the | ||
| /// agent's request would silently fail downstream. | ||
| fn is_valid_name(name: &str) -> bool { |
There was a problem hiding this comment.
This doc points to crates/security-proxy/src/substitution.rs for the substitution engine, but that module/file doesn’t exist in crates/security-proxy/src. Please update the reference (or link to the actual substitution implementation) so the MCP’s threat-model docs remain accurate and verifiable.
| // `!secure ...` — split off the subcommand word. | ||
| let mut parts = trimmed.splitn(3, ' '); | ||
| let _lead = parts.next(); // `!secure` |
There was a problem hiding this comment.
handle_secure splits on the literal space character via splitn(3, ' '). With multiple spaces (common on mobile, e.g. "!secure set KEY=..."), sub becomes an empty string and the command falls into the help path instead of running set. Consider using split_whitespace() (or filtering empty segments) for parsing the lead/subcommand/rest so extra whitespace and tabs don’t break dispatch.
| // `!secure ...` — split off the subcommand word. | |
| let mut parts = trimmed.splitn(3, ' '); | |
| let _lead = parts.next(); // `!secure` | |
| // `!secure ...` — parse the subcommand using whitespace-aware splitting | |
| // so repeated spaces or tabs do not produce an empty subcommand. | |
| let after_lead = trimmed | |
| .strip_prefix("!secure") | |
| .map(str::trim_start) | |
| .unwrap_or(""); | |
| let mut parts = after_lead.splitn(2, char::is_whitespace); |
|
Codex integration sweep note: I reviewed the inline comments on this PR. GitHub rejected direct inline replies for these older/outdated review comments with HTTP 422, so I am responding top-level instead: 3141257686, 3141257692, 3141257703, 3141257709, 3141257720, 3141257730, 3141257738, 3141257745, 3141257756.\n\nI did not edit this branch. Items that overlap the secure/fnox/host-agent/digest integration work are addressed in draft PR #38 (codex-integration-code), including stdin-based fnox set, bounded fnox waits, whitespace-safe !secure parsing, identity-aware !secure audit logs, valid-input host-agent properties, real WhatsApp HMAC verification, loopback OneCLI default bind, and race-free digest temp paths. Remaining PR-specific findings stay actionable for this branch owner or a follow-up. |
Cross-cutting triage finding (PRs #20, #21, #23, #26, #28, #31): the !secure parser used `splitn(' ')` which mis-shapes multi-space input — `"!secure set NAME=v"` parsed as sub="" rest="set NAME=v", silently routing through the unknown-subcommand path with the secret in the help-error message. Switch to `split_whitespace()`. Also wires the previously-unused `identity_id` into an audit log so the `_` prefix can drop. The log records identity + subcommand, never the value (`rest`). Closes the "audit-claim docstring with no audit implementation" finding from triage.
…ning (#44) Squash-merge of integration/super-combined — 4 weeks of feature work + cross-PR security fixes + codex agent's hardening, all green CI (14/14 checks). ## Features landing - **fnox secret-resolver integration** (#15) + FnoxClient subprocess wrapper (#21) - **Adversarial commit-reviewer + mechanical pre-commit gate** (#18) - **{{secret:NAME}} substitution engine** in security-proxy URL/headers/body (#19) - **Per-secret destination allowlist** (#22) — RFC §11.1 attack defense - **!secure chat commands** (set/list) on Telegram (#20), Matrix (#28), WhatsApp (#31) - **zeroclawed-mcp** scaffold — agent-facing secret discovery server (#23) - **install.sh wires MCP** into Claude Code agent configs (#26) - **zeroclawed-secret-paste** — localhost web UI for one-shot secret input (#34) - **Bulk paste UI** — .env-style multi-secret onboarding with per-line results - **LAN-friendly defaults** — bind 0.0.0.0 + RFC 1918 Origin acceptance - **WhatsApp HMAC verification** (was always-true placeholder before — codex hardening) ## Security fixes folded in - /vault/:secret bearer auth + 127.0.0.1 default bind (#39) - URL-embedded secrets honor destination allowlist (#41) - Paste-flow: bearer URL only at debug, fnox set via stdin not argv (#40) - Paste-flow: graceful shutdown, exit-on-submit, reject Origin: null (#43) - Subprocess timeouts + kill_on_drop on FnoxClient - BrokenPipe-tolerant stdin write (Linux CI surface) - Header-value log redaction - OneCLI bound to 127.0.0.1 by default - Sanitized real API token + Telegram IDs from sample configs (#36) ## Architecture / refactors - Consolidated onecli binary into security-proxy (#17) - Hardcoded vault URL removed from onecli-client - security-proxy resolver wired into hot path - Extracted build_app router; migrated /vault/:secret route - !secure parser uses split_whitespace (was splitn), audit-logs invocations ## Test coverage added - security-proxy substitution engine + body/headers tests - onecli-client retry + Http(_) variant + adversarial fallthrough suite - onecli-client client.rs rewritten from tautologies to wiremock-backed - config/validator coverage (was zero, now 290-line module covered) - 16 zeroclawed-secret-paste tests including bulk-mode cases ## Docs / RFCs - agent-secret-gateway holistic architecture - consolidation-findings (what #28 must address) - secret-input-web-ui RFC (input-only, new-by-default) - browser-harness integration spike - test-quality-audit Round 1+2+3 (host-agent + zeroclawed priority files) ## Codex agent's hardening cherry-picks - Subprocess timeouts on fnox calls - map_spawn_error helper - Validator hardening + atomic-counter digest race fix - WhatsApp HMAC implementation + tests - proxy header-value log redaction CI: all 14 checks green at squash time. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
|
Subsumed by #44 (squashed to |
…ning (#44) Squash-merge of integration/super-combined — 4 weeks of feature work + cross-PR security fixes + codex agent's hardening, all green CI (14/14 checks). ## Features landing - **fnox secret-resolver integration** (#15) + FnoxClient subprocess wrapper (#21) - **Adversarial commit-reviewer + mechanical pre-commit gate** (#18) - **{{secret:NAME}} substitution engine** in security-proxy URL/headers/body (#19) - **Per-secret destination allowlist** (#22) — RFC §11.1 attack defense - **!secure chat commands** (set/list) on Telegram (#20), Matrix (#28), WhatsApp (#31) - **zeroclawed-mcp** scaffold — agent-facing secret discovery server (#23) - **install.sh wires MCP** into Claude Code agent configs (#26) - **zeroclawed-secret-paste** — localhost web UI for one-shot secret input (#34) - **Bulk paste UI** — .env-style multi-secret onboarding with per-line results - **LAN-friendly defaults** — bind 0.0.0.0 + RFC 1918 Origin acceptance - **WhatsApp HMAC verification** (was always-true placeholder before — codex hardening) ## Security fixes folded in - /vault/:secret bearer auth + 127.0.0.1 default bind (#39) - URL-embedded secrets honor destination allowlist (#41) - Paste-flow: bearer URL only at debug, fnox set via stdin not argv (#40) - Paste-flow: graceful shutdown, exit-on-submit, reject Origin: null (#43) - Subprocess timeouts + kill_on_drop on FnoxClient - BrokenPipe-tolerant stdin write (Linux CI surface) - Header-value log redaction - OneCLI bound to 127.0.0.1 by default - Sanitized real API token + Telegram IDs from sample configs (#36) ## Architecture / refactors - Consolidated onecli binary into security-proxy (#17) - Hardcoded vault URL removed from onecli-client - security-proxy resolver wired into hot path - Extracted build_app router; migrated /vault/:secret route - !secure parser uses split_whitespace (was splitn), audit-logs invocations ## Test coverage added - security-proxy substitution engine + body/headers tests - onecli-client retry + Http(_) variant + adversarial fallthrough suite - onecli-client client.rs rewritten from tautologies to wiremock-backed - config/validator coverage (was zero, now 290-line module covered) - 16 zeroclawed-secret-paste tests including bulk-mode cases ## Docs / RFCs - agent-secret-gateway holistic architecture - consolidation-findings (what #28 must address) - secret-input-web-ui RFC (input-only, new-by-default) - browser-harness integration spike - test-quality-audit Round 1+2+3 (host-agent + zeroclawed priority files) ## Codex agent's hardening cherry-picks - Subprocess timeouts on fnox calls - map_spawn_error helper - Validator hardening + atomic-counter digest race fix - WhatsApp HMAC implementation + tests - proxy header-value log redaction CI: all 14 checks green at squash time. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
V1 of `.github/copilot-instructions.md` was ~970 words and read more like
documentation than reviewer guidance. Two issues that hurt signal:
1. **Length** — Copilot's per-repo instructions read window is ~4000
chars; v1 was over that, so the trailing past-mistake list and
skip-list were getting truncated.
2. **Format** — long bulleted exposition reads less like a rule and
more like prose, which Copilot treats as background context rather
than as constraints to apply.
V2 changes:
- Cuts to ~3500 chars by condensing the prioritization tiers and
removing per-class HIGH/MED/LOW exposition (the priority-order list
carries the same info in 5 lines).
- Leads with a single review philosophy line ("if uncertain, do not
comment"), the highest-leverage rule borrowed from deno's
copilot-instructions.md.
- Names specific past Copilot noise patterns from this repo's PR
history (env-mutex/serial_test repeated 8+ times across #19/#22/#23;
dead-doc-reference 4x across #20/#23/#25) so the "don't repeat
across PRs" rule has teeth.
- Cross-references a new path-scoped file at
`.github/instructions/rust.instructions.md` (`applyTo: "**/*.rs"`),
which carries the Rust-specific review nits (`#[expect]` over
`#[allow]`, `// SAFETY:` requirement, `Mutex` across `.await`,
`select!` cancellation safety, `kill_on_drop`, `&str` over
`&String`, `LazyLock<Regex>` for hot paths, etc.).
Path-scoped instructions are loaded only when a PR touches a file
matching `applyTo`, so Rust-specific rules don't burn the global
4000-char budget on PRs that only touch docs / TOML / shell.
…de (#56) * chore(.github): add copilot-instructions.md to tune PR-review behavior GitHub Copilot supports per-repo review instructions at .github/copilot-instructions.md (≤2 pages, applied to every Copilot PR review automatically). Adds calciforge-specific guidance to improve signal-to-noise: - Skip what pre-commit already gates (fmt, clippy, gitleaks) - Prioritize HIGH-severity classes that bit us in past reviews: secret leakage in logs, substitution-boundary correctness, unwrap/expect outside tests, missing unsafe around set_var (edition 2024), blocking I/O in async, auth bypass paths - Tell Copilot what's NOT a bug despite looking like one: {{secret:NAME}} sentinel syntax, post-history-scrub fake test values, FnoxClient subprocess-by-design, clashd/zeroclaw_* upstream references, mixed Rust edition (known) - Past-mistake checklist (6 classes from real findings that landed and were caught later — substitution-after-bypass, None dest_host, bearer-in-info-log, fnox set argv leak, 0.0.0.0 default, hardcoded fallback URLs) - Skip even-if-correct: 'consider adding tests' without specifics, rename suggestions vs. functional convention, feature-creep proposals Cross-references AGENTS.md (host-agent coding standards) and CLAUDE.md (public-repo secret discipline) so Copilot follows both. 83 lines, well under the documented 2-page cap. * chore(.github): tighten copilot-instructions + add path-scoped Rust file V1 of `.github/copilot-instructions.md` was ~970 words and read more like documentation than reviewer guidance. Two issues that hurt signal: 1. **Length** — Copilot's per-repo instructions read window is ~4000 chars; v1 was over that, so the trailing past-mistake list and skip-list were getting truncated. 2. **Format** — long bulleted exposition reads less like a rule and more like prose, which Copilot treats as background context rather than as constraints to apply. V2 changes: - Cuts to ~3500 chars by condensing the prioritization tiers and removing per-class HIGH/MED/LOW exposition (the priority-order list carries the same info in 5 lines). - Leads with a single review philosophy line ("if uncertain, do not comment"), the highest-leverage rule borrowed from deno's copilot-instructions.md. - Names specific past Copilot noise patterns from this repo's PR history (env-mutex/serial_test repeated 8+ times across #19/#22/#23; dead-doc-reference 4x across #20/#23/#25) so the "don't repeat across PRs" rule has teeth. - Cross-references a new path-scoped file at `.github/instructions/rust.instructions.md` (`applyTo: "**/*.rs"`), which carries the Rust-specific review nits (`#[expect]` over `#[allow]`, `// SAFETY:` requirement, `Mutex` across `.await`, `select!` cancellation safety, `kill_on_drop`, `&str` over `&String`, `LazyLock<Regex>` for hot paths, etc.). Path-scoped instructions are loaded only when a PR touches a file matching `applyTo`, so Rust-specific rules don't burn the global 4000-char budget on PRs that only touch docs / TOML / shell. * chore(.github): restore AGENTS.md + CLAUDE.md cross-refs in copilot instructions Verified GitHub's copilot-instructions docs do not specify the ~4000-char read window I'd assumed in the previous commit — that was the older Copilot Chat feature, not the Copilot code-review one. With no real length pressure, the AGENTS.md / CLAUDE.md pointers (dropped in v2 to save chars) are worth restoring. CLAUDE.md's "never commit these" list is exactly the kind of leakage Copilot should be enforcing on diff. * docs: split AGENTS.md into workspace-wide root + host-agent crate file The root `AGENTS.md` was titled "Calciforge Host-Agent" and carried host-agent-specific build/architecture rules — at the repo root, where agents (Claude Code, Codex, Copilot cloud agent, OpenClaw) read it as workspace-wide guidance. The mismatch meant agents working in any non-host-agent crate were getting irrelevant rules ("ZFS snapshot delegation", "mTLS CN→Unix user") and missing the actually-shared ones (substitution-boundary order, sentinel string contract, public-repo secret discipline pointer). Restructure: - Move the existing host-agent content verbatim to `crates/host-agent/AGENTS.md` (`git mv` so history is preserved). - New root `AGENTS.md` covers the whole workspace: crate inventory, project vocabulary, mandatory rules every agent must follow (CLAUDE.md secret discipline, pre-commit gate, sentinel contract, substitution boundary order, no-secret-values-in-logs, fnox stdin mode), workspace build/test commands, and pointers into per-area files (`crates/host-agent/AGENTS.md`, `docs/rfcs/`, `docs/security-gateway.md`, `docs/model-gateway.md`). Cross-refs `.github/copilot-instructions.md` and `.github/instructions/rust.instructions.md` so agents that find AGENTS.md first can pick up the Copilot-specific tuning if relevant. Pairs with the copilot-instructions tightening earlier on this branch. * fix(.github): correct gitleaks allowlist description in copilot-instructions Copilot's review caught a real factual error in v2: the line claimed specific literals (`+15555550100`, `7000000001`, `eyJ0eXAi...`) were allowlisted in `.gitleaks.toml`. They aren't — `7000000001` is even used in non-allowlisted source (`crates/calciforge/src/auth.rs`). The real allowlist mechanism is path-based (`tests/**/fixtures/`, `docs/rfcs/*.md`, lockfiles, etc.) plus a small regex list (loopback, RFC 5737, a few inherited-from-main values). Replace the misleading "these specific literals are allowlisted" claim with an accurate description of how the allowlist actually works, so Copilot doesn't downgrade real findings on the assumption they fall under a non-existent literal-match exemption. Pleasingly meta: this is exactly the "verify against the codebase before commenting" rule from the same file working as intended on the PR that introduced the file.
Summary
Per `docs/rfcs/agent-secret-gateway.md` §4. New crate
`crates/zeroclawed-mcp` exposes a deliberately narrow MCP surface
that lets agents:
never values)
(`secret_reference(name) → "{{secret:NAME}}"`)
stub)
The core threat-model decision: this server does NOT expose
`get_secret`. Any agent that connects to MCP could otherwise
enumerate names and pull values. Values flow through the
security-proxy substitution layer ONLY, where they're injected at
the network boundary and never enter agent context.
Implementation
substitution engine — guards against returning a token that would
silently fail downstream
agent-spawned subprocesses; splitting out means agents that don't
need secret discovery don't pay startup cost, and gateway failures
here don't take down the main daemon
manipulation
7 given/when/then tests
safety contract: high-value secrets must NOT slip through to chat
Stacks on PR #21 (FnoxClient)
Depends on `onecli_client::FnoxClient` from PR #21.
Follow-ups
`add_secret_request` currently stubs
🤖 Generated with Claude Code