Skip to content

docs: test-quality-audit Round 3 (host-agent + zeroclawed priority files)#37

Closed
bglusman wants to merge 1 commit intomainfrom
docs/test-audit-round-3
Closed

docs: test-quality-audit Round 3 (host-agent + zeroclawed priority files)#37
bglusman wants to merge 1 commit intomainfrom
docs/test-audit-round-3

Conversation

@bglusman
Copy link
Copy Markdown
Owner

Subagent-driven Round 3 of the test audit, +334 lines on top of Rounds 1+2 baseline. 21 files audited (all 19 host-agent inline test modules + 2 zeroclawed priority files: config.rs, context.rs).

Top findings worth escalating:

  1. Real-looking 64-hex API token in config.rs SAMPLE_CONFIG — already filed as separate fix (PR security: sanitize real API token + Telegram ID missed by previous passes #36)
  2. adapters/exec.rs has zero tests on the privileged validate/execute decision tree — largest single coverage gap in host-agent
  3. auth/adapter.rs tests fictional types contradicting production policy (test says "Full autonomy never requires approval", production says it CANNOT bypass always_ask) — passing tests describing wrong behavior
  4. approval/mod.rs has only the negative path tested — full happy path / replay rejection / expiration / target-mismatch all uncovered
  5. identity_plugin.rs:203 tests str::starts_with, not the actual guard
  6. error.rs IntoResponse mapping unfenced — both tests are non-panic checks
  7. context.rs is the positive model — recommended reference pattern

Out-of-scope-but-flagged for next round: commands.rs (44 tests, 2158 LOC), install/ssh.rs, install/cli.rs, install/executor.rs, adapters/cli.rs.

Two real-looking secrets the audit discussed are redacted in the doc itself (truncated middle) so the doc doesn't trip gitleaks on the very thing it's flagging.

🤖 Generated with Claude Code

Consolidates all three rounds of the test-quality audit into one
file: \`docs/rfcs/test-quality-audit.md\` (906 lines, +334 from
Round 2's 572-line baseline).

## Round 3 scope (subagent-driven)

Files audited (21 total):

**Host-agent (all 19 inline test modules)**: error.rs, audit.rs,
metrics.rs, rate_limit.rs, perm_warn.rs, config.rs, auth/{adapter,
identity}.rs, adapters/{exec,systemd,registry,pct,git,zfs}.rs,
zfs/mod.rs, approval/{signal,identity_plugin,token,mod}.rs.

**Zeroclawed priority files**: config.rs (18 tests), context.rs
(19 tests).

**Files NOT audited (in-scope but past the 25-min time cap)**:
\`commands.rs\` (44 tests, 2158 LOC), \`install/ssh.rs\`,
\`install/cli.rs\`, \`install/executor.rs\`, \`adapters/cli.rs\`.
Documented in the round's scope-coverage summary so the next pass
can pick up cleanly.

## Top findings worth escalating

1. **Real-looking 64-hex API token in \`config.rs\` SAMPLE_CONFIG**
   (lines 807, 923) — \`zc_4f5c220eec86…\`. Round 2 sanitized the
   Telegram IDs in this file but missed the token. **Filed as
   separate fix in security/sanitize-config-rs-secrets branch (PR
   #36).**
2. **\`adapters/exec.rs\` has zero tests on the privileged
   validate/execute decision tree** — the largest single coverage
   gap in host-agent. Stdlib-tautology tests only.
3. **\`auth/adapter.rs\` tests fictional types** that contradict
   production policy (test-local says "Full autonomy never requires
   approval"; production at \`config.rs:447\` says "Full autonomy
   CANNOT bypass \`always_ask\`"). Tests are passing while
   describing wrong behavior.
4. **\`approval/mod.rs\` has only the negative path tested** — the
   full create→Signal-confirm→consume happy path, token-replay
   rejection, expiration, target-mismatch, and not-yet-approved
   branches are all uncovered. Largest authorization-flow gap in
   the crate.
5. **\`approval/identity_plugin.rs:203\`
   \`test_relative_path_rejected\`** tests \`str::starts_with\`,
   not the actual guard. Comment admits it.
6. **\`error.rs\` IntoResponse mapping** is unfenced — both tests
   are \`let _ = err.into_response()\` non-panic checks.
7. **\`context.rs\` is a positive example** — 19 uniformly
   behavioral, multi-assertion, edge-case tests. Recommended as
   the reference pattern for the codebase's other test modules.

## Why a separate branch

Lands the docs separately from any code changes so the audit
findings ship without waiting on the consolidation refactor (PR #17)
or the ongoing FnoxClient/MCP work to merge. PR #17's commit
\`b2614c9a\` introduces the file with Rounds 1+2 — when both branches
land, the merge is straightforward (this branch's content
strictly extends).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 25, 2026 03:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bglusman
Copy link
Copy Markdown
Owner Author

Codex integration sweep: reviewed this open PR during the cross-PR pass. I did not find inline review comments to answer here, and I did not edit this branch. Any test-quality recommendations that overlap host-agent validator coverage are represented in draft integration PR #38.

@bglusman
Copy link
Copy Markdown
Owner Author

Subsumed by #44 (squashed to 9ed51fbc on main). All commits from this branch are present in the squash. Closing as redundant rather than merging again.

@bglusman bglusman closed this Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants