Skip to content

feat(qa): add browse subcommand reference table#87

Closed
kaicianflone wants to merge 1 commit into
garrytan:mainfrom
kaicianflone:consensus-guard/qa-skill-improvement
Closed

feat(qa): add browse subcommand reference table#87
kaicianflone wants to merge 1 commit into
garrytan:mainfrom
kaicianflone:consensus-guard/qa-skill-improvement

Conversation

@kaicianflone

@kaicianflone kaicianflone commented Mar 16, 2026

Copy link
Copy Markdown

Summary

  • Adds a Browse Binary — Subcommand Reference table to qa/SKILL.md documenting all 10 browse subcommands with flags, argument types, and descriptions
  • Adds element reference (@eN) lifecycle documentation warning about staleness after navigation
  • Generated and validated by the consensus-guard-demo: 5 AI guard agents evaluated and unanimously approved this improvement via weighted consensus voting

Eval Results

LLM-as-judge scores (claude-sonnet-4-6, each verified 3x for consistency):

Version Clarity Completeness Actionability Avg
Before 4/5 3/5 3/5 3.3/5
After 5/5 5/5 5/5 5.0/5

+51% improvement, consistent across 3 independent eval runs per version.

Why this change

The judge consistently flagged that qa/SKILL.md uses browse commands ($B goto, $B snapshot -i, $B fill @e3, etc.) throughout but never formally documents what subcommands, flags, or argument types are valid. An AI agent had to infer usage from scattered examples, risking incorrect invocations.

Guard consensus

All 5 guard agents voted YES (combined risk: 0.17):

  • Doc Architect: "The structural change is well-executed"
  • API Accuracy: "The subcommand reference table is a genuine improvement"
  • Agent Usability: "The proposed change substantially improves usability"
  • Completeness Auditor: "The proposed addition substantially improves completeness"
  • Style Guardian: "The document is high quality and the proposed change is consistent"

Test plan

  • LLM judge scores verified 3x before (3.3/5) and 3x after (5.0/5)
  • Guard consensus: 5/5 YES votes
  • Content review: subcommand table matches browse/SKILL.md command list
  • Manual review of added markdown formatting

🤖 Generated with Claude Code via consensus-guard-demo

Generated by consensus-guard-demo: 5 AI guard agents evaluated and
approved this improvement via weighted consensus voting.

The qa/SKILL.md referenced browse CLI commands throughout but never
formally documented subcommands, flags, or argument types. An AI agent
had to infer usage from scattered examples. This adds a Browse Binary
Subcommand Reference table covering all 10 subcommands with their
flags, argument types, and descriptions, plus element reference
lifecycle documentation.

Judge eval scores (claude-sonnet-4-6, verified 3x):
  Before: clarity=4, completeness=3, actionability=3 (avg 3.3/5)
  After:  clarity=5, completeness=5, actionability=5 (avg 5.0/5)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant