feat(#218): audit-skill artefact persistence + canonical structure#222
Conversation
Tech design at docs/technical-designs/audit-artefact-persistence.md
covers the full feature: domain model, C4 L3 component diagram,
Mermaid DFD, shared lib API, JSON + MD frontmatter schema, storage
layout, launch-check backward-compat strategy, implementation tasks,
risks, and three open questions resolved in the AgDR.
AgDR-0019 records four load-bearing decisions:
- Paired JSON + MD per run (preserves /launch-check's existing pattern,
zero migration for adopters with committed run history)
- Rigid common-denominator findings shape {id, severity, status,
summary} with per-dimension nuance in the MD body via per-dim
templates
- Read both old + new paths for /launch-check, write only to new
(non-destructive backward compat; trend reader gains one branch)
- Four discrete shell-lib functions mirroring the existing _lib-*.sh
shape adopters already know (audit_resolve_dir / audit_run_persist /
audit_run_list / audit_render_trend)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generalises /launch-check's per-run JSON + per-run MD + opt-in commit
marker convention into a shared shell library that all audit skills
consume, so every audit run produces structurally consistent on-disk
artefacts and the trend across runs becomes legible.
Components:
- .claude/hooks/_lib-audit-history.sh — four discrete shell functions
(audit_resolve_dir / audit_run_persist / audit_run_list /
audit_render_trend) mirroring the existing _lib-*.sh shape adopters
already know. ~280 LOC bash + jq.
- .claude/hooks/tests/test_audit_history.sh — 11 cases covering
resolve / persist / stats derivation / preserved stats / marker
semantics / sorted listing / legacy launch-check path read /
silent-on-1-run / generic trend rendering / launch-check legacy
dispatch / legacy+canonical merge regression.
- templates/audits/threat-model.md, templates/audits/security-review.md
— canonical body skeletons for the two pilot dimensions (reference
material per AgDR-0019; per-skill bodies are inline).
- .claude/skills/threat-model/SKILL.md — added Step 5 "Persist the
run + render trend" calling the lib. Severity vocabulary in the
payload is lowercase (lib stats expect this); visible Step 3 table
keeps its conventional capitalisation.
- .claude/skills/security-review/SKILL.md — same shape; persists per
PR review with the same score/verdict formula.
- .claude/skills/launch-check/SKILL.md — refactored Step 6 to consume
audit_run_persist + audit_render_trend. JSON is a SUPERSET preserving
the legacy scores{} / branch / commit / top_risks[] fields so the
existing render-trend.sh continues to plot the same chart unchanged.
audit_run_list merges canonical (audits/launch-check/runs/) +
legacy (launch-check/runs/) paths so adopters' existing history is
preserved with no `mv` required.
Decision rationale: docs/agdr/AgDR-0019-audit-artefact-persistence.md
Technical design + diagrams: docs/technical-designs/audit-artefact-persistence.md
Follow-up #221 covers the remaining 7 audit skills' retrofit
(compliance-check, accessibility-audit, performance-audit, seo-audit,
monitoring-audit, docs-audit, analytics-audit) — mechanical retrofit
per the pattern proven by the two pilots in this PR.
Closes #218
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
atlas-apex
left a comment
There was a problem hiding this comment.
Code Review: PR #222
Commit: 0498641505d6ce5fc1cf08553744ae77bcc497b2
Summary
Generalises /launch-check's per-run persistence convention into a shared bash lib (_lib-audit-history.sh, 4 functions) and pilots it on /threat-model + /security-review. /launch-check is refactored to consume the lib without behaviour change — JSON is a SUPERSET preserving the legacy scores{} map AND adding canonical dimension/score/findings[]/stats{}/schema_version. Legacy path read-merged transparently. AgDR-0019 captures the four load-bearing decisions. 11 lib tests pass locally (re-run on review: all green).
Checklist Results
- Architecture & Design: Pass — clean separation; lib mirrors existing
_lib-*.shshape; dispatch-to-legacy for launch-check is a sensible bridge. - Code Quality: Pass — bash is readable, jq invocations are correct, error paths return non-zero with stderr context.
- Testing: Pass — 11 cases cover all AC claims (write/read/derive/preserve/marker/sort/legacy/silent/render/dispatch/regression-merge); case 11 specifically validates the load-bearing legacy+canonical merge for
/launch-check. - Security: Pass (N/A — no auth/crypto/secrets touched).
- Performance: Pass — jq calls are bounded (one per file in trend window, default 5).
- PR Description & Glossary: Pass — 8-row glossary covers every new term (audit-history lib, per-run JSON/MD, findings shape, headline score, legacy+canonical merge, opt-in commit marker, verdict).
- Technical Decisions (AgDR): Pass — AgDR-0019 captures all four decisions (paired JSON+MD, rigid common-denominator findings, non-destructive launch-check backcompat, four-function shell lib) with options-considered tables and explicit per-decision rationale per the AgDR template.
Verifications performed
audit_run_persistjq (lines 165-180 of lib): correct. Schema-augmentation uses$in + {…}so caller-provided fields like launch-check'sscores{}/branch/commit/top_risks[]survive untouched. Stats derivation reduces overfindings[]only whenstatsis absent — case 4 confirms explicit stats are preserved.- Frontmatter jq (lines 188-204): emits valid YAML,
((.stats // {}).by_severity.X // 0)correctly fallbacks to 0 for every severity bucket. audit_run_listlegacy merge (lines 249-268): canonical + legacy paths globbed, ts-sorted ascending. Cases 7 + 11 both green.- Launch-check Step 6 superset: payload example at SKILL.md:249-265 includes both legacy fields (
scores,branch,commit,top_risks) and canonical (findings[]derived from per-dimension scores). After lib augmentation, JSON contains both — no field loss. Legacyrender-trend.shkeeps readingts/scores/verdictunchanged. - Pilot SKILL retrofit consistency: both
/threat-modelStep 5 and/security-reviewPersist section use identical score formula (max(0, 100 - 25*crit - 10*high - 3*med - 1*low)), identical verdict mapping (worst-severity rule →pass/conditional/fail), identical lowercase-severity rule, identical marker semantics. - Commit format: both commits match
type(#218): description(docs + feat). - AgDR template conformance: AgDR-0019 has the canonical context/options-considered/decision/consequences/artifacts sections; each of the four decisions has its own options table.
Issues Found
None blocking.
Suggestions (non-blocking, optional)
- The launch-check dispatch in
audit_render_trend(line 309-318) stages files into a tmp dir thenrm -rfs it — works fine, but ifrender-trend.shever grows to tail-on-stdin (unlikely) the staging becomes redundant. Worth a comment pinning the contract; current comment at lines 304-308 already covers this. Fine as-is. audit_render_trend's generic ASCII chart (lines 359-384) duplicatesrender-trend.shshape. Acceptable for now; AgDR-0019 explicitly defers full absorption ofrender-trend.shto a follow-up.- Follow-up #221 captures the remaining 7 retrofits — mechanical, well-scoped, no review concerns from this PR.
Verdict
APPROVED
The two-commit shape (docs first, then feat) is exemplary. Schema-superset approach for launch-check + read-merged legacy path is the right backcompat call — the regression test (case 11) makes that explicit. Pilot retrofits are clean and consistent. Ship it.
🤖 Reviewed by Rex (Code Reviewer Agent)
📌 Reviewed commit: `0498641505d6ce5fc1cf08553744ae77bcc497b2`
CI's markdownlint-cli2 flagged the "Design rationale:" list at line 358 of launch-check/SKILL.md for MD032 (blanks-around-lists). One-line fix: insert blank line after the section header. No content change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
atlas-apex
left a comment
There was a problem hiding this comment.
Re-Review: PR #222 (follow-up commit)
Commit: fbda3d3e86a5daff58d930bbafc8e467d2b62c01
Verification
- Diff confirmed: +1/-0, single blank line inserted after
Design rationale:heading at.claude/skills/launch-check/SKILL.md:358to satisfy markdownlint MD032 - No content changes — purely a CI-fix
- All 4 CI checks green: markdownlint-cli2, lychee, shellcheck, Verify Ticket ID
- Load-bearing decisions untouched: shared lib API (
_lib-audit-history.sh), JSON+MD artefact pair, launch-check superset schema, and legacy+canonical merge strategy all unchanged from prior approval at0498641505d6ce5fc1cf08553744ae77bcc497b2
Verdict
APPROVED — prior approval still applies in spirit; the markdownlint fix is the minimum mechanical change to unblock CI.
(Posted as comment because GitHub disallows self-approval; approval marker written at .claude/session/reviews/222-rex.approved.)
🤖 Reviewed by Rex (Code Reviewer Agent)
📌 Reviewed commit: fbda3d3e86a5daff58d930bbafc8e467d2b62c01
Mechanical retrofit completing the follow-up filed alongside #218. Each of the 7 remaining audit skills now persists a structured JSON+MD pair via audit_run_persist + renders trend via audit_render_trend on every run. Same shape as the threat-model and security-review pilots from #222. Skills retrofitted: - /compliance-check — GDPR/ePrivacy - /accessibility-audit — WCAG 2.1 AA - /performance-audit — bundle / images / caching - /seo-audit — meta / sitemap / OG / structured data - /monitoring-audit — logs / errors / health / alerts / runbooks - /docs-audit — Diataxis quadrants + README quality + staleness - /analytics-audit — SDK / event taxonomy / funnels / dashboards Each gains: - A "Persist the run + render trend" section before its existing "## Rules" section, with subsections for project/score/verdict resolution, payload+body construction, persist call, trend render, and opt-in commit marker - Two new Rules entries: "Always persist via the lib" + "Severity vocabulary in the JSON is lowercase" - Per-dim severity-vocabulary mapping documented (e.g. "Documentation readiness: PARTIAL" → conditional, "Incident readiness: NOT READY" → fail, etc.) 7 new canonical templates at templates/audits/<dim>.md, each demonstrating the dimension's findings shape (POUR groupings for accessibility, Diataxis quadrants for docs, regulatory exposure for compliance, etc.) per the convention shipped in #222. After this PR, all 9 audit skills (this 7 + threat-model + security-review) consume the shared lib uniformly. /launch-check already consumed the lib via its #222 refactor with backward-compat read-merge for adopters' existing history. Closes #221 Co-authored-by: me2resh <ahmed.abdelaliem@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
) * docs(#218): technical design + AgDR-0019 for audit-artefact persistence Tech design at docs/technical-designs/audit-artefact-persistence.md covers the full feature: domain model, C4 L3 component diagram, Mermaid DFD, shared lib API, JSON + MD frontmatter schema, storage layout, launch-check backward-compat strategy, implementation tasks, risks, and three open questions resolved in the AgDR. AgDR-0019 records four load-bearing decisions: - Paired JSON + MD per run (preserves /launch-check's existing pattern, zero migration for adopters with committed run history) - Rigid common-denominator findings shape {id, severity, status, summary} with per-dimension nuance in the MD body via per-dim templates - Read both old + new paths for /launch-check, write only to new (non-destructive backward compat; trend reader gains one branch) - Four discrete shell-lib functions mirroring the existing _lib-*.sh shape adopters already know (audit_resolve_dir / audit_run_persist / audit_run_list / audit_render_trend) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(#218): audit-skill artefact persistence + canonical structure Generalises /launch-check's per-run JSON + per-run MD + opt-in commit marker convention into a shared shell library that all audit skills consume, so every audit run produces structurally consistent on-disk artefacts and the trend across runs becomes legible. Components: - .claude/hooks/_lib-audit-history.sh — four discrete shell functions (audit_resolve_dir / audit_run_persist / audit_run_list / audit_render_trend) mirroring the existing _lib-*.sh shape adopters already know. ~280 LOC bash + jq. - .claude/hooks/tests/test_audit_history.sh — 11 cases covering resolve / persist / stats derivation / preserved stats / marker semantics / sorted listing / legacy launch-check path read / silent-on-1-run / generic trend rendering / launch-check legacy dispatch / legacy+canonical merge regression. - templates/audits/threat-model.md, templates/audits/security-review.md — canonical body skeletons for the two pilot dimensions (reference material per AgDR-0019; per-skill bodies are inline). - .claude/skills/threat-model/SKILL.md — added Step 5 "Persist the run + render trend" calling the lib. Severity vocabulary in the payload is lowercase (lib stats expect this); visible Step 3 table keeps its conventional capitalisation. - .claude/skills/security-review/SKILL.md — same shape; persists per PR review with the same score/verdict formula. - .claude/skills/launch-check/SKILL.md — refactored Step 6 to consume audit_run_persist + audit_render_trend. JSON is a SUPERSET preserving the legacy scores{} / branch / commit / top_risks[] fields so the existing render-trend.sh continues to plot the same chart unchanged. audit_run_list merges canonical (audits/launch-check/runs/) + legacy (launch-check/runs/) paths so adopters' existing history is preserved with no `mv` required. Decision rationale: docs/agdr/AgDR-0019-audit-artefact-persistence.md Technical design + diagrams: docs/technical-designs/audit-artefact-persistence.md Follow-up #221 covers the remaining 7 audit skills' retrofit (compliance-check, accessibility-audit, performance-audit, seo-audit, monitoring-audit, docs-audit, analytics-audit) — mechanical retrofit per the pattern proven by the two pilots in this PR. Closes #218 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(#218): blank line before list to satisfy markdownlint MD032 CI's markdownlint-cli2 flagged the "Design rationale:" list at line 358 of launch-check/SKILL.md for MD032 (blanks-around-lists). One-line fix: insert blank line after the section header. No content change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: me2resh <ahmed.abdelaliem@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mechanical retrofit completing the follow-up filed alongside #218. Each of the 7 remaining audit skills now persists a structured JSON+MD pair via audit_run_persist + renders trend via audit_render_trend on every run. Same shape as the threat-model and security-review pilots from #222. Skills retrofitted: - /compliance-check — GDPR/ePrivacy - /accessibility-audit — WCAG 2.1 AA - /performance-audit — bundle / images / caching - /seo-audit — meta / sitemap / OG / structured data - /monitoring-audit — logs / errors / health / alerts / runbooks - /docs-audit — Diataxis quadrants + README quality + staleness - /analytics-audit — SDK / event taxonomy / funnels / dashboards Each gains: - A "Persist the run + render trend" section before its existing "## Rules" section, with subsections for project/score/verdict resolution, payload+body construction, persist call, trend render, and opt-in commit marker - Two new Rules entries: "Always persist via the lib" + "Severity vocabulary in the JSON is lowercase" - Per-dim severity-vocabulary mapping documented (e.g. "Documentation readiness: PARTIAL" → conditional, "Incident readiness: NOT READY" → fail, etc.) 7 new canonical templates at templates/audits/<dim>.md, each demonstrating the dimension's findings shape (POUR groupings for accessibility, Diataxis quadrants for docs, regulatory exposure for compliance, etc.) per the convention shipped in #222. After this PR, all 9 audit skills (this 7 + threat-model + security-review) consume the shared lib uniformly. /launch-check already consumed the lib via its #222 refactor with backward-compat read-merge for adopters' existing history. Closes #221 Co-authored-by: me2resh <ahmed.abdelaliem@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
.claude/hooks/_lib-audit-history.sh(4 functions:audit_resolve_dir,audit_run_persist,audit_run_list,audit_render_trend) generalises/launch-check's existing per-run persistence convention to the whole audit-skill family/threat-modeland/security-review— both now persist a JSON + MD pair per run and render the trend section after each invocation/launch-checkrefactored to consume the lib without behaviour change. JSON is a superset preserving the legacyscores{}map (so the existing chart renderer plots the same shape) plus the canonical generic fields (so the lib's trend renderer reads it). Adopters' existing history at the legacy path is read-merged transparently — nomvrequired, no destructive migrationtemplates/audits/threat-model.mdandtemplates/audits/security-review.md(reference material; per-skill bodies are inline per AgDR-0019)Why this matters
Before this PR the audit family was inconsistent:
/launch-checkpersisted (since #183) and rendered a trend chart; the other nine audit skills wrote only to stdout and disappeared as soon as the terminal scrolled. Comparing two threat models from a quarter apart was an exercise in reading two different free-form essays. After this PR every audit run produces a structurally consistent on-disk artefact with parseable frontmatter, and the trend renderer renders identically across all dimensions.The launch-check refactor is the load-bearing piece: it changes how persistence happens (lib-driven instead of inline) without changing what operators see (chart shape unchanged, existing history preserved). The regression test (case 11) confirms the legacy + canonical path merge keeps adopters' existing history visible on the first post-refactor run.
The decision airing in AgDR-0019 covers the four load-bearing calls (paired JSON+MD, rigid common-denominator findings shape, non-destructive backward compat, four discrete shell functions). The technical design at docs/technical-designs/audit-artefact-persistence.md embeds a Mermaid C4 L3 component diagram and a Mermaid DFD showing the audit subsystem boundaries.
Testing
bash .claude/hooks/tests/test_audit_history.sh→ 11 passed, 0 failedbash -n .claude/hooks/_lib-audit-history.sh→ OKstats.by_severity.high=2, .medium=1runs/ignored; present →!runs/*.jsonun-ignoredaudit_run_persistand the frontmatter-generation jq next to it are the two non-trivial pieces)scores{}/branch/commit/top_risks[]AND newdimension/score/findings[]/stats{}/schema_version)/threat-modelagainst any registered project, confirmprojects/<name>/audits/threat-model/<ts>.mdis written with valid frontmatterGlossary
.claude/hooks/_lib-audit-history.sh— shared shell library with four functions handling persistence + trend rendering for all audit skills. Sourced by SKILL.md flows.projects/<name>/audits/<dimension>/runs/<ts>.json. Schema includests,dimension,verdict,score,findings[],stats{}, plus dimension-specific extras (e.g. launch-check preservesscores{}+branch+commit+top_risks[]).projects/<name>/audits/<dimension>/<ts>.md. Frontmatter is generated by the lib from the JSON; body is freeform per dimension.{id, severity, status, summary}per finding. Per-dimension nuance (STRIDE category, OWASP class, WCAG criterion) lives in the MD body via per-dim templates./launch-checkkeeps its existingmean(scores.*)derivation for byte-equal chart output.audit_run_listreads JSON files from BOTHprojects/<name>/audits/<dim>/runs/(canonical) ANDprojects/<name>/launch-check/runs/(legacy, launch-check only) so adopters' existing trend history isn't orphaned by the path change..audit-history-tracked— presence-only file inside the dimension's audit dir that flips theruns/.gitignorefrom "ignore everything" to "un-ignore *.json". MD artefacts are committed unconditionally.pass/conditional/fail(generic vocabulary)./launch-checkkeeps its four-state vocabulary (go/go-with-warnings/conditional-go/no-go) in its body, but maps to the generic three-state in the frontmatter for cross-dim consistency.🤖 Generated with Claude Code