feat(#218): audit-skill artefact persistence + canonical structure by atlas-apex · Pull Request #222 · me2resh/apexyard

atlas-apex · 2026-05-11T21:50:20Z

Summary

New shared library .claude/hooks/_lib-audit-history.sh (4 functions: audit_resolve_dir, audit_run_persist, audit_run_list, audit_render_trend) generalises /launch-check's existing per-run persistence convention to the whole audit-skill family
Pilot retrofits on /threat-model and /security-review — both now persist a JSON + MD pair per run and render the trend section after each invocation
/launch-check refactored to consume the lib without behaviour change. JSON is a superset preserving the legacy scores{} map (so the existing chart renderer plots the same shape) plus the canonical generic fields (so the lib's trend renderer reads it). Adopters' existing history at the legacy path is read-merged transparently — no mv required, no destructive migration
11 lib tests covering write/read/derive/preserve/marker/sort/legacy-path/silent/render/dispatch/regression-merge — all passing
Per-dim canonical templates at templates/audits/threat-model.md and templates/audits/security-review.md (reference material; per-skill bodies are inline per AgDR-0019)
Follow-up [Chore] Retrofit remaining 7 audit skills onto _lib-audit-history.sh #221 filed for the remaining 7 audit skills (compliance-check, accessibility-audit, performance-audit, seo-audit, monitoring-audit, docs-audit, analytics-audit) — mechanical retrofit per the pattern proven here

Why this matters

Before this PR the audit family was inconsistent: /launch-check persisted (since #183) and rendered a trend chart; the other nine audit skills wrote only to stdout and disappeared as soon as the terminal scrolled. Comparing two threat models from a quarter apart was an exercise in reading two different free-form essays. After this PR every audit run produces a structurally consistent on-disk artefact with parseable frontmatter, and the trend renderer renders identically across all dimensions.

The launch-check refactor is the load-bearing piece: it changes how persistence happens (lib-driven instead of inline) without changing what operators see (chart shape unchanged, existing history preserved). The regression test (case 11) confirms the legacy + canonical path merge keeps adopters' existing history visible on the first post-refactor run.

The decision airing in AgDR-0019 covers the four load-bearing calls (paired JSON+MD, rigid common-denominator findings shape, non-destructive backward compat, four discrete shell functions). The technical design at docs/technical-designs/audit-artefact-persistence.md embeds a Mermaid C4 L3 component diagram and a Mermaid DFD showing the audit subsystem boundaries.

Testing

Lib tests: bash .claude/hooks/tests/test_audit_history.sh → 11 passed, 0 failed
Lib syntax: bash -n .claude/hooks/_lib-audit-history.sh → OK
Regression test for launch-check (case 11): one run in legacy path + one run in canonical path → both dates appear in the rendered chart
Stats derivation (case 3): payload with 2 high + 1 medium → stats.by_severity.high=2, .medium=1
Marker semantics (case 5): absent → runs/ ignored; present → !runs/*.json un-ignored
Reviewer can spot-check the lib's jq invocations (the schema-augmentation jq in audit_run_persist and the frontmatter-generation jq next to it are the two non-trivial pieces)
Reviewer can confirm the launch-check Step 6 rewrite preserves the JSON superset shape (legacy scores{} / branch / commit / top_risks[] AND new dimension / score / findings[] / stats{} / schema_version)
Manual smoke test post-merge: invoke /threat-model against any registered project, confirm projects/<name>/audits/threat-model/<ts>.md is written with valid frontmatter

Glossary

Term	Definition
Audit-history lib	`.claude/hooks/_lib-audit-history.sh` — shared shell library with four functions handling persistence + trend rendering for all audit skills. Sourced by SKILL.md flows.
Per-run JSON	The trend-renderer's input file at `projects/<name>/audits/<dimension>/runs/<ts>.json`. Schema includes `ts`, `dimension`, `verdict`, `score`, `findings[]`, `stats{}`, plus dimension-specific extras (e.g. launch-check preserves `scores{}` + `branch` + `commit` + `top_risks[]`).
Per-run MD	The durable human-readable artefact at `projects/<name>/audits/<dimension>/<ts>.md`. Frontmatter is generated by the lib from the JSON; body is freeform per dimension.
Findings shape	The common denominator across all nine audit dimensions: `{id, severity, status, summary}` per finding. Per-dimension nuance (STRIDE category, OWASP class, WCAG criterion) lives in the MD body via per-dim templates.
Headline score	A single 0-100 number per run plotted on the trend chart's Y-axis. Generic skills derive it from severity-weighted finding count; `/launch-check` keeps its existing `mean(scores.*)` derivation for byte-equal chart output.
Legacy + canonical merge	`audit_run_list` reads JSON files from BOTH `projects/<name>/audits/<dim>/runs/` (canonical) AND `projects/<name>/launch-check/runs/` (legacy, launch-check only) so adopters' existing trend history isn't orphaned by the path change.
Opt-in commit marker	`.audit-history-tracked` — presence-only file inside the dimension's audit dir that flips the `runs/` `.gitignore` from "ignore everything" to "un-ignore *.json". MD artefacts are committed unconditionally.
Verdict	One of `pass` / `conditional` / `fail` (generic vocabulary). `/launch-check` keeps its four-state vocabulary (`go` / `go-with-warnings` / `conditional-go` / `no-go`) in its body, but maps to the generic three-state in the frontmatter for cross-dim consistency.

🤖 Generated with Claude Code

Tech design at docs/technical-designs/audit-artefact-persistence.md covers the full feature: domain model, C4 L3 component diagram, Mermaid DFD, shared lib API, JSON + MD frontmatter schema, storage layout, launch-check backward-compat strategy, implementation tasks, risks, and three open questions resolved in the AgDR. AgDR-0019 records four load-bearing decisions: - Paired JSON + MD per run (preserves /launch-check's existing pattern, zero migration for adopters with committed run history) - Rigid common-denominator findings shape {id, severity, status, summary} with per-dimension nuance in the MD body via per-dim templates - Read both old + new paths for /launch-check, write only to new (non-destructive backward compat; trend reader gains one branch) - Four discrete shell-lib functions mirroring the existing _lib-*.sh shape adopters already know (audit_resolve_dir / audit_run_persist / audit_run_list / audit_render_trend) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Generalises /launch-check's per-run JSON + per-run MD + opt-in commit marker convention into a shared shell library that all audit skills consume, so every audit run produces structurally consistent on-disk artefacts and the trend across runs becomes legible. Components: - .claude/hooks/_lib-audit-history.sh — four discrete shell functions (audit_resolve_dir / audit_run_persist / audit_run_list / audit_render_trend) mirroring the existing _lib-*.sh shape adopters already know. ~280 LOC bash + jq. - .claude/hooks/tests/test_audit_history.sh — 11 cases covering resolve / persist / stats derivation / preserved stats / marker semantics / sorted listing / legacy launch-check path read / silent-on-1-run / generic trend rendering / launch-check legacy dispatch / legacy+canonical merge regression. - templates/audits/threat-model.md, templates/audits/security-review.md — canonical body skeletons for the two pilot dimensions (reference material per AgDR-0019; per-skill bodies are inline). - .claude/skills/threat-model/SKILL.md — added Step 5 "Persist the run + render trend" calling the lib. Severity vocabulary in the payload is lowercase (lib stats expect this); visible Step 3 table keeps its conventional capitalisation. - .claude/skills/security-review/SKILL.md — same shape; persists per PR review with the same score/verdict formula. - .claude/skills/launch-check/SKILL.md — refactored Step 6 to consume audit_run_persist + audit_render_trend. JSON is a SUPERSET preserving the legacy scores{} / branch / commit / top_risks[] fields so the existing render-trend.sh continues to plot the same chart unchanged. audit_run_list merges canonical (audits/launch-check/runs/) + legacy (launch-check/runs/) paths so adopters' existing history is preserved with no `mv` required. Decision rationale: docs/agdr/AgDR-0019-audit-artefact-persistence.md Technical design + diagrams: docs/technical-designs/audit-artefact-persistence.md Follow-up #221 covers the remaining 7 audit skills' retrofit (compliance-check, accessibility-audit, performance-audit, seo-audit, monitoring-audit, docs-audit, analytics-audit) — mechanical retrofit per the pattern proven by the two pilots in this PR. Closes #218 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

atlas-apex

Code Review: PR #222

Commit: 0498641505d6ce5fc1cf08553744ae77bcc497b2

Summary

Generalises /launch-check's per-run persistence convention into a shared bash lib (_lib-audit-history.sh, 4 functions) and pilots it on /threat-model + /security-review. /launch-check is refactored to consume the lib without behaviour change — JSON is a SUPERSET preserving the legacy scores{} map AND adding canonical dimension/score/findings[]/stats{}/schema_version. Legacy path read-merged transparently. AgDR-0019 captures the four load-bearing decisions. 11 lib tests pass locally (re-run on review: all green).

Checklist Results

Architecture & Design: Pass — clean separation; lib mirrors existing _lib-*.sh shape; dispatch-to-legacy for launch-check is a sensible bridge.
Code Quality: Pass — bash is readable, jq invocations are correct, error paths return non-zero with stderr context.
Testing: Pass — 11 cases cover all AC claims (write/read/derive/preserve/marker/sort/legacy/silent/render/dispatch/regression-merge); case 11 specifically validates the load-bearing legacy+canonical merge for /launch-check.
Security: Pass (N/A — no auth/crypto/secrets touched).
Performance: Pass — jq calls are bounded (one per file in trend window, default 5).
PR Description & Glossary: Pass — 8-row glossary covers every new term (audit-history lib, per-run JSON/MD, findings shape, headline score, legacy+canonical merge, opt-in commit marker, verdict).
Technical Decisions (AgDR): Pass — AgDR-0019 captures all four decisions (paired JSON+MD, rigid common-denominator findings, non-destructive launch-check backcompat, four-function shell lib) with options-considered tables and explicit per-decision rationale per the AgDR template.

Verifications performed

audit_run_persist jq (lines 165-180 of lib): correct. Schema-augmentation uses $in + {…} so caller-provided fields like launch-check's scores{} / branch / commit / top_risks[] survive untouched. Stats derivation reduces over findings[] only when stats is absent — case 4 confirms explicit stats are preserved.
Frontmatter jq (lines 188-204): emits valid YAML, ((.stats // {}).by_severity.X // 0) correctly fallbacks to 0 for every severity bucket.
audit_run_list legacy merge (lines 249-268): canonical + legacy paths globbed, ts-sorted ascending. Cases 7 + 11 both green.
Launch-check Step 6 superset: payload example at SKILL.md:249-265 includes both legacy fields (scores, branch, commit, top_risks) and canonical (findings[] derived from per-dimension scores). After lib augmentation, JSON contains both — no field loss. Legacy render-trend.sh keeps reading ts/scores/verdict unchanged.
Pilot SKILL retrofit consistency: both /threat-model Step 5 and /security-review Persist section use identical score formula (max(0, 100 - 25*crit - 10*high - 3*med - 1*low)), identical verdict mapping (worst-severity rule → pass/conditional/fail), identical lowercase-severity rule, identical marker semantics.
Commit format: both commits match type(#218): description (docs + feat).
AgDR template conformance: AgDR-0019 has the canonical context/options-considered/decision/consequences/artifacts sections; each of the four decisions has its own options table.

Issues Found

None blocking.

Suggestions (non-blocking, optional)

The launch-check dispatch in audit_render_trend (line 309-318) stages files into a tmp dir then rm -rfs it — works fine, but if render-trend.sh ever grows to tail-on-stdin (unlikely) the staging becomes redundant. Worth a comment pinning the contract; current comment at lines 304-308 already covers this. Fine as-is.
audit_render_trend's generic ASCII chart (lines 359-384) duplicates render-trend.sh shape. Acceptable for now; AgDR-0019 explicitly defers full absorption of render-trend.sh to a follow-up.
Follow-up #221 captures the remaining 7 retrofits — mechanical, well-scoped, no review concerns from this PR.

Verdict

APPROVED

The two-commit shape (docs first, then feat) is exemplary. Schema-superset approach for launch-check + read-merged legacy path is the right backcompat call — the regression test (case 11) makes that explicit. Pilot retrofits are clean and consistent. Ship it.

🤖 Reviewed by Rex (Code Reviewer Agent)
📌 Reviewed commit: `0498641505d6ce5fc1cf08553744ae77bcc497b2`

CI's markdownlint-cli2 flagged the "Design rationale:" list at line 358 of launch-check/SKILL.md for MD032 (blanks-around-lists). One-line fix: insert blank line after the section header. No content change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

atlas-apex

Re-Review: PR #222 (follow-up commit)

Commit: fbda3d3e86a5daff58d930bbafc8e467d2b62c01

Verification

Diff confirmed: +1/-0, single blank line inserted after Design rationale: heading at .claude/skills/launch-check/SKILL.md:358 to satisfy markdownlint MD032
No content changes — purely a CI-fix
All 4 CI checks green: markdownlint-cli2, lychee, shellcheck, Verify Ticket ID
Load-bearing decisions untouched: shared lib API (_lib-audit-history.sh), JSON+MD artefact pair, launch-check superset schema, and legacy+canonical merge strategy all unchanged from prior approval at 0498641505d6ce5fc1cf08553744ae77bcc497b2

Verdict

APPROVED — prior approval still applies in spirit; the markdownlint fix is the minimum mechanical change to unblock CI.

(Posted as comment because GitHub disallows self-approval; approval marker written at .claude/session/reviews/222-rex.approved.)

🤖 Reviewed by Rex (Code Reviewer Agent)
📌 Reviewed commit: fbda3d3e86a5daff58d930bbafc8e467d2b62c01

Mechanical retrofit completing the follow-up filed alongside #218. Each of the 7 remaining audit skills now persists a structured JSON+MD pair via audit_run_persist + renders trend via audit_render_trend on every run. Same shape as the threat-model and security-review pilots from #222. Skills retrofitted: - /compliance-check — GDPR/ePrivacy - /accessibility-audit — WCAG 2.1 AA - /performance-audit — bundle / images / caching - /seo-audit — meta / sitemap / OG / structured data - /monitoring-audit — logs / errors / health / alerts / runbooks - /docs-audit — Diataxis quadrants + README quality + staleness - /analytics-audit — SDK / event taxonomy / funnels / dashboards Each gains: - A "Persist the run + render trend" section before its existing "## Rules" section, with subsections for project/score/verdict resolution, payload+body construction, persist call, trend render, and opt-in commit marker - Two new Rules entries: "Always persist via the lib" + "Severity vocabulary in the JSON is lowercase" - Per-dim severity-vocabulary mapping documented (e.g. "Documentation readiness: PARTIAL" → conditional, "Incident readiness: NOT READY" → fail, etc.) 7 new canonical templates at templates/audits/<dim>.md, each demonstrating the dimension's findings shape (POUR groupings for accessibility, Diataxis quadrants for docs, regulatory exposure for compliance, etc.) per the convention shipped in #222. After this PR, all 9 audit skills (this 7 + threat-model + security-review) consume the shared lib uniformly. /launch-check already consumed the lib via its #222 refactor with backward-compat read-merge for adopters' existing history. Closes #221 Co-authored-by: me2resh <ahmed.abdelaliem@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

) * docs(#218): technical design + AgDR-0019 for audit-artefact persistence Tech design at docs/technical-designs/audit-artefact-persistence.md covers the full feature: domain model, C4 L3 component diagram, Mermaid DFD, shared lib API, JSON + MD frontmatter schema, storage layout, launch-check backward-compat strategy, implementation tasks, risks, and three open questions resolved in the AgDR. AgDR-0019 records four load-bearing decisions: - Paired JSON + MD per run (preserves /launch-check's existing pattern, zero migration for adopters with committed run history) - Rigid common-denominator findings shape {id, severity, status, summary} with per-dimension nuance in the MD body via per-dim templates - Read both old + new paths for /launch-check, write only to new (non-destructive backward compat; trend reader gains one branch) - Four discrete shell-lib functions mirroring the existing _lib-*.sh shape adopters already know (audit_resolve_dir / audit_run_persist / audit_run_list / audit_render_trend) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(#218): audit-skill artefact persistence + canonical structure Generalises /launch-check's per-run JSON + per-run MD + opt-in commit marker convention into a shared shell library that all audit skills consume, so every audit run produces structurally consistent on-disk artefacts and the trend across runs becomes legible. Components: - .claude/hooks/_lib-audit-history.sh — four discrete shell functions (audit_resolve_dir / audit_run_persist / audit_run_list / audit_render_trend) mirroring the existing _lib-*.sh shape adopters already know. ~280 LOC bash + jq. - .claude/hooks/tests/test_audit_history.sh — 11 cases covering resolve / persist / stats derivation / preserved stats / marker semantics / sorted listing / legacy launch-check path read / silent-on-1-run / generic trend rendering / launch-check legacy dispatch / legacy+canonical merge regression. - templates/audits/threat-model.md, templates/audits/security-review.md — canonical body skeletons for the two pilot dimensions (reference material per AgDR-0019; per-skill bodies are inline). - .claude/skills/threat-model/SKILL.md — added Step 5 "Persist the run + render trend" calling the lib. Severity vocabulary in the payload is lowercase (lib stats expect this); visible Step 3 table keeps its conventional capitalisation. - .claude/skills/security-review/SKILL.md — same shape; persists per PR review with the same score/verdict formula. - .claude/skills/launch-check/SKILL.md — refactored Step 6 to consume audit_run_persist + audit_render_trend. JSON is a SUPERSET preserving the legacy scores{} / branch / commit / top_risks[] fields so the existing render-trend.sh continues to plot the same chart unchanged. audit_run_list merges canonical (audits/launch-check/runs/) + legacy (launch-check/runs/) paths so adopters' existing history is preserved with no `mv` required. Decision rationale: docs/agdr/AgDR-0019-audit-artefact-persistence.md Technical design + diagrams: docs/technical-designs/audit-artefact-persistence.md Follow-up #221 covers the remaining 7 audit skills' retrofit (compliance-check, accessibility-audit, performance-audit, seo-audit, monitoring-audit, docs-audit, analytics-audit) — mechanical retrofit per the pattern proven by the two pilots in this PR. Closes #218 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(#218): blank line before list to satisfy markdownlint MD032 CI's markdownlint-cli2 flagged the "Design rationale:" list at line 358 of launch-check/SKILL.md for MD032 (blanks-around-lists). One-line fix: insert blank line after the section header. No content change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: me2resh <ahmed.abdelaliem@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mechanical retrofit completing the follow-up filed alongside #218. Each of the 7 remaining audit skills now persists a structured JSON+MD pair via audit_run_persist + renders trend via audit_render_trend on every run. Same shape as the threat-model and security-review pilots from #222. Skills retrofitted: - /compliance-check — GDPR/ePrivacy - /accessibility-audit — WCAG 2.1 AA - /performance-audit — bundle / images / caching - /seo-audit — meta / sitemap / OG / structured data - /monitoring-audit — logs / errors / health / alerts / runbooks - /docs-audit — Diataxis quadrants + README quality + staleness - /analytics-audit — SDK / event taxonomy / funnels / dashboards Each gains: - A "Persist the run + render trend" section before its existing "## Rules" section, with subsections for project/score/verdict resolution, payload+body construction, persist call, trend render, and opt-in commit marker - Two new Rules entries: "Always persist via the lib" + "Severity vocabulary in the JSON is lowercase" - Per-dim severity-vocabulary mapping documented (e.g. "Documentation readiness: PARTIAL" → conditional, "Incident readiness: NOT READY" → fail, etc.) 7 new canonical templates at templates/audits/<dim>.md, each demonstrating the dimension's findings shape (POUR groupings for accessibility, Diataxis quadrants for docs, regulatory exposure for compliance, etc.) per the convention shipped in #222. After this PR, all 9 audit skills (this 7 + threat-model + security-review) consume the shared lib uniformly. /launch-check already consumed the lib via its #222 refactor with backward-compat read-merge for adopters' existing history. Closes #221 Co-authored-by: me2resh <ahmed.abdelaliem@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

me2resh and others added 2 commits May 11, 2026 22:33

github-actions Bot added github-issue has-ticket labels May 11, 2026

atlas-apex commented May 11, 2026

View reviewed changes

This was referenced May 11, 2026

[Chore] Add Data Flow Diagram section to threat-model template #223

Closed

chore(#223): add Data Flow Diagram section to threat-model template #225

Merged

atlas-apex commented May 12, 2026

View reviewed changes

atlas-apex merged commit 42c8a72 into dev May 12, 2026
4 checks passed

atlas-apex deleted the feature/GH-218-audit-artefact-persistence branch May 12, 2026 07:56

This was referenced May 14, 2026

feat(#224): add architecture vision + DFD + sequence templates #226

Merged

chore(#221): retrofit 7 audit skills onto _lib-audit-history.sh #239

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(#218): audit-skill artefact persistence + canonical structure#222

feat(#218): audit-skill artefact persistence + canonical structure#222
atlas-apex merged 3 commits into
devfrom
feature/GH-218-audit-artefact-persistence

atlas-apex commented May 11, 2026

Uh oh!

atlas-apex left a comment

Uh oh!

atlas-apex left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

atlas-apex commented May 11, 2026

Summary

Why this matters

Testing

Glossary

Uh oh!

atlas-apex left a comment

Choose a reason for hiding this comment

Code Review: PR #222

Summary

Checklist Results

Verifications performed

Issues Found

Suggestions (non-blocking, optional)

Verdict

Uh oh!

atlas-apex left a comment

Choose a reason for hiding this comment

Re-Review: PR #222 (follow-up commit)

Verification

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants