Skip to content

feat(#218): audit-skill artefact persistence + canonical structure#222

Merged
atlas-apex merged 3 commits into
devfrom
feature/GH-218-audit-artefact-persistence
May 12, 2026
Merged

feat(#218): audit-skill artefact persistence + canonical structure#222
atlas-apex merged 3 commits into
devfrom
feature/GH-218-audit-artefact-persistence

Conversation

@atlas-apex

Copy link
Copy Markdown
Collaborator

Summary

  • New shared library .claude/hooks/_lib-audit-history.sh (4 functions: audit_resolve_dir, audit_run_persist, audit_run_list, audit_render_trend) generalises /launch-check's existing per-run persistence convention to the whole audit-skill family
  • Pilot retrofits on /threat-model and /security-review — both now persist a JSON + MD pair per run and render the trend section after each invocation
  • /launch-check refactored to consume the lib without behaviour change. JSON is a superset preserving the legacy scores{} map (so the existing chart renderer plots the same shape) plus the canonical generic fields (so the lib's trend renderer reads it). Adopters' existing history at the legacy path is read-merged transparently — no mv required, no destructive migration
  • 11 lib tests covering write/read/derive/preserve/marker/sort/legacy-path/silent/render/dispatch/regression-merge — all passing
  • Per-dim canonical templates at templates/audits/threat-model.md and templates/audits/security-review.md (reference material; per-skill bodies are inline per AgDR-0019)
  • Follow-up [Chore] Retrofit remaining 7 audit skills onto _lib-audit-history.sh #221 filed for the remaining 7 audit skills (compliance-check, accessibility-audit, performance-audit, seo-audit, monitoring-audit, docs-audit, analytics-audit) — mechanical retrofit per the pattern proven here

Why this matters

Before this PR the audit family was inconsistent: /launch-check persisted (since #183) and rendered a trend chart; the other nine audit skills wrote only to stdout and disappeared as soon as the terminal scrolled. Comparing two threat models from a quarter apart was an exercise in reading two different free-form essays. After this PR every audit run produces a structurally consistent on-disk artefact with parseable frontmatter, and the trend renderer renders identically across all dimensions.

The launch-check refactor is the load-bearing piece: it changes how persistence happens (lib-driven instead of inline) without changing what operators see (chart shape unchanged, existing history preserved). The regression test (case 11) confirms the legacy + canonical path merge keeps adopters' existing history visible on the first post-refactor run.

The decision airing in AgDR-0019 covers the four load-bearing calls (paired JSON+MD, rigid common-denominator findings shape, non-destructive backward compat, four discrete shell functions). The technical design at docs/technical-designs/audit-artefact-persistence.md embeds a Mermaid C4 L3 component diagram and a Mermaid DFD showing the audit subsystem boundaries.

Testing

  • Lib tests: bash .claude/hooks/tests/test_audit_history.sh → 11 passed, 0 failed
  • Lib syntax: bash -n .claude/hooks/_lib-audit-history.sh → OK
  • Regression test for launch-check (case 11): one run in legacy path + one run in canonical path → both dates appear in the rendered chart
  • Stats derivation (case 3): payload with 2 high + 1 medium → stats.by_severity.high=2, .medium=1
  • Marker semantics (case 5): absent → runs/ ignored; present → !runs/*.json un-ignored
  • Reviewer can spot-check the lib's jq invocations (the schema-augmentation jq in audit_run_persist and the frontmatter-generation jq next to it are the two non-trivial pieces)
  • Reviewer can confirm the launch-check Step 6 rewrite preserves the JSON superset shape (legacy scores{} / branch / commit / top_risks[] AND new dimension / score / findings[] / stats{} / schema_version)
  • Manual smoke test post-merge: invoke /threat-model against any registered project, confirm projects/<name>/audits/threat-model/<ts>.md is written with valid frontmatter

Glossary

Term Definition
Audit-history lib .claude/hooks/_lib-audit-history.sh — shared shell library with four functions handling persistence + trend rendering for all audit skills. Sourced by SKILL.md flows.
Per-run JSON The trend-renderer's input file at projects/<name>/audits/<dimension>/runs/<ts>.json. Schema includes ts, dimension, verdict, score, findings[], stats{}, plus dimension-specific extras (e.g. launch-check preserves scores{} + branch + commit + top_risks[]).
Per-run MD The durable human-readable artefact at projects/<name>/audits/<dimension>/<ts>.md. Frontmatter is generated by the lib from the JSON; body is freeform per dimension.
Findings shape The common denominator across all nine audit dimensions: {id, severity, status, summary} per finding. Per-dimension nuance (STRIDE category, OWASP class, WCAG criterion) lives in the MD body via per-dim templates.
Headline score A single 0-100 number per run plotted on the trend chart's Y-axis. Generic skills derive it from severity-weighted finding count; /launch-check keeps its existing mean(scores.*) derivation for byte-equal chart output.
Legacy + canonical merge audit_run_list reads JSON files from BOTH projects/<name>/audits/<dim>/runs/ (canonical) AND projects/<name>/launch-check/runs/ (legacy, launch-check only) so adopters' existing trend history isn't orphaned by the path change.
Opt-in commit marker .audit-history-tracked — presence-only file inside the dimension's audit dir that flips the runs/ .gitignore from "ignore everything" to "un-ignore *.json". MD artefacts are committed unconditionally.
Verdict One of pass / conditional / fail (generic vocabulary). /launch-check keeps its four-state vocabulary (go / go-with-warnings / conditional-go / no-go) in its body, but maps to the generic three-state in the frontmatter for cross-dim consistency.

🤖 Generated with Claude Code

me2resh and others added 2 commits May 11, 2026 22:33
Tech design at docs/technical-designs/audit-artefact-persistence.md
covers the full feature: domain model, C4 L3 component diagram,
Mermaid DFD, shared lib API, JSON + MD frontmatter schema, storage
layout, launch-check backward-compat strategy, implementation tasks,
risks, and three open questions resolved in the AgDR.

AgDR-0019 records four load-bearing decisions:
- Paired JSON + MD per run (preserves /launch-check's existing pattern,
  zero migration for adopters with committed run history)
- Rigid common-denominator findings shape {id, severity, status,
  summary} with per-dimension nuance in the MD body via per-dim
  templates
- Read both old + new paths for /launch-check, write only to new
  (non-destructive backward compat; trend reader gains one branch)
- Four discrete shell-lib functions mirroring the existing _lib-*.sh
  shape adopters already know (audit_resolve_dir / audit_run_persist /
  audit_run_list / audit_render_trend)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generalises /launch-check's per-run JSON + per-run MD + opt-in commit
marker convention into a shared shell library that all audit skills
consume, so every audit run produces structurally consistent on-disk
artefacts and the trend across runs becomes legible.

Components:

- .claude/hooks/_lib-audit-history.sh — four discrete shell functions
  (audit_resolve_dir / audit_run_persist / audit_run_list /
  audit_render_trend) mirroring the existing _lib-*.sh shape adopters
  already know. ~280 LOC bash + jq.
- .claude/hooks/tests/test_audit_history.sh — 11 cases covering
  resolve / persist / stats derivation / preserved stats / marker
  semantics / sorted listing / legacy launch-check path read /
  silent-on-1-run / generic trend rendering / launch-check legacy
  dispatch / legacy+canonical merge regression.
- templates/audits/threat-model.md, templates/audits/security-review.md
  — canonical body skeletons for the two pilot dimensions (reference
  material per AgDR-0019; per-skill bodies are inline).
- .claude/skills/threat-model/SKILL.md — added Step 5 "Persist the
  run + render trend" calling the lib. Severity vocabulary in the
  payload is lowercase (lib stats expect this); visible Step 3 table
  keeps its conventional capitalisation.
- .claude/skills/security-review/SKILL.md — same shape; persists per
  PR review with the same score/verdict formula.
- .claude/skills/launch-check/SKILL.md — refactored Step 6 to consume
  audit_run_persist + audit_render_trend. JSON is a SUPERSET preserving
  the legacy scores{} / branch / commit / top_risks[] fields so the
  existing render-trend.sh continues to plot the same chart unchanged.
  audit_run_list merges canonical (audits/launch-check/runs/) +
  legacy (launch-check/runs/) paths so adopters' existing history is
  preserved with no `mv` required.

Decision rationale: docs/agdr/AgDR-0019-audit-artefact-persistence.md
Technical design + diagrams: docs/technical-designs/audit-artefact-persistence.md

Follow-up #221 covers the remaining 7 audit skills' retrofit
(compliance-check, accessibility-audit, performance-audit, seo-audit,
monitoring-audit, docs-audit, analytics-audit) — mechanical retrofit
per the pattern proven by the two pilots in this PR.

Closes #218

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@atlas-apex atlas-apex left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: PR #222

Commit: 0498641505d6ce5fc1cf08553744ae77bcc497b2

Summary

Generalises /launch-check's per-run persistence convention into a shared bash lib (_lib-audit-history.sh, 4 functions) and pilots it on /threat-model + /security-review. /launch-check is refactored to consume the lib without behaviour change — JSON is a SUPERSET preserving the legacy scores{} map AND adding canonical dimension/score/findings[]/stats{}/schema_version. Legacy path read-merged transparently. AgDR-0019 captures the four load-bearing decisions. 11 lib tests pass locally (re-run on review: all green).

Checklist Results

  • Architecture & Design: Pass — clean separation; lib mirrors existing _lib-*.sh shape; dispatch-to-legacy for launch-check is a sensible bridge.
  • Code Quality: Pass — bash is readable, jq invocations are correct, error paths return non-zero with stderr context.
  • Testing: Pass — 11 cases cover all AC claims (write/read/derive/preserve/marker/sort/legacy/silent/render/dispatch/regression-merge); case 11 specifically validates the load-bearing legacy+canonical merge for /launch-check.
  • Security: Pass (N/A — no auth/crypto/secrets touched).
  • Performance: Pass — jq calls are bounded (one per file in trend window, default 5).
  • PR Description & Glossary: Pass — 8-row glossary covers every new term (audit-history lib, per-run JSON/MD, findings shape, headline score, legacy+canonical merge, opt-in commit marker, verdict).
  • Technical Decisions (AgDR): Pass — AgDR-0019 captures all four decisions (paired JSON+MD, rigid common-denominator findings, non-destructive launch-check backcompat, four-function shell lib) with options-considered tables and explicit per-decision rationale per the AgDR template.

Verifications performed

  1. audit_run_persist jq (lines 165-180 of lib): correct. Schema-augmentation uses $in + {…} so caller-provided fields like launch-check's scores{} / branch / commit / top_risks[] survive untouched. Stats derivation reduces over findings[] only when stats is absent — case 4 confirms explicit stats are preserved.
  2. Frontmatter jq (lines 188-204): emits valid YAML, ((.stats // {}).by_severity.X // 0) correctly fallbacks to 0 for every severity bucket.
  3. audit_run_list legacy merge (lines 249-268): canonical + legacy paths globbed, ts-sorted ascending. Cases 7 + 11 both green.
  4. Launch-check Step 6 superset: payload example at SKILL.md:249-265 includes both legacy fields (scores, branch, commit, top_risks) and canonical (findings[] derived from per-dimension scores). After lib augmentation, JSON contains both — no field loss. Legacy render-trend.sh keeps reading ts/scores/verdict unchanged.
  5. Pilot SKILL retrofit consistency: both /threat-model Step 5 and /security-review Persist section use identical score formula (max(0, 100 - 25*crit - 10*high - 3*med - 1*low)), identical verdict mapping (worst-severity rule → pass/conditional/fail), identical lowercase-severity rule, identical marker semantics.
  6. Commit format: both commits match type(#218): description (docs + feat).
  7. AgDR template conformance: AgDR-0019 has the canonical context/options-considered/decision/consequences/artifacts sections; each of the four decisions has its own options table.

Issues Found

None blocking.

Suggestions (non-blocking, optional)

  • The launch-check dispatch in audit_render_trend (line 309-318) stages files into a tmp dir then rm -rfs it — works fine, but if render-trend.sh ever grows to tail-on-stdin (unlikely) the staging becomes redundant. Worth a comment pinning the contract; current comment at lines 304-308 already covers this. Fine as-is.
  • audit_render_trend's generic ASCII chart (lines 359-384) duplicates render-trend.sh shape. Acceptable for now; AgDR-0019 explicitly defers full absorption of render-trend.sh to a follow-up.
  • Follow-up #221 captures the remaining 7 retrofits — mechanical, well-scoped, no review concerns from this PR.

Verdict

APPROVED

The two-commit shape (docs first, then feat) is exemplary. Schema-superset approach for launch-check + read-merged legacy path is the right backcompat call — the regression test (case 11) makes that explicit. Pilot retrofits are clean and consistent. Ship it.


🤖 Reviewed by Rex (Code Reviewer Agent)
📌 Reviewed commit: `0498641505d6ce5fc1cf08553744ae77bcc497b2`

CI's markdownlint-cli2 flagged the "Design rationale:" list at line
358 of launch-check/SKILL.md for MD032 (blanks-around-lists).
One-line fix: insert blank line after the section header. No content
change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@atlas-apex atlas-apex left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-Review: PR #222 (follow-up commit)

Commit: fbda3d3e86a5daff58d930bbafc8e467d2b62c01

Verification

  • Diff confirmed: +1/-0, single blank line inserted after Design rationale: heading at .claude/skills/launch-check/SKILL.md:358 to satisfy markdownlint MD032
  • No content changes — purely a CI-fix
  • All 4 CI checks green: markdownlint-cli2, lychee, shellcheck, Verify Ticket ID
  • Load-bearing decisions untouched: shared lib API (_lib-audit-history.sh), JSON+MD artefact pair, launch-check superset schema, and legacy+canonical merge strategy all unchanged from prior approval at 0498641505d6ce5fc1cf08553744ae77bcc497b2

Verdict

APPROVED — prior approval still applies in spirit; the markdownlint fix is the minimum mechanical change to unblock CI.

(Posted as comment because GitHub disallows self-approval; approval marker written at .claude/session/reviews/222-rex.approved.)


🤖 Reviewed by Rex (Code Reviewer Agent)
📌 Reviewed commit: fbda3d3e86a5daff58d930bbafc8e467d2b62c01

@atlas-apex atlas-apex merged commit 42c8a72 into dev May 12, 2026
4 checks passed
@atlas-apex atlas-apex deleted the feature/GH-218-audit-artefact-persistence branch May 12, 2026 07:56
atlas-apex added a commit that referenced this pull request May 15, 2026
Mechanical retrofit completing the follow-up filed alongside #218.
Each of the 7 remaining audit skills now persists a structured
JSON+MD pair via audit_run_persist + renders trend via
audit_render_trend on every run. Same shape as the threat-model and
security-review pilots from #222.

Skills retrofitted:
- /compliance-check  — GDPR/ePrivacy
- /accessibility-audit — WCAG 2.1 AA
- /performance-audit  — bundle / images / caching
- /seo-audit          — meta / sitemap / OG / structured data
- /monitoring-audit   — logs / errors / health / alerts / runbooks
- /docs-audit         — Diataxis quadrants + README quality + staleness
- /analytics-audit    — SDK / event taxonomy / funnels / dashboards

Each gains:
- A "Persist the run + render trend" section before its existing
  "## Rules" section, with subsections for project/score/verdict
  resolution, payload+body construction, persist call, trend render,
  and opt-in commit marker
- Two new Rules entries: "Always persist via the lib" + "Severity
  vocabulary in the JSON is lowercase"
- Per-dim severity-vocabulary mapping documented (e.g. "Documentation
  readiness: PARTIAL" → conditional, "Incident readiness: NOT READY"
  → fail, etc.)

7 new canonical templates at templates/audits/<dim>.md, each
demonstrating the dimension's findings shape (POUR groupings for
accessibility, Diataxis quadrants for docs, regulatory exposure for
compliance, etc.) per the convention shipped in #222.

After this PR, all 9 audit skills (this 7 + threat-model +
security-review) consume the shared lib uniformly. /launch-check
already consumed the lib via its #222 refactor with backward-compat
read-merge for adopters' existing history.

Closes #221

Co-authored-by: me2resh <ahmed.abdelaliem@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
me2resh added a commit that referenced this pull request Jun 5, 2026
)

* docs(#218): technical design + AgDR-0019 for audit-artefact persistence

Tech design at docs/technical-designs/audit-artefact-persistence.md
covers the full feature: domain model, C4 L3 component diagram,
Mermaid DFD, shared lib API, JSON + MD frontmatter schema, storage
layout, launch-check backward-compat strategy, implementation tasks,
risks, and three open questions resolved in the AgDR.

AgDR-0019 records four load-bearing decisions:
- Paired JSON + MD per run (preserves /launch-check's existing pattern,
  zero migration for adopters with committed run history)
- Rigid common-denominator findings shape {id, severity, status,
  summary} with per-dimension nuance in the MD body via per-dim
  templates
- Read both old + new paths for /launch-check, write only to new
  (non-destructive backward compat; trend reader gains one branch)
- Four discrete shell-lib functions mirroring the existing _lib-*.sh
  shape adopters already know (audit_resolve_dir / audit_run_persist /
  audit_run_list / audit_render_trend)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(#218): audit-skill artefact persistence + canonical structure

Generalises /launch-check's per-run JSON + per-run MD + opt-in commit
marker convention into a shared shell library that all audit skills
consume, so every audit run produces structurally consistent on-disk
artefacts and the trend across runs becomes legible.

Components:

- .claude/hooks/_lib-audit-history.sh — four discrete shell functions
  (audit_resolve_dir / audit_run_persist / audit_run_list /
  audit_render_trend) mirroring the existing _lib-*.sh shape adopters
  already know. ~280 LOC bash + jq.
- .claude/hooks/tests/test_audit_history.sh — 11 cases covering
  resolve / persist / stats derivation / preserved stats / marker
  semantics / sorted listing / legacy launch-check path read /
  silent-on-1-run / generic trend rendering / launch-check legacy
  dispatch / legacy+canonical merge regression.
- templates/audits/threat-model.md, templates/audits/security-review.md
  — canonical body skeletons for the two pilot dimensions (reference
  material per AgDR-0019; per-skill bodies are inline).
- .claude/skills/threat-model/SKILL.md — added Step 5 "Persist the
  run + render trend" calling the lib. Severity vocabulary in the
  payload is lowercase (lib stats expect this); visible Step 3 table
  keeps its conventional capitalisation.
- .claude/skills/security-review/SKILL.md — same shape; persists per
  PR review with the same score/verdict formula.
- .claude/skills/launch-check/SKILL.md — refactored Step 6 to consume
  audit_run_persist + audit_render_trend. JSON is a SUPERSET preserving
  the legacy scores{} / branch / commit / top_risks[] fields so the
  existing render-trend.sh continues to plot the same chart unchanged.
  audit_run_list merges canonical (audits/launch-check/runs/) +
  legacy (launch-check/runs/) paths so adopters' existing history is
  preserved with no `mv` required.

Decision rationale: docs/agdr/AgDR-0019-audit-artefact-persistence.md
Technical design + diagrams: docs/technical-designs/audit-artefact-persistence.md

Follow-up #221 covers the remaining 7 audit skills' retrofit
(compliance-check, accessibility-audit, performance-audit, seo-audit,
monitoring-audit, docs-audit, analytics-audit) — mechanical retrofit
per the pattern proven by the two pilots in this PR.

Closes #218

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(#218): blank line before list to satisfy markdownlint MD032

CI's markdownlint-cli2 flagged the "Design rationale:" list at line
358 of launch-check/SKILL.md for MD032 (blanks-around-lists).
One-line fix: insert blank line after the section header. No content
change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: me2resh <ahmed.abdelaliem@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
me2resh added a commit that referenced this pull request Jun 5, 2026
Mechanical retrofit completing the follow-up filed alongside #218.
Each of the 7 remaining audit skills now persists a structured
JSON+MD pair via audit_run_persist + renders trend via
audit_render_trend on every run. Same shape as the threat-model and
security-review pilots from #222.

Skills retrofitted:
- /compliance-check  — GDPR/ePrivacy
- /accessibility-audit — WCAG 2.1 AA
- /performance-audit  — bundle / images / caching
- /seo-audit          — meta / sitemap / OG / structured data
- /monitoring-audit   — logs / errors / health / alerts / runbooks
- /docs-audit         — Diataxis quadrants + README quality + staleness
- /analytics-audit    — SDK / event taxonomy / funnels / dashboards

Each gains:
- A "Persist the run + render trend" section before its existing
  "## Rules" section, with subsections for project/score/verdict
  resolution, payload+body construction, persist call, trend render,
  and opt-in commit marker
- Two new Rules entries: "Always persist via the lib" + "Severity
  vocabulary in the JSON is lowercase"
- Per-dim severity-vocabulary mapping documented (e.g. "Documentation
  readiness: PARTIAL" → conditional, "Incident readiness: NOT READY"
  → fail, etc.)

7 new canonical templates at templates/audits/<dim>.md, each
demonstrating the dimension's findings shape (POUR groupings for
accessibility, Diataxis quadrants for docs, regulatory exposure for
compliance, etc.) per the convention shipped in #222.

After this PR, all 9 audit skills (this 7 + threat-model +
security-review) consume the shared lib uniformly. /launch-check
already consumed the lib via its #222 refactor with backward-compat
read-merge for adopters' existing history.

Closes #221

Co-authored-by: me2resh <ahmed.abdelaliem@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants