explanations: add test coverage for generator, formatters, and persistence
Context
Safety Invariant #2 (CLAUDE.md) requires every decision that results in an action (or a deliberate non-action) to produce an ExplanationRecord. The @skytwin/explanations package is the single point that produces these records — it's how SkyTwin meets its promise that the user can always understand why something happened.
The package has a working implementation (explanation-generator.ts, 361 LOC) but zero test files. There is no __tests__/ directory under packages/explanations/src/. Behavior across the three public methods (generate, formatForUser, formatForAudit) and six private helpers is not locked in. A rename, a branch rewrite, or a contract change on DecisionOutcome could silently break user-facing explanation output and nothing would fail CI.
This is not a stub-filling job — the code already produces meaningful strings. It's a coverage job: pin the current behavior so future changes are intentional.
Claude Code estimate: ~2-3h
Current State (verified 2026-04-23)
Package layout
packages/explanations/src/
explanation-generator.ts 361 LOC, exports ExplanationGenerator, ExplanationRepositoryPort, AuditRecord
index.ts 5 LOC, barrel
(no __tests__/)
No vitest config at the package level; relies on the workspace root.
Public surface
ExplanationGenerator (class):
generate(decision, outcome, context): Promise<ExplanationRecord> — builds the record, calls repository.save(record), returns it.
formatForUser(record): string — multi-line human-readable format.
formatForAudit(record): AuditRecord — structured object for compliance logs.
Branching that needs coverage
buildSummary (lines 176-210) — 4 branches: no selectedAction, autoExecute, requiresApproval, fall-through.
buildConfidenceReasoning (lines 256-304) — counts high/low confidence preferences, zero-preference path, single-vs-multi candidate path.
buildActionRationale (lines 306-321) — no-action path, autoExecute suffix, plain path.
buildCorrectionGuidance (lines 327-360) — autoExecute + reversible, autoExecute + non-reversible, approval path.
gatherEvidenceReferences (lines 212-241) — rawData.source present/absent, preferences with 0/1/many evidenceIds (slice(0,3) cap).
gatherPreferenceReferences (lines 243-254) — empty vs populated.
Integration risk
generate() calls repository.save(record) once and awaits before returning. No current test verifies that a repository save failure surfaces as a rejected promise vs. a swallowed error. Invariant #2 ("always log explanations") depends on this propagating.
Field shape
ExplanationRecord type lives in @skytwin/shared-types. Field naming (evidenceUsed, preferencesInvoked, confidenceReasoning, etc.) is consumed by apps/web and apps/mobile detail views — a rename ripples across apps. No test locks the contract.
Proposed Change
Add packages/explanations/src/__tests__/explanation-generator.test.ts with vitest, using an in-memory fake repository that records saves. Build a small fixtures module for DecisionObject, DecisionContext, DecisionOutcome, Preference, RiskAssessment. Cover every branch listed above, plus the two public formatters.
No implementation changes except:
- If a test reveals a branch that's unreachable or genuinely broken, fix narrowly and note it in the PR.
- Extract fixture builders into
packages/explanations/src/__tests__/fixtures.ts to keep the test file readable.
Acceptance Criteria
packages/explanations/src/__tests__/explanation-generator.test.ts exists with ≥20 passing tests covering all branches listed above.
generate() — auto-executed outcome → returned record has non-empty summary, actionRationale, correctionGuidance, empty escalationRationale.
generate() — requiresApproval outcome → escalationRationale is non-empty and equals outcome.reasoning.
generate() — no selectedAction → summary contains the word "escalated" (or equivalent), actionRationale starts with "No action was selected".
generate() — calls repository.save exactly once with the built record; returns the same object reference that was saved.
generate() — when repository.save rejects, the rejection propagates (not swallowed).
buildSummary branches produce distinct strings for autoExecute, requiresApproval, no-action, and default fall-through.
buildCorrectionGuidance — reversible auto-executed action includes "Undo this action"; non-reversible omits it and renumbers steps.
gatherEvidenceReferences — caps preference evidence at 3 per preference (respects slice(0, 3)); drops rawData reference when rawData.source is missing.
gatherPreferenceReferences — returns empty array when context.relevantPreferences is empty; maps 1:1 otherwise with howUsed containing the preference value.
formatForUser — output contains --- Decision Explanation ---, the summary, Risk level:, and How to correct this: sections, in order.
formatForUser — omits the "Why approval was needed" section when escalationRationale is undefined.
formatForAudit — autoExecuted is true iff escalationRationale is undefined (this test will catch the current implementation — verify it matches intent).
formatForAudit — evidenceCount and preferencesCount match the array lengths on the source record.
- Running
pnpm --filter @skytwin/explanations test exits 0.
- Running
pnpm test at the root exits 0 with no skipped or flaky tests added.
- PR passes
/review before merge.
Testing Plan
| Layer |
What |
Count |
| Unit |
generate() — three outcome shapes + record saved + save-rejects-propagates |
+5 |
| Unit |
buildSummary — 4 branches |
+4 |
| Unit |
buildConfidenceReasoning — zero prefs / mixed prefs / multi-candidate |
+3 |
| Unit |
buildActionRationale — no action / auto / plain |
+3 |
| Unit |
buildCorrectionGuidance — auto+reversible / auto+non-reversible / approval |
+3 |
| Unit |
gatherEvidenceReferences — rawData absent / preferences evidence cap |
+2 |
| Unit |
gatherPreferenceReferences — empty / populated |
+2 |
| Unit |
formatForUser — contains required sections / omits escalation when absent |
+2 |
| Unit |
formatForAudit — autoExecuted derivation / count fields |
+2 |
Target: ~26 new tests, all passing.
Effort Estimate
- Fixtures module: ~30min
- Test file authorship: ~90min
- Debugging any branches that don't match expected behavior: ~30min buffer
Total: ~2-3h Claude Code time
Files Reference
| File |
Change |
packages/explanations/src/__tests__/explanation-generator.test.ts |
New — full test suite |
packages/explanations/src/__tests__/fixtures.ts |
New — builders for DecisionObject/Context/Outcome/Preference |
packages/explanations/package.json |
Verify test script exists; add if missing |
packages/explanations/vitest.config.ts |
Add only if package-local config is required |
Non-Goals
- Not rewriting explanation copy or changing user-facing strings. Pin current behavior; rewrites are a separate product task.
- Not adding i18n, templating, or LLM-generated prose.
- Not changing the
ExplanationRecord schema in @skytwin/shared-types.
- Not covering persistence backends —
ExplanationRepositoryPort is stubbed for tests; the DB adapter has its own coverage in @skytwin/db.
Related
- Companion issue: "Safety kernel test: prove policy denial blocks auto-execution end-to-end" — shares a fixture for
blocked-by-policy ExplanationRecord shape.
explanations: add test coverage for generator, formatters, and persistence
Context
Safety Invariant #2 (CLAUDE.md) requires every decision that results in an action (or a deliberate non-action) to produce an
ExplanationRecord. The@skytwin/explanationspackage is the single point that produces these records — it's how SkyTwin meets its promise that the user can always understand why something happened.The package has a working implementation (
explanation-generator.ts, 361 LOC) but zero test files. There is no__tests__/directory underpackages/explanations/src/. Behavior across the three public methods (generate,formatForUser,formatForAudit) and six private helpers is not locked in. A rename, a branch rewrite, or a contract change onDecisionOutcomecould silently break user-facing explanation output and nothing would fail CI.This is not a stub-filling job — the code already produces meaningful strings. It's a coverage job: pin the current behavior so future changes are intentional.
Claude Code estimate: ~2-3h
Current State (verified 2026-04-23)
Package layout
No
vitestconfig at the package level; relies on the workspace root.Public surface
ExplanationGenerator(class):generate(decision, outcome, context): Promise<ExplanationRecord>— builds the record, callsrepository.save(record), returns it.formatForUser(record): string— multi-line human-readable format.formatForAudit(record): AuditRecord— structured object for compliance logs.Branching that needs coverage
buildSummary(lines 176-210) — 4 branches: noselectedAction,autoExecute,requiresApproval, fall-through.buildConfidenceReasoning(lines 256-304) — counts high/low confidence preferences, zero-preference path, single-vs-multi candidate path.buildActionRationale(lines 306-321) — no-action path, autoExecute suffix, plain path.buildCorrectionGuidance(lines 327-360) — autoExecute + reversible, autoExecute + non-reversible, approval path.gatherEvidenceReferences(lines 212-241) — rawData.source present/absent, preferences with 0/1/many evidenceIds (slice(0,3) cap).gatherPreferenceReferences(lines 243-254) — empty vs populated.Integration risk
generate()callsrepository.save(record)once and awaits before returning. No current test verifies that a repository save failure surfaces as a rejected promise vs. a swallowed error. Invariant #2 ("always log explanations") depends on this propagating.Field shape
ExplanationRecordtype lives in@skytwin/shared-types. Field naming (evidenceUsed,preferencesInvoked,confidenceReasoning, etc.) is consumed byapps/webandapps/mobiledetail views — a rename ripples across apps. No test locks the contract.Proposed Change
Add
packages/explanations/src/__tests__/explanation-generator.test.tswith vitest, using an in-memory fake repository that records saves. Build a small fixtures module forDecisionObject,DecisionContext,DecisionOutcome,Preference,RiskAssessment. Cover every branch listed above, plus the two public formatters.No implementation changes except:
packages/explanations/src/__tests__/fixtures.tsto keep the test file readable.Acceptance Criteria
packages/explanations/src/__tests__/explanation-generator.test.tsexists with ≥20 passing tests covering all branches listed above.generate()— auto-executed outcome → returned record has non-emptysummary,actionRationale,correctionGuidance, emptyescalationRationale.generate()—requiresApprovaloutcome →escalationRationaleis non-empty and equalsoutcome.reasoning.generate()— noselectedAction→summarycontains the word "escalated" (or equivalent),actionRationalestarts with "No action was selected".generate()— callsrepository.saveexactly once with the built record; returns the same object reference that was saved.generate()— whenrepository.saverejects, the rejection propagates (not swallowed).buildSummarybranches produce distinct strings forautoExecute,requiresApproval, no-action, and default fall-through.buildCorrectionGuidance— reversible auto-executed action includes "Undo this action"; non-reversible omits it and renumbers steps.gatherEvidenceReferences— caps preference evidence at 3 per preference (respectsslice(0, 3)); drops rawData reference whenrawData.sourceis missing.gatherPreferenceReferences— returns empty array whencontext.relevantPreferencesis empty; maps 1:1 otherwise withhowUsedcontaining the preference value.formatForUser— output contains--- Decision Explanation ---, the summary,Risk level:, andHow to correct this:sections, in order.formatForUser— omits the "Why approval was needed" section whenescalationRationaleis undefined.formatForAudit—autoExecutedistrueiffescalationRationaleis undefined (this test will catch the current implementation — verify it matches intent).formatForAudit—evidenceCountandpreferencesCountmatch the array lengths on the source record.pnpm --filter @skytwin/explanations testexits 0.pnpm testat the root exits 0 with no skipped or flaky tests added./reviewbefore merge.Testing Plan
generate()— three outcome shapes + record saved + save-rejects-propagatesbuildSummary— 4 branchesbuildConfidenceReasoning— zero prefs / mixed prefs / multi-candidatebuildActionRationale— no action / auto / plainbuildCorrectionGuidance— auto+reversible / auto+non-reversible / approvalgatherEvidenceReferences— rawData absent / preferences evidence capgatherPreferenceReferences— empty / populatedformatForUser— contains required sections / omits escalation when absentformatForAudit—autoExecutedderivation / count fieldsTarget: ~26 new tests, all passing.
Effort Estimate
Total: ~2-3h Claude Code time
Files Reference
packages/explanations/src/__tests__/explanation-generator.test.tspackages/explanations/src/__tests__/fixtures.tspackages/explanations/package.jsontestscript exists; add if missingpackages/explanations/vitest.config.tsNon-Goals
ExplanationRecordschema in@skytwin/shared-types.ExplanationRepositoryPortis stubbed for tests; the DB adapter has its own coverage in@skytwin/db.Related
blocked-by-policyExplanationRecord shape.