Skip to content

feat(commands): add 6 lifecycle slash commands#1611

Merged
rjmurillo merged 8 commits into
mainfrom
feat/lifecycle-commands
Apr 11, 2026
Merged

feat(commands): add 6 lifecycle slash commands#1611
rjmurillo merged 8 commits into
mainfrom
feat/lifecycle-commands

Conversation

@rjmurillo

@rjmurillo rjmurillo commented Apr 11, 2026

Copy link
Copy Markdown
Owner

Pull Request

Summary

Replace legacy numbered workflow commands with 6 lifecycle slash commands following the Osmani agent-skills pattern. Commands are stack-agnostic, platform-agnostic, and host-agnostic.

Added:

  • .claude/commands/spec.md - Define what to build
  • .claude/commands/plan.md - Plan how to build
  • .claude/commands/build.md - Build incrementally
  • .claude/commands/test.md - Multi-dimensional quality validation
  • .claude/commands/review.md - Five-axis code review
  • .claude/commands/ship.md - Pre-flight validation and PR creation

Deleted:

  • .claude/commands/workflow/0-init.md
  • .claude/commands/workflow/1-plan.md
  • .claude/commands/workflow/2-impl.md
  • .claude/commands/workflow/3-qa.md
  • .claude/commands/workflow/4-security.md
  • .claude/commands/workflow/9-sync.md

Modified:

  • AGENTS.md - Lifecycle reference replaces workflow reference
  • CLAUDE.md - Separate skill routing from lifecycle commands
  • .markdownlint-cli2.yaml - Exclude lifecycle commands from linting
  • .claude/skills/workflow/SKILL.md - Marked DEPRECATED
  • .claude/skills/prompt-engineer/SKILL.md - Model bumped to claude-sonnet-4-6

Specification References

Type Reference Description
Issue Closes #1609 Ship 6 core skills for Claude Code / Copilot

Changes

  • Add 6 lifecycle slash commands following Osmani agent-skills pattern
  • Delete 6 unused workflow commands (0-init through 9-sync)
  • Update AGENTS.md and CLAUDE.md routing
  • Deprecate workflow skill
  • Apply prompt engineering: role personas, structured output, empty-args guards
  • Action commands use Bash(*) for platform/stack portability

Type of Change

  • New feature (non-breaking change adding functionality)
  • Documentation update

Testing

  • No testing required (documentation only)
  • validate-slash-commands CI passes
  • All referenced agents and skills verified

Agent Review

Security Review

  • No security-critical changes in this PR

Other Agent Reviews

  • Architect reviewed design changes
  • Critic validated implementation plan
  • QA verified test coverage

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Documentation updated (if applicable)
  • No new warnings introduced

Related Issues

Closes #1609

Add Osmani-pattern lifecycle commands for Claude Code:
- /spec: problem to requirements with CVA analysis
- /plan: specs to milestones with risk coverage
- /build: TDD implementation with atomic commits
- /test: layered testing with hypothesis debugging
- /review: 5-axis review (architecture, security, quality, tests, standards)
- /ship: pre-flight validation delegating to /push-pr

Remove unused workflow/ commands (0-init through 9-sync). Update AGENTS.md
and CLAUDE.md routing. Exclude .claude/commands/ from markdownlint MD041.

Closes #1609

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 11, 2026 05:14
@github-actions github-actions Bot added enhancement New feature or request area-infrastructure Build, CI/CD, configuration labels Apr 11, 2026
@rjmurillo rjmurillo requested a review from rjmurillo-bot April 11, 2026 05:15
@github-actions

Copy link
Copy Markdown
Contributor

PR Validation Report

Note

Status: PASS

Description Validation

Check Status
Description matches diff PASS

PR Standards

Check Status
Issue linking keywords PASS
Template compliance WARN

QA Validation

Check Status
Code changes detected True
QA report exists false

⚡ Warnings

  • Template compliance: 1/4 sections complete
  • QA report not found for code changes (recommended before merge)

Powered by PR Validation workflow

@coderabbitai coderabbitai Bot added agent-orchestrator Task coordination agent area-workflows GitHub Actions workflows area-skills Skills documentation and patterns labels Apr 11, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the legacy numbered workflow commands with a new lifecycle-based command set including /spec, /plan, /build, /test, /review, and /ship, while updating documentation and linting configurations. A critical issue was identified regarding the removal of the legacy commands, which will break existing PowerShell tests that hardcode references to the old command names.

I am having trouble creating individual review comments. Click here to see my feedback.

.claude/commands/workflow/0-init.md (1-48)

high

The removal of the legacy workflow/ commands will break the existing test suite. Multiple test files, including tests/Invoke-WorkflowCommand.Tests.ps1 (line 40) and tests/WorkflowHelpers.Tests.ps1 (line 52), contain hardcoded references to these specific command names (e.g., 0-init, 1-plan, 4-security). These tests must be updated to validate the new lifecycle commands (/spec, /plan, etc.) or removed if the underlying workflow logic is being retired to ensure CI/CD pipeline health.

References
  1. When analyzing a diff for breaking changes, consider the full context of the file, not just the changed lines, to ensure that removals are not compensated for elsewhere.

@github-actions

github-actions Bot commented Apr 11, 2026

Copy link
Copy Markdown
Contributor

AI Quality Gate Review

Warning

⚠️ Final Verdict: WARN

Walkthrough

This PR was reviewed by six AI agents in parallel, analyzing different aspects of the changes:

  • Security Agent: Scans for vulnerabilities, secrets exposure, and security anti-patterns
  • QA Agent: Evaluates test coverage, error handling, and code quality
  • Analyst Agent: Assesses code quality, impact analysis, and maintainability
  • Architect Agent: Reviews design patterns, system boundaries, and architectural concerns
  • DevOps Agent: Evaluates CI/CD, build pipelines, and infrastructure changes
  • Roadmap Agent: Assesses strategic alignment, feature scope, and user value

Review Summary

Agent Verdict Category Status
Security PASS N/A
QA PASS N/A
Analyst WARN N/A ⚠️
Architect PASS N/A
DevOps WARN N/A ⚠️
Roadmap PASS N/A

💡 Quick Access: Click on individual agent jobs (e.g., "🔒 security Review", "🧪 qa Review") in the workflow run to see detailed findings and step summaries.

Security Review Details

Now let me check if any old workflow commands are being removed and examine the changes to AGENTS.md and CLAUDE.md:

Let me check if the workflow commands were removed as stated in the PR description:

The workflow directory no longer exists as expected. Let me verify the model alias used in the prompt-engineer skill (the memory mentioned model aliases must be claude-{opus|sonnet|haiku}-4-5):

The prompt-engineer SKILL.md has model: claude-sonnet-4-6 which is NOT in the valid model aliases. This is a validation finding but not a security issue.

Now let me complete my security analysis. I've reviewed all the files in this PR:

Security Review Summary

PR Type Classification: PROMPT/DOCS/CONFIG

Changed files:

  • .claude/commands/*.md (6 new lifecycle commands) - PROMPT
  • .claude/commands/workflow/*.md (deleted) - PROMPT
  • .claude/skills/prompt-engineer/SKILL.md - PROMPT
  • .claude/skills/workflow/SKILL.md - PROMPT
  • .markdownlint-cli2.yaml - CONFIG
  • AGENTS.md, CLAUDE.md - DOCS

Findings

Severity Category Finding Location CWE
Low misconfiguration Invalid model alias claude-sonnet-4-6 not in VALID_MODEL_ALIASES .claude/skills/prompt-engineer/SKILL.md:7 N/A

Analysis

Prompt Injection Surface Review:

  • All 6 new lifecycle commands use $ARGUMENTS for user input
  • Commands delegate to Task subagents (implementer, qa, security, architect, etc.)
  • No direct command execution from user input
  • Bash tool access is scoped with Bash(*) pattern in allowed-tools
  • Agent instructions properly separate control flow from user data

Secret Detection:

  • No hardcoded credentials, API keys, or tokens found
  • No .env file exposure patterns

Configuration Security:

  • .markdownlint-cli2.yaml changes only add exclusions for the new command files (MD041 exemption for frontmatter-only files)
  • No overly permissive settings

Workflow Skill Deprecation:

  • Properly deprecated with clear documentation
  • No removal of security controls

Recommendations

  1. Update model: claude-sonnet-4-6 to a valid alias like model: claude-sonnet-4-5 in .claude/skills/prompt-engineer/SKILL.md. This is a frontmatter validation issue, not a security issue.

Verdict

VERDICT: PASS
MESSAGE: No security vulnerabilities found. PR adds lifecycle commands with proper agent delegation patterns. One minor frontmatter validation issue (invalid model alias) is non-security related.
{
  "verdict": "PASS",
  "message": "No security vulnerabilities found. PR adds lifecycle commands with proper agent delegation. One minor non-security frontmatter issue noted.",
  "agent": "security",
  "timestamp": "2026-04-11T06:41:00.000Z",
  "findings": [
    {
      "severity": "low",
      "category": "misconfiguration",
      "description": "Invalid model alias claude-sonnet-4-6 not in VALID_MODEL_ALIASES (claude-{opus|sonnet|haiku}-4-5 or dated snapshots)",
      "location": ".claude/skills/prompt-engineer/SKILL.md:7",
      "cwe": "N/A",
      "recommendation": "Change model: claude-sonnet-4-6 to model: claude-sonnet-4-5"
    }
  ]
}
QA Review Details

Now I have enough context to perform the QA review. Let me analyze the PR systematically.


QA Review: PR #1611

PR Type Classification

PR TYPE: DOCS
FILES: 
  - DOCS: .claude/commands/spec.md, plan.md, build.md, test.md, review.md, ship.md (6 new)
  - DOCS: .claude/commands/workflow/0-init.md through 9-sync.md (6 deleted)
  - DOCS: .claude/skills/workflow/SKILL.md (deprecated marker)
  - DOCS: .claude/skills/prompt-engineer/SKILL.md (model bump)
  - CONFIG: .markdownlint-cli2.yaml (exclude new commands)
  - DOCS: AGENTS.md, CLAUDE.md (reference updates)

Classification Rationale: All changed files are markdown (.md) containing prompt/command definitions, or YAML configuration. No executable code (.py, .ps1, .ts, .js, .cs) was modified. The "code" in these files is agent instruction text, not executable logic.

Test Coverage Assessment

N/A - DOCS only PR

Per PR Type Detection rules: "Files without executable logic do not require tests." The 6 new lifecycle command files are Claude Code slash command definitions (markdown with YAML frontmatter). They contain no executable code paths requiring unit tests.

The PR description confirms validation approach:

  • YAML frontmatter validated by existing validate-slash-commands CI
  • Referenced agents exist (structural check)
  • Referenced skills exist (structural check)
  • Markdownlint passes

Quality Concerns

Severity Issue Location Evidence Required Fix
LOW Model alias uses 4-6 version .claude/skills/prompt-engineer/SKILL.md:7 model: claude-sonnet-4-6 Per repository memory, only 4-5 aliases valid. Verify validator acceptance.

Investigation of model alias concern: The repository memory states "Skill frontmatter validator only accepts model aliases claude-{opus|sonnet|haiku}-4-5". However, this is a DOCS-only PR and the PR passed CI (validate-slash-commands CI: PASS). If the validator rejected claude-sonnet-4-6, CI would fail. The memory may be outdated or the validator was updated.

Regression Risk Assessment

  • Risk Level: Low
  • Affected Components: Agent command routing (CLAUDE.md), lifecycle reference (AGENTS.md)
  • Breaking Changes: Legacy commands deleted (/0-init through /9-sync), but skill marked DEPRECATED with migration path
  • Required Testing: Functional testing of each command (noted as deferred in PR description)

Evidence

VERDICT: PASS
MESSAGE: DOCS-only PR with valid structure; no executable code requires tests.

PR TYPE: DOCS

EVIDENCE:
- Tests found: N/A - DOCS only, no executable code paths
- Test execution: PASS (pytest 7002 passed, 3 skipped - validates existing codebase unaffected)
- Edge cases: N/A - no user input handling code
- Error handling: N/A - no executable error paths
- Blocking issues: 0

Validation Summary

Check Status Notes
Broken links [PASS] Referenced agents/skills verified in PR checklist
YAML frontmatter [PASS] validate-slash-commands CI: PASS per PR description
Markdownlint [PASS] Commands excluded in .markdownlint-cli2.yaml (lines 143-148)
Structural integrity [PASS] All 6 commands have consistent structure
Deprecation path [PASS] SKILL.md clearly marked DEPRECATED with migration to new commands

{
  "verdict": "PASS",
  "message": "DOCS-only PR replacing workflow commands with lifecycle commands; no executable code requires tests",
  "agent": "qa",
  "timestamp": "2026-04-11T06:42:00Z",
  "findings": [
    {
      "severity": "low",
      "category": "code-quality",
      "description": "Model alias claude-sonnet-4-6 in prompt-engineer SKILL.md may not match validator expectations",
      "location": ".claude/skills/prompt-engineer/SKILL.md:7",
      "recommendation": "Verify skill frontmatter validator accepts 4-6 aliases or revert to claude-sonnet-4-5"
    }
  ]
}
Analyst Review Details

Let me search for the correct repository and examine the local files directly.

Now let me check if the old workflow commands were removed:

Good, the workflow directory was removed. Let me verify the documentation alignment:

PR #1611 Analysis: Lifecycle Slash Commands

Code Quality Score

Criterion Score (1-5) Notes
Readability 5 Clear YAML frontmatter, well-structured markdown with tables and step-by-step processes
Maintainability 5 Each command is self-contained with explicit agent delegation via Task() calls
Consistency 5 All 6 commands follow identical structure: frontmatter → context → process → principles → output
Simplicity 4 Commands appropriately detailed for orchestration; /test command is comprehensive (145 lines) but justified for 6-gate testing

Overall: 4.75/5

Impact Assessment

  • Scope: Module-wide (affects .claude/commands/ and documentation)
  • Risk Level: Low
  • Affected Components:
    • .claude/commands/ (6 new files, 6 removed)
    • .claude/skills/workflow/SKILL.md (deprecation notice added)
    • AGENTS.md and CLAUDE.md (routing tables updated)
    • .markdownlint-cli2.yaml (exclusions added)

Findings

Priority Category Finding Location
Low Consistency prompt-engineer/SKILL.md uses model claude-sonnet-4-6 while repository memory says only 4-5 aliases are valid .claude/skills/prompt-engineer/SKILL.md:7

Recommendations

  1. Verify claude-sonnet-4-6 model alias is valid per the skill frontmatter validator. Repository memory indicates only 4-5 dated snapshots are accepted.

Verdict

VERDICT: WARN
MESSAGE: Clean refactoring from numbered workflow to lifecycle commands; verify model alias claude-sonnet-4-6 is valid per skill_frontmatter.py validation rules

Detailed Assessment

Strengths:

  1. Clear semantic naming: /spec, /plan, /build, /test, /review, /ship maps directly to development lifecycle phases
  2. Each command delegates to appropriate specialized agents via Task() calls
  3. Proper deprecation notice in workflow/SKILL.md preserving backward compatibility for script references
  4. Markdownlint exclusions are specific to the 6 new commands only, not a blanket exclusion
  5. Documentation in AGENTS.md and CLAUDE.md updated consistently

Architecture:

  • Commands properly use @CLAUDE.md reference at top
  • YAML frontmatter includes description, allowed-tools, and argument-hint
  • Each command includes evaluation criteria, principles, and structured output format
  • Agent delegation follows established Task(subagent_type=...) pattern
{
  "verdict": "WARN",
  "message": "Clean lifecycle command refactoring; verify model alias claude-sonnet-4-6 validity",
  "agent": "analyst",
  "timestamp": "2026-04-11T06:41:00Z",
  "findings": [
    {
      "severity": "low",
      "category": "consistency",
      "description": "prompt-engineer/SKILL.md uses model claude-sonnet-4-6 but repository memory indicates only 4-5 aliases are validated",
      "location": ".claude/skills/prompt-engineer/SKILL.md:7",
      "recommendation": "Verify claude-sonnet-4-6 is added to scripts/validation/skill_frontmatter.py:35-53 accepted aliases or revert to claude-sonnet-4-5"
    }
  ]
}
Architect Review Details

Let me check the deleted workflow commands to understand what's being removed:

Design Quality Assessment

Aspect Rating (1-5) Notes
Pattern Adherence 5 Follows established skill/command patterns from ADR-030. Clean frontmatter. Multi-agent delegation per AGENTS.md conventions.
Boundary Respect 5 Clear separation: commands invoke skills/agents. Commands orchestrate, skills execute. No boundary violations.
Coupling 4 Commands depend on existing skills and agents (cva-analysis, code-qualities-assessment, etc.). Explicit coupling via Skill() and Task() calls is appropriate.
Cohesion 5 Each command has single purpose: /spec = requirements, /plan = milestones, /build = implementation, /test = validation, /review = quality, /ship = deployment.
Extensibility 5 Frontmatter-based commands are pluggable. Each follows identical structure. New lifecycle commands can be added without modifying existing ones.

Overall Design Score: 4.8/5

Architectural Concerns

Severity Concern Location Recommendation
Low Model alias in prompt-engineer .claude/skills/prompt-engineer/SKILL.md:7 claude-sonnet-4-6 not in validated aliases per repository memory. Should use claude-sonnet-4-5.

Breaking Change Assessment

  • Breaking Changes: Yes (removal of /0-init through /4-security commands)
  • Impact Scope: Minor (old commands deprecated, not removed from skill scripts)
  • Migration Required: No (workflow skill marked DEPRECATED, scripts retained for reference)
  • Migration Path: SKILL.md clearly documents new lifecycle commands as replacement. AGENTS.md updated with routing.

Technical Debt Analysis

  • Debt Added: Low (6 clean command files with consistent structure)
  • Debt Reduced: Medium (removes 6 numbered workflow commands with complex PowerShell dependencies)
  • Net Impact: Improved

ADR Assessment

  • ADR Required: No
  • Decisions Identified: Command naming convention changed from numbered (0-init) to semantic (spec, plan, build, test, review, ship). This is a UX improvement, not an architectural change.
  • Existing ADR: ADR-006 (Thin Workflows) and ADR-030 (Skills Pattern) remain applicable. Commands are thin orchestrators that delegate to agents/skills.
  • Recommendation: N/A. The change aligns with existing ADRs. The workflow skill is correctly deprecated. No new architectural decision required.

Recommendations

  1. [LOW] Fix model alias in .claude/skills/prompt-engineer/SKILL.md line 7: change claude-sonnet-4-6 to claude-sonnet-4-5 to match validated aliases.
  2. [INFO] Consider removing the deleted workflow commands from .claude/commands/workflow/ directory entirely in a follow-up cleanup PR.

Verdict

The PR introduces 6 well-structured lifecycle commands that follow established patterns (ADR-030 skills superiority, ADR-006 thin workflows). Each command has:

  1. Valid YAML frontmatter with description, allowed-tools, argument-hint
  2. Single responsibility (one lifecycle phase per command)
  3. Clear delegation to agents via Task() and skills via Skill()
  4. Quality principles documented inline

The deprecation of numbered workflow commands is properly communicated. Markdownlint exclusions for MD041 (first line heading) are justified because command files use frontmatter.

VERDICT: PASS
MESSAGE: Well-designed lifecycle command system following ADR-030 skill patterns. Clean separation of concerns. One low-severity model alias issue identified.
{
  "verdict": "PASS",
  "message": "Well-designed lifecycle command system following ADR-030 skill patterns with clean separation of concerns",
  "agent": "architect",
  "timestamp": "2026-04-11T06:42:00Z",
  "findings": [
    {
      "severity": "low",
      "category": "tech-debt",
      "description": "Model alias claude-sonnet-4-6 not in validated alias list per repository memory",
      "location": ".claude/skills/prompt-engineer/SKILL.md:7",
      "recommendation": "Change to claude-sonnet-4-5 to match skill_frontmatter.py validated aliases"
    }
  ]
}
DevOps Review Details

Now I have enough context to perform the DevOps review. Let me check if the workflow directory still exists or if the old files were properly removed:

Now I have enough information to complete the DevOps review.


DevOps Review: PR #1611

PR Scope Detection

Category Files Patterns
DOCS AGENTS.md, CLAUDE.md *.md (non-.github/)
CONFIG .markdownlint-cli2.yaml *.yaml (non-workflow)
TEMPLATE .claude/commands/*.md Agent command files

PR TYPE: DOCS/CONFIG/TEMPLATE hybrid. Running gates: Schema validation, Template review.

This PR does not modify:

  • GitHub Actions workflows (.github/workflows/)
  • Composite actions (.github/actions/)
  • Build scripts (scripts/)
  • CI/CD pipelines

Pipeline Impact Assessment

Area Impact Notes
Build None No build configuration changes
Test None No test infrastructure changes
Deploy None No deployment workflow changes
Cost None No CI runner changes

CI/CD Quality Checks

Check Status Location
YAML syntax valid .markdownlint-cli2.yaml
Actions pinned N/A No workflow changes
Secrets secure N/A No secrets referenced
Permissions minimal N/A No workflow changes
Shell scripts robust N/A No shell scripts changed

Template Assessment

New Lifecycle Commands: Adequate

All 6 new commands (spec.md, plan.md, build.md, test.md, review.md, ship.md) have:

  • Valid YAML frontmatter with description, allowed-tools, argument-hint
  • Clear process documentation
  • Appropriate agent routing (Task subagent_type references)
  • Skill invocations for quality gates

Removed Commands: workflow/0-init through workflow/9-sync removed as expected per PR description.

Workflow SKILL.md: Properly marked DEPRECATED with clear migration guidance.


Findings

Severity Category Finding Location Fix
Low templates prompt-engineer/SKILL.md uses model claude-sonnet-4-6 which is invalid per repository memory .claude/skills/prompt-engineer/SKILL.md:7 Change to claude-sonnet-4-5 per skill frontmatter validator

Configuration Review

.markdownlint-cli2.yaml changes:

  • Lines 141-149 add ignores for the 6 new lifecycle commands
  • Rationale documented: "New lifecycle command files have only YAML frontmatter, no H1 heading. MD041 not applicable."
  • This is valid. Commands with only frontmatter cannot satisfy MD041 (first line heading).

Automation Opportunities

Opportunity Type Benefit Effort
None identified - - -

The lifecycle commands already leverage skills and agents appropriately. No redundant patterns detected.


Recommendations

  1. Fix model alias in .claude/skills/prompt-engineer/SKILL.md from claude-sonnet-4-6 to claude-sonnet-4-5 (per repository memory: skill frontmatter validator only accepts 4-5 aliases).

Verdict

VERDICT: WARN
MESSAGE: Minor model alias issue in prompt-engineer skill. No CI/CD or pipeline changes in this PR. Safe to merge after fix.

{
  "verdict": "WARN",
  "message": "Model alias claude-sonnet-4-6 in prompt-engineer/SKILL.md is invalid per skill frontmatter validator; use claude-sonnet-4-5",
  "agent": "devops",
  "timestamp": "2026-04-11T06:41:30.000Z",
  "findings": [
    {
      "severity": "low",
      "category": "templates",
      "description": "Skill frontmatter uses model claude-sonnet-4-6 which is not in the validator allowed list",
      "location": ".claude/skills/prompt-engineer/SKILL.md:7",
      "recommendation": "Change model to claude-sonnet-4-5 per scripts/validation/skill_frontmatter.py allowed model list"
    }
  ]
}
Roadmap Review Details

Now let me check the issue referenced and see the workflow files that were removed:

Now I have enough context to provide my strategic alignment assessment.


Strategic Alignment Assessment

Criterion Rating Notes
Aligns with project goals High Improves multi-agent workflow adoption by providing clearer lifecycle phases
Priority appropriate High Developer experience is core to the vision of minimal friction adoption
User value clear High Six well-defined commands reduce cognitive load vs numbered workflow system
Investment justified High Removes unused cruft (6 workflow files), adds structured value (6 lifecycle commands)

Feature Completeness

  • Scope Assessment: Right-sized
  • Ship Ready: Yes
  • MVP Complete: Yes
  • Enhancement Opportunities: None immediate; commands are well-documented with clear agent routing

Impact Analysis

Dimension Assessment Notes
User Value High Clearer naming (/spec vs /0-init), logical progression, explicit agent delegation
Business Impact Medium Improves developer velocity, reduces onboarding friction
Technical Leverage High Reuses existing agents (spec-generator, milestone-planner, implementer, qa, architect, devops)
Competitive Position Improved Osmani-pattern lifecycle is industry-recognized, makes project more approachable

KANO Classification

Performance Feature: The more intuitive the command names, the more satisfied users become. Users expected to learn numbered commands before; now they get semantic names that match their mental model.

Concerns

Priority Concern Recommendation
Low Skill deprecation notice is added but old workflow/SKILL.md still exists Acceptable; deprecation notice is clear and old scripts may still be referenced
Low No explicit migration path documented for users of /0-init through /4-security Consider adding migration table to CHANGELOG or README for existing users

Positive Observations

  1. Clean removal: Six unused workflow commands deleted, reducing maintenance burden
  2. Documentation updated: AGENTS.md and CLAUDE.md reflect new lifecycle commands
  3. Deprecation handled correctly: workflow/SKILL.md marked as deprecated with clear replacement guidance
  4. Markdownlint exclusion: Correctly excludes command files from MD041 (first-line heading) since frontmatter comes first

Recommendations

  1. The PR delivers clear user value with a well-scoped change
  2. The lifecycle pattern (spec → plan → build → test → review → ship) maps directly to development workflow, improving discoverability
  3. Each command delegates to the appropriate specialized agent, maintaining the orchestrator pattern

Verdict

VERDICT: PASS
MESSAGE: Lifecycle commands deliver measurable user value by replacing numbered workflow commands with semantic names that match developer mental models
{
  "verdict": "PASS",
  "message": "Lifecycle commands deliver measurable user value by replacing numbered workflow commands with semantic names that match developer mental models",
  "agent": "roadmap",
  "timestamp": "2026-04-11T06:41:00Z",
  "findings": [
    {
      "severity": "low",
      "category": "documentation",
      "description": "No explicit migration path for users of deprecated /0-init through /4-security commands",
      "location": ".claude/commands/",
      "recommendation": "Consider adding migration table to CHANGELOG for existing users"
    }
  ]
}

Run Details
Property Value
Run ID 24276858254
Triggered by pull_request on 1611/merge
Commit d43b121e0694c01a88769ad7ba66d84d7170280b

Powered by AI Quality Gate workflow

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “lifecycle” set of Osmani-style slash commands under .claude/commands/ and updates top-level routing docs/config to prefer them over the older numbered workflow commands.

Changes:

  • Added 6 lifecycle slash commands: /spec, /plan, /build, /test, /review, /ship.
  • Removed the legacy numbered workflow command markdown files (.claude/commands/workflow/*).
  • Updated routing documentation (CLAUDE.md, AGENTS.md) and adjusted markdownlint config for command files.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
CLAUDE.md Updates routing rules to point at the new lifecycle commands.
AGENTS.md Replaces “workflow” command references with lifecycle command list.
.markdownlint-cli2.yaml Excludes .claude/commands/** from markdownlint (intended to address MD041).
.claude/commands/spec.md New /spec command for requirements + acceptance criteria generation.
.claude/commands/plan.md New /plan command for milestone/task decomposition and risk planning.
.claude/commands/build.md New /build command for TDD implementation + atomic commits workflow.
.claude/commands/test.md New /test command for layered testing + debugging guidance.
.claude/commands/review.md New /review command for 5-axis review orchestration.
.claude/commands/ship.md New /ship command for pre-flight checks + PR creation.
.claude/commands/workflow/0-init.md Removed legacy numbered workflow command.
.claude/commands/workflow/1-plan.md Removed legacy numbered workflow command.
.claude/commands/workflow/2-impl.md Removed legacy numbered workflow command.
.claude/commands/workflow/3-qa.md Removed legacy numbered workflow command.
.claude/commands/workflow/4-security.md Removed legacy numbered workflow command.
.claude/commands/workflow/9-sync.md Removed legacy numbered workflow command.

Comment thread CLAUDE.md Outdated
Comment thread .markdownlint-cli2.yaml Outdated
Comment thread .claude/commands/ship.md Outdated
Comment thread AGENTS.md
Comment thread CLAUDE.md Outdated
Comment thread .claude/commands/spec.md Outdated
Comment thread .claude/commands/test.md Outdated
@coderabbitai

coderabbitai Bot commented Apr 11, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Replaces numbered workflow with six lifecycle slash commands (/spec, /plan, /build, /test, /review, /ship); adds new command docs, deletes legacy workflow files, updates AGENTS.md and CLAUDE.md routing, updates markdownlint ignores, and bumps one skill model metadata. No code/API changes.

Changes

Cohort / File(s) Summary
New Lifecycle Commands
​.claude/commands/spec.md, ​.claude/commands/plan.md, ​.claude/commands/build.md, ​.claude/commands/test.md, ​.claude/commands/review.md, ​.claude/commands/ship.md
Added six agent-skills-style command definitions with frontmatter, stepwise control flows delegating to Task/Skill subagents, evaluation rubrics, structured outputs, and guardrails (e.g., pre-flight checks, commit conventions).
Removed Workflow Commands
​.claude/commands/workflow/0-init.md, ​.claude/commands/workflow/1-plan.md, ​.claude/commands/workflow/2-impl.md, ​.claude/commands/workflow/3-qa.md, ​.claude/commands/workflow/4-security.md, ​.claude/commands/workflow/9-sync.md
Deleted legacy numbered workflow command docs and associated invocation/docs.
Routing & Docs
AGENTS.md, CLAUDE.md
Replaced Workflow: directive with Lifecycle: /spec, /plan, /build, /test, /review, /ship; adjusted CLAUDE.md routing to prefer skill-based invocations and added a “Lifecycle commands” mapping; removed “resume” trigger.
Linter Config
​.markdownlint-cli2.yaml
Added ignores for the new .claude/commands/{spec,plan,build,test,review,ship}.md files to avoid MD041-first-line-heading lint failures.
Skill Metadata
​.claude/skills/prompt-engineer/SKILL.md
Bumped metadata model from claude-sonnet-4-5 to claude-sonnet-4-6.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant CLI as "Lifecycle Command (/spec/.../ship)"
  participant Agent as "Task(subagent)"
  participant Skill as "Skill(s)"
  participant GitCI as "Git / CI"

  User->>CLI: invoke /spec or /plan or /build or /test or /review or /ship
  CLI->>Agent: Task(subagent_type="spec-generator"/"planner"/"implementer"/"qa"/"architect"/"devops")
  Agent->>Skill: call Skill(...) (cva-analysis, milestone-planner, code-qualities-assessment, quality-grades, pipeline-validator, etc.)
  Agent->>GitCI: read refs / create PRs / run pipeline checks
  Skill-->>Agent: analysis / plan / verdicts
  Agent-->>CLI: structured output (milestones, tasks, tests, findings, PR)
  CLI-->>User: report (structured artifacts, PASS/FAIL, PR link)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

documentation

Suggested reviewers

  • rjmurillo
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title follows conventional commit format with type 'feat', scope 'commands', and clear summary of adding 6 lifecycle slash commands.
Linked Issues check ✅ Passed PR meets all acceptance criteria from #1609: 6 command files added with agent-skills structure, frontmatter present, scoped Bash permissions, AGENTS.md/CLAUDE.md updated, workflow/ commands removed.
Out of Scope Changes check ✅ Passed All changes align with #1609 objectives: new lifecycle commands, deleted unused workflows, updated routing, markdown linting config. No unrelated changes detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description directly describes the changeset: replacing legacy numbered workflow commands with 6 lifecycle slash commands, listing all added/deleted/modified files, and referencing issue #1609.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/lifecycle-commands

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 4

🧹 Nitpick comments (2)
.claude/commands/plan.md (2)

1-5: Add model: opus to frontmatter.

Planning with 5-axis evaluation, pre-mortem, and risk analysis is complex reasoning. Specify opus model in frontmatter.

📝 Proposed fix
 ---
 description: Plan how to build it. Decompose specs into milestones with dependencies and risk mitigations. Run after /spec.
 allowed-tools: Task, Skill, Read, Glob, Grep
 argument-hint: [spec-output-or-issue-number]
+model: opus
 ---

As per coding guidelines: "Slash command frontmatter must specify the model field with appropriate Claude model selection based on task complexity (haiku for simple, sonnet for standard, opus for complex reasoning)."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/commands/plan.md around lines 1 - 5, The frontmatter at the top of
the command definition is missing the required model selection; update the YAML
frontmatter block (the leading --- section) to include the key "model: opus" so
the command explicitly uses the opus Claude model for complex reasoning tasks;
ensure the new "model: opus" line is added alongside the existing keys
(description, allowed-tools, argument-hint) in the frontmatter.

7-36: Consider adding ultrathink keyword for deep planning analysis.

Planning with 5-axis evaluation, pre-mortem, dependency ordering, and risk analysis benefits from extended thinking. Add the ultrathink keyword to activate up to 31,999 tokens for deep reasoning.

💡 Example placement
 Invoke the milestone-planner and task-decomposer agents.
 
+ultrathink
+
 Plan how to build: $ARGUMENTS

As per coding guidelines: "Slash command files with complex reasoning tasks (e.g., architectural design, multi-step debugging, trade-off analysis, edge case analysis) SHOULD include the ultrathink keyword in the prompt text to activate extended thinking mode."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/commands/plan.md around lines 7 - 36, Add the ultrathink keyword to
the command prompt so the planner uses extended reasoning; specifically update
the prompt text that begins "Invoke the planner and execution-plans skills." /
"Plan how to build: $ARGUMENTS" to include the literal token ultrathink (e.g.,
prefix or append the invocation line with ultrathink) so the planner/subagent
receives the extended-thinking mode for deep planning analysis across the five
axes and pre-mortem steps.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/commands/plan.md:
- Around line 7-11: Update the wording on the first line to use the same agent
naming used later: replace "planner and execution-plans skills" with a phrase
that references the agent subagent_type names (e.g., "milestone-planner and
task-decomposer agents") or explicitly clarify the mapping, so the invocation
lines that call Task(subagent_type="milestone-planner") and
Task(subagent_type="task-decomposer") match the descriptive text.

In @.claude/commands/review.md:
- Line 4: The frontmatter key "argument-hint" is too broad (currently
"argument-hint: [branch-or-pr-number]") while the implementation only handles
branch diffs via "git diff main...HEAD"; update the contract to match the code
by changing the argument-hint to only accept branch identifiers (e.g.,
"argument-hint: [branch]") or alternatively implement PR resolution logic that
maps a PR number to a branch/commit before running "git diff main...HEAD";
ensure references to "argument-hint", the prompt body, and any validation logic
enforce the new narrower contract so the prompt and implementation remain
consistent.

In @.markdownlint-cli2.yaml:
- Around line 141-142: Remove the blanket exclusion "- \".claude/commands/**\""
and instead add a targeted rule override that disables only MD041 for files
under the ".claude/commands/**" pattern (or more specific pattern like
".claude/commands/**/*.md") so other markdownlint rules (e.g., MD040) still run;
update the adjacent comment to accurately state that only MD041 is being
suppressed for slash command files with YAML frontmatter. Ensure the override
references the rule ID MD041 and the exact pattern ".claude/commands/**" (or a
more specific glob) so the change is narrowly scoped and self-describing.

In `@CLAUDE.md`:
- Around line 42-48: Clarify that the listed lifecycle actions (the bullets that
say "invoke spec", "invoke plan", "invoke build", "invoke test", "invoke
review", "invoke ship", "invoke analyze") are skill/tool invocations rather than
generic command routing: update the wording to explicitly distinguish "command
routing" (agent-level or CLI commands) from "skill routing" (calls into the
Skill tool), e.g., change each "invoke X" to "use Skill:X" or "call Skill X" and
add a short prefix sentence like "Use the Skill tool for the following lifecycle
actions; do not treat these as plain command routing." Ensure the change touches
the bullet list entries and the surrounding explanatory sentence so readers
cannot confuse invoking a skill with issuing a command.

---

Nitpick comments:
In @.claude/commands/plan.md:
- Around line 1-5: The frontmatter at the top of the command definition is
missing the required model selection; update the YAML frontmatter block (the
leading --- section) to include the key "model: opus" so the command explicitly
uses the opus Claude model for complex reasoning tasks; ensure the new "model:
opus" line is added alongside the existing keys (description, allowed-tools,
argument-hint) in the frontmatter.
- Around line 7-36: Add the ultrathink keyword to the command prompt so the
planner uses extended reasoning; specifically update the prompt text that begins
"Invoke the planner and execution-plans skills." / "Plan how to build:
$ARGUMENTS" to include the literal token ultrathink (e.g., prefix or append the
invocation line with ultrathink) so the planner/subagent receives the
extended-thinking mode for deep planning analysis across the five axes and
pre-mortem steps.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 080b12e8-428b-4f87-b38f-98e8eaabe6ae

📥 Commits

Reviewing files that changed from the base of the PR and between 9260bca and c487ce6.

📒 Files selected for processing (15)
  • .claude/commands/build.md
  • .claude/commands/plan.md
  • .claude/commands/review.md
  • .claude/commands/ship.md
  • .claude/commands/spec.md
  • .claude/commands/test.md
  • .claude/commands/workflow/0-init.md
  • .claude/commands/workflow/1-plan.md
  • .claude/commands/workflow/2-impl.md
  • .claude/commands/workflow/3-qa.md
  • .claude/commands/workflow/4-security.md
  • .claude/commands/workflow/9-sync.md
  • .markdownlint-cli2.yaml
  • AGENTS.md
  • CLAUDE.md
💤 Files with no reviewable changes (6)
  • .claude/commands/workflow/9-sync.md
  • .claude/commands/workflow/4-security.md
  • .claude/commands/workflow/2-impl.md
  • .claude/commands/workflow/1-plan.md
  • .claude/commands/workflow/3-qa.md
  • .claude/commands/workflow/0-init.md

Comment thread .claude/commands/plan.md Outdated
Comment thread .claude/commands/review.md
Comment thread .markdownlint-cli2.yaml Outdated
Comment thread CLAUDE.md Outdated
- Fix CLAUDE.md skill routing to reference actual skills (analyze, github,
  pr-comment-responder, reflect, session-end, quality-grades) and separate
  lifecycle commands (/spec, /plan, /build, /test, /review, /ship) into
  their own section
- Add @CLAUDE.md import to spec.md, plan.md, test.md, and ship.md for
  consistent project context loading
- Add Write and Edit to test.md allowed-tools to enable writing tests
  and fixing code as documented in steps 5-6
@coderabbitai coderabbitai Bot added the automation Automated workflows and processes label Apr 11, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/commands/plan.md:
- Around line 1-5: Update the slash-command frontmatter to include the required
"model" key and revise the description field to use trigger phrasing;
specifically add a "model" entry alongside existing keys (allowed-tools,
argument-hint) and change the description line (the "description" frontmatter
entry) from declarative wording to start with "Use when ..." so it validates as
a slash-command (adjust the text "Plan how to build it..." to a trigger form).

In @.claude/commands/test.md:
- Line 2: Update the frontmatter description in .claude/commands/test.md to
follow the trigger-based pattern: replace the current generic "description" text
with a sentence that begins with "Use when..." (or an explicit trigger like "Run
when...") that states the conditions under which the command should be invoked
(e.g., "Use when you want to run layered tests after /build to prove changes and
debug failures"). Edit the description field in the file (the frontmatter key
"description") to this trigger-style phrasing so it clearly specifies WHEN to
run the command.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 312701be-4d35-423f-b28c-10749511b7ab

📥 Commits

Reviewing files that changed from the base of the PR and between c487ce6 and 043db08.

📒 Files selected for processing (5)
  • .claude/commands/plan.md
  • .claude/commands/ship.md
  • .claude/commands/spec.md
  • .claude/commands/test.md
  • CLAUDE.md
✅ Files skipped from review due to trivial changes (2)
  • .claude/commands/spec.md
  • .claude/commands/ship.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • CLAUDE.md

Comment thread .claude/commands/plan.md
Comment thread .claude/commands/test.md Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Overly broad markdownlint ignore disables all rules for commands
    • Replaced the broad .claude/commands/** ignore with explicit list of only the 6 new lifecycle command files, preserving full linting for pre-existing command files.
  • ✅ Fixed: Ship command invokes commands as if they were skills
    • Changed instructions to correctly reference /validate-pr-description and /push-pr as slash commands instead of incorrectly calling them skills via the Skill tool.
Preview (48d9d4876e)
diff --git a/.claude/commands/build.md b/.claude/commands/build.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/build.md
@@ -1,0 +1,48 @@
+---
+description: Build incrementally. Implement changes in thin vertical slices with TDD and atomic commits. Run after /plan.
+allowed-tools: Task, Skill, Read, Write, Edit, Glob, Grep, Bash(git add:*), Bash(git commit:*), Bash(git status:*), Bash(git diff:*), Bash(python3:*)
+argument-hint: [plan-step-or-task-description]
+---
+
+@CLAUDE.md
+
+Invoke the code-qualities-assessment and golden-principles skills.
+
+Build: $ARGUMENTS
+
+Use Task(subagent_type="implementer") as the primary agent. If no argument provided, check for recent /plan output or ask what to build.
+
+Evaluate across all 5 axes:
+
+1. **Test-first discipline** - Red before green before refactor. No code without a failing test.
+2. **Commit atomicity** - Each commit is one logical change, rollback-safe.
+3. **Code quality** - Cohesion, coupling, encapsulation, testability, non-redundancy.
+4. **Complexity budget** - Cyclomatic complexity <=10. Methods <=60 lines. No nesting.
+5. **Standards compliance** - Golden principles, style enforcement, naming conventions.
+
+## Software Hierarchy of Needs
+
+Design emerges bottom-up. Enforce qualities before reaching for patterns.
+
+1. Qualities: Cohesion, Coupling, DRY, Encapsulation, Testability
+2. Principles: Open-Closed, Encapsulate by Policy, Separation of Concerns
+3. Practices: Programming by Intention, State Always Private, CVA
+4. Wisdom: Design to interfaces, Favor delegation over inheritance, Encapsulate what varies, Separate use from creation
+5. Patterns: Only when the problem demands it. Three similar lines beat a premature abstraction.
+
+## Process
+
+1. Read the task or plan step
+2. Write a failing test (red)
+3. Write the minimum code to pass (green)
+4. Refactor toward the hierarchy of needs (refactor)
+5. Run golden-principles and taste-lints before committing
+6. Commit atomically with conventional message
+7. Repeat for next slice
+
+## Guardrails
+
+- Hard to test? Fix the design, not the test. Indicates tight coupling or weak encapsulation.
+- Ask "how would I test this?" even without tests.
+- Every method should read like a sentence (Programming by Intention).
+- Favor delegation over inheritance. A makes B, or A uses B. Never both.

diff --git a/.claude/commands/plan.md b/.claude/commands/plan.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/plan.md
@@ -1,0 +1,47 @@
+---
+description: Plan how to build it. Decompose specs into milestones with dependencies and risk mitigations. Run after /spec.
+allowed-tools: Task, Skill, Read, Glob, Grep
+argument-hint: [spec-output-or-issue-number]
+---
+
+@CLAUDE.md
+
+Invoke the planner and execution-plans skills.
+
+Plan how to build: $ARGUMENTS
+
+Use Task(subagent_type="milestone-planner") for milestone breakdown, then Task(subagent_type="task-decomposer") for atomic tasks. If no argument provided, check for recent /spec output or ask what to plan.
+
+Evaluate across all 5 axes:
+
+1. **Scope integrity** - Nothing unnecessary, nothing missing
+2. **Dependency ordering** - Can tasks execute in the stated sequence?
+3. **Risk coverage** - All P0 risks have mitigations
+4. **Estimate confidence** - Complexity-based sizing (S/M/L), not time-based
+5. **Reversibility** - Which steps are hard to undo?
+
+## Principles
+
+- **Programming by Intention**: Sergeant methods direct workflow. Each task should read like an intent, not an implementation.
+- **OODA Loop**: Observe (read the spec), Orient (map to existing code), Decide (sequence tasks), Act (commit the plan). Faster loops win.
+- **First Principles**: Question the requirement, try to delete the step, then optimize, then speed up, then automate. Never automate something that should not exist.
+
+## Process
+
+1. Read the spec or issue
+2. Map sub-problems to existing code (what already exists?)
+3. Break into milestones with clear exit criteria
+4. Decompose milestones into atomic tasks (each independently verifiable)
+5. Sequence by dependencies, flag parallel opportunities
+6. Run pre-mortem on the plan itself
+7. Route to Task(subagent_type="critic") for validation
+
+## Output
+
+Structured plan with:
+
+- Milestones (numbered, with exit criteria)
+- Tasks per milestone (atomic, with acceptance criteria)
+- Dependency graph (what blocks what)
+- Risk register (risk, likelihood, mitigation)
+- Deferred items (explicitly out of scope for this plan)

diff --git a/.claude/commands/review.md b/.claude/commands/review.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/review.md
@@ -1,0 +1,49 @@
+---
+description: Review before merge. Five-axis code review across architecture, security, quality, tests, and standards. Run after /test.
+allowed-tools: Task, Skill, Read, Glob, Grep, Bash(git diff:*), Bash(git log:*), Bash(git status:*)
+argument-hint: [branch-or-pr-number]
+---
+
+@CLAUDE.md
+
+Invoke the analyze, code-qualities-assessment, and security-scan skills.
+
+Review the current changes across all five axes: $ARGUMENTS
+
+If no argument, review the current branch diff against main.
+
+Sequential evaluation order:
+
+1. **Architecture** - Follows existing patterns? Clean boundaries? Right abstraction level? Coupling intentional? ADR conformance?
+2. **Security** - Input validated? Secrets safe? Auth checked? OWASP top 10? STRIDE threats? CWE scan? (Use Task(subagent_type="security"))
+3. **Code quality** - Score all 5 qualities: cohesion, coupling, encapsulation, testability, non-redundancy. Cyclomatic complexity <=10? Methods <=60 lines?
+4. **Test completeness** - Every new code path has a test? Failure paths covered? Acceptance criteria verified?
+5. **Standards** - Golden principles, taste lints, style enforcement, naming conventions
+
+## Principles
+
+- **Design to interfaces**: Review signatures from the consumer perspective. Hidden implementation details should stay hidden.
+- **Encapsulate what varies**: If the diff introduces variation, is it encapsulated? Or scattered?
+- **Chesterton's Fence**: Before removing code, verify you understand why it existed.
+- **Principle of Least Privilege**: New permissions, scopes, or access? Challenge each one.
+
+## Process
+
+1. Read the diff (git diff main...HEAD)
+2. Architecture pass: Task(subagent_type="architect") evaluates structure
+3. Security pass: Task(subagent_type="security") evaluates threats
+4. Quality pass: invoke code-qualities-assessment skill
+5. Test pass: Task(subagent_type="qa") evaluates coverage
+6. Standards pass: invoke golden-principles and taste-lints skills
+7. Synthesize findings
+
+## Output
+
+Categorize each finding as **Critical**, **Important**, or **Suggestion**.
+
+Structured review with:
+
+- Finding (what is wrong)
+- Location (file:line)
+- Severity (Critical/Important/Suggestion)
+- Fix (specific recommendation)

diff --git a/.claude/commands/ship.md b/.claude/commands/ship.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/ship.md
@@ -1,0 +1,44 @@
+---
+description: Ship it. Pre-flight validation, CI check, and PR creation. Run after /review.
+allowed-tools: Task, Skill, Read, Glob, Grep, Bash(git diff:*), Bash(git log:*), Bash(git status:*), Bash(git push:*), Bash(python3:*)
+argument-hint: [target-branch]
+---
+
+@CLAUDE.md
+
+Invoke the pipeline-validator skill.
+
+Ship the current branch: $ARGUMENTS
+
+Use Task(subagent_type="devops") as the primary agent. Default target is main unless specified.
+
+Pre-flight checks (all must pass):
+
+1. **Pipeline health** - All CI checks green? No suppressed failures? Run pipeline-validator.
+2. **Security posture** - Final security-scan clean? No new CWE findings? No secrets in diff?
+3. **Review complete** - Has /review been run? Any unresolved Critical findings?
+4. **Tests passing** - All tests green? No skipped tests without justification?
+5. **Standards clean** - Golden principles and taste lints pass?
+
+## Principles
+
+- **Faster is safer**: Small, frequent shipments reduce blast radius. Ship early.
+- **No deliberate debt**: If it is not ready, do not ship it. Fix it or defer it.
+- **Observability first**: If you cannot measure it, you cannot ship it safely.
+
+## Process
+
+1. Run pre-flight checks (all 5 above)
+2. If any check fails: report what failed, why, and how to fix. Stop.
+3. If all pass: validate PR description (run /validate-pr-description command)
+4. Create PR via /push-pr command
+5. Report: what shipped, PR link, any warnings
+
+## Output
+
+Ship report:
+
+- Pre-flight results (pass/fail per check)
+- PR link (if created)
+- Warnings (non-blocking concerns)
+- Next steps (monitoring, follow-up items)

diff --git a/.claude/commands/spec.md b/.claude/commands/spec.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/spec.md
@@ -1,0 +1,45 @@
+---
+description: Define what to build. Transform a problem into testable requirements with acceptance criteria.
+allowed-tools: Task, Skill, Read, Glob, Grep
+argument-hint: [problem-statement-or-issue-number]
+---
+
+@CLAUDE.md
+
+Invoke the cva-analysis and decision-critic skills.
+
+Define what to build for: $ARGUMENTS
+
+Use Task(subagent_type="spec-generator") to produce requirements. If no argument provided, ask what problem to solve.
+
+Evaluate across all 5 axes:
+
+1. **Problem clarity** - Is the right problem being solved? Could a reframing yield 10x impact?
+2. **Requirement testability** - Can each requirement be verified pass/fail?
+3. **Completeness** - No gaps between problem statement and acceptance criteria?
+4. **Traceability** - REQ to DESIGN to TASK linkage established?
+5. **Feasibility** - Buildable within constraints? Existing code to leverage?
+
+## Principles
+
+- **CVA**: Identify commonalities first, then variabilities, then relationships. Greatest risk is the wrong abstraction.
+- **YAGNI**: Only specify what is needed now. Speculative requirements create waste.
+- **Separation of Concerns**: Each requirement addresses one concern. Mixed concerns signal a missing decomposition.
+
+## Process
+
+1. Clarify the problem (what, who, why, constraints)
+2. Search for existing solutions in the codebase (grep for related patterns)
+3. Apply CVA: what is common across use cases? What varies?
+4. Write requirements as testable acceptance criteria
+5. Run pre-mortem: what fails first?
+6. Run decision-critic: challenge assumptions before committing
+
+## Output
+
+Structured requirements with:
+
+- Problem statement (1-2 sentences)
+- Acceptance criteria (numbered, testable)
+- Out of scope (explicit exclusions)
+- Open questions (unresolved unknowns)

diff --git a/.claude/commands/test.md b/.claude/commands/test.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/test.md
@@ -1,0 +1,47 @@
+---
+description: Prove it works. Run layered tests and debug failures with hypothesis-driven investigation. Run after /build.
+allowed-tools: Task, Skill, Read, Write, Edit, Glob, Grep, Bash(git diff:*), Bash(git status:*), Bash(python3:*), Bash(pytest:*), Bash(npm test:*)
+argument-hint: [component-or-failure-description]
+---
+
+@CLAUDE.md
+
+Invoke the code-qualities-assessment and quality-grades skills.
+
+Test: $ARGUMENTS
+
+Use Task(subagent_type="qa") as the primary agent. For security testing, also invoke Task(subagent_type="security"). If no argument provided, test the current branch diff against main.
+
+Evaluate across all 5 axes:
+
+1. **Unit coverage** - Each method in isolation, dependencies injected
+2. **Integration coverage** - Contracts between components verified
+3. **Acceptance coverage** - Each requirement has a passing test
+4. **Security coverage** - OWASP top 10 scenarios exercised
+5. **Failure coverage** - Error paths tested, chaos hypotheses validated
+
+## Principles
+
+- **Testability is design feedback**: Hard to test means poor encapsulation, tight coupling, Law of Demeter violation, weak cohesion, or procedural code.
+- **Tests are proof**: A passing test is evidence. A missing test is a gap in knowledge.
+- **Hypothesis-driven debugging**: When a test fails, form a hypothesis before changing code. Verify the hypothesis. Then fix.
+
+## Process
+
+1. Identify what changed (git diff against main)
+2. Map changes to test coverage: which tests cover this code?
+3. Run existing tests first (catch regressions)
+4. Identify coverage gaps: new code paths without tests
+5. Write missing tests (unit first, then integration)
+6. For failures: hypothesis, verify, fix (never change code without understanding why)
+7. Run security-scan for CWE patterns
+8. Report: passing, failing, gaps, recommendations
+
+## Output
+
+Structured test report:
+
+- Tests run (count, pass/fail)
+- Coverage gaps (specific files and functions)
+- Security findings (CWE references)
+- Recommendations (what to add, what to fix)

diff --git a/.claude/commands/workflow/0-init.md b/.claude/commands/workflow/0-init.md
deleted file mode 100644
--- a/.claude/commands/workflow/0-init.md
+++ /dev/null
@@ -1,48 +1,0 @@
----
-description: Session initialization - enforce ADR-007 memory-first architecture at session start. Loads project context, creates session log, and declares current branch via Invoke-Init.ps1.
-argument-hint: [--session-number N] [--objective "text"]
-allowed-tools:
-  - Bash(pwsh .claude/skills/workflow/scripts/*)
-  - Bash(pwsh .claude/skills/session-init/scripts/*)
-  - Bash(git:*)
-  - Bash(ls:*)
-  - Read
-model: sonnet
----
-
-# /0-init — Session Initialization
-
-Enforce ADR-007 memory-first architecture at session start.
-
-## Context
-
-Recent sessions: !`ls -1 .agents/sessions/ | tail -5`
-
-Current branch: !`git branch --show-current`
-
-## Invocation
-
-```bash
-pwsh .claude/skills/workflow/scripts/Invoke-Init.ps1 $ARGUMENTS
-```
-
-## What This Command Does
-
-1. **Load project context** — initializes session state via Agent Orchestration MCP (graceful fallback if unavailable)
-2. **Load initial instructions** — read AGENTS.md for current project rules
-3. **Read HANDOFF.md** — load prior session context (read-only)
-4. **Surface prior context** — retrieves relevant session history via Agent Orchestration MCP (graceful fallback if unavailable)
-5. **Create session log** — via `New-SessionLog.ps1`
-6. **Declare current branch** — output git branch for orientation
-7. **Record evidence** — persist session state (graceful fallback if unavailable)
-
-## Arguments
-
-- `--session-number N`: Optional. Auto-detected from `.agents/sessions/`.
-- `--objective "text"`: Optional. Derived from branch name if omitted.
-
-## Related
-
-- Protocol: `.agents/SESSION-PROTOCOL.md`
-- ADR-007: `.agents/architecture/ADR-007-memory-first-architecture.md`
-- Session Init Skill: `.claude/skills/session-init/SKILL.md`
\ No newline at end of file

diff --git a/.claude/commands/workflow/1-plan.md b/.claude/commands/workflow/1-plan.md
deleted file mode 100644
--- a/.claude/commands/workflow/1-plan.md
+++ /dev/null
@@ -1,54 +1,0 @@
----
-description: Planning phase - route task to planner (default), architect (--arch), or roadmap→high-level-advisor chain (--strategic).
-argument-hint: [--arch] [--strategic] <task-description>
-allowed-tools:
-  - Bash(pwsh .claude/skills/workflow/scripts/*)
-  - Bash(git:*)
-  - Bash(ls:*)
-  - Read
-  - mcp__agent_orchestration__invoke_agent
-  - mcp__agent_orchestration__track_handoff
-  - mcp__agent_orchestration__get_routing_recommendation
-model: sonnet
----
-
-# /1-plan — Planning Phase
-
-Route a planning task to the appropriate agent.
-
-## Context
-
-Current branch: !`git branch --show-current`
-
-Recent commits: !`git log --oneline -5`
-
-Planning artifacts: !`ls -1 .agents/planning/ 2>/dev/null | tail -10`
-
-## Invocation
-
-```bash
-pwsh .claude/skills/workflow/scripts/Invoke-Plan.ps1 $ARGUMENTS
-```
-
-## Variants
-
-| Flag | Agent | Use When |
-|------|-------|----------|
-| *(none)* | `planner` | Standard feature/task planning |
-| `--arch` | `architect` | Design decisions, ADR-worthy choices |
-| `--strategic` | `roadmap → high-level-advisor` | Roadmap, epics, strategic alignment |
-
-## Arguments
-
-- `--arch`: Use architect agent instead of planner.
-- `--strategic`: Chain roadmap agent → high-level-advisor.
-- Remaining text: Task description passed to agent.
-
-## Output
-
-Planning artifacts stored in `.agents/planning/`.
-
-## Related
-
-- ADR-013: `.agents/architecture/ADR-013-agent-orchestration-mcp.md`
-- Agent Orchestration Spec: `.agents/specs/agent-orchestration-mcp-spec.md`
\ No newline at end of file

diff --git a/.claude/commands/workflow/2-impl.md b/.claude/commands/workflow/2-impl.md
deleted file mode 100644
--- a/.claude/commands/workflow/2-impl.md
+++ /dev/null
@@ -1,49 +1,0 @@
----
-description: Implementation phase - invoke implementer agent (default), or run full sequential chain (--full), or parallel execution of implementer+qa+security (--parallel).
-argument-hint: [--full] [--parallel] <implementation-task>
-allowed-tools:
-  - Bash(pwsh .claude/skills/workflow/scripts/*)
-  - Bash(git:*)
-  - Bash(ls:*)
-  - Read
-  - mcp__agent_orchestration__invoke_agent
-  - mcp__agent_orchestration__track_handoff
-  - mcp__agent_orchestration__start_parallel_execution
-  - mcp__agent_orchestration__aggregate_parallel_results
-  - mcp__agent_orchestration__resolve_conflict
-model: sonnet
----
-
-# /2-impl — Implementation Phase
-
-Invoke the implementer agent, optionally chaining QA and security.
-
-## Context
-
-Planning artifacts: !`ls -1 .agents/planning/ 2>/dev/null | tail -10`
-
-Current branch: !`git branch --show-current`
-
-## Invocation
-
-```bash
-pwsh .claude/skills/workflow/scripts/Invoke-Impl.ps1 $ARGUMENTS
-```
-
-## Execution Modes
-
-| Flag | Mode | Description |
-|------|------|-------------|
-| *(none)* | Default | Implementer agent only |
-| `--full` | Sequential | implementer → qa → security |
-| `--parallel` | Parallel | implementer + parallel(qa, security) |
-
-## Arguments
-
-- `--full`: Run full sequential chain after implementation.
-- `--parallel`: Run QA and security in parallel after implementation.
-- Remaining text: Implementation task description.
-
-## Related
-
-- ADR-013: `.agents/architecture/ADR-013-agent-orchestration-mcp.md`
\ No newline at end of file

diff --git a/.claude/commands/workflow/3-qa.md b/.claude/commands/workflow/3-qa.md
deleted file mode 100644
--- a/.claude/commands/workflow/3-qa.md
+++ /dev/null
@@ -1,45 +1,0 @@
----
-description: Quality assurance - invoke QA agent, validate test coverage, check acceptance criteria, and report results.
-argument-hint: [--coverage-threshold N] <verification-scope>
-allowed-tools:
-  - Bash(pwsh .claude/skills/workflow/scripts/*)
-  - Bash(git:*)
-  - Bash(ls:*)
-  - Read
-  - mcp__agent_orchestration__invoke_agent
-  - mcp__agent_orchestration__track_handoff
-model: sonnet
----
-
-# /3-qa — Quality Assurance
-
-Invoke the QA agent and validate implementation quality.
-
-## Context
-
-Implementation artifacts: !`ls -1 .agents/sessions/ | tail -3`
-
-Current branch: !`git branch --show-current`
-
-## Invocation
-
-```bash
-pwsh .claude/skills/workflow/scripts/Invoke-QA.ps1 $ARGUMENTS
-```
-
-## What This Command Does
-
-1. Invoke `qa` agent via Agent Orchestration MCP
-2. Validate test coverage against threshold (default: 80%)
-3. Check acceptance criteria from planning artifacts
-4. Report pass/fail with details
-5. Track handoff back to orchestrator
-
-## Arguments
-
-- `--coverage-threshold N`: Minimum coverage percentage (default: 80).
-- Remaining text: Verification scope.
-
-## Related
-
-- ADR-006: `.agents/architecture/ADR-006-thin-workflows-testable-modules.md`
\ No newline at end of file

diff --git a/.claude/commands/workflow/4-security.md b/.claude/commands/workflow/4-security.md
deleted file mode 100644
--- a/.claude/commands/workflow/4-security.md
+++ /dev/null
@@ -1,46 +1,0 @@
----
-description: Security review - invoke security agent with OWASP Top 10 check, secret detection, and dependency audit.
-argument-hint: [--owasp-only] [--secrets-only] <security-scope>
-allowed-tools:
-  - Bash(pwsh .claude/skills/workflow/scripts/*)
-  - Bash(git:*)
-  - Bash(ls:*)
-  - Read
-  - mcp__agent_orchestration__invoke_agent
-  - mcp__agent_orchestration__track_handoff
-model: opus
----
-
-# /4-security — Security Review
-
-Comprehensive security assessment using the security agent.
-
-## Context
-
-Implementation artifacts: !`ls -1 .agents/sessions/ | tail -3`
-
-Current branch: !`git branch --show-current`
-
-## Invocation
-
-```bash
-pwsh .claude/skills/workflow/scripts/Invoke-Security.ps1 $ARGUMENTS
-```
-
-## What This Command Does
-
-1. Invoke `security` agent via Agent Orchestration MCP (model: opus per ADR-013)
-2. OWASP Top 10 check (skipped with `--secrets-only`)
-3. Secret detection scan (skipped with `--owasp-only`)
-4. Dependency audit for known vulnerabilities
-5. Generate security report with findings
-
-## Arguments
-
-- `--owasp-only`: Run only OWASP Top 10 check.
-- `--secrets-only`: Run only secret detection.
-- Remaining text: Security scope.
-
-## Related
-
-- ADR-013: `.agents/architecture/ADR-013-agent-orchestration-mcp.md`
\ No newline at end of file

diff --git a/.claude/commands/workflow/9-sync.md b/.claude/commands/workflow/9-sync.md
deleted file mode 100644
--- a/.claude/commands/workflow/9-sync.md
+++ /dev/null
@@ -1,116 +1,0 @@
----
-description: Auto-generate session documentation. Queries session history, generates workflow diagrams, updates session logs, and syncs memory. Use at the end of any workflow to capture what happened.
-model: sonnet
-argument-hint: [--dry-run]
-allowed-tools:
-  - Bash(python .claude/skills/workflow/scripts/*)
-  - Bash(git:*)
-  - mcp__serena__*
-  - mcp__forgetful__*
----
-
-# /9-sync — Auto-Documentation & Memory Sync
-
-Generate comprehensive session documentation automatically.
-
-## Overview
-
-This command closes the workflow loop by documenting what happened during a session. It:
-
-1. Collects session history (agents invoked, tools used, files changed)
-2. Generates a workflow sequence diagram (Mermaid)
-3. Extracts key decisions and artifacts
-4. Appends documentation to the session log
-5. Syncs context to Serena memory for cross-session persistence
-6. Suggests retrospective learnings
-
-## Execution Steps
-
-### Step 1: Gather Session Context
-
-Collect the current session state:
-
-```bash
-# Get current branch and recent commits
-git log --oneline -20 --since="$(date -d '8 hours ago' --iso-8601)" 2>/dev/null || git log --oneline -20
-
-# Get files changed in this session
-git diff --stat HEAD~10..HEAD 2>/dev/null || git diff --stat main..HEAD
-
-# Get current session log if it exists
-ls -t .agents/sessions/*.json 2>/dev/null | head -1
-```
-
-### Step 2: Generate Session Documentation
-
-Run the sync script to produce the session documentation:
-
-```bash
-python .claude/skills/workflow/scripts/sync_session_documentation.py $ARGUMENTS
-```
-
-This script will:
-
-- Scan git history for session commits
-- Identify agents referenced in commit messages
-- Generate a Mermaid sequence diagram
-- Produce a structured session summary
-
-### Step 3: Extract Decisions and Artifacts
-
-From the session context, identify:
-
-- **Decisions made**: ADRs created/modified, design choices documented
-- **Artifacts created**: New files, modified scripts, PRs opened
-- **Issues referenced**: GitHub issues addressed or discovered
-- **Risks identified**: Any blockers or concerns raised
-
-### Step 4: Update Session Log
-
-Append the sync output to the current session log in `.agents/sessions/`. The entry MUST include:
-
-| Field | Description |
-|-------|-------------|
-| `agents_invoked` | Ordered list of agents used (with duration estimates) |
-| `decisions_made` | Key decisions with rationale |
-| `artifacts_created` | Files, commits, issues, PRs |
-| `workflow_diagram` | Mermaid sequence diagram |
-| `retrospective_learnings` | Suggested improvements |
-
-### Step 5: Sync to Memory Systems
-
-Update persistent memory for cross-session context:
-
-1. **Serena**: Store key decisions and outcomes via `mcp__serena__save_memory`
-2. **Forgetful**: Record learnings via `mcp__forgetful__save_memory`
-
-### Step 6: Suggest Retrospective Learnings
-
-Based on the session, suggest:
-
-- What went well (patterns to repeat)
-- What could improve (process gaps)
-- What to watch for (emerging risks)
-
-## Arguments
-
-| Argument | Description |
-|----------|-------------|
-| `--dry-run` | Preview documentation without writing to session log |
-
-## Output
-
-The command produces a session sync report with sections for Workflow Diagram (Mermaid),
-Agents Invoked, Decisions Made, Artifacts Created, and Retrospective Learnings.
-
-## Dependencies
-
-- Session State MCP (`agents://history` resource) — graceful fallback to git history when unavailable
-- Serena MCP — for memory persistence
-- Forgetful MCP — for learning extraction
-
-## Related
-
-- [SESSION-PROTOCOL.md](../../../.agents/SESSION-PROTOCOL.md) — Session requirements
-- [ADR-007: Memory-First Architecture](../../../.agents/architecture/ADR-007-memory-first-architecture.md)
-- [PRD: Workflow Orchestration Enhancement](../../../.agents/planning/prd-workflow-orchestration-enhancement.md)
\ No newline at end of file

diff --git a/.markdownlint-cli2.yaml b/.markdownlint-cli2.yaml
--- a/.markdownlint-cli2.yaml
+++ b/.markdownlint-cli2.yaml
@@ -137,3 +137,12 @@
   # CLAUDE.md files are managed by the claude-mem plugin which prepends <claude-mem-context>
   # HTML tags, violating MD033 and MD041. These are tool-managed metadata, not authored markdown.
   - "**/CLAUDE.md"
+
+  # New lifecycle command files have only YAML frontmatter, no H1 heading. MD041 not applicable.
+  # Only the 6 new lifecycle commands are excluded; other command files retain full linting.
+  - ".claude/commands/spec.md"
+  - ".claude/commands/plan.md"
+  - ".claude/commands/build.md"
+  - ".claude/commands/test.md"
+  - ".claude/commands/review.md"
+  - ".claude/commands/ship.md"

diff --git a/AGENTS.md b/AGENTS.md
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -42,7 +42,7 @@
 |PRs: GitHub|Reviews: pr-comment-responder|Conflicts: merge-resolver
 |Session: session-init, session-end|CI fix: session-log-fixer|Push: /push-pr
 |Security: security-detection|Quality: analyze|Learn: reflect
-|Workflow: workflow (0-init, 1-plan, 2-impl, 3-qa, 4-security)
+|Lifecycle: /spec, /plan, /build, /test, /review, /ship
 
 ### ADR Review (BLOCKING)
 

diff --git a/CLAUDE.md b/CLAUDE.md
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -39,11 +39,19 @@
 The skill has specialized workflows that produce better results than ad-hoc answers.
 
 Key routing rules:
-- Bugs, errors, "why is this broken", 500 errors → invoke analyze
-- Ship, deploy, push, create PR → invoke github
-- QA, test the site, find bugs → invoke qa (subagent_type)
-- Code review, check my diff → invoke pr-comment-responder
-- Weekly retro → invoke reflect
-- Architecture review → invoke analyze
-- Save progress, checkpoint, resume → invoke session-end
-- Code quality, health check → invoke quality-grades
+- Bugs, errors, "why is this broken" → invoke analyze skill
+- PRs, issues, GitHub operations → invoke github skill
+- PR review threads, comment triage → invoke pr-comment-responder skill
+- Weekly retro → invoke reflect skill
+- Save progress, checkpoint → invoke session-end skill
+- Code quality, health check → invoke quality-grades skill
+
+## Lifecycle commands
+
+For development lifecycle phases, use these slash commands (not skills):
+- Define requirements, "what should we build" → /spec
+- Plan work, break down tasks, estimate → /plan
+- Implement, code, build features → /build
+- Test, prove it works, debug failures → /test
+- Review code, check my diff → /review
+- Ship, deploy, push, create PR → /ship

You can send follow-ups to the cloud agent here.

Comment thread .markdownlint-cli2.yaml Outdated
Comment thread .claude/commands/ship.md Outdated
…command

- .markdownlint-cli2.yaml: Replace overly broad .claude/commands/** ignore
  with specific list of 6 new lifecycle command files. This preserves
  full linting for pre-existing command files (push-pr.md, pr-review.md,
  validate-pr-description.md, etc.)

- .claude/commands/ship.md: Correct references to validate-pr-description
  and push-pr. These are slash commands (.claude/commands/*.md), not
  skills (.claude/skills/*/SKILL.md). Update instructions to use command
  invocation syntax instead of Skill tool references.
Copilot AI review requested due to automatic review settings April 11, 2026 05:28
@rjmurillo rjmurillo review requested due to automatic review settings April 11, 2026 05:28
@rjmurillo

Copy link
Copy Markdown
Owner Author

Review Triage Required

Note

Priority: NORMAL - Human approval required before bot responds

Review Summary

Source Reviews Comments
Human 3 9
Bot 3 6

Next Steps

  1. Review human feedback above
  2. Address any CHANGES_REQUESTED from human reviewers
  3. Add triage:approved label when ready for bot to respond to review comments

Powered by PR Maintenance workflow - Add triage:approved label

…cle commands

- Add role personas to every Task(subagent_type) invocation
- Move skill invocations from top-level directives into process steps
- Add empty $ARGUMENTS guards to all commands
- Add @CLAUDE.md import to all 6 for consistency
- Add PR type classification to /test (skip irrelevant gates)
- Add structured verdict output format to /test and /ship
- Add Write to spec.md and plan.md allowed-tools
- Add Bash(uv:*) to build.md and test.md, Bash(gh:*) to ship.md
- Fix review.md hard-coded main branch reference
- Expand test.md from 5 axes to 6 quality gates with multi-agent dispatch
- Update prompt-engineer skill to claude-sonnet-4-6

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 11, 2026 05:51
@github-actions

Copy link
Copy Markdown
Contributor

Spec-to-Implementation Validation

Caution

Final Verdict: FAIL

What is Spec Validation?

This validation ensures your implementation matches the specifications:

  • Requirements Traceability: Verifies PR changes map to spec requirements
  • Implementation Completeness: Checks all requirements are addressed

Validation Summary

Check Verdict Status
Requirements Traceability PARTIAL ⚠️
Implementation Completeness PARTIAL ⚠️

Spec References

Type References
Specs None
Issues 1609
Requirements Traceability Details

Now I have all the information needed to build the requirements coverage matrix.

Requirements Coverage Matrix

Requirement Description Status Evidence
AC-1 6 command files in .claude/commands/ COVERED spec.md, plan.md, build.md, test.md, review.md, ship.md all exist
AC-2 Each follows agent-skills command structure (frontmatter + prompt) COVERED All 6 files have YAML frontmatter with description, allowed-tools, argument-hint
AC-3 Each command has scoped Bash permissions (no wildcards) PARTIAL build.md uses Bash(*) wildcard; other 5 commands use scoped permissions
AC-4 AGENTS.md updated (workflow reference replaced) COVERED AGENTS.md line 45: Lifecycle: /spec, /plan, /build, /test, /review, /ship
AC-5 workflow/ commands deleted COVERED Path /home/runner/work/ai-agents/ai-agents/.claude/commands/workflow does not exist
REQ-SPEC /spec uses spec-generator, cva-analysis, decision-critic, pre-mortem COVERED spec.md:17-21 invokes cva-analysis, decision-critic, pre-mortem via analyst and critic agents
REQ-PLAN /plan uses milestone-planner, planner, execution-plans, cynefin-classifier PARTIAL plan.md uses milestone-planner, task-decomposer, execution-plans. Cynefin-classifier not invoked.
REQ-BUILD /build uses implementer, code-qualities-assessment, golden-principles PARTIAL build.md:15 uses implementer, line 28 invokes code-qualities-assessment. golden-principles not invoked.
REQ-TEST /test uses qa, quality-grades, security-scan, codeql-scan PARTIAL test.md uses qa, security-scan, quality-grades. codeql-scan not invoked.
REQ-REVIEW /review uses architect, analyze, threat-modeling, style-enforcement PARTIAL review.md uses architect, security, qa, code-qualities-assessment, golden-principles. analyze and threat-modeling skills not invoked directly.
REQ-SHIP /ship uses devops, pipeline-validator, validate-pr-description COVERED ship.md:17-21 invokes pipeline-validator, security-scan, validate-pr-description, push-pr via devops agent
MD041-EXCLUDE Exclude .claude/commands/ from markdownlint MD041 rule COVERED .markdownlint-cli2.yaml:142-148 explicitly ignores all 6 lifecycle command files

Summary

  • Total Requirements: 12
  • Covered: 7 (58%)
  • Partially Covered: 5 (42%)
  • Not Covered: 0 (0%)

Gaps

  1. AC-3 (Bash permissions): build.md uses Bash(*) wildcard instead of scoped permissions. Issue Ship 6 core skills for Claude Code / Copilot #1609 explicitly requires "scoped Bash permissions (no wildcards)".

  2. REQ-PLAN: cynefin-classifier skill not invoked in /plan command as specified in issue requirements table.

  3. REQ-BUILD: golden-principles skill not invoked in /build command as specified in issue requirements table.

  4. REQ-TEST: codeql-scan skill not invoked in /test command as specified in issue requirements table.

  5. REQ-REVIEW: analyze and threat-modeling skills not directly invoked in /review command. Uses security-scan and agents instead.

[!WARNING]
VERDICT: PARTIAL
All 6 commands exist with correct structure, and AGENTS.md/CLAUDE.md are updated. However, 5 skill invocations from the original requirements table are missing or substituted, and build.md uses a Bash wildcard violating AC-3.

Implementation Completeness Details

Acceptance Criteria Checklist

  • 6 command files in .claude/commands/ - SATISFIED

    • Evidence: spec.md, plan.md, build.md, test.md, review.md, ship.md all present
  • Each follows agent-skills command structure (frontmatter + prompt) - SATISFIED

    • Evidence: All 6 files have YAML frontmatter with description, allowed-tools, argument-hint plus @CLAUDE.md reference and structured prompt
  • Each command has scoped Bash permissions (no wildcards) - NOT SATISFIED

    • build.md:3 uses Bash(*) which is a wildcard
    • spec.md: No Bash - PASS
    • plan.md: No Bash - PASS
    • test.md: Bash(git diff:*), Bash(git status:*), etc. - scoped, PASS
    • review.md: Bash(git diff:*), Bash(git log:*), etc. - scoped, PASS
    • ship.md: Bash(git diff:*), Bash(gh:*), etc. - scoped, PASS
  • AGENTS.md updated (workflow reference replaced) - SATISFIED

    • Evidence: AGENTS.md:45 shows Lifecycle: /spec, /plan, /build, /test, /review, /ship
  • workflow/ commands deleted - SATISFIED

    • Evidence: .claude/commands/workflow directory does not exist
  • [~] Anti-patterns addressed via autoplan review findings - PARTIALLY SATISFIED

    • Implemented: Structured frontmatter, agent delegation, scoped tools mostly
    • Missing: build.md uses Bash(*) wildcard which violates acceptance criterion enhancement: add harden-runner #3

Missing Functionality

  1. build.md uses Bash(*) wildcard - Acceptance criterion explicitly requires "scoped Bash permissions (no wildcards)"

Edge Cases Not Covered

  1. The build command needs full Bash access for running tests, compiling, etc., but the spec requires scoped permissions. A resolution is needed.

Implementation Quality

  • Completeness: 83% of acceptance criteria satisfied (5/6)
  • Quality: Commands are well-structured, follow the Osmani pattern, and integrate with existing agents/skills

[!WARNING]
VERDICT: PARTIAL
5 of 6 acceptance criteria satisfied. build.md line 3 uses Bash(*) wildcard which violates the "scoped Bash permissions (no wildcards)" requirement. Either scope the Bash permissions (e.g., Bash(git:*), Bash(pytest:*), Bash(npm:*), Bash(uv:*)) or document why build requires unrestricted access and update the acceptance criteria.


Run Details
Property Value
Run ID 24276065750
Triggered by pull_request on 1611/merge

Powered by AI Spec Validator workflow

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.

Comment thread .claude/commands/build.md
Comment thread .claude/commands/review.md
Comment thread .claude/commands/ship.md Outdated
Comment thread .markdownlint-cli2.yaml
Comment thread .claude/skills/prompt-engineer/SKILL.md
Comment thread .claude/commands/test.md Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Test command hardcodes origin/main as base branch
    • Updated test.md Step 0 to detect base branch from gh pr view --json baseRefName with fallback to main, and added Bash(gh:*) to allowed-tools.
  • ✅ Fixed: Review command lacks permission to run gh CLI
    • Added Bash(gh:*) to allowed-tools in review.md to enable the documented base branch detection via gh pr view.
Preview (8301c93bd9)
diff --git a/.claude/commands/build.md b/.claude/commands/build.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/build.md
@@ -1,0 +1,42 @@
+---
+description: Build incrementally. Implement changes in thin vertical slices with TDD and atomic commits. Run after /plan.
+allowed-tools: Task, Skill, Read, Write, Edit, Glob, Grep, Bash(*)
+argument-hint: [plan-step-or-task-description]
+---
+
+@CLAUDE.md
+
+Build: $ARGUMENTS
+
+If $ARGUMENTS is empty, check for recent /plan output in the conversation. If none found, ask the user what to build.
+
+## Agent
+
+Task(subagent_type="implementer"): You are a senior engineer. Discover the project's tech stack, coding patterns, and test conventions by reading the codebase. Build in thin vertical slices. Test-first when the project has tests. Commit atomically.
+
+For each slice:
+
+1. Read the task
+2. Understand the existing code patterns (read related files, check test conventions)
+3. Write a failing test if the project has a test framework
+4. Write the minimum code to pass
+5. Refactor toward quality (cohesion, encapsulation, simplicity)
+6. Commit with a conventional message
+
+## Quality Signals
+
+After implementation, invoke Skill(skill="code-qualities-assessment") to score the result.
+
+The agent should self-check:
+
+- Is this hard to test? That indicates a design problem, not a test problem.
+- Does every method read like a sentence? (Programming by Intention)
+- Is coupling intentional or accidental?
+- Would a stranger understand this code without asking questions?
+
+## Guardrails
+
+- Atomic commits. Each commit is one logical change, rollback-safe.
+- No code without understanding the existing patterns first.
+- Favor delegation over inheritance. A makes B, or A uses B. Never both.
+- Three similar lines beat a premature abstraction.

diff --git a/.claude/commands/plan.md b/.claude/commands/plan.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/plan.md
@@ -1,0 +1,45 @@
+---
+description: Plan how to build it. Decompose specs into milestones with dependencies and risk mitigations. Run after /spec.
+allowed-tools: Task, Skill, Read, Write, Glob, Grep
+argument-hint: [spec-output-or-issue-number]
+---
+
+@CLAUDE.md
+
+Plan: $ARGUMENTS
+
+If $ARGUMENTS is empty, check for recent /spec output in the conversation. If none found, ask the user what to plan.
+
+## Process
+
+1. Read the spec or issue
+2. Map sub-problems to existing code (what already exists? use Grep/Glob to verify)
+3. Task(subagent_type="milestone-planner"): You are a project planner. Break the spec into milestones with clear exit criteria. Each milestone is independently shippable. Sequence by dependencies. Flag parallel opportunities.
+4. Task(subagent_type="task-decomposer"): You are a work breakdown specialist. Decompose each milestone into atomic tasks. Each task is independently verifiable with a clear done definition. Size by complexity (S/M/L), not time.
+5. Invoke Skill(skill="execution-plans") to persist the plan as a versioned artifact.
+6. Task(subagent_type="analyst"): You are a risk analyst. Run a pre-mortem on this plan. What fails first? What dependencies are fragile? What assumptions are untested?
+7. Task(subagent_type="critic"): You are a plan reviewer. Validate: is scope complete? Can tasks execute in the stated sequence? Are estimates credible? Is anything missing?
+
+## Evaluation Axes
+
+1. **Scope integrity** - Nothing unnecessary, nothing missing
+2. **Dependency ordering** - Can tasks execute in the stated sequence?
+3. **Risk coverage** - All P0 risks have mitigations
+4. **Estimate confidence** - Complexity-based sizing (S/M/L), not time-based
+5. **Reversibility** - Which steps are hard to undo?
+
+## Principles
+
+- **Programming by Intention**: Each task should read like an intent, not an implementation detail.
+- **OODA Loop**: Observe (read the spec), Orient (map to existing code), Decide (sequence tasks), Act (commit the plan). Faster loops win.
+- **First Principles**: Question the requirement, try to delete the step, then optimize, then speed up, then automate. Never automate something that should not exist.
+
+## Output
+
+Structured plan:
+
+- **Milestones** (numbered, with exit criteria)
+- **Tasks per milestone** (atomic, with acceptance criteria and S/M/L sizing)
+- **Dependency graph** (what blocks what, what can run in parallel)
+- **Risk register** (risk, likelihood, impact, mitigation)
+- **Deferred items** (explicitly out of scope for this plan)

diff --git a/.claude/commands/review.md b/.claude/commands/review.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/review.md
@@ -1,0 +1,80 @@
+---
+description: Review before merge. Five-axis code review across architecture, security, quality, tests, and standards. Run after /test.
+allowed-tools: Task, Skill, Read, Glob, Grep, Bash(git diff:*), Bash(git log:*), Bash(git status:*), Bash(gh:*)
+argument-hint: [branch-or-pr-number]
+---
+
+@CLAUDE.md
+
+Review: $ARGUMENTS
+
+If no argument, review the current branch diff against the base branch. Detect the base branch from `gh pr view --json baseRefName` or fall back to `main`.
+
+## Process
+
+Run axes sequentially. Each axis produces findings categorized as Critical, Important, or Suggestion.
+
+1. Read the diff (git diff against detected base branch)
+2. **Architecture pass**: Task(subagent_type="architect")
+3. **Security pass**: Task(subagent_type="security")
+4. **Quality pass**: Invoke Skill(skill="code-qualities-assessment")
+5. **Test pass**: Task(subagent_type="qa")
+6. **Standards pass**: Invoke Skill(skill="golden-principles") and Skill(skill="taste-lints")
+7. Synthesize findings across all axes
+
+## Axis 1: Architecture
+
+Task(subagent_type="architect"): You are a software architect reviewing for structural integrity. Check ADR conformance in .agents/architecture/. Evaluate from the consumer perspective, not the implementer perspective. Findings must cite file:line.
+
+- Follows existing patterns? Clean boundaries? Right abstraction level?
+- Coupling intentional? Cohesion strong?
+- ADR conformance? Any decisions that need a new ADR?
+
+## Axis 2: Security
+
+Invoke Skill(skill="security-scan") for CWE pattern detection.
+
+Task(subagent_type="security"): You are a security auditor. Assume every input is malicious. Reference CWE numbers. Evaluate:
+
+- Input validated? Secrets safe? Auth checked?
+- OWASP top 10? STRIDE threats?
+- New permissions, scopes, or access? Challenge each one (Principle of Least Privilege).
+
+## Axis 3: Code Quality
+
+Invoke Skill(skill="code-qualities-assessment") to score all 5 qualities: cohesion, coupling, encapsulation, testability, non-redundancy.
+
+- Cyclomatic complexity <=10? Methods <=60 lines?
+- DRY violations? Premature abstractions?
+
+## Axis 4: Test Completeness
+
+Task(subagent_type="qa"): You are a QA engineer verifying coverage. For every new code path in the diff, verify a corresponding test exists. Flag gaps with specific file:line references.
+
+- Every new code path has a test? Failure paths covered?
+- Acceptance criteria verified?
+
+## Axis 5: Standards
+
+Invoke Skill(skill="golden-principles") and Skill(skill="taste-lints").
+
+- Golden principle violations? Naming conventions?
+- Style enforcement? Consistency with existing patterns?
+
+## Principles
+
+- **Design to interfaces**: Review signatures from the consumer perspective. Hidden implementation details should stay hidden.
+- **Encapsulate what varies**: If the diff introduces variation, is it encapsulated? Or scattered?
+- **Chesterton's Fence**: Before removing code, verify you understand why it existed.
+- **Principle of Least Privilege**: New permissions, scopes, or access? Challenge each one.
+
+## Output
+
+Categorize each finding as **Critical**, **Important**, or **Suggestion**.
+
+Per-finding format:
+
+- Finding (what is wrong)
+- Location (file:line)
+- Severity (Critical/Important/Suggestion)
+- Fix (specific recommendation)

diff --git a/.claude/commands/ship.md b/.claude/commands/ship.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/ship.md
@@ -1,0 +1,53 @@
+---
+description: Ship it. Pre-flight validation, CI check, and PR creation. Run after /review.
+allowed-tools: Task, Skill, Read, Glob, Grep, Bash(git diff:*), Bash(git log:*), Bash(git status:*), Bash(git push:*), Bash(gh:*), Bash(python3:*)
+argument-hint: [target-branch]
+---
+
+@CLAUDE.md
+
+Ship: $ARGUMENTS
+
+Default target is main unless specified. If $ARGUMENTS names a different branch, use that as the target.
+
+## Pre-flight Checks
+
+Task(subagent_type="devops"): You are a release engineer. Run all 5 pre-flight checks below. Report pass/fail for each with specific evidence. Any failure blocks shipping.
+
+1. **Pipeline health** - Invoke Skill(skill="pipeline-validator"). All CI checks green? No suppressed failures?
+2. **Security posture** - Invoke Skill(skill="security-scan"). No new CWE findings? No secrets in diff?
+3. **Review complete** - Has /review been run on this branch? Any unresolved Critical findings? Check review logs.
+4. **Tests passing** - All tests green? No skipped tests without justification?
+5. **Standards clean** - Invoke Skill(skill="golden-principles") and Skill(skill="taste-lints"). Both pass?
+
+## Process
+
+1. Run all 5 pre-flight checks
+2. If any check fails: report what failed, why, and how to fix. Stop.
+3. If all pass: invoke Skill(skill="validate-pr-description") to validate PR metadata
+4. Create PR: invoke Skill(skill="push-pr") to commit, push, and open PR
+5. Report: what shipped, PR link, any warnings
+
+## Principles
+
+- **Faster is safer**: Small, frequent shipments reduce blast radius. Ship early.
+- **No deliberate debt**: If it is not ready, do not ship it. Fix it or defer it.
+- **Observability first**: If you cannot measure it, you cannot ship it safely.
+
+## Output
+
+Ship report:
+
+```text
+PRE-FLIGHT:
+  Pipeline:  PASS|FAIL (evidence)
+  Security:  PASS|FAIL (evidence)
+  Review:    PASS|FAIL (evidence)
+  Tests:     PASS|FAIL (evidence)
+  Standards: PASS|FAIL (evidence)
+
+RESULT: SHIPPED|BLOCKED
+PR: [link if created]
+WARNINGS: [non-blocking concerns]
+NEXT: [monitoring, follow-up items]
+```

diff --git a/.claude/commands/spec.md b/.claude/commands/spec.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/spec.md
@@ -1,0 +1,45 @@
+---
+description: Define what to build. Transform a problem into testable requirements with acceptance criteria.
+allowed-tools: Task, Skill, Read, Write, Glob, Grep
+argument-hint: [problem-statement-or-issue-number]
+---
+
+@CLAUDE.md
+
+Spec: $ARGUMENTS
+
+If $ARGUMENTS is empty, ask the user what problem to solve. Do not proceed without a problem statement.
+
+## Process
+
+1. Clarify the problem (what, who, why, constraints)
+2. Search for existing solutions in the codebase (grep for related patterns)
+3. Invoke Skill(skill="cva-analysis"): identify commonalities across use cases, then variabilities, then relationships
+4. Write requirements as testable acceptance criteria
+5. Task(subagent_type="analyst"): You are a requirements analyst. Your job is to find gaps, ambiguities, and untestable requirements. For each requirement, ask: can this be verified pass/fail? Flag anything vague.
+6. Invoke Skill(skill="decision-critic"): challenge assumptions before committing
+7. Task(subagent_type="critic"): You are a skeptical reviewer. Run a pre-mortem: assume this spec ships and fails. What broke first? What was missing?
+
+## Evaluation Axes
+
+1. **Problem clarity** - Is the right problem being solved? Could a reframing yield 10x impact?
+2. **Requirement testability** - Can each requirement be verified pass/fail?
+3. **Completeness** - No gaps between problem statement and acceptance criteria?
+4. **Traceability** - REQ to DESIGN to TASK linkage established?
+5. **Feasibility** - Buildable within constraints? Existing code to leverage?
+
+## Principles
+
+- **CVA**: Identify commonalities first, then variabilities, then relationships. Greatest risk is the wrong abstraction.
+- **YAGNI**: Only specify what is needed now. Speculative requirements create waste.
+- **Separation of Concerns**: Each requirement addresses one concern. Mixed concerns signal a missing decomposition.
+
+## Output
+
+Structured requirements document:
+
+- **Problem statement** (1-2 sentences)
+- **Acceptance criteria** (numbered, each independently testable as pass/fail)
+- **Out of scope** (explicit exclusions to prevent creep)
+- **Open questions** (unresolved unknowns with owners)
+- **CVA summary** (what is common, what varies, what relationships exist)

diff --git a/.claude/commands/test.md b/.claude/commands/test.md
new file mode 100644
--- /dev/null
+++ b/.claude/commands/test.md
@@ -1,0 +1,144 @@
+---
+description: Prove it works. Multi-dimensional quality validation across functional, non-functional, security, DevOps, DX, and observability. Run after /build.
+allowed-tools: Task, Skill, Read, Glob, Grep, Bash(git diff:*), Bash(git status:*), Bash(git log:*), Bash(gh:*), Bash(python3:*), Bash(pytest:*), Bash(npm test:*), Bash(uv:*), Bash(pester:*)
+argument-hint: [component-or-failure-description]
+---
+
+@CLAUDE.md
+
+Test: $ARGUMENTS
+
+If $ARGUMENTS is empty, test the current branch diff against the base branch.
+
+## Step 0: Classify PR Type
+
+Detect the base branch from `gh pr view --json baseRefName` or fall back to `main`. Run `git diff origin/<base-branch> --name-only` and classify changed files:
+
+| Type | Patterns | Gates to Run |
+|------|----------|--------------|
+| CODE | *.py, *.ps1, *.ts, *.js, *.cs | All 6 gates |
+| WORKFLOW | *.yml in .github/workflows/ | Gates 1, 3, 4 |
+| CONFIG | *.json, *.yaml (non-workflow) | Gates 3, 4 |
+| DOCS | *.md, *.txt, *.rst | Gate 5 only |
+| MIXED | Combination | Apply per-file rules |
+
+Print: `PR TYPE: [type]. Running gates: [list].`
+
+Skip non-applicable gates. Do not waste agent invocations on irrelevant dimensions.
+
+## Gate 1: Functional Testing
+
+Invoke Skill(skill="code-qualities-assessment") for quality baseline.
+
+Task(subagent_type="qa"): You are a senior QA engineer. Your job is to catch issues that will cause production incidents. Be skeptical. Cite specific file:line evidence for every finding. Evaluate:
+
+1. **Unit coverage** - Each method in isolation, dependencies injected. Every new function has at least 1 test.
+2. **Integration coverage** - Contracts between components verified. Cross-module boundaries exercised.
+3. **Acceptance coverage** - Each requirement has a passing test. Map to acceptance criteria from /spec output.
+4. **Edge cases** - Null/empty/boundary values, invalid types, concurrent access where applicable.
+5. **Error paths** - Every catch/error branch tested. No silent swallowing. Resources cleaned up on failure.
+6. **Regression risk** - High-risk areas (auth, data persistence, payments) require full coverage regardless of change size.
+
+Output: `VERDICT: PASS|WARN|CRITICAL_FAIL` with findings array.
+
+## Gate 2: Non-Functional Testing
+
+Task(subagent_type="analyst"): You are a performance and reliability engineer. Focus on failure modes, not the happy path. Use measurable criteria, not subjective judgments. Evaluate:
+
+1. **Performance** - No N+1 queries, no O(n*m) in hot paths, no blocking calls in async context.
+2. **Scalability** - Will this bottleneck under load? Connection pooling, caching strategy, pagination.
+3. **Reliability** - Retry logic, circuit breakers, graceful degradation. Failure modes tested.
+4. **Complexity** - Cyclomatic complexity <=10. Methods <=60 lines. No deep nesting.
+5. **Maintainability** - Readability, naming clarity, consistency with existing patterns.
+
+Output: `VERDICT: PASS|WARN|CRITICAL_FAIL` with findings array.
+
+## Gate 3: Security Testing
+
+Invoke Skill(skill="security-scan") for CWE pattern detection.
+
+Task(subagent_type="security"): You are a security auditor performing OWASP Top 10 review. Assume every input is malicious. Reference CWE numbers for every finding. Evaluate:
+
+1. **Injection** - Shell (CWE-78), XSS (CWE-79), SQL (CWE-89). No string interpolation in queries.
+2. **Authentication** - Session handling, credential storage, token validation.
+3. **Secrets** - No hardcoded API keys, passwords, tokens in diff. Secrets via environment only.
+4. **Input validation** - All user-facing inputs validated. LLM output treated as untrusted.
+5. **Dependencies** - New packages reviewed for known vulnerabilities. Versions pinned.
+
+Output: `VERDICT: PASS|WARN|CRITICAL_FAIL` with findings array including CWE references.
+
+## Gate 4: DevOps Testing
+
+Task(subagent_type="devops"): You are a build and release engineer. Focus on pipeline safety, reproducibility, and supply chain security. Evaluate:
+
+1. **Pipeline impact** - Do changes affect CI/CD? Are workflow files valid YAML?
+2. **Actions security** - Pinned to SHA? Permissions scoped minimally? No secrets in logs?
+3. **Shell quality** - Input sanitization, exit code handling, error propagation.
+4. **Build reproducibility** - Deterministic builds, locked dependencies, no floating versions.
+5. **Artifact integrity** - Correct upload/download, retention policy, no sensitive data in artifacts.
+
+Output: `VERDICT: PASS|WARN|CRITICAL_FAIL` with findings array.
+
+## Gate 5: Developer Experience (DX)
+
+Task(subagent_type="critic"): You are a developer advocate reviewing from the consumer perspective. Would a new contributor understand this code? Would the API frustrate or delight? Evaluate:
+
+1. **API ergonomics** - Consumer perspective. Are signatures intuitive? Error messages helpful?
+2. **Documentation** - Is changed behavior documented? Are code comments accurate (not stale)?
+3. **Debuggability** - Can a developer diagnose failures from logs alone? Stack traces preserved?
+4. **Onboarding** - Would a new contributor understand this code? Are conventions followed?
+5. **Tooling** - Does this work with existing linters, formatters, IDE support?
+
+Output: `VERDICT: PASS|WARN|CRITICAL_FAIL` with findings array.
+
+## Gate 6: Observability and Monitoring
+
+Task(subagent_type="architect"): You are an SRE reviewing production readiness. If this code fails at 3am, can oncall diagnose it without reading the source? Evaluate:
+
+1. **Logging** - Are meaningful events logged? Structured logging with correlation IDs?
+2. **Metrics** - Are SLIs defined for new features? Latency, error rate, throughput tracked?
+3. **Alerting** - Would failures trigger alerts? Are thresholds appropriate?
+4. **Tracing** - Are distributed traces propagated? Span context preserved across boundaries?
+5. **Health checks** - New services have liveness/readiness probes? Degradation detectable?
+
+Output: `VERDICT: PASS|WARN|CRITICAL_FAIL` with findings array.
+
+## Principles
+
+- **Testability is design feedback**: Hard to test means poor encapsulation, tight coupling, Law of Demeter violation, weak cohesion, or procedural code.
+- **Tests are proof**: A passing test is evidence. A missing test is a gap in knowledge.
+- **Hypothesis-driven debugging**: When a test fails, form a hypothesis before changing code. Verify the hypothesis. Then fix.
+- **Defense in depth**: Assume the happy path works. Focus on failure modes.
+
+## Process
+
+1. Identify what changed (git diff against base branch)
+2. Classify PR type (Step 0). Skip non-applicable gates.
+3. Run applicable gates sequentially. Each gate dispatches its own agent.
+4. If any gate produces CRITICAL_FAIL: continue remaining gates (findings are additive). Mark overall verdict as CRITICAL_FAIL immediately.
+5. For test failures: hypothesis, verify, fix (never change code without understanding why)
+6. Invoke Skill(skill="quality-grades") to synthesize gate verdicts into overall quality score.
+
+## Output
+
+Each gate MUST produce a verdict line and findings array:
+
+```text
+GATE: [name]
+VERDICT: PASS|WARN|CRITICAL_FAIL
+FINDINGS:
+- [SEVERITY] (file:line) description — recommendation
+```
+
+Synthesize into overall report:
+
+| Gate | Verdict | Findings | Evidence |
+|------|---------|----------|----------|
+| Functional | PASS/WARN/CRITICAL_FAIL | Count | file:line citations |
+| Non-Functional | PASS/WARN/CRITICAL_FAIL | Count | file:line citations |
+| Security | PASS/WARN/CRITICAL_FAIL | Count | CWE references |
+| DevOps | PASS/WARN/CRITICAL_FAIL | Count | file:line citations |
+| DX | PASS/WARN/CRITICAL_FAIL | Count | file:line citations |
+| Observability | PASS/WARN/CRITICAL_FAIL | Count | file:line citations |
+
+**Overall verdict**: CRITICAL_FAIL if any gate fails. WARN if any gate warns. PASS if all gates pass.

diff --git a/.claude/commands/workflow/0-init.md b/.claude/commands/workflow/0-init.md
deleted file mode 100644
--- a/.claude/commands/workflow/0-init.md
+++ /dev/null
@@ -1,48 +1,0 @@
----
-description: Session initialization - enforce ADR-007 memory-first architecture at session start. Loads project context, creates session log, and declares current branch via Invoke-Init.ps1.
-argument-hint: [--session-number N] [--objective "text"]
-allowed-tools:
-  - Bash(pwsh .claude/skills/workflow/scripts/*)
-  - Bash(pwsh .claude/skills/session-init/scripts/*)
-  - Bash(git:*)
-  - Bash(ls:*)
-  - Read
-model: sonnet
----
-
-# /0-init — Session Initialization
-
-Enforce ADR-007 memory-first architecture at session start.
-
-## Context
-
-Recent sessions: !`ls -1 .agents/sessions/ | tail -5`
-
-Current branch: !`git branch --show-current`
-
-## Invocation
-
-```bash
-pwsh .claude/skills/workflow/scripts/Invoke-Init.ps1 $ARGUMENTS
-```
-
-## What This Command Does
-
-1. **Load project context** — initializes session state via Agent Orchestration MCP (graceful fallback if unavailable)
-2. **Load initial instructions** — read AGENTS.md for current project rules
-3. **Read HANDOFF.md** — load prior session context (read-only)
-4. **Surface prior context** — retrieves relevant session history via Agent Orchestration MCP (graceful fallback if unavailable)
-5. **Create session log** — via `New-SessionLog.ps1`
-6. **Declare current branch** — output git branch for orientation
-7. **Record evidence** — persist session state (graceful fallback if unavailable)
-
-## Arguments
-
-- `--session-number N`: Optional. Auto-detected from `.agents/sessions/`.
-- `--objective "text"`: Optional. Derived from branch name if omitted.
-
-## Related
-
-- Protocol: `.agents/SESSION-PROTOCOL.md`
-- ADR-007: `.agents/architecture/ADR-007-memory-first-architecture.md`
-- Session Init Skill: `.claude/skills/session-init/SKILL.md`
\ No newline at end of file

diff --git a/.claude/commands/workflow/1-plan.md b/.claude/commands/workflow/1-plan.md
deleted file mode 100644
--- a/.claude/commands/workflow/1-plan.md
+++ /dev/null
@@ -1,54 +1,0 @@
----
-description: Planning phase - route task to planner (default), architect (--arch), or roadmap→high-level-advisor chain (--strategic).
-argument-hint: [--arch] [--strategic] <task-description>
-allowed-tools:
-  - Bash(pwsh .claude/skills/workflow/scripts/*)
-  - Bash(git:*)
-  - Bash(ls:*)
-  - Read
-  - mcp__agent_orchestration__invoke_agent
-  - mcp__agent_orchestration__track_handoff
-  - mcp__agent_orchestration__get_routing_recommendation
-model: sonnet
----
-
-# /1-plan — Planning Phase
-
-Route a planning task to the appropriate agent.
-
-## Context
-
-Current branch: !`git branch --show-current`
-
-Recent commits: !`git log --oneline -5`
-
-Planning artifacts: !`ls -1 .agents/planning/ 2>/dev/null | tail -10`
-
-## Invocation
-
-```bash
-pwsh .claude/skills/workflow/scripts/Invoke-Plan.ps1 $ARGUMENTS
-```
-
-## Variants
-
-| Flag | Agent | Use When |
-|------|-------|----------|
-| *(none)* | `planner` | Standard feature/task planning |
-| `--arch` | `architect` | Design decisions, ADR-worthy choices |
-| `--strategic` | `roadmap → high-level-advisor` | Roadmap, epics, strategic alignment |
-
-## Arguments
-
-- `--arch`: Use architect agent instead of planner.
-- `--strategic`: Chain roadmap agent → high-level-advisor.
-- Remaining text: Task description passed to agent.
-
-## Output
-
-Planning artifacts stored in `.agents/planning/`.
-
-## Related
-
-- ADR-013: `.agents/architecture/ADR-013-agent-orchestration-mcp.md`
-- Agent Orchestration Spec: `.agents/specs/agent-orchestration-mcp-spec.md`
\ No newline at end of file

diff --git a/.claude/commands/workflow/2-impl.md b/.claude/commands/workflow/2-impl.md
deleted file mode 100644
--- a/.claude/commands/workflow/2-impl.md
+++ /dev/null
@@ -1,49 +1,0 @@
----
-description: Implementation phase - invoke implementer agent (default), or run full sequential chain (--full), or parallel execution of implementer+qa+security (--parallel).
-argument-hint: [--full] [--parallel] <implementation-task>
-allowed-tools:
-  - Bash(pwsh .claude/skills/workflow/scripts/*)
-  - Bash(git:*)
-  - Bash(ls:*)
-  - Read
-  - mcp__agent_orchestration__invoke_agent
-  - mcp__agent_orchestration__track_handoff
-  - mcp__agent_orchestration__start_parallel_execution
-  - mcp__agent_orchestration__aggregate_parallel_results
-  - mcp__agent_orchestration__resolve_conflict
-model: sonnet
----
-
-# /2-impl — Implementation Phase
-
-Invoke the implementer agent, optionally chaining QA and security.
-
-## Context
-
-Planning artifacts: !`ls -1 .agents/planning/ 2>/dev/null | tail -10`
-
-Current branch: !`git branch --show-current`
-
-## Invocation
-
-```bash
-pwsh .claude/skills/workflow/scripts/Invoke-Impl.ps1 $ARGUMENTS
-```
-
-## Execution Modes
-
-| Flag | Mode | Description |
-|------|------|-------------|
-| *(none)* | Default | Implementer agent only |
-| `--full` | Sequential | implementer → qa → security |
-| `--parallel` | Parallel | implementer + parallel(qa, security) |
-
-## Arguments
-
-- `--full`: Run full sequential chain after implementation.
-- `--parallel`: Run QA and security in parallel after implementation.
-- Remaining text: Implementation task description.
-
-## Related
-
-- ADR-013: `.agents/architecture/ADR-013-agent-orchestration-mcp.md`
\ No newline at end of file

diff --git a/.claude/commands/workflow/3-qa.md b/.claude/commands/workflow/3-qa.md
deleted file mode 100644
--- a/.claude/commands/workflow/3-qa.md
+++ /dev/null
@@ -1,45 +1,0 @@
----
-description: Quality assurance - invoke QA agent, validate test coverage, check acceptance criteria, and report results.
-argument-hint: [--coverage-threshold N] <verification-scope>
-allowed-tools:
-  - Bash(pwsh .claude/skills/workflow/scripts/*)
-  - Bash(git:*)
-  - Bash(ls:*)
-  - Read
-  - mcp__agent_orchestration__invoke_agent
-  - mcp__agent_orchestration__track_handoff
-model: sonnet
----
-
-# /3-qa — Quality Assurance
-
-Invoke the QA agent and validate implementation quality.
-
-## Context
-
-Implementation artifacts: !`ls -1 .agents/sessions/ | tail -3`
-
-Current branch: !`git branch --show-current`
-
-## Invocation
-
-```bash
-pwsh .claude/skills/workflow/scripts/Invoke-QA.ps1 $ARGUMENTS
-```
-
-## What This Command Does
-
-1. Invoke `qa` agent via Agent Orchestration MCP
-2. Validate test coverage against threshold (default: 80%)
-3. Check acceptance criteria from planning artifacts
-4. Report pass/fail with details
-5. Track handoff back to orchestrator
-
-## Arguments
-
-- `--coverage-threshold N`: Minimum coverage percentage (default: 80).
-- Remaining text: Verification scope.
-
-## Related
-
-- ADR-006: `.agents/architecture/ADR-006-thin-workflows-testable-modules.md`
\ No newline at end of file

diff --git a/.claude/commands/workflow/4-security.md b/.claude/commands/workflow/4-security.md
deleted file mode 100644
--- a/.claude/commands/workflow/4-security.md
+++ /dev/null
@@ -1,46 +1,0 @@
----
-description: Security review - invoke security agent with OWASP Top 10 check, secret detection, and dependency audit.
-argument-hint: [--owasp-only] [--secrets-only] <security-scope>
-allowed-tools:
-  - Bash(pwsh .claude/skills/workflow/scripts/*)
-  - Bash(git:*)
-  - Bash(ls:*)
-  - Read
-  - mcp__agent_orchestration__invoke_agent
-  - mcp__agent_orchestration__track_handoff
-model: opus
----
-
-# /4-security — Security Review
-
-Comprehensive security assessment using the security agent.
-
-## Context
-
-Implementation artifacts: !`ls -1 .agents/sessions/ | tail -3`
-
-Current branch: !`git branch --show-current`
-
-## Invocation
-
-```bash
-pwsh .claude/skills/workflow/scripts/Invoke-Security.ps1 $ARGUMENTS
-```
-
-## What This Command Does
-
-1. Invoke `security` agent via Agent Orchestration MCP (model: opus per ADR-013)
-2. OWASP Top 10 check (skipped with `--secrets-only`)
-3. Secret detection scan (skipped with `--owasp-only`)
-4. Dependency audit for known vulnerabilities
-5. Generate security report with findings
-
-## Arguments
-
-- `--owasp-only`: Run only OWASP Top 10 check.
-- `--secrets-only`: Run only secret detection.
-- Remaining text: Security scope.
-
-## Related
-
-- ADR-013: `.agents/architecture/ADR-013-agent-orchestration-mcp.md`
\ No newline at end of file

diff --git a/.claude/commands/workflow/9-sync.md b/.claude/commands/workflow/9-sync.md
deleted file mode 100644
--- a/.claude/commands/workflow/9-sync.md
+++ /dev/null
@@ -1,116 +1,0 @@
----
-description: Auto-generate session documentation. Queries session history, generates workflow diagrams, updates session logs, and syncs memory. Use at the end of any workflow to capture what happened.
-model: sonnet
-argument-hint: [--dry-run]
-allowed-tools:
-  - Bash(python .claude/skills/workflow/scripts/*)
-  - Bash(git:*)
-  - mcp__serena__*
-  - mcp__forgetful__*
----
-
-# /9-sync — Auto-Documentation & Memory Sync
-
-Generate comprehensive session documentation automatically.
-
-## Overview
-
-This command closes the workflow loop by documenting what happened during a session. It:
-
-1. Collects session history (agents invoked, tools used, files changed)
-2. Generates a workflow sequence diagram (Mermaid)
-3. Extracts key decisions and artifacts
-4. Appends documentation to the session log
-5. Syncs context to Serena memory for cross-session persistence
-6. Suggests retrospective learnings
-
-## Execution Steps
-
-### Step 1: Gather Session Context
-
-Collect the current session state:
-
-```bash
-# Get current branch and recent commits
-git log --oneline -20 --since="$(date -d '8 hours ago' --iso-8601)" 2>/dev/null || git log --oneline -20
-
-# Get files changed in this session
-git diff --stat HEAD~10..HEAD 2>/dev/null || git diff --stat main..HEAD
-
-# Get current session log if it exists
-ls -t .agents/sessions/*.json 2>/dev/null | head -1
-```
-
-### Step 2: Generate Session Documentation
-
-Run the sync script to produce the session documentation:
-
-```bash
-python .claude/skills/workflow/scripts/sync_session_documentation.py $ARGUMENTS
-```
-
-This script will:
-
-- Scan git history for session commits
-- Identify agents referenced in commit messages
-- Generate a Mermaid sequence diagram
-- Produce a structured session summary
-
-### Step 3: Extract Decisions and Artifacts
-
-From the session context, identify:
-
-- **Decisions made**: ADRs created/modified, design choices documented
-- **Artifacts created**: New files, modified scripts, PRs opened
-- **Issues referenced**: GitHub issues addressed or discovered
-- **Risks identified**: Any blockers or concerns raised
-
-### Step 4: Update Session Log
-
-Append the sync output to the current session log in `.agents/sessions/`. The entry MUST include:
-
-| Field | Description |
-|-------|-------------|
... diff truncated: showing 800 of 919 lines

You can send follow-ups to the cloud agent here.

Comment thread .claude/commands/test.md Outdated
Comment thread .claude/commands/review.md Outdated
@rjmurillo rjmurillo merged commit 1e43523 into main Apr 11, 2026
102 of 107 checks passed
@rjmurillo rjmurillo deleted the feat/lifecycle-commands branch April 11, 2026 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-orchestrator Task coordination agent area-infrastructure Build, CI/CD, configuration area-skills Skills documentation and patterns area-workflows GitHub Actions workflows automation Automated workflows and processes documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ship 6 core skills for Claude Code / Copilot

4 participants