Skip to content

feat(skills): add requirements-interview skill (grill-me pattern) for /spec#1812

Merged
rjmurillo merged 14 commits into
mainfrom
feat/1798-autonomous
Apr 29, 2026
Merged

feat(skills): add requirements-interview skill (grill-me pattern) for /spec#1812
rjmurillo merged 14 commits into
mainfrom
feat/1798-autonomous

Conversation

@rjmurillo-bot

@rjmurillo-bot rjmurillo-bot commented Apr 27, 2026

Copy link
Copy Markdown
Collaborator

Pull Request

Summary

Adds requirements-interview skill (Matt Pocock's grill-me pattern) and wires it through /spec end-to-end. The skill walks the design tree adversarially, recommends an answer per question, greps the codebase before asking the user, and emits a structured PRD. /spec now carries every PRD section through downstream steps and hands the result to spec-generator for durable REQ-NNN/DESIGN-NNN/TASK-NNN artifacts in .agents/specs/.

Closes the alignment-before-generation gap documented in #1798. JetBrains ICSE 2026 and Faros 2026 both quantify the cost of skipping alignment (>50% review/edit time, +200% review time, +242% incidents per PR). This skill is the forcing function.

Specification References

Type Reference Description
Issue Closes #1798 feat(agents): Add adversarial requirements interview to /spec command (grill-me pattern)

Implements Issue #1798 Option C in spirit (skill is reusable + wired into /spec end-to-end). Direct analyst-agent invocation deferred to a follow-up issue (the /spec integration covers the primary alignment goal).

Changes

  • New skill: .claude/skills/requirements-interview/SKILL.md (113 lines, prompt-only). Frontmatter declares name, version, model, license, user-invocable, and least-privilege allowed-tools. Defines the 7-branch checklist (user stories, data model, integrations, failure modes, security, observability, scope), question discipline (numbered, recommended-answer-required, codebase-first), and ## Process and ## Verification sections aligned to the SkillForge validator schema.
  • New tests: .claude/skills/requirements-interview/tests/test_skill_contract.py. 10 contract tests imported from the canonical validator (no duplicated constants, no external dependency, regex-based XML check).
  • Modified: .claude/commands/spec.md. Step 2 invokes the skill before complexity classification. Steps 3-5 carry the PRD forward and consume specific sections. Step 5 makes CVA conditional on tier and use-case count. Step 6 invokes Task(subagent_type="spec-generator") with the full PRD plus tier and CVA summary. Output section mirrors the PRD schema (12 sections instead of 5).
  • Modified: .claude-plugin/marketplace.json. Bumped project-toolkit reusable-skill count to 66.
  • New: .agents/sessions/2026-04-27-session-1761-1798-grill-me-requirements-interview.json. Session log. Renumbered 1760 → 1761 to resolve an episode-file collision with main's session-1760.
  • New: .agents/memory/episodes/episode-2026-04-27-session-1761.json (auto-extracted from the session log).
  • New: .serena/memories/skills/skills-requirements-interview-pattern.md. Captures the skill design, end-to-end /spec flow, validator gotchas, test-contract sync rule, and deferred-work pointers.

Type of Change

  • New feature (non-breaking change adding functionality)
  • Refactoring of /spec step 6 to carry the PRD schema (separated from the new behavior into its own commit)

Testing

  • Tests added/updated (10 contract tests)
  • Manual testing completed (skill_frontmatter PASS, skill_size PASS at 113/500, SkillForge 18/19, markdownlint clean, pytest 10/10)

Agent Review

Security Review

  • No security-critical changes in this PR (no auth, secrets, or infrastructure)
  • Security patterns applied: allowed-tools declared for least-privilege scope

Other Agent Reviews

  • Architect-style review via gstack /review (5 specialists + adversarial Codex pass) on the original commit
  • Critic validated implementation plan (CodeRabbit + Gemini bots)
  • QA verified test coverage (10/10 contract tests)

Checklist

  • Code follows project style guidelines (markdownlint, ruff)
  • Self-review completed (/review + /pr-review)
  • Comments added for complex logic (the skill's Process section documents the method)
  • Documentation updated (Serena memory entry under skills/)
  • No new warnings introduced

Related Issues

Closes #1798

Test Plan

python3 -m scripts.validation.skill_frontmatter --path .claude/skills/requirements-interview/SKILL.md   # PASS
python3 -m scripts.validation.skill_size --path .claude/skills/requirements-interview/SKILL.md          # PASS (113 / 500)
python3 .claude/skills/SkillForge/scripts/validate-skill.py .claude/skills/requirements-interview/      # 18/19 (1 non-blocking warning)
python3 -m pytest .claude/skills/requirements-interview/tests/                                          # 10 passed
python3 build/scripts/validate_marketplace_counts.py                                                    # up to date

Pre-push hook summary: 11 PASS, 1 WARN (PR size), 0 FAIL.

Notes

Deferred follow-ups (out of PR scope, tracked separately):

- Analyst agent prompt should invoke this skill in its default flow (Issue #1798 Option C narrow read).
  Touches templates/agents/analyst.shared.md plus regenerated per-platform copies.
- Auto-extractor in .githooks/pre-commit should embed the full session-log basename in the episode
  filename to prevent future parallel-session collisions (the bug that caused the 1760 collision
  with main).

🤖 Generated with Claude Code

… wire into /spec

Add `requirements-interview` skill implementing Pocock's grill-me pattern:
adversarial design-tree traversal with a recommended answer per question,
codebase-first lookup, and a structured PRD output. Wire it into the /spec
command as step 2 so every spec starts with shared understanding before any
classification, CVA, or critic pass runs.

Closes the alignment-before-generation gap documented in issue #1798.

Files:
- .claude/skills/requirements-interview/SKILL.md (new, 105 lines, prompt-only)
- .claude/skills/requirements-interview/tests/test_skill_contract.py (new, 10 tests)
- .claude/commands/spec.md (insert step 2, renumber 3..9)
- .agents/sessions/2026-04-27-session-1760-*.json (session log)

Validation: skill_frontmatter PASS, skill_size PASS, markdownlint clean,
10/10 contract tests pass.

Fixes #1798

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added enhancement New feature or request area-skills Skills documentation and patterns labels Apr 27, 2026
@rjmurillo-bot rjmurillo-bot enabled auto-merge (squash) April 27, 2026 18:29
@github-actions

Copy link
Copy Markdown
Contributor

PR Validation Report

Caution

Status: FAIL

Description Validation

Check Status
Description matches diff FAIL

PR Standards

Check Status
Issue linking keywords PASS
Template compliance WARN

QA Validation

Check Status
Code changes detected True
QA report exists false

⚠️ Blocking Issues

  • PR description does not match actual changes

⚡ Warnings

  • Template compliance: 1/4 sections complete
  • QA report not found for code changes (recommended before merge)

Powered by PR Validation workflow

@coderabbitai coderabbitai Bot requested a review from rjmurillo April 27, 2026 18:29
@github-actions

Copy link
Copy Markdown
Contributor

Session Protocol Compliance Report

Caution

Overall Verdict: CRITICAL_FAIL

All session protocol requirements satisfied.

What is Session Protocol?

Session logs document agent work sessions and must comply with RFC 2119 requirements:

  • MUST: Required for compliance (blocking failures)
  • SHOULD: Recommended practices (warnings)
  • MAY: Optional enhancements

See .agents/SESSION-PROTOCOL.md for full specification.

Compliance Summary

Session File Verdict MUST Failures
sessions-2026-04-27-session-1760-1798-grill-me-requirements-interview.md ❔ NON_COMPLIANT 0

Detailed Validation Results

Click each session to see the complete validation report with specific requirement failures.

📄 sessions-2026-04-27-session-1760-1798-grill-me-requirements-interview

=== Session Validation ===
File: /home/runner/work/ai-agents/ai-agents/.agents/sessions/2026-04-27-session-1760-1798-grill-me-requirements-interview.json

[FAIL] Validation errors:

  • Incomplete MUST: sessionEnd.serenaMemoryUpdated

✨ Zero-Token Validation

This validation uses deterministic script analysis instead of AI:

  • Zero tokens consumed (previously 300K-900K per debug cycle)
  • Instant feedback - see exact failures in this summary
  • No artifact downloads needed to diagnose issues
  • 10x-100x faster debugging

Powered by validate_session_json.py

📊 Run Details
Property Value
Run ID 25012459273
Files Checked 1
Validation Method Deterministic script analysis

Powered by Session Protocol Validator workflow

@coderabbitai coderabbitai Bot added agent-orchestrator Task coordination agent agent-analyst Research and investigation agent agent-architect Design and ADR agent area-workflows GitHub Actions workflows labels Apr 27, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Spec-to-Implementation Validation

Tip

Final Verdict: PASS

What is Spec Validation?

This validation ensures your implementation matches the specifications:

  • Requirements Traceability: Verifies PR changes map to spec requirements
  • Implementation Completeness: Checks all requirements are addressed

Validation Summary

Check Verdict Status
Requirements Traceability PASS
Implementation Completeness PASS

Spec References

Type References
Specs None
Issues 1798
Requirements Traceability Details

Requirements Coverage Matrix

Requirement Description Status Evidence
REQ-001 Create requirements-interview skill implementing grill-me pattern COVERED .claude/skills/requirements-interview/SKILL.md created (105 lines)
REQ-002 Interview relentlessly until shared understanding COVERED SKILL.md lines 17-18: "adversarial requirements interviewer...shared understanding before any code"
REQ-003 Walk design tree branch by branch, resolve dependencies COVERED SKILL.md lines 44-47: "Build the design tree...Walk one branch at a time, depth first. Resolve dependencies before siblings"
REQ-004 For each question, provide recommended answer COVERED SKILL.md line 46: "For every question, propose your recommended answer"
REQ-005 If codebase can answer, grep instead of asking user COVERED SKILL.md line 47: "If the codebase can answer it, answer it. Grep before you ask"
REQ-006 Cover user stories, data model, edge cases, failure modes COVERED SKILL.md lines 63-71: Branch Checklist covers all areas
REQ-007 Output structured PRD with acceptance criteria COVERED SKILL.md lines 93-94: Structured Output section with EARS syntax
REQ-008 Skill reusable by orchestrator, architect, analyst COVERED SKILL.md line 7: user-invocable: true; PR confirms "Skill is reusable"
REQ-009 Wire into /spec command workflow COVERED spec.md line 16: Step 2 invokes Skill(skill="requirements-interview")
REQ-010 Interview happens BEFORE architect/implementer routing COVERED spec.md: Step 2 (interview) precedes Step 3 (complexity classification)
REQ-011 Option C implementation (skill + analyst integration) PARTIAL Skill created and wired to /spec; analyst agent enhancement not included
REQ-012 Cover security branch in interview COVERED SKILL.md line 69: Security branch with rule citations
REQ-013 Cover observability in interview COVERED SKILL.md line 70: Observability branch defined
REQ-014 Cover scope boundaries COVERED SKILL.md line 71: Scope boundaries branch defined
REQ-015 Contract tests for skill COVERED tests/test_skill_contract.py with 10 tests

Summary

  • Total Requirements: 15
  • Covered: 14 (93%)
  • Partially Covered: 1 (7%)
  • Not Covered: 0 (0%)

Gaps

  1. REQ-011 (Option C partial): Issue recommended "Option C — Both" where the skill is created AND analyst agent gets behavior wired into its default flow. The PR creates the skill and wires it to /spec, but does not modify the analyst agent (analyst.agent.md) to invoke the skill as part of its standard workflow. The /spec integration achieves the primary goal; analyst enhancement could be a follow-up.

[!TIP]
VERDICT: PASS
All core requirements from Issue #1798 are covered. The implementation delivers the grill-me skill with design tree traversal, recommended-answer discipline, codebase-first lookup, structured PRD output, and /spec integration. The partial gap (analyst agent enhancement) is an optional enhancement per Option C, not a blocking requirement. The PR explicitly chose a minimal-blast-radius approach by wiring via /spec only.

Implementation Completeness Details

Acceptance Criteria Checklist

Based on Issue #1798, the acceptance criteria are:

  • AC1: Grill-me pattern implementation - SATISFIED

    • Evidence: SKILL.md lines 15-17 implement adversarial interview behavior; lines 43-50 define the Method with "walk the design tree branch by branch"
  • AC2: Design tree traversal (Brooks' concept) - SATISFIED

    • Evidence: SKILL.md lines 44-47 define depth-first branch traversal; lines 63-71 enumerate 7 branches (user stories, data model, integrations, failure modes, security, observability, scope boundaries)
  • AC3: "For each question, provide your recommended answer" - SATISFIED

    • Evidence: SKILL.md line 46: "For every question, propose your recommended answer. Cite the source: code path, ADR, prior art, or stated assumption."
  • AC4: Codebase-first principle - SATISFIED

    • Evidence: SKILL.md line 47: "If the codebase can answer it, answer it. Grep before you ask. Cite the file and line."
  • AC5: Structured PRD output - SATISFIED

    • Evidence: SKILL.md lines 92-94 define output format with Problem, User stories, Data model, Integrations, Failure modes, Security, Observability, Acceptance criteria, Out of scope, Deferred, Open questions
  • AC6: /spec integration - SATISFIED

    • Evidence: spec.md line 16 inserts Skill(skill="requirements-interview") as step 2, before complexity classification
  • AC7: Reusable skill (Option C recommendation) - SATISFIED

    • Evidence: SKILL.md line 7 user-invocable: true; triggers in lines 8-12 allow standalone invocation by orchestrator, architect, analyst
  • AC8: Resolve dependencies between decisions - SATISFIED

    • Evidence: SKILL.md line 45: "Resolve dependencies before siblings. A storage decision constrains the consistency model; ask the storage question first."
  • AC9: Tests guard the prompt contract - SATISFIED

    • Evidence: test_skill_contract.py provides 10 tests covering frontmatter, required sections, grill-me reference, recommended-answer discipline, codebase-first principle

Missing Functionality

None identified. All explicit requirements from Issue #1798 are addressed.

Edge Cases Not Covered

  1. Analyst agent enhancement (Option A) - Not implemented. Issue recommended Option C (skill + analyst integration). The skill is standalone; analyst agent file was not modified. This is acceptable since Option C states "analyst agent invokes it as part of its standard workflow" and /spec already invokes the skill before delegating to analyst.

  2. Interview transcript persistence - SKILL.md specifies output at .agents/specs/interviews/INTERVIEW-<slug>.md but no directory or template exists. This is acceptable for a prompt-only skill; the agent creates the file on invocation.

Implementation Quality

  • Completeness: 9/9 acceptance criteria satisfied (100%)
  • Quality: Implementation exceeds minimum requirements. Branch checklist covers 7 areas vs. 4 in the issue. Anti-patterns section guards against common failure modes. Quality gates define completion criteria.
Criterion Status
Grill-me pattern PASS
Design tree traversal PASS
Recommended answer per question PASS
Codebase-first lookup PASS
Structured PRD output PASS
/spec integration PASS
Reusable skill PASS
Dependency resolution PASS
Contract tests PASS

[!TIP]
VERDICT: PASS
Implementation satisfies all 9 acceptance criteria from Issue #1798. The skill correctly implements the grill-me pattern with design tree traversal, recommended-answer discipline, and codebase-first lookup. Integration into /spec as step 2 ensures alignment-before-generation. Test coverage guards the prompt contract.


Run Details
Property Value
Run ID 25012459272
Triggered by pull_request on 1812/merge

Powered by AI Spec Validator workflow

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the requirements-interview skill, implementing an adversarial 'grill-me' pattern to elicit testable requirements before implementation. It updates the .claude/commands/spec.md process to incorporate this interview as a mandatory step and adds a contract test suite to validate the skill's structure and frontmatter. Review feedback focuses on improving the test implementation by utilizing existing internal validation scripts to parse frontmatter, define constants, and perform regex-based XML tag checks, thereby reducing external dependencies and preventing logic duplication.

Comment thread .claude/skills/requirements-interview/tests/test_skill_contract.py Outdated
Comment thread .claude/skills/requirements-interview/tests/test_skill_contract.py Outdated
Comment thread .claude/skills/requirements-interview/tests/test_skill_contract.py Outdated
@github-actions

github-actions Bot commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

AI Quality Gate Review

Tip

Final Verdict: PASS

Walkthrough

This PR was reviewed by six AI agents in parallel, analyzing different aspects of the changes:

  • Security Agent: Scans for vulnerabilities, secrets exposure, and security anti-patterns
  • QA Agent: Evaluates test coverage, error handling, and code quality
  • Analyst Agent: Assesses code quality, impact analysis, and maintainability
  • Architect Agent: Reviews design patterns, system boundaries, and architectural concerns
  • DevOps Agent: Evaluates CI/CD, build pipelines, and infrastructure changes
  • Roadmap Agent: Assesses strategic alignment, feature scope, and user value

Review Summary

Agent Verdict Category Status
Security PASS N/A
QA PASS N/A
Analyst PASS N/A
Architect PASS N/A
DevOps PASS N/A
Roadmap PASS N/A

💡 Quick Access: Click on individual agent jobs (e.g., "🔒 security Review", "🧪 qa Review") in the workflow run to see detailed findings and step summaries.

Security Review Details

Now I have reviewed all the changed files. Let me compile my security analysis.

Security Review: PR #1812

PR Type Classification

File Category Security Scrutiny
.claude/skills/requirements-interview/SKILL.md PROMPT Prompt injection surface
.claude/skills/requirements-interview/tests/test_skill_contract.py CODE Full review
.claude/commands/spec.md PROMPT Prompt injection surface
.claude-plugin/marketplace.json CONFIG Schema only
.serena/memories/skills/skills-requirements-interview-pattern.md DOCS None required
.agents/sessions/*.json CONFIG Schema only
.agents/memory/episodes/*.json CONFIG Schema only

Findings

Severity Category Finding Location CWE
None - No security vulnerabilities identified - -

Analysis Summary

PROMPT Files (SKILL.md, spec.md)

  • [PASS] No dynamic code execution patterns
  • [PASS] allowed-tools declares least-privilege scope: Read, Glob, Grep, Write (no shell/bash)
  • [PASS] No user input directly passed to system prompts without structure
  • [PASS] Tool invocations use typed delegation (Task(subagent_type=...), Skill(skill=...))
  • [PASS] No external URL fetching or arbitrary code execution

CODE File (test_skill_contract.py)

  • [PASS] Read-only test file; reads local SKILL.md only
  • [PASS] No subprocess calls or shell execution
  • [PASS] No network operations
  • [PASS] Uses standard pytest patterns with Path.read_text()
  • [PASS] Imports from internal validation module only

CONFIG Files (JSON)

  • [PASS] No secrets or credentials present
  • [PASS] marketplace.json increments skill count only (66)
  • [PASS] Session logs contain no sensitive data (timestamps, filenames, protocol compliance)

DOCS File (Serena memory)

  • [PASS] Documentation only; no executable content
  • [PASS] External URLs are reference links, not fetched

Recommendations

None required. The PR follows security best practices:

  1. Skill declares minimal allowed-tools (no shell access)
  2. Test file is isolated and read-only
  3. No hardcoded secrets or credentials
  4. No injection vectors in prompt templates

Verdict

VERDICT: PASS
MESSAGE: No security vulnerabilities found. Skill uses least-privilege tool scope, no shell access, no secrets exposure.
{
  "verdict": "PASS",
  "message": "No security vulnerabilities found. Skill uses least-privilege tool scope, no shell access, no secrets exposure.",
  "agent": "security",
  "timestamp": "2026-04-29T03:42:29.442Z",
  "findings": []
}
QA Review Details

Now I have a complete picture of the PR. Let me produce my QA review verdict.


QA Review: PR #1812

PR TYPE: MIXED

Category Files
CODE .claude/skills/requirements-interview/tests/test_skill_contract.py
CONFIG .claude-plugin/marketplace.json
DOCS .claude/skills/requirements-interview/SKILL.md, .claude/commands/spec.md, .serena/memories/skills/skills-requirements-interview-pattern.md
SESSION ARTIFACTS .agents/sessions/*.json, .agents/memory/episodes/*.json

Test Coverage Assessment

Area Status Evidence Files Checked
Unit tests Adequate test_skill_contract.py: 10 contract tests validating SKILL.md structure SKILL.md
Edge cases Covered Tests validate: missing fields, name pattern, description length, XML injection, model validation, required sections Frontmatter constraints
Error paths Tested parse_frontmatter error handling tested via assert result.is_valid pattern Lines 57-59
Assertions Present Each test contains explicit assert statements 10 assertions across 10 tests

Quality Concerns

Severity Issue Location Evidence Required Fix
LOW Test imports private patterns test_skill_contract.py:22-28 Imports _NAME_PATTERN, _XML_TAG_PATTERN (underscore-prefixed) Documented as deferred in Serena memory; acceptable
LOW Description length relaxed test_skill_contract.py:34 DESCRIPTION_MAX = 1024 while memory says frontmatter descriptions should be under 200 chars Non-blocking; skill validates against canonical validator
NONE Session log outcome: failure episode-*.json:5 Misleading label; session completed successfully per PR description Auto-extracted metadata quirk; no code impact

Regression Risk Assessment

  • Risk Level: Low
  • Affected Components: /spec command workflow, new skill (no existing functionality modified beyond spec.md)
  • Breaking Changes: None. /spec gains new step 2 behavior but existing usage patterns preserved.
  • Required Testing: Contract tests validate SKILL.md structure; pytest 7362 passed confirms no regression.

Code Quality Verification

Metric Value Threshold Status
SKILL.md lines 112 500 [PASS]
Function length All < 20 lines 50 [PASS]
Test isolation Each test independent Required [PASS]
Test assertions 10 tests, 10+ assertions Present [PASS]

Pre-Executed Test Results

  • pytest: 7362 passed, 3 skipped, 43 warnings in 44.17s
  • Status: [PASS]
  • Skill-specific tests: 10/10 contract tests in test_skill_contract.py passed (confirmed in PR description)

Fail-Safe Pattern Verification

Pattern Status Evidence
Input validation [PASS] Frontmatter parsed with explicit error handling; required fields enforced
Error handling [PASS] parse_frontmatter returns errors list; tests assert result.is_valid
Timeout handling [N/A] Prompt-only skill; no external calls
Fallback behavior [N/A] Skill invocation pattern; no runtime fallback needed

Test-Implementation Alignment

Acceptance Criterion Test Coverage Status
Skill file exists test_skill_file_exists [PASS]
Within size limit test_skill_within_size_limit [PASS]
Required frontmatter test_frontmatter_required_fields [PASS]
Name pattern test_name_matches_pattern [PASS]
Description constraints test_description_constraints [PASS]
Model validation test_model_is_supported [PASS]
Required sections test_required_sections_present [PASS]
Grill-me pattern referenced test_grill_me_pattern_referenced [PASS]
Recommended answer discipline test_recommended_answer_discipline [PASS]
Codebase-first principle test_codebase_first_principle [PASS]

Coverage: 10/10 acceptance criteria covered (100%)


VERDICT: PASS
MESSAGE: New skill with comprehensive contract tests (10/10) and pytest suite (7362 passed); no blocking issues.

PR TYPE: MIXED (CODE + DOCS + CONFIG + SESSION ARTIFACTS)

EVIDENCE:
- Tests found: 10 contract tests for 1 new skill (test_skill_contract.py)
- Test execution: PASS (7362 passed, 3 skipped, 43 warnings in 44.17s)
- Edge cases: Covered (null/empty fields, pattern mismatches, length limits, XML injection)
- Error handling: Tested (frontmatter parsing validation)
- Blocking issues: 0
{
  "verdict": "PASS",
  "message": "New skill with comprehensive contract tests (10/10) and pytest suite (7362 passed); no blocking issues.",
  "agent": "qa",
  "timestamp": "2026-04-29T03:43:38.154Z",
  "findings": [
    {
      "severity": "low",
      "category": "code-quality",
      "description": "Test imports private validator patterns (_NAME_PATTERN, _XML_TAG_PATTERN)",
      "location": "test_skill_contract.py:22-28",
      "recommendation": "Documented as deferred in Serena memory; track for follow-up refactor"
    },
    {
      "severity": "low",
      "category": "test-coverage",
      "description": "DESCRIPTION_MAX=1024 differs from 200-char memory guideline for short descriptions",
      "location": "test_skill_contract.py:34",
      "recommendation": "Acceptable; validates against canonical validator maximum, not style preference"
    }
  ]
}
Analyst Review Details

Now I have enough context to perform the analysis. Let me compile the findings.


Analysis: PR #1812 - Requirements Interview Skill (grill-me pattern)

Code Quality Score

Criterion Score (1-5) Notes
Readability 5 Clear markdown structure. Branch checklist, process steps, and anti-patterns documented.
Maintainability 4 Tests import from canonical validator (no drift). One concern: 11 required sections may need sync if SKILL.md evolves.
Consistency 5 Follows existing skill patterns (decision-critic, pre-mortem). Frontmatter matches project schema.
Simplicity 4 Prompt-only skill (no scripts). Complexity is appropriate for adversarial interview pattern.

Overall: 4.5/5

Impact Assessment

  • Scope: Module-wide (touches /spec command, adds new skill, updates marketplace count)
  • Risk Level: Low
  • Affected Components:
    • .claude/commands/spec.md (step 2 and step 6 modified)
    • .claude/skills/requirements-interview/ (new skill)
    • .claude-plugin/marketplace.json (count bump: 66 skills)
    • .serena/memories/skills/ (new memory entry)
    • .agents/sessions/ and .agents/memory/episodes/ (session artifacts)

Findings

Priority Category Finding Location
Low maintainability Test file duplicates VALID_MODEL_ALIASES, _NAME_PATTERN, _XML_TAG_PATTERN constants from validator. Future drift risk if validator changes. tests/test_skill_contract.py:22-28
Low documentation Description length (289 chars) exceeds the 200-char best practice from repository memories, though under the 1024-char hard limit. SKILL.md:4
Low consistency Session log episode outcome marked "failure" but metrics show 8 commits and all validations passed. Misleading metadata. episode-2026-04-27-session-1761.json:5
Low documentation Deferred analyst-agent integration documented in PR notes and Serena memory. Tracked separately per PR description. PR description, .serena/memories/skills/skills-requirements-interview-pattern.md:66-77

Recommendations

  1. Future PR: Promote validator private constants (_NAME_PATTERN, _XML_TAG_PATTERN) to public exports and import them in tests to eliminate duplication (already tracked in memory as deferred work).
  2. Consider: Shortening the frontmatter description to under 200 chars per steering guidance, though not blocking.
  3. Session log: Fix episode outcome to "success" since validations passed and commits landed.

Verdict

The skill is well-structured, follows established patterns, and includes 10 contract tests. The /spec integration correctly wires the skill at step 2 and hands the PRD to spec-generator at step 6. Marketplace count is updated. Tests import from the canonical validator to prevent drift on core validation logic.

The test constant duplication is a known trade-off documented in the Serena memory entry and PR notes. No architectural violations or blocking issues.

VERDICT: PASS
MESSAGE: Clean skill implementation with comprehensive tests. Minor maintainability debt (test constant duplication) is tracked for follow-up.
{
  "verdict": "PASS",
  "message": "Clean skill implementation with comprehensive tests. Minor maintainability debt (test constant duplication) is tracked for follow-up.",
  "agent": "analyst",
  "timestamp": "2026-04-29T03:42:34.235Z",
  "findings": [
    {
      "severity": "low",
      "category": "maintainability",
      "description": "Test file duplicates VALID_MODEL_ALIASES, _NAME_PATTERN, _XML_TAG_PATTERN constants from validator. Future drift risk if validator changes.",
      "location": "tests/test_skill_contract.py:22-28",
      "recommendation": "Future PR: promote validator private constants to public exports and import them in tests."
    },
    {
      "severity": "low",
      "category": "documentation",
      "description": "Description length (289 chars) exceeds 200-char best practice from steering guidance, though under 1024-char hard limit.",
      "location": "SKILL.md:4",
      "recommendation": "Consider shortening description in future iteration."
    },
    {
      "severity": "low",
      "category": "consistency",
      "description": "Session log episode outcome marked 'failure' but metrics show 8 commits and all validations passed.",
      "location": "episode-2026-04-27-session-1761.json:5",
      "recommendation": "Fix episode outcome to 'success'."
    },
    {
      "severity": "low",
      "category": "documentation",
      "description": "Analyst-agent integration deferred. Documented in PR notes and Serena memory.",
      "location": "PR description",
      "recommendation": "Track in separate issue per PR notes."
    }
  ]
}
Architect Review Details

Now I have enough context to perform the architectural review.


Architect Review: PR #1812 — requirements-interview skill

Design Quality Assessment

Aspect Rating (1-5) Notes
Pattern Adherence 5 Single-responsibility skill. Follows established skill structure (frontmatter, Process, Verification sections).
Boundary Respect 5 Skill produces structured PRD; spec-generator formalizes artifacts. Clear separation.
Coupling 4 Explicit interface: skill outputs PRD → /spec carries it → spec-generator consumes it. Minor coupling to EARS syntax.
Cohesion 5 Skill does one thing: adversarial requirements interview.
Extensibility 4 Branch checklist is explicit; new branches can be added. No plugin mechanism, but acceptable for prompt-only skill.

Overall Design Score: 4.6/5

Architectural Concerns

Severity Concern Location Recommendation
Low Description in frontmatter exceeds 200-char guideline (currently 340 chars) SKILL.md:4 Shorten to verb phrase under 200 chars per steering doc
Low Memory changes bundled with skill PR .serena/memories/skills/ Per memory guidance, memory updates should ship in separate PR

Breaking Change Assessment

  • Breaking Changes: No
  • Impact Scope: None
  • Migration Required: No
  • Migration Path: N/A

Technical Debt Analysis

  • Debt Added: Low (test imports from canonical validator reduce duplication)
  • Debt Reduced: Medium (structured PRD eliminates ad-hoc requirements gathering)
  • Net Impact: Improved

ADR Assessment

  • ADR Required: No
  • Decisions Identified: None requiring new ADR. The skill pattern is established (ADR-030). EARS syntax is documented (ADR-032). /spec wiring is a feature enhancement, not an architectural change.
  • Existing ADR: ADR-030 (skills-pattern-superiority), ADR-032 (ears-requirements-syntax)
  • Recommendation: N/A

Recommendations

  1. Shorten frontmatter description to comply with 200-char guideline. Example: Adversarial requirements interview using the grill-me pattern. Walks the design tree, proposes answers, greps first, emits structured PRD.
  2. Split memory file into follow-up PR per repository memory guidance (Serena memory changes separate from skill changes).

Verdict

VERDICT: PASS
MESSAGE: Well-structured skill with clear boundaries. Two low-severity findings are non-blocking.
{
  "verdict": "PASS",
  "message": "Well-structured skill with clear boundaries. Two low-severity findings are non-blocking.",
  "agent": "architect",
  "timestamp": "2026-04-29T03:42:25.869Z",
  "findings": [
    {
      "severity": "low",
      "category": "design-pattern",
      "description": "Frontmatter description exceeds 200-char steering guideline (340 chars)",
      "location": ".claude/skills/requirements-interview/SKILL.md:4",
      "recommendation": "Shorten to under 200 chars per .agents/steering/claude-skills.md"
    },
    {
      "severity": "low",
      "category": "boundaries",
      "description": "Serena memory file bundled with skill PR",
      "location": ".serena/memories/skills/skills-requirements-interview-pattern.md",
      "recommendation": "Move memory changes to separate PR per repository guidance"
    }
  ]
}
DevOps Review Details

Let me check the local repository state and files mentioned in the PR description.

Based on my review, I now have a complete picture of this PR. Let me compile the DevOps review.


DevOps Review: PR #1812

PR Scope Detection

Category Files Changed Review Scope
DOCS .claude/skills/requirements-interview/SKILL.md, .claude/commands/spec.md, .serena/memories/skills/skills-requirements-interview-pattern.md None required
CONFIG .claude-plugin/marketplace.json, session logs (JSON) Schema validation only
CODE .claude/skills/requirements-interview/tests/test_skill_contract.py Build impact only

Assessment: This PR contains no workflow files, no shell scripts, no GitHub Actions changes. It is primarily a skill addition (documentation + tests) with a marketplace count bump.


Pipeline Impact Assessment

Area Impact Notes
Build None No build scripts modified
Test Low New pytest file added; existing test infrastructure handles it
Deploy None No deployment changes
Cost None No CI workflow changes

CI/CD Quality Checks

Check Status Location
YAML syntax valid N/A No workflows modified
Actions pinned N/A No actions modified
Secrets secure N/A No secrets referenced
Permissions minimal N/A No permissions changes
Shell scripts robust N/A No shell scripts added

Findings

Severity Category Finding Location Fix
Low automation marketplace.json count bumped to 66 .claude-plugin/marketplace.json:21 Verified: count matches skill directory (66 skills)

Template Assessment

  • PR Template: Adequate (PR follows template, includes test plan)
  • Issue Templates: Not modified
  • Template Issues: None

Automation Opportunities

Opportunity Type Benefit Effort
None identified N/A N/A N/A

This PR follows existing patterns. The skill tests use the existing pytest infrastructure. The marketplace count validation script (build/scripts/validate_marketplace_counts.py) already validates skill counts in CI.

Recommendations

  1. No DevOps changes required for this PR.
  2. The test file imports from scripts.validation.skill_frontmatter, which is already in the existing test infrastructure.

Verdict

VERDICT: PASS
MESSAGE: DOCS/CONFIG-only PR with no CI/CD, workflow, or infrastructure changes. New pytest tests integrate with existing infrastructure. Marketplace count properly bumped.

{
  "verdict": "PASS",
  "message": "DOCS/CONFIG-only PR with no CI/CD, workflow, or infrastructure changes. New pytest tests integrate with existing infrastructure.",
  "agent": "devops",
  "timestamp": "2026-04-29T03:42:26.166Z",
  "findings": [
    {
      "severity": "low",
      "category": "automation",
      "description": "Marketplace skill count bumped from 65 to 66",
      "location": ".claude-plugin/marketplace.json:21",
      "recommendation": "Verified correct; count matches skill directory enumeration"
    }
  ]
}
Roadmap Review Details

I have reviewed the key files. Now I'll provide the strategic alignment assessment.


Roadmap Review: PR #1812

Strategic Alignment Assessment

Criterion Rating Notes
Aligns with project goals High Addresses generation-before-alignment failure mode documented in issue #1798. Supports "enable coordinated multi-agent AI workflows" objective.
Priority appropriate High Requirements clarity is foundational. Research cited shows >50% review time reduction. Wiring into /spec maximizes reach.
User value clear High Prevents wasted implementation effort through early requirements validation. Structured PRD output reduces downstream rework.
Investment justified High 113-line prompt-only skill with 10 tests. Minimal implementation cost for high-leverage workflow improvement.

Feature Completeness

  • Scope Assessment: Right-sized
  • Ship Ready: Yes
  • MVP Complete: Yes
  • Enhancement Opportunities:
    • Analyst agent auto-invocation (explicitly deferred, tracked)
    • Session collision fix (explicitly deferred, tracked)

Impact Analysis

Dimension Assessment Notes
User Value High Prevents "generation without alignment" across all /spec invocations. Forces structured thinking before code.
Business Impact High JetBrains/Faros research quantifies >50% review time reduction. Scales to every new feature.
Technical Leverage High Skill is reusable (user-invocable: true). Integrates with existing /spec, decision-critic, and spec-generator.
Competitive Position Improved Differentiates agent system with adversarial requirements discipline. Few competitors enforce pre-implementation alignment.

RICE Score (Estimated)

Factor Value Rationale
Reach ~50 features/quarter Every /spec invocation (primary feature entry point)
Impact 2 (High) Prevents costly rework; research shows 50%+ review time saved
Confidence 80% Pattern proven externally (Matt Pocock); codebase integration tested
Effort 0.2 person-months 113-line skill + 10 tests; minimal footprint
Score 400 (50 x 2 x 0.8) / 0.2

KANO Classification

Must-Be (becoming): Requirements validation is expected in mature agent systems. Absence creates dissatisfaction when implementation diverges from intent. This skill converts what was an implicit expectation into an explicit gate.

Concerns

Priority Concern Recommendation
Low Serena memory file included in PR Consider splitting .serena/memories/ changes to separate PR per repository memory about skill PRs
Low Session log collision workaround Deferred fix is tracked; acceptable for this PR
Low SkillForge 18/19 (1 non-blocking warning) Acceptable; warning is non-blocking

Recommendations

  1. Ship as-is. The skill addresses a documented pain point with quantified impact and minimal implementation cost.
  2. Track the deferred analyst-agent integration in a follow-up issue to complete Option C from feat(agents): Add adversarial requirements interview to /spec command (grill-me pattern) #1798.
  3. Monitor adoption: Track /spec invocations pre/post to validate the 50% review time claim.

Verdict

VERDICT: PASS
MESSAGE: High-impact, right-sized skill that closes the alignment-before-generation gap. Research-backed, minimal cost, strong integration with /spec workflow.

{
  "verdict": "PASS",
  "message": "High-impact requirements-interview skill closes alignment-before-generation gap with research-backed methodology and minimal implementation cost",
  "agent": "roadmap",
  "timestamp": "2026-04-29T03:42:30.789Z",
  "findings": [
    {
      "severity": "low",
      "category": "scope",
      "description": "Serena memory file included in skill PR",
      "location": ".serena/memories/skills/skills-requirements-interview-pattern.md",
      "recommendation": "Repository convention prefers memory changes in separate PRs. Acceptable for this PR but note for future."
    },
    {
      "severity": "low",
      "category": "documentation",
      "description": "Deferred work items properly tracked",
      "location": "PR description Notes section",
      "recommendation": "Create follow-up issue for analyst-agent auto-invocation to complete #1798 Option C."
    }
  ]
}

Run Details
Property Value
Run ID 25089749733
Triggered by pull_request on 1812/merge
Commit bfd6cdc1de1fd1d7eae5151fc3f939abec37c4ae

Powered by AI Quality Gate workflow

@rjmurillo

Copy link
Copy Markdown
Owner

Review Triage Required

Note

Priority: NORMAL - Human approval required before bot responds

Review Summary

Source Reviews Comments
Human 0 0
Bot 1 3

Next Steps

  1. Review human feedback above
  2. Address any CHANGES_REQUESTED from human reviewers
  3. Add triage:approved label when ready for bot to respond to review comments

Powered by PR Maintenance workflow - Add triage:approved label

@coderabbitai

coderabbitai Bot commented Apr 27, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces a mandatory adversarial requirements-interview step into the /spec workflow that generates a structured PRD before proceeding to complexity classification and specification generation. It adds a new reusable requirements-interview skill that walks a design tree and records decisions with EARS acceptance criteria, along with validation tests and updated workflow documentation.

Changes

Cohort / File(s) Summary
Spec workflow redesign
.claude/commands/spec.md
Adds mandatory requirements-interview step at start of spec generation; moves complexity tier classification to after PRD exists; scopes repository search using PRD data; makes CVA conditional based on tier; introduces new spec-generator stage to formalize artifacts; requires EARS-syntax acceptance criteria; expands analyst review to validate all PRD sections for testability/gaps; redefines final output schema to match full PRD structure.
Requirements-interview skill
.claude/skills/requirements-interview/SKILL.md
New skill implementing adversarial requirements elicitation with design-tree traversal, depth-first branch walking, decision recording (CONFIRMED/OVERRIDDEN/DEFERRED/OUT_OF_SCOPE), codebase-first exploration to avoid redundant user questions, failure-mode prompts to surface unknown-unknowns, EARS-syntax acceptance criteria generation, and structured PRD handoff to spec-generator stage.
Skill validation tests
.claude/skills/requirements-interview/tests/test_skill_contract.py
New pytest suite enforcing contract for requirements-interview skill: file presence, 500-line max, frontmatter validation (name, description, model fields), required section headings, and mandatory concept references ("grill-me", "design tree", "recommended answer", codebase exploration via grep/explore).
Plugin metadata
.claude-plugin/marketplace.json
Increments project-toolkit plugin skill count from 65 to 66.
Session memory
.agents/memory/episodes/episode-2026-04-27-session-1761.json
New episode log recording failed session outcome, implementation decisions, validation events, metrics, and lessons from end-to-end /spec flow wiring.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Analyst as Analyst Agent<br/>(spec command)
    participant Interview as requirements-interview<br/>skill
    participant Codebase as Codebase
    participant SpecGen as spec-generator<br/>stage

    User->>Analyst: /spec [feature request]
    Analyst->>Interview: invoke (grill-me pattern)
    
    Interview->>Interview: restate & confirm problem
    Interview->>Interview: construct design tree
    
    loop for each branch (depth-first)
        Interview->>Interview: walk to leaf decision
        Interview->>Codebase: grep/explore for answer
        alt answer found in codebase
            Codebase-->>Interview: evidence
            Interview->>Interview: recommend answer from code
        else unknown answer
            Interview->>User: ask decision question
            User-->>Interview: provide answer
        end
        Interview->>Interview: record (CONFIRMED|OVERRIDDEN|DEFERRED|OUT_OF_SCOPE)
    end
    
    Interview->>Interview: finalize PRD (user stories,<br/>data model, integrations,<br/>failure modes, EARS criteria)
    Interview-->>SpecGen: structured PRD (unchanged)
    
    SpecGen->>SpecGen: compute complexity tier<br/>(driven by PRD)
    SpecGen->>SpecGen: conditional CVA analysis
    SpecGen->>SpecGen: formalize REQ/DESIGN/TASK<br/>artifacts
    SpecGen-->>Analyst: durable artifacts
    Analyst-->>User: spec output
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • rjmurillo
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title follows conventional commit format with type feat, scope skills, and clear description of the new skill and integration point.
Linked Issues check ✅ Passed PR implements all core objectives from #1798: adversarial requirements interview (grill-me pattern), design-tree traversal, recommended answers, codebase-first exploration, structured PRD output, and integration into /spec as step 2 before classification.
Out of Scope Changes check ✅ Passed All changes directly support #1798 requirements: new skill file, contract tests, spec.md integration, and session logs. No unrelated modifications present.
Description check ✅ Passed The PR description directly addresses the changeset: adds requirements-interview skill, wires it into /spec, updates spec.md workflow, adds tests, and includes supporting artifacts. All claims link to concrete file changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/1798-autonomous

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 2

🧹 Nitpick comments (2)
.claude/skills/requirements-interview/tests/test_skill_contract.py (2)

31-41: Avoid duplicating the model allowlist in tests.

VALID_MODELS is duplicated here and in the validation script path referenced by this file. This will drift. Pull the allowlist from one shared source used by both validators and tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/requirements-interview/tests/test_skill_contract.py around
lines 31 - 41, Replace the duplicated VALID_MODELS set in the test with a single
import from the shared canonical allowlist used by the validator: remove the
local VALID_MODELS declaration and import the shared constant (e.g.,
VALID_MODELS or MODEL_ALLOWLIST) from the validator module referenced by this
test, then use that imported symbol in assertions (wrap/convert to a set if the
shared value is a list). Ensure the test still treats the imported allowlist as
the same type (set/list) expected by the assertions so no other changes are
needed.

1-133: Move skill contract tests to the central CI test tree.

Keep skill directories execution-focused and relocate this test to .github/tests/skills/requirements-interview/ to match repo test layout conventions.

As per coding guidelines: "Tests should be moved from skill directories to .github/tests/skills/{skill-name}/. Keep skill directories focused on execution-time content only."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/requirements-interview/tests/test_skill_contract.py around
lines 1 - 133, Move the test module out of the skill directory into the central
CI tree and update the SKILL_PATH constant so the tests still locate the skill’s
SKILL.md from the new test location: relocate the file to
.github/tests/skills/requirements-interview/, then edit SKILL_PATH (the
module-level symbol) so it points at the original skill's SKILL.md (use a
correct relative Path from the new test file to the skill directory), leaving
all test functions (e.g. test_skill_file_exists,
test_frontmatter_required_fields, test_skill_within_size_limit) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/commands/spec.md:
- Around line 16-24: The interview Skill(skill="requirements-interview") must
emit and persist the full structured PRD (including user stories, data model,
integrations, failure modes, security, and observability) and downstream steps
must consume that full schema instead of only acceptance criteria; update the
orchestration so Task(subagent_type="analyst") and subsequent steps (including
Skill(skill="cva-analysis") and the acceptance-criteria writer) accept the PRD
object as input, preserve its fields, and explicitly reference/pass the
interview schema when calling step 3 (complexity tier classification), step 4
(codebase search), step 5 (cva-analysis), and step 6 (write requirements) so no
interview decisions are dropped.

In @.claude/skills/requirements-interview/SKILL.md:
- Around line 1-13: The frontmatter for the skill (fields like name:
requirements-interview, model: claude-sonnet-4-6, user-invocable: true) is
missing an allowed-tools declaration; add an allowed-tools YAML array to the
top-level frontmatter that explicitly lists only the minimal tool IDs this skill
requires (or an empty list if it needs no tools) so the MCP tool scope is
constrained for this user-invocable skill, and keep the rest of the frontmatter
unchanged.

---

Nitpick comments:
In @.claude/skills/requirements-interview/tests/test_skill_contract.py:
- Around line 31-41: Replace the duplicated VALID_MODELS set in the test with a
single import from the shared canonical allowlist used by the validator: remove
the local VALID_MODELS declaration and import the shared constant (e.g.,
VALID_MODELS or MODEL_ALLOWLIST) from the validator module referenced by this
test, then use that imported symbol in assertions (wrap/convert to a set if the
shared value is a list). Ensure the test still treats the imported allowlist as
the same type (set/list) expected by the assertions so no other changes are
needed.
- Around line 1-133: Move the test module out of the skill directory into the
central CI tree and update the SKILL_PATH constant so the tests still locate the
skill’s SKILL.md from the new test location: relocate the file to
.github/tests/skills/requirements-interview/, then edit SKILL_PATH (the
module-level symbol) so it points at the original skill's SKILL.md (use a
correct relative Path from the new test file to the skill directory), leaving
all test functions (e.g. test_skill_file_exists,
test_frontmatter_required_fields, test_skill_within_size_limit) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 9f5b6c3d-6f40-4ee9-a61d-17de15618514

📥 Commits

Reviewing files that changed from the base of the PR and between 550634c and f6ae8c8.

⛔ Files ignored due to path filters (1)
  • .agents/sessions/2026-04-27-session-1760-1798-grill-me-requirements-interview.json is excluded by !.agents/sessions/**
📒 Files selected for processing (3)
  • .claude/commands/spec.md
  • .claude/skills/requirements-interview/SKILL.md
  • .claude/skills/requirements-interview/tests/test_skill_contract.py

Comment thread .claude/commands/spec.md Outdated
Comment thread .claude/skills/requirements-interview/SKILL.md
rjmurillo-bot and others added 5 commits April 28, 2026 07:19
Pre-commit SkillForge validator at .claude/skills/SkillForge/scripts/validate-skill.py
flagged three structural issues introduced by f6ae8c8:

  1. Frontmatter "triggers:" key not in ALLOWED_PROPERTIES (only allowed:
     agent, allowed-tools, context, description, hooks, license, metadata,
     model, name, user-invocable, version). Drop the redundant frontmatter
     field; the markdown "## Triggers" section already provides the
     trigger phrases the validator parses (validate_triggers regex).
  2. Section heading "## Method" must be "## Process" or "### Phase N"
     for validate_process to pass.
  3. Section heading "## Quality Gates" must be "## Verification" /
     "## Success Criteria" / "## Checklist" for validate_verification
     to pass; checkboxes added so verification.checkboxes warning clears.

Updated test_skill_contract.py REQUIRED_SECTIONS to match the new
section names. 10/10 contract tests still pass.

Validator now reports 18/19 checks pass (warning only: optional
"Extension Points" section). No behavior change.

Refs #1798

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… steps

Previously /spec step 6 narrowed output to "testable acceptance criteria",
dropping every other PRD section that step 2's requirements-interview skill
emitted: user stories, data model, integrations, failure modes, security,
observability, out-of-scope, deferred, open questions. Steps 3-5 also
treated input as a raw problem statement, not as the structured PRD.

Behavior preserved: no new file artifacts, no new agent invocations.
This commit only routes the existing PRD through downstream steps and
expands the Output section to mirror the PRD schema.

SKILL.md handoff section reworded to match current /spec behavior
(inline carry only). spec-generator integration follows in a separate
commit per the refactor-then-feature discipline in
.claude/rules/refactoring.md.

Refs #1798
Addresses CodeRabbit comment on .claude/commands/spec.md (PR #1812)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…facts

/spec step 6 now invokes Task(subagent_type=spec-generator) with the full
PRD from step 2 plus the complexity tier from step 3 and CVA summary from
step 5. The spec-generator agent (.claude/agents/spec-generator.md, model:
sonnet) writes:

  - .agents/specs/requirements/REQ-NNN-{slug}.md (EARS, one per requirement)
  - .agents/specs/design/DESIGN-NNN-{slug}.md
  - .agents/specs/tasks/TASK-NNN-{slug}.md

This realizes Issue #1798 Option C end-to-end: the requirements-interview
skill produces the structured PRD, /spec carries every section through
downstream steps (per the prior refactor commit), and the spec-generator
agent formalizes the result as durable on-disk artifacts. The full PRD is
passed in unchanged so the agent does not re-ask questions the interview
already answered.

requirements-interview SKILL.md handoff section updated to match the new
behavior: the skill produces the structured input the formalizer needs,
and the formalizer writes the REQ/DESIGN/TASK files.

Behavior change: /spec runs now produce on-disk artifacts in
.agents/specs/. Pre-existing inline-AC summary still surfaces via the
Output section. Callers depending on the "no files written" contract
should either invoke requirements-interview directly (skill is reusable)
or pin their flow before this commit.

Refs #1798

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…st memory

- Update session log with workLog entries for the review pass and the
  three follow-up commits (5924cff fix, f036ad5 refactor, d6284f8 feat).
- Set sessionEnd.serenaMemoryUpdated.Complete to true; evidence cites the
  new memory file. Closes the MUST violation flagged by the Session
  Protocol Validator on PR #1812.
- Set endingCommit to d6284f8.
- nextSteps now lists: push branch, reply to bot comments, open follow-up
  issues for analyst-agent integration and test-file refactor.
- Write .serena/memories/skills/skills-requirements-interview-pattern.md
  with end-to-end /spec flow, SkillForge validator gotchas, and
  deferred-work pointers. Future sessions get the full picture without
  replaying PR #1812. Naming follows ADR-017 (skills-{description},
  matching the skills/ subdirectory).
- Episode .agents/memory/episodes/episode-2026-04-27-session-1760.json
  auto-extracted by the pre-commit hook from the session log.

Refs #1798

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… (62 → 63)

Adding the requirements-interview skill in f6ae8c8 should have bumped
the project-toolkit description from 62 to 63 reusable skills. Pre-push
test_validate_marketplace_counts.py::test_current_counts_are_valid caught
the drift on /ship.

Validator (build/scripts/validate_marketplace_counts.py) counts immediate
subdirectories of .claude/skills/ that contain a SKILL.md. After this PR's
addition, that count is 63.

Refs #1798

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

♻️ Duplicate comments (1)
.claude/skills/requirements-interview/SKILL.md (1)

1-8: ⚠️ Potential issue | 🟠 Major

Add allowed-tools to this user-invocable skill.

Line 7 sets user-invocable: true, but there is no tool scope in frontmatter. Keep tool access explicit and minimal.

Suggested patch
 ---
 name: requirements-interview
 version: 0.1.0
 description: Adversarial requirements interview that walks the design tree to elicit testable requirements before any code is written. Implements the grill-me pattern - ask relentlessly, recommend an answer for every question, and resolve dependencies between decisions one branch at a time. Skip any question the codebase can already answer.
 model: claude-sonnet-4-6
 license: MIT
 user-invocable: true
+allowed-tools: Read, Glob, Grep, Write
 ---

As per coding guidelines: "Declare allowed-tools in skill definition files to scope MCP tool visibility to the skill context."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/requirements-interview/SKILL.md around lines 1 - 8, The
frontmatter for the skill "requirements-interview" declares user-invocable: true
but omits the required allowed-tools key; add an explicit allowed-tools entry in
the frontmatter (e.g., allowed-tools: [] or list the minimal MCP tools needed)
directly below the existing fields so that the skill's metadata includes
allowed-tools alongside name, version, description, model and user-invocable;
ensure the key name is exactly allowed-tools and scope is as minimal as possible
for this skill.
🧹 Nitpick comments (1)
.claude/skills/requirements-interview/tests/test_skill_contract.py (1)

31-41: Stop duplicating the model allowlist in test code.

This set can drift from scripts/validation/skill_frontmatter.py. Use one shared source so contract tests and validator cannot disagree.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/requirements-interview/tests/test_skill_contract.py around
lines 31 - 41, The test duplicates the model allowlist via the local
VALID_MODELS constant; instead import and use the canonical allowlist from the
validator module so they can't drift apart. Replace the local VALID_MODELS in
test_skill_contract.py with an import of the authoritative symbol exported by
the skill frontmatter validator (e.g., the allowlist constant or accessor such
as ALLOWED_MODELS or get_allowed_models in the skill_frontmatter module) and
reference that imported name in the test assertions so the test and the
validator share one source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/commands/spec.md:
- Around line 17-24: The CVA skip path is inconsistent: when Tier 1-2
single-use-case allows skipping CVA (step 3) you must still provide a stable
input to Task(subagent_type="spec-generator") in step 6 which currently expects
output from Skill(skill="cva-analysis"); update the flow to either (a) produce
an explicit empty/placeholder CVA object (e.g., cva_summary = {skipped: true,
reason: "Tier 1-2 single-use-case"}) whenever the skip branch is taken, or (b)
add a boolean flag (e.g., cva_required=false) alongside existing PRD payload so
Task(subagent_type="spec-generator") can handle missing CVA gracefully; update
the code paths that invoke Skill(skill="cva-analysis") and the call to the
spec-generator to pass this placeholder/flag and ensure
Task(subagent_type="spec-generator") checks for cva_summary before consuming it.

---

Duplicate comments:
In @.claude/skills/requirements-interview/SKILL.md:
- Around line 1-8: The frontmatter for the skill "requirements-interview"
declares user-invocable: true but omits the required allowed-tools key; add an
explicit allowed-tools entry in the frontmatter (e.g., allowed-tools: [] or list
the minimal MCP tools needed) directly below the existing fields so that the
skill's metadata includes allowed-tools alongside name, version, description,
model and user-invocable; ensure the key name is exactly allowed-tools and scope
is as minimal as possible for this skill.

---

Nitpick comments:
In @.claude/skills/requirements-interview/tests/test_skill_contract.py:
- Around line 31-41: The test duplicates the model allowlist via the local
VALID_MODELS constant; instead import and use the canonical allowlist from the
validator module so they can't drift apart. Replace the local VALID_MODELS in
test_skill_contract.py with an import of the authoritative symbol exported by
the skill frontmatter validator (e.g., the allowlist constant or accessor such
as ALLOWED_MODELS or get_allowed_models in the skill_frontmatter module) and
reference that imported name in the test assertions so the test and the
validator share one source of truth.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 882e0c19-14dc-4df5-94d6-85aa2ce1226e

📥 Commits

Reviewing files that changed from the base of the PR and between f6ae8c8 and b1a3e45.

⛔ Files ignored due to path filters (2)
  • .agents/sessions/2026-04-27-session-1760-1798-grill-me-requirements-interview.json is excluded by !.agents/sessions/**
  • .serena/memories/skills/skills-requirements-interview-pattern.md is excluded by !.serena/memories/**
📒 Files selected for processing (5)
  • .agents/memory/episodes/episode-2026-04-27-session-1760.json
  • .claude-plugin/marketplace.json
  • .claude/commands/spec.md
  • .claude/skills/requirements-interview/SKILL.md
  • .claude/skills/requirements-interview/tests/test_skill_contract.py
✅ Files skipped from review due to trivial changes (2)
  • .claude-plugin/marketplace.json
  • .agents/memory/episodes/episode-2026-04-27-session-1760.json

Comment thread .claude/commands/spec.md Outdated
rjmurillo and others added 3 commits April 28, 2026 08:01
…terview contract tests

Replace python-frontmatter with parse_frontmatter from
scripts/validation/skill_frontmatter.py and import VALID_MODEL_ALIASES,
DATED_SNAPSHOT_PATTERN, _NAME_PATTERN, and _XML_TAG_PATTERN from the same
module. Eliminates a third-party dependency in the test path, removes a
hardcoded VALID_MODELS set that was already drifting (missing
claude-3-7-sonnet-latest), and ensures the contract test cannot disagree
with production validation logic.

Refs #1798

Co-Authored-By: Claude <noreply@anthropic.com>
…st-privilege scope

Add allowed-tools (Read, Glob, Grep, Write) to the user-invocable skill
so MCP tool visibility is constrained to what the interview process
actually needs: reading source files and ADRs, globbing/grepping the
codebase before asking questions, and writing the interview transcript
to .agents/specs/interviews/. Matches the convention used by sibling
skills (analyze, SkillForge) and the parent /spec command.

Refs #1798

Co-Authored-By: Claude <noreply@anthropic.com>
Step 3 in /spec instructs Tier 1-2 single-use-case work to skip CVA, but
step 5 unconditionally invoked Skill(skill="cva-analysis") and step 6
expected its output. That left the spec-generator with a missing input on
the skip path. Reword step 5 so the CVA invocation is gated on tier (and
multi-use-case status) and otherwise sets an explicit
"CVA summary: N/A (single-use-case Tier 1-2)" placeholder. Step 6 carries
this placeholder forward so the spec-generator always receives a stable
PRD payload.

Refs #1798

Co-Authored-By: Claude <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

🧹 Nitpick comments (1)
.claude/skills/requirements-interview/tests/test_skill_contract.py (1)

1-13: Move this contract test out of the skill directory.

This file lives under .claude/skills/.../tests/, but repository rules keep skill directories execution-only and place CI tests under .github/tests/skills/{skill-name}/.

As per coding guidelines, **/.claude/skills/**/test?(s)/**: Tests should be moved from skill directories to .github/tests/skills/{skill-name}/.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/requirements-interview/tests/test_skill_contract.py around
lines 1 - 13, The test module test_skill_contract.py currently lives inside the
skill directory and must be relocated to the repository's CI test area for
skills; move the entire file out of the .claude/skills tree into the central CI
tests folder for skills, update any import or relative references if necessary,
and ensure CI/test discovery still finds it (rename paths or package references
so the module import name remains valid); after moving, remove the old file from
the skill directory and run the test suite to confirm no import or path breaks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/requirements-interview/tests/test_skill_contract.py:
- Around line 49-50: The fixture function skill_text currently reads SKILL_PATH
directly which can raise FileNotFoundError before the test that asserts
existence runs; update skill_text to explicitly check SKILL_PATH.exists() (or
similar) and assert the file exists with a clear message before calling
SKILL_PATH.read_text(encoding="utf-8"), so failures are deterministic and point
to the missing SKILL file.

---

Nitpick comments:
In @.claude/skills/requirements-interview/tests/test_skill_contract.py:
- Around line 1-13: The test module test_skill_contract.py currently lives
inside the skill directory and must be relocated to the repository's CI test
area for skills; move the entire file out of the .claude/skills tree into the
central CI tests folder for skills, update any import or relative references if
necessary, and ensure CI/test discovery still finds it (rename paths or package
references so the module import name remains valid); after moving, remove the
old file from the skill directory and run the test suite to confirm no import or
path breaks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: eab630af-b5b4-4493-b203-3ee4cac38563

📥 Commits

Reviewing files that changed from the base of the PR and between b1a3e45 and c51a9f8.

📒 Files selected for processing (3)
  • .claude/commands/spec.md
  • .claude/skills/requirements-interview/SKILL.md
  • .claude/skills/requirements-interview/tests/test_skill_contract.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • .claude/skills/requirements-interview/SKILL.md

Comment thread .claude/skills/requirements-interview/tests/test_skill_contract.py
…ce check

Adds an assertion in the module-scoped skill_text fixture so any
fixture-using test surfaces a deterministic, actionable error message
when SKILL.md is missing, instead of the implicit FileNotFoundError
that surfaces with a less obvious traceback. Test ordering already
runs test_skill_file_exists first, so this is defensive only.

Refs #1798

Co-Authored-By: Claude <noreply@anthropic.com>
@coderabbitai coderabbitai Bot added the agent-qa Testing and verification agent label Apr 28, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 28, 2026
This was referenced May 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-analyst Research and investigation agent agent-architect Design and ADR agent agent-orchestrator Task coordination agent agent-qa Testing and verification agent area-skills Skills documentation and patterns area-workflows GitHub Actions workflows enhancement New feature or request needs-split PR has too many commits and should be split

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(agents): Add adversarial requirements interview to /spec command (grill-me pattern)

2 participants