docs: workflow-first Step 2 + install-path accuracy pass + plugin manifest hardening by rjmurillo-bot · Pull Request #1825 · rjmurillo/ai-agents

rjmurillo-bot · 2026-04-30T05:19:47Z

Summary

Originally scoped to add a workflow-first Step 2 to docs/getting-started.md for issue #1823. Audit during /review found a deeper rot pattern across the install path that I expanded scope to fix once the user confirmed: dead skill-installer references, an unpublished npm package as the headline command, wrong Copilot plugin name, every published count number drifted from reality, an invalid commands path in the Copilot plugin manifest, and the failing session-log schema check that was blocking merge.

Changes

Original (issue #1823)

docs/getting-started.md: insert Step 2 "Understand the Workflow" with the 7-phase pipeline (Grill Me, PRD->Spec, Kanban, Implement, QA, Review, Ship), per-phase table, Day Shift / Night Shift split, Mermaid sequence diagram, "Go deeper" cross-references. Renumbered subsequent steps; updated Fastest Start anchor.

CI gate fix

.agents/sessions/2026-04-29-session-1823-getting-started-workflow.json: added the four sessionEnd MUST fields the schema requires (serenaMemoryUpdated, validationPassed, markdownLintRun, changesCommitted). The legacy keys (lintRun, commitAtomic) were causing the required Aggregate Results check to fail. Validated locally with uv run python scripts/validate_session_json.py ... -> [PASS] Session log is valid.

Doc accuracy + dead refs (expanded scope)

README.md and docs/getting-started.md:
- Reconcile every count against ground truth from filesystem and marketplace.json. Claude bundle: 23 agents, 23 commands, 29 hooks, 69 skills. Copilot bundle: 24 agents, 28 hooks, 81 skills. ADR count removed from end-user copy because ADRs are an internal governance artifact, not bundled.
- Fix Copilot plugin name. Doc was telling Copilot users to install project-toolkit@ai-agents, which targets ./.claude (Claude content). The correct Copilot bundle is copilot-cli-toolkit@ai-agents, which targets ./src/copilot-cli and lands 24 agents + 28 hooks + 81 skills.
- Strip skill-installer references. Per user direction, that upstream project is dead; the docs were pointing readers at it as the canonical alternative install path.
- Replace fake copilot --list-agents (no such flag exists in real GitHub Copilot CLI) with the verified copilot plugin list, plus an end-to-end check via copilot -p "analyst: respond with 'available'".
- Apply CodeRabbit's L79 suggestion: explicit "QA gate sign-off" on Day Shift; /test runs autonomously on Night Shift, but the verdict on whether to proceed is a Day-Shift decision.
- Fix README "6 quality gates" row that was missing the non-functional gate name.
- Deduplicate backlog-generator (was listed twice with different descriptions). Add three agents the catalog was missing: issue-feature-review, merge-resolver, negotiation. Add a Bundle column to the catalog so the per-platform asymmetry is visible (spec-generator is Claude-only; backlog-generator is Copilot-only).

Plugin manifest hardening (per user request)

src/copilot-cli/.claude-plugin/plugin.json: drop "commands": "./skills". Copilot CLI has no concept of slash commands, and pointing the commands index at the skills directory was nonsense even on Claude semantics. Bump the description's skill count from 79 to 81 to match disk.
.claude/.claude-plugin/plugin.json: declare explicit agents, skills, commands, and hooks: ./hooks/hooks.json paths. The plugin worked before via auto-discovery (per the schema captured during PR fix(plugins): repair plugin.json schema (P0 - customer install broken) #1795 incident response), but explicit paths document bundling intent and survive future reorgs without silent content drops.

Notes

CodeRabbit flagged PASS/WARN/CRITICAL_FAIL on docs/getting-started.md L71 and asked me to switch to PASS/FAIL/NEEDS WORK. Rejected: those are two different verdict surfaces. The .claude/commands/test.md command file literally emits VERDICT: PASS|WARN|CRITICAL_FAIL for each gate. PASS/FAIL/NEEDS WORK is the QA-agent verdict format used in orchestrator validation and HANDOFF-TERMS. The doc accurately describes what /test outputs.

CodeRabbit also flagged a missing link to docs/WORKFLOW-VALIDATION.md in the new "Go deeper" section. Rejected: that file documents how to validate GitHub Actions YAML workflows, which is unrelated to the development workflow pipeline (Grill Me -> Ship) that Step 2 introduces. Adding it would be off-topic for end-user onboarding.

Tests

uv run python build/scripts/validate_plugin_manifests.py
-> All 3 manifest(s) valid

uv run python build/scripts/validate_marketplace_counts.py
-> marketplace.json counts are up to date.

uv run python scripts/validate_session_json.py \
  .agents/sessions/2026-04-29-session-1823-getting-started-workflow.json
-> [PASS] Session log is valid

uv run pytest tests/test_marketplace_two_plugin.py        -> 14 passed
uv run pytest tests/test_bootstrap.py                     ->  7 passed
uv run pytest tests/integration/test_e2e_install.py       -> 13 passed
uv run pytest -k "marketplace or plugin or bootstrap or e2e"
                                                          -> 111 passed

npx markdownlint-cli2 README.md docs/getting-started.md   -> 0 errors

# Live install test against the marketplace, sandbox config dir
copilot plugin marketplace add rjmurillo/ai-agents        -> ok
copilot plugin install copilot-cli-toolkit@ai-agents      -> 81 skills + 24 agents + hooks
copilot plugin install claude-toolkit@ai-agents           -> 69 skills + 25 agents + hooks + commands
copilot plugin install claude-agents@ai-agents            -> 24 agents only
copilot plugin install copilot-cli-agents@ai-agents       -> 24 agents + 81 skills + hooks
copilot plugin list                                       -> shows installed plugins

Test plan

Required Aggregate Results CI check passes (was failing on the session log schema)
Markdown lint clean on changed files
All 3 plugin manifests validate
All 5 marketplace plugins install successfully via copilot plugin install
Live verification that copilot plugin list is a real command and copilot --list-agents is not
/install-plugin rjmurillo/ai-agents documented as Claude Code only; Copilot users get the explicit two-step flow

Refs #1795 (plugin manifest schema authority + validator)
Closes #1823

🤖 Generated with Claude Code

Insert a new "Step 2: Understand the Workflow" section between installation and verification. Surfaces the Grill Me -> PRD -> Plan -> Build -> Test -> Review -> Ship pipeline with per-phase table, Day Shift / Night Shift split, mermaid sequence diagram, and cross-references to deep-dive docs. Renumbers Verify, Use an Agent, and Understand the Output to steps 3-5 and updates the Fastest Start anchor. Fixes #1823 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Refs #1823 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-30T05:20:45Z

PR Validation Report

Note

✅ Status: PASS

Description Validation

Check	Status
Description matches diff	PASS

PR Standards

Check	Status
Issue linking keywords	PASS
Template compliance	WARN

QA Validation

Check	Status
Code changes detected	False
QA report exists	N/A

⚡ Warnings

Template compliance: 2/4 sections complete

_{Powered by PR Validation workflow}

github-actions · 2026-04-30T05:21:10Z

Session Protocol Compliance Report

Caution

❌ Overall Verdict: CRITICAL_FAIL

All session protocol requirements satisfied.

What is Session Protocol?

Session logs document agent work sessions and must comply with RFC 2119 requirements:

MUST: Required for compliance (blocking failures)
SHOULD: Recommended practices (warnings)
MAY: Optional enhancements

See .agents/SESSION-PROTOCOL.md for full specification.

Compliance Summary

Session File	Verdict	MUST Failures
`sessions-2026-04-29-session-1823-getting-started-workflow.md`	❔ NON_COMPLIANT	0

Detailed Validation Results

Click each session to see the complete validation report with specific requirement failures.

📄 sessions-2026-04-29-session-1823-getting-started-workflow

=== Session Validation ===
File: /home/runner/work/ai-agents/ai-agents/.agents/sessions/2026-04-29-session-1823-getting-started-workflow.json

[FAIL] Validation errors:

Missing required item: sessionEnd.changesCommitted
Missing required item: sessionEnd.serenaMemoryUpdated
Missing required item: sessionEnd.markdownLintRun
Missing required item: sessionEnd.validationPassed

✨ Zero-Token Validation

This validation uses deterministic script analysis instead of AI:

✅ Zero tokens consumed (previously 300K-900K per debug cycle)
✅ Instant feedback - see exact failures in this summary
✅ No artifact downloads needed to diagnose issues
✅ 10x-100x faster debugging

Powered by validate_session_json.py

📊 Run Details

Property	Value
Run ID	25148817456
Files Checked	1
Validation Method	Deterministic script analysis

_{Powered by Session Protocol Validator workflow}

github-actions · 2026-04-30T05:23:18Z

AI Quality Gate Review

Tip

✅ Final Verdict: PASS

Walkthrough

This PR was reviewed by six AI agents in parallel, analyzing different aspects of the changes:

Security Agent: Scans for vulnerabilities, secrets exposure, and security anti-patterns
QA Agent: Evaluates test coverage, error handling, and code quality
Analyst Agent: Assesses code quality, impact analysis, and maintainability
Architect Agent: Reviews design patterns, system boundaries, and architectural concerns
DevOps Agent: Evaluates CI/CD, build pipelines, and infrastructure changes
Roadmap Agent: Assesses strategic alignment, feature scope, and user value

Review Summary

Agent	Verdict	Category	Status
Security	PASS	N/A	✅
QA	PASS	N/A	✅
Analyst	PASS	N/A	✅
Architect	PASS	N/A	✅
DevOps	PASS	N/A	✅
Roadmap	PASS	N/A	✅

💡 Quick Access: Click on individual agent jobs (e.g., "🔒 security Review", "🧪 qa Review") in the workflow run to see detailed findings and step summaries.

Security Review Details

I have reviewed all the changed files. Let me complete the security analysis.

Security Review: PR #1825

PR Type Classification

File	Category	Scrutiny Level
`.agents/sessions/2026-04-29-session-1823-getting-started-workflow.json`	CONFIG	Schema and secrets
`.claude/.claude-plugin/plugin.json`	CONFIG	Schema and secrets
`src/copilot-cli/.claude-plugin/plugin.json`	CONFIG	Schema and secrets
`README.md`	DOCS	None required
`docs/getting-started.md`	DOCS	None required

Classification: DOCS + CONFIG (3 documentation files, 3 configuration files)

Analysis

1. Secret Detection

Session log file: Contains no credentials, tokens, or API keys. File holds protocol compliance metadata and session workflow state only.

Plugin manifests: Both plugin.json files contain:

Package metadata (name, version, author)
Relative path references (./agents, ./skills, ./commands, ./hooks/hooks.json)
No secrets, tokens, or credentials

Documentation files: Commands shown are example plugin install commands using public repository paths (rjmurillo/ai-agents). No real credentials exposed.

2. Plugin Manifest Security

The manifest changes are structural improvements:

Change	Security Impact
Removed `"commands": "./skills"` from Copilot manifest	[PASS] Eliminates nonsensical path reference
Added explicit paths in Claude manifest	[PASS] Documents intent, no new attack surface
Updated description counts (79 -> 81)	[PASS] Metadata accuracy, no security impact

3. Documentation Security

The documentation changes:

Update plugin install commands (public repository paths)
Add workflow pipeline documentation (Mermaid diagram, tables)
Fix agent/skill counts to match filesystem
Remove dead references to skill-installer

No sensitive internal endpoints, credentials, or architectural weaknesses exposed.

Findings

Severity	Category	Finding	Location	CWE
None	-	No security issues identified	-	-

Recommendations

None. The PR contains documentation accuracy fixes and plugin manifest hygiene improvements with no security implications.

Verdict

VERDICT: PASS
MESSAGE: Documentation and plugin manifest updates with no security-relevant code changes. No secrets, injection vectors, or sensitive data exposure detected.

{
  "verdict": "PASS",
  "message": "Documentation and plugin manifest updates with no security-relevant code changes",
  "agent": "security",
  "timestamp": "2026-04-30T14:17:46.974Z",
  "findings": []
}

QA Review Details

Let me verify the plugin manifest paths and validate the counts claimed in the PR against actual filesystem state.

Let me count the hooks more precisely and verify the marketplace.json:

Now I have enough information. Let me verify the counts against the documentation claims:

Counts from filesystem:

.claude/agents/: 25 .md files (but includes AGENTS.md, CLAUDE.md) = 23 agent definitions
.claude/skills/: 69 skill directories + CLAUDE.md
.claude/commands/: 15 .md files + CLAUDE.md + forgetful directory = need to count properly
.claude/hooks/: hooks.json shows 7 event types with 25 total hook entries
src/copilot-cli/: 24 agent .md files (24 agents)
src/copilot-cli/skills/: 81 skill directories
src/copilot-cli/hooks/: hooks.json shows 28 hook entries

The PR claims:

Claude: 23 agents, 23 commands, 29 hooks, 69 skills
Copilot: 24 agents, 28 hooks, 81 skills

These appear consistent with the filesystem. Now let me provide the QA verdict:

QA Review: PR #1825

PR Type Classification

PR TYPE: MIXED (DOCS + CONFIG)
FILES:
  - DOCS: README.md, docs/getting-started.md
  - CONFIG: .claude/.claude-plugin/plugin.json, src/copilot-cli/.claude-plugin/plugin.json
  - DATA: .agents/sessions/2026-04-29-session-1823-getting-started-workflow.json

Test Coverage Assessment

Area	Status	Evidence	Files Checked
Unit tests	N/A	DOCS + CONFIG PR	No executable code added
Edge cases	N/A	No new logic	N/A
Error paths	N/A	No new logic	N/A
Assertions	N/A	No executable code	N/A

Rationale: This PR changes documentation and JSON configuration files. No executable code was added or modified. Test coverage requirements do not apply per PR Type Detection rules.

Validation Performed

Check	Status	Evidence
pytest suite	[PASS]	7850 passed, 4 skipped (pre-executed results)
Count accuracy	[PASS]	Verified against filesystem: Claude 23 agents, 69 skills, Copilot 24 agents, 81 skills match docs
Plugin manifest schema	[PASS]	Both plugin.json files have valid structure
Session log schema	[PASS]	Added required `sessionEnd` fields per PR description
Markdown syntax	[PASS]	Mermaid diagram and tables render correctly

Quality Concerns

Severity	Issue	Location	Evidence	Required Fix
LOW	Hook count claim needs verification	README.md:48, docs/getting-started.md:63	PR claims "29 hooks" for Claude but hooks.json shows 25 hook entries across 7 event types	Cosmetic: verify counting methodology (may count unique .py files vs hook entries)

Note: The hook count discrepancy (29 vs 25 entries in hooks.json) may reflect a different counting methodology (e.g., counting individual .py files in hooks directories rather than entries in hooks.json). This is cosmetic and does not affect functionality.

Regression Risk Assessment

Risk Level: Low
Affected Components: User-facing documentation, plugin manifests (non-breaking)
Breaking Changes: None. Manifest changes add explicit paths (previously auto-discovered) and remove invalid commands path from Copilot manifest.
Required Testing: Plugin install flows (documented as tested in PR)

Pre-executed Test Results

pytest: 7850 passed, 4 skipped, 43 warnings in 47.56s

All tests pass. No failures related to the changed files.

VERDICT: PASS
MESSAGE: Documentation accuracy pass with validated plugin manifests; no executable code changes.

PR TYPE: MIXED (DOCS + CONFIG)

EVIDENCE:
- Tests found: N/A - no executable code added
- Test execution: PASS (7850 passed, 4 skipped from pre-executed pytest)
- Edge cases: N/A - documentation and config only
- Error handling: N/A - no executable code
- Blocking issues: 0
- Count claims verified against filesystem: Claude 23 agents/69 skills, Copilot 24 agents/81 skills match
- Plugin manifests validated: explicit paths added, invalid commands path removed
- Session log schema fixed: added required sessionEnd fields

{
  "verdict": "PASS",
  "message": "Documentation accuracy pass with validated plugin manifests; no executable code changes.",
  "agent": "qa",
  "timestamp": "2026-04-30T14:19:20.350Z",
  "findings": [
    {
      "severity": "low",
      "category": "documentation",
      "description": "Hook count in docs (29) differs from hooks.json entries (25); may reflect counting .py files vs hook entries",
      "location": "README.md:48",
      "recommendation": "Verify counting methodology is consistent; cosmetic issue only"
    }
  ]
}

Analyst Review Details

Now I have sufficient context. Let me compile my analysis.

Analysis: PR #1825

Code Quality Score

Criterion	Score (1-5)	Notes
Readability	5	Clear markdown structure, consistent formatting, well-organized tables
Maintainability	4	Explicit plugin paths improve future reorg resilience; counts may drift again
Consistency	5	Reconciled all counts to filesystem truth; unified naming across platforms
Simplicity	4	Expanded scope beyond original issue, but each change is necessary

Overall: 4.5/5

Impact Assessment

Scope: Module-wide (documentation + plugin manifests)
Risk Level: Low
Affected Components:
- docs/getting-started.md (new Step 2 workflow section, renumbered steps)
- README.md (count reconciliation, plugin name corrections, dead ref removal)
- .claude/.claude-plugin/plugin.json (explicit path declarations)
- src/copilot-cli/.claude-plugin/plugin.json (removed invalid commands key, fixed count)
- .agents/sessions/ (schema compliance fix for CI)

Findings

Priority	Category	Finding	Location
Low	consistency	PR description notes "23 agents" for Claude, but README L161-164 install matrix row for "Claude agents only" states "24 specialized agent definitions"	README.md:161-164
Low	documentation	CodeRabbit flagged `PASS/WARN/CRITICAL_FAIL` terminology; PR author explains this matches `/test` output format vs QA-agent verdict format. Author's rationale is sound.	docs/getting-started.md:77
Low	documentation	CodeRabbit requested WORKFLOW-VALIDATION.md link in "Go deeper". Author rejected as off-topic (YAML workflow validation vs dev pipeline). Rejection is reasonable.	docs/getting-started.md:116-121

Recommendations

Minor count discrepancy: The README install matrix row for "Claude agents only" states "24 specialized agent definitions for Claude Code" while the PR description and other sections use "23". The PR explanation clarifies this: src/claude/ ships 24, .claude/agents/ ships 23, different curated sets. The text in the table is accurate for that specific plugin source. No action required if the count is verified.
All CI checks passing: The Aggregate Results check that was blocking this PR now passes. Session log schema compliance is fixed.
Plugin manifest hardening is sound: Adding explicit agents, skills, commands, hooks paths to .claude/.claude-plugin/plugin.json documents bundling intent and prevents silent content drops in future reorgs.
Removal of commands key from Copilot CLI manifest is correct: Copilot CLI has no concept of slash commands. Pointing commands at the skills directory was semantically incorrect.

Evidence Verification

Claim	Verified
Session log now validates	[PASS] CI job "Validate .agents/sessions/2026-04-29-session-1823-getting-started-workflow.json" completed successfully
Plugin manifests valid	[PASS] CI job "Validate Plugin Manifests" completed successfully
Python tests pass	[PASS] CI job "Run Python Tests" completed successfully
Agent drift detection	[PASS] CI job "Agent Drift Detection" completed successfully
`copilot --list-agents` is fake	Verified: PR correctly replaces with `copilot plugin list`

Verdict

VERDICT: PASS
MESSAGE: Documentation accuracy pass and plugin manifest hardening are well-executed. All CI gates pass. Expanded scope is justified by the audit findings.

The PR fixes 6 distinct rot patterns: dead skill-installer references, wrong Copilot plugin name, drifted counts, invalid commands path in Copilot manifest, missing explicit paths in Claude manifest, and session log schema violation. Each fix is supported by evidence (filesystem counts, live install tests, schema validation). The 7-phase workflow addition in Step 2 is clear and actionable.

{
  "verdict": "PASS",
  "message": "Documentation accuracy pass and plugin manifest hardening are well-executed with all CI gates passing",
  "agent": "analyst",
  "timestamp": "2026-04-30T14:17:35.859Z",
  "findings": [
    {
      "severity": "low",
      "category": "consistency",
      "description": "README install matrix row for 'Claude agents only' states 24 agents while other sections use 23. This is explained by different plugin sources (src/claude/ vs .claude/agents/). No change needed if count is verified.",
      "location": "README.md:161-164",
      "recommendation": "Verify the 24 count matches src/claude/ agent file count"
    },
    {
      "severity": "low",
      "category": "documentation",
      "description": "CodeRabbit flagged PASS/WARN/CRITICAL_FAIL terminology. Author correctly explains this matches /test gate output format, distinct from QA-agent verdicts.",
      "location": "docs/getting-started.md:77",
      "recommendation": "No action required; author's rationale is sound"
    },
    {
      "severity": "low",
      "category": "documentation",
      "description": "CodeRabbit requested WORKFLOW-VALIDATION.md link in Go deeper section. Author rejected as off-topic (YAML workflow validation vs dev pipeline).",
      "location": "docs/getting-started.md:116-121",
      "recommendation": "No action required; rejection is reasonable"
    }
  ]
}

Architect Review Details

Based on my review of the changed files, I can now provide the architectural assessment.

Design Quality Assessment

Aspect	Rating (1-5)	Notes
Pattern Adherence	5	Documentation follows established structure; plugin manifests use correct schema
Boundary Respect	5	Clear separation: docs in `docs/`, manifests in their platform dirs, session logs in `.agents/sessions/`
Coupling	5	No new dependencies introduced; plugin manifests correctly declare explicit paths
Cohesion	5	Each file serves single purpose: user docs, plugin config, session audit
Extensibility	4	Plugin manifest hardening with explicit paths improves future maintainability

Overall Design Score: 5/5

Architectural Concerns

Severity	Concern	Location	Recommendation
None	N/A	N/A	N/A

No architectural concerns identified. The changes are documentation and configuration corrections.

Breaking Change Assessment

Breaking Changes: No
Impact Scope: None
Migration Required: No
Migration Path: N/A

The plugin manifest changes (plugin.json) use explicit paths that match prior auto-discovery behavior. Users reinstalling will see no behavioral difference.

Technical Debt Analysis

Debt Added: None
Debt Reduced: Medium
Net Impact: Improved

Rationale:

Dead skill-installer references removed (was pointing users to defunct upstream)
Incorrect copilot --list-agents command replaced with working copilot plugin list
Count drift fixed (agents, skills, hooks now match filesystem truth)
Session log schema compliance restored (was blocking CI)
Plugin manifest commands path corrected (was pointing at wrong directory)

ADR Assessment

ADR Required: No
Decisions Identified: None requiring ADR
Existing ADR: ADR-014 (HANDOFF.md read-only) correctly followed
Recommendation: N/A

The changes are accuracy corrections, not architectural decisions. The plugin manifest hardening (explicit paths vs auto-discovery) is a configuration preference, not a pattern change.

Recommendations

[PASS] Plugin manifest explicit paths: The shift from auto-discovery to explicit agents, skills, commands, hooks declarations in .claude/.claude-plugin/plugin.json improves clarity. Future reorgs will surface as manifest validation failures rather than silent content drops.
[PASS] Count reconciliation approach: Using filesystem and marketplace.json as ground truth, then propagating to docs, is the correct direction. Prevents drift.
[PASS] Session log schema fix: Adding the four required sessionEnd MUST fields (serenaMemoryUpdated, validationPassed, markdownLintRun, changesCommitted) unblocks CI. The legacy field removal is correct.

Verdict

VERDICT: PASS
MESSAGE: Documentation accuracy pass with plugin manifest hardening. No architectural changes. Reduces tech debt by fixing dead references, count drift, and session log schema compliance.

{
  "verdict": "PASS",
  "message": "Documentation accuracy pass with plugin manifest hardening. No architectural changes. Reduces tech debt by fixing dead references, count drift, and session log schema compliance.",
  "agent": "architect",
  "timestamp": "2026-04-30T14:17:39.526Z",
  "findings": []
}

DevOps Review Details

The PR scope is clear. Let me verify the plugin manifest schema and check for any DevOps-relevant patterns in the changes.

Based on my analysis, this PR changes no workflow files. Let me confirm the exact files in this PR and provide my DevOps review.

DevOps Review: PR #1825

PR Scope Detection

File	Category	DevOps Scope
`.agents/sessions/2026-04-29-session-1823-getting-started-workflow.json`	CONFIG	Schema validation only
`.claude/.claude-plugin/plugin.json`	CONFIG	Schema validation only
`src/copilot-cli/.claude-plugin/plugin.json`	CONFIG	Schema validation only
`README.md`	DOCS	None required
`docs/getting-started.md`	DOCS	None required

Classification: DOCS + CONFIG PR. No workflows, actions, or CI scripts modified.

Pipeline Impact Assessment

Area	Impact	Notes
Build	None	No build configuration changes
Test	None	No test infrastructure changes
Deploy	None	No deployment configuration changes
Cost	None	No runner or resource changes

CI/CD Quality Checks

Check	Status	Location
YAML syntax valid	N/A	No workflow files changed
Actions pinned	N/A	No workflow files changed
Secrets secure	N/A	No secrets referenced
Permissions minimal	N/A	No workflow files changed
Shell scripts robust	N/A	No scripts changed

Findings

Severity	Category	Finding	Location	Fix
Low	templates	Plugin manifest paths now explicit	`.claude/.claude-plugin/plugin.json`	Improvement: explicit `agents`, `skills`, `commands`, `hooks` paths document bundling intent
Low	templates	Removed invalid `commands` path	`src/copilot-cli/.claude-plugin/plugin.json`	Improvement: dropped `"commands": "./skills"` which was semantically incorrect for Copilot CLI

Template Assessment

PR Template: Adequate. PR description is thorough with test evidence.
Issue Templates: Not modified.
Template Issues: None.

Plugin Manifest Review

Both plugin manifests validate per the PR description (uv run python build/scripts/validate_plugin_manifests.py -> All 3 manifest(s) valid).

.claude/.claude-plugin/plugin.json: Now declares explicit paths for agents, skills, commands, and hooks. This is a hardening improvement that documents bundling intent and survives future reorganizations without silent content drops.

src/copilot-cli/.claude-plugin/plugin.json: Removed "commands": "./skills". Copilot CLI has no slash command concept; pointing the commands index at the skills directory was invalid. Description count updated from 79 to 81 to match disk.

Session Log Schema Fix

The session log now includes the four required sessionEnd fields:

serenaMemoryUpdated
validationPassed
markdownLintRun
changesCommitted

This fixes the failing Aggregate Results CI gate. The PR description shows local validation passed: uv run python scripts/validate_session_json.py ... -> [PASS] Session log is valid.

Automation Opportunities

Opportunity	Type	Benefit	Effort
None identified	N/A	N/A	N/A

The PR already leverages existing validators (validate_plugin_manifests.py, validate_marketplace_counts.py, validate_session_json.py). No new automation needed.

Recommendations

No DevOps changes required. This PR correctly uses existing validation infrastructure.
The explicit plugin manifest paths improve maintainability.

Verdict

VERDICT: PASS
MESSAGE: DOCS + CONFIG PR with no CI/CD changes. Plugin manifests validate. Session log schema fix restores CI gate compliance.

{
  "verdict": "PASS",
  "message": "DOCS + CONFIG PR with no CI/CD changes. Plugin manifests validate. Session log schema fix restores CI gate compliance.",
  "agent": "devops",
  "timestamp": "2026-04-30T14:17:39.148Z",
  "findings": [
    {
      "severity": "low",
      "category": "templates",
      "description": "Plugin manifest now declares explicit paths for agents, skills, commands, and hooks",
      "location": ".claude/.claude-plugin/plugin.json",
      "recommendation": "No action needed. This is an improvement that documents bundling intent."
    },
    {
      "severity": "low",
      "category": "templates",
      "description": "Removed invalid commands path from Copilot CLI plugin manifest",
      "location": "src/copilot-cli/.claude-plugin/plugin.json",
      "recommendation": "No action needed. Copilot CLI has no slash command concept."
    }
  ]
}

Roadmap Review Details

I now have sufficient context to provide the strategic alignment assessment for this PR.

Strategic Alignment Assessment

Criterion	Rating	Notes
Aligns with project goals	High	Fixes user-facing documentation accuracy. Users can now install correctly on first try.
Priority appropriate	High	Broken install docs block adoption. Fixing dead refs and wrong plugin names is P0 for onboarding.
User value clear	High	Directly unblocks new users stuck on wrong commands or dead skill-installer links.
Investment justified	High	Documentation + manifest fix only. No architectural churn. High ROI.

Feature Completeness

Scope Assessment: Right-sized. Started narrow (Step 2 workflow), expanded when audit revealed rot. Stayed within doc and manifest boundaries.
Ship Ready: Yes
MVP Complete: Yes
Enhancement Opportunities: None required. Future work could add video/gif for "Fastest Start" but that is tracked separately (TODO placeholder already in README L17).

Impact Analysis

Dimension	Assessment	Notes
User Value	High	First-run success rate should increase. Users pointed at real commands.
Business Impact	High	Adoption friction reduction directly affects plugin install conversion.
Technical Leverage	Medium	Plugin manifest explicit paths prevent silent content drops in future reorgs.
Competitive Position	Improved	Accurate counts and working install commands build trust.

Concerns

Priority	Concern	Recommendation
Low	Scope creep beyond original issue	Acceptable. Audit revealed user-blocking rot; better to fix now than ship broken docs. Decision documented in PR body and session log.
Low	"23 vs 24 agents" gap could confuse users	Addressed. PR adds a `Bundle` column in the catalog table and explains the gap in README L166.

Recommendations

Merge as-is. The expanded scope is justified by the audit findings. All counts are reconciled against filesystem and marketplace.json.
No follow-up required. The session log schema fix unblocks CI. Plugin manifest hardening is defensive but valuable.

Verdict

VERDICT: PASS
MESSAGE: Documentation accuracy pass removes adoption blockers and reconciles all published counts. Scope expansion justified by audit findings.

{
  "verdict": "PASS",
  "message": "Documentation accuracy pass removes adoption blockers and reconciles all published counts.",
  "agent": "roadmap",
  "timestamp": "2026-04-30T14:17:37.038Z",
  "findings": [
    {
      "severity": "low",
      "category": "scope",
      "description": "PR scope expanded from original issue 1823 after audit revealed dead refs and wrong counts",
      "location": "docs/getting-started.md:1",
      "recommendation": "Acceptable. Audit findings documented in PR body. Better to fix now than ship broken docs."
    },
    {
      "severity": "low",
      "category": "documentation",
      "description": "23 vs 24 agent count gap between Claude and Copilot bundles could confuse users",
      "location": "README.md:376",
      "recommendation": "Addressed via Bundle column in catalog table and explanatory text at L166."
    }
  ]
}

Run Details

Property	Value
Run ID	25170485211
Triggered by	`pull_request` on `1825/merge`
Commit	`6e4393146c8c0f9617c723a94059230310919800`

_{Powered by AI Quality Gate workflow}

gemini-code-assist

Code Review

This pull request updates the docs/getting-started.md file by introducing a new section, 'Step 2: Understand the Workflow,' which provides a detailed overview of the 7-phase agent pipeline and its operational modes. As a result, the subsequent steps in the guide have been renumbered, and internal links have been updated for consistency. I have no feedback to provide as there were no review comments.

coderabbitai · 2026-04-30T05:24:30Z

📝 Walkthrough

Walkthrough

Adds a workflow-first Step 2 to docs/getting-started.md describing a 7‑phase pipeline (phases, outputs, when to run them), Day Shift vs Night Shift model, and a mermaid pipeline diagram; renumbers subsequent steps and updates verification/examples. Also updates README and Claude/Copilot plugin manifests to declare asset paths and adjust install/verification commands and counts.

Changes

Cohort / File(s)	Summary
Getting Started doc `docs/getting-started.md`	Inserted "Step 2: Understand the Workflow" with a 7‑phase pipeline (descriptions, artifacts, triggers), Day/Night Shift model, and mermaid diagram; renumbered steps (Verify → Step 3, Use an Agent → Step 4, Understand the Output → Step 5); updated anchors and "What Next".
README & install docs `README.md`, `docs/getting-started.md`	Distinguish Claude vs Copilot bundles; change Copilot install target to `copilot-cli-toolkit@ai-agents`; update verification to `copilot plugin list`; remove end-user skill‑installer/UV prereq text; update bundle-level counts and troubleshooting guidance.
Claude plugin manifest `.claude/.claude-plugin/plugin.json`	Extended manifest with explicit relative fields: `agents`, `skills`, `commands`, and `hooks` pointing to repository subpaths.
Copilot plugin manifest `src/copilot-cli/.claude-plugin/plugin.json`	Removed `"commands": "./skills"` entry and updated skill count/description in manifest to reflect reconciled totals.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

area-skills

Suggested reviewers

rjmurillo

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title follows conventional commit format with 'docs:' prefix and clearly summarizes the three main changes: workflow Step 2, install accuracy, and plugin manifest updates.
Description check	✅ Passed	Description is detailed and directly related to changeset. Explains original scope (issue `#1823`), expanded scope (install-path fixes), CI gate fix (session log), doc accuracy updates, and manifest hardening with test results.
Linked Issues check	✅ Passed	All coding requirements from issue `#1823` are met: Step 2 'Understand the Workflow' added with 7-phase pipeline, Day Shift/Night Shift split, Mermaid diagram, phase descriptions with artifacts and decision triggers, cross-references, and step renumbering with updated anchors.
Out of Scope Changes check	✅ Passed	Expanded scope changes (session log schema fix, count reconciliation, Copilot plugin name correction, skill-installer removal, manifest hardening) are justified by audit findings during /review and explicitly approved by user. No unrelated changes detected.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/1823-autonomous

_{Review rate limit: 4/5 reviews remaining, refill in 12 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 2

🧹 Nitpick comments (2)

docs/getting-started.md (2)
108-114: ⚡ Quick win

[P1] Add missing cross-reference to WORKFLOW-VALIDATION.md.

“Go deeper” includes autonomous-issue-development.md and other items, but it omits WORKFLOW-VALIDATION.md, which is explicitly listed in the PR objectives. Add it to keep the workflow-first path complete.
Suggested addition
 ### Go deeper
 - [Autonomous issue development](autonomous-issue-development.md) for running the full pipeline AFK
+- [Workflow validation](WORKFLOW-VALIDATION.md) for checking workflow/CI correctness before ship
 - [Ideation workflow](ideation-workflow.md) for turning vague ideas into specs before `/spec`
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/getting-started.md` around lines 108 - 114, The "Go deeper" list is
missing a cross-reference to WORKFLOW-VALIDATION.md; add an entry to that list
referencing WORKFLOW-VALIDATION.md (use a consistent link label like "Workflow
validation" and the same list style as the other items) so the
docs/getting-started.md "Go deeper" section includes the workflow validation doc
alongside autonomous-issue-development.md and the other links.
63-72: ⚖️ Poor tradeoff

[P2] Optional: tighten “2–3 sentence description per phase” to match the objective.

The table columns include “What it does / Artifact produced / When to use,” but most “What it does” cells are effectively 1 sentence. If the objective is strict, add an additional sentence to each “What it does” cell (still concise) or add a brief per-phase paragraph under the table.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/getting-started.md` around lines 63 - 72, The "What it does" column
entries in the phase table are too brief; for each phase row (e.g., "Grill Me"
`/spec` (requirements-interview), "PRD to Spec" `/spec` (completion), "Kanban"
`/plan`, "Implement" `/build`, "QA" `/test`, "Review" `/review`, "Ship" `/ship`)
add a second concise sentence that clarifies the objective or expected outcome
(or alternatively add a short 1–2 sentence paragraph under the table for each
phase) so each phase has a clear 2–3 sentence description matching the stated
objective.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/getting-started.md`:
- Around line 69-71: Update the QA verdict wording in the Step 2 Phase 5 table
row (the "QA" / `/test` entry) so the artifact column uses the repo-standard
statuses "PASS/FAIL/NEEDS WORK" instead of "PASS/WARN/CRITICAL_FAIL"; edit the
text in the table cell where the gate verdicts are described to exactly read
"Gate verdicts table with PASS/FAIL/NEEDS WORK per gate" so it matches other
documentation and repo conventions.
- Around line 75-79: Update the Day/Night Shift bullets to explicitly state that
"QA" is a human-in-loop sign-off on Day Shift and that Night Shift runs the
automated /test gate: modify the "Day Shift" bullet to read something like "Day
Shift (human decision required): Grill Me interview responses, PRD review, QA
sign-off, ship decision" and modify the "Night Shift" bullet to emphasize
autonomous execution like "Night Shift (AFK or autonomous): /build loops,
automated /test gate runs, /review passes" so QA's human sign-off is clearly
scoped to Day Shift and /test is clearly autonomous.

---

Nitpick comments:
In `@docs/getting-started.md`:
- Around line 108-114: The "Go deeper" list is missing a cross-reference to
WORKFLOW-VALIDATION.md; add an entry to that list referencing
WORKFLOW-VALIDATION.md (use a consistent link label like "Workflow validation"
and the same list style as the other items) so the docs/getting-started.md "Go
deeper" section includes the workflow validation doc alongside
autonomous-issue-development.md and the other links.
- Around line 63-72: The "What it does" column entries in the phase table are
too brief; for each phase row (e.g., "Grill Me" `/spec`
(requirements-interview), "PRD to Spec" `/spec` (completion), "Kanban" `/plan`,
"Implement" `/build`, "QA" `/test`, "Review" `/review`, "Ship" `/ship`) add a
second concise sentence that clarifies the objective or expected outcome (or
alternatively add a short 1–2 sentence paragraph under the table for each phase)
so each phase has a clear 2–3 sentence description matching the stated
objective.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 732863e7-01e0-443f-98a9-c49818396f17

📥 Commits

Reviewing files that changed from the base of the PR and between ab6a61d and 5b9fd3a.

⛔ Files ignored due to path filters (1)

.agents/sessions/2026-04-29-session-1823-getting-started-workflow.json is excluded by !.agents/sessions/**

📒 Files selected for processing (1)

docs/getting-started.md

rjmurillo · 2026-04-30T06:58:46Z

Review Triage Required

Note

Priority: NORMAL - Human approval required before bot responds

Review Summary

Source	Reviews	Comments
Human	0	0
Bot	2	2

Next Steps

Review human feedback above
Address any CHANGES_REQUESTED from human reviewers
Add triage:approved label when ready for bot to respond to review comments

_{Powered by PR Maintenance workflow - Add triage:approved label}

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/getting-started.md`:
- Around line 116-122: Add the missing workflow validation deep link to the "Go
deeper" Step 2 list: insert a bullet linking to WORKFLOW-VALIDATION.md alongside
the existing items (Autonomous issue development, Ideation workflow,
.claude/commands/spec.md, .claude/skills/requirements-interview/SKILL.md) under
the "Go deeper" header so the list matches the PR objective and completes the
workflow-first map.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: b6b70daa-5958-46ac-8072-cf3b47d30f92

📥 Commits

Reviewing files that changed from the base of the PR and between 5b9fd3a and 5d36a24.

📒 Files selected for processing (1)

docs/getting-started.md

Add the four sessionEnd fields that the JSON schema requires (serenaMemoryUpdated, validationPassed, markdownLintRun, changesCommitted). The original log used legacy keys (lintRun, commitAtomic) and omitted the other two; the required CI gate "Validate .agents/sessions/2026-04-29-session-1823-getting-started-workflow.json" was failing as a result, which in turn failed the required "Aggregate Results" check. Validated locally: uv run python scripts/validate_session_json.py \ .agents/sessions/2026-04-29-session-1823-getting-started-workflow.json -> [PASS] Session log is valid Refs #1823 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CodeRabbit PR #1825 review comment (line 79) flagged that the Day/Night Shift split listed ship decisions and PRD review on Day Shift, but did not explicitly call out that QA gate verdicts require a human sign-off. /test runs autonomously on Night Shift, but the verdict on whether to proceed is a Day-Shift decision. Refs #1825 (CodeRabbit comment 3165802663) Refs #1823 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Single coherent doc-accuracy pass against ground truth from the filesystem and live tests against Claude Code and Copilot CLI: - Plugin name: Copilot CLI users were being told to install `project-toolkit@ai-agents`, which targets `./.claude` (Claude content) and lands 69 skills only. The correct Copilot bundle is `copilot-cli-toolkit@ai-agents`, which targets `./src/copilot-cli` and lands 24 agents + 28 hooks + 81 skills. - Counts: replace "21 agents / 62 skills / 57 ADRs / 49 skills / 17+ commands / 50+ skills" with the actual marketplace.json numbers split per platform (Claude: 23 agents, 23 commands, 29 hooks, 69 skills; Copilot: 24 agents, 28 hooks, 81 skills). ADR count removed from end-user copy because ADRs do not ship with the plugins; they are an internal governance artifact. - Dead refs: skill-installer is a deprecated upstream tool. Removed the install path, prerequisites, troubleshooting block, and the Core Capabilities bullet that pointed at it. - Verification step: `copilot --list-agents` is not a real flag. Replaced with `copilot plugin list` (verified locally) plus an end-to-end check via `copilot -p "analyst: respond with 'available'"`. - Catalog: deduplicated `backlog-generator`, added the three agents the catalog was missing (issue-feature-review, merge-resolver, negotiation), and added a Bundle column to surface the per-platform asymmetry (`spec-generator` is Claude only; `backlog-generator` is Copilot only). - README L311: `/test` row was missing the `non-functional` gate name despite saying "6 quality gates"; restored the sixth name to match `.claude/commands/test.md`. Local validation: copilot plugin marketplace add rjmurillo/ai-agents -> ok copilot plugin install copilot-cli-toolkit@ai-agents -> 81 skills copilot plugin install claude-toolkit@ai-agents -> 69 skills copilot plugin install claude-agents@ai-agents -> ok (agents) copilot plugin install copilot-cli-agents@ai-agents -> 81 skills copilot plugin list -> ok grep skill-installer README.md docs/getting-started.md -> empty grep -- --list-agents README.md docs/getting-started.md -> empty Refs #1825 Refs #1823 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User asked why plugin install differs between bundles. Investigation found two manifest-level bugs in src/copilot-cli/.claude-plugin/plugin.json plus one missing-explicitness gap in .claude/.claude-plugin/plugin.json. src/copilot-cli/.claude-plugin/plugin.json - Drop `"commands": "./skills"`. Copilot CLI has no concept of slash commands, and pointing the `commands` index at the skills directory is nonsense even on Claude semantics. The validator accepted it because it starts with `./`, but no install path consumes it. - Bump skill count in description from 79 to 81 to match the actual count under src/copilot-cli/skills/. .claude/.claude-plugin/plugin.json - Add explicit `agents`, `skills`, `commands`, and `hooks: ./hooks/hooks.json` declarations. The plugin worked before via auto-discovery (Anthropic schema, see PR #1795 / .serena/memories /claude/claude-code-plugin-manifest-schema.md), but explicit paths document bundling intent. Without them, a future reorg could quietly drop content from the install. Did NOT add `hooks` to src/copilot-cli/.claude-plugin/plugin.json because the validator (build/scripts/validate_plugin_manifests.py) checks the referenced hooks.json against the Claude PascalCase event list (PreToolUse, PostToolUse, ...). src/copilot-cli/hooks/hooks.json uses Copilot CLI's camelCase events (preToolUse, postToolUse, userPromptSubmitted, ...), so declaring the field would fail validation. Auto-discovery picks the file up at install time, which is the same path it took before; explicit declaration would need a validator update first. Verification (all locally): uv run python build/scripts/validate_plugin_manifests.py -> All 3 manifest(s) valid uv run pytest tests/test_marketplace_two_plugin.py -v -> 14 passed uv run pytest tests/test_bootstrap.py -v -> 7 passed uv run pytest tests/integration/test_e2e_install.py -v -> 13 passed uv run pytest -k "marketplace or plugin or bootstrap or e2e" -v -> 111 passed, 7703 deselected Per-plugin install content (live, against rjmurillo/ai-agents marketplace): claude-agents : 24 agents at root (no skills/hooks) copilot-cli-agents : 24 agents + 81 skills + hooks dir project-toolkit : 25 agents + 69 skills + hooks dir + commands dir claude-toolkit : 25 agents + 69 skills + hooks dir + commands dir copilot-cli-toolkit : 24 agents + 81 skills + hooks dir Refs #1795 (schema authority + validator) Refs #1825 Refs #1823 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…1823-autonomous

github-actions · 2026-04-30T14:04:24Z

Spec-to-Implementation Validation

Tip

✅ Final Verdict: PASS

What is Spec Validation?

This validation ensures your implementation matches the specifications:

Requirements Traceability: Verifies PR changes map to spec requirements
Implementation Completeness: Checks all requirements are addressed

Validation Summary

Check	Verdict	Status
Requirements Traceability	`PASS`	✅
Implementation Completeness	`PASS`	✅

Spec References

Type	References
Specs	None
Issues	1823

Requirements Traceability Details

Now let me view the README.md and plugin manifest changes to complete the analysis:

Now I have sufficient information to complete the requirements traceability analysis. Let me compile the findings:

Requirements Coverage Matrix

Requirement	Description	Status	Evidence
REQ-001	Add new Step 2: Understand the Workflow section before "Use an Agent"	COVERED	`docs/getting-started.md:65-121` (Step 2 inserted with full workflow content)
REQ-002	Show the 7-phase pipeline (Grill Me, PRD, Kanban, Implement, Review, QA, Ship)	COVERED	`docs/getting-started.md:71-79` (table with all 7 phases)
REQ-003	For each phase: What it does (2-3 sentences)	COVERED	`docs/getting-started.md:71-79` "What it does" column
REQ-004	For each phase: What it produces (artifact name)	COVERED	`docs/getting-started.md:71-79` "Artifact produced" column
REQ-005	For each phase: When to use it (decision trigger)	COVERED	`docs/getting-started.md:71-79` "When to use" column
REQ-006	Mirror Matt's Day Shift / Night Shift model	COVERED	`docs/getting-started.md:81-86` (explicit Day/Night Shift section)
REQ-007	Day Shift: Grill Me, PRD review, QA, ship decisions	COVERED	`docs/getting-started.md:85` lists "Grill Me interview responses, PRD review, QA gate sign-off, ship decision"
REQ-008	Night Shift: /build loops, automated review, test gates	COVERED	`docs/getting-started.md:86` lists "/build loops, /test gate runs, /review passes"
REQ-009	Keep Fastest Start unchanged	COVERED	`docs/getting-started.md:5-28` (Fastest Start preserved, only anchor renamed)
REQ-010	Insert new Step 2 after installation, before command usage	COVERED	`docs/getting-started.md:65` (Step 2 follows Step 1: Install, precedes Step 4: Use an Agent)
REQ-011	Reference existing docs (autonomous-issue-development.md, etc.) for deep dives	COVERED	`docs/getting-started.md:116-121` "Go deeper" section with 4 cross-references
REQ-012	Reduce cognitive load: users see the map before navigating	COVERED	Structure places workflow understanding before agent usage
REQ-013	Align with PR #1812: explain requirements-interview skill in /spec context	COVERED	`docs/getting-started.md:73` explicitly ties "Grill Me" to `/spec (requirements-interview)`

Summary

Total Requirements: 13
Covered: 13 (100%)
Partially Covered: 0 (0%)
Not Covered: 0 (0%)

Expanded Scope Items (Beyond Issue #1823)

The PR description documents expanded scope approved by the user. These are not requirements from Issue #1823 but are implementation choices that passed validation:

Item	Status	Evidence
Session log schema fix	COMPLETE	`.agents/sessions/2026-04-29-session-1823-getting-started-workflow.json` adds required `sessionEnd` fields
Count reconciliation (Claude: 23/23/29/69, Copilot: 24/28/81)	COMPLETE	`README.md:38,42-49`, `docs/getting-started.md:63`
Copilot plugin name fix	COMPLETE	Docs now use `copilot-cli-toolkit@ai-agents` instead of `project-toolkit@ai-agents`
skill-installer removal	COMPLETE	Dead upstream references stripped
`copilot plugin list` verification command	COMPLETE	`docs/getting-started.md:136-139`
Plugin manifest hardening	COMPLETE	`.claude/.claude-plugin/plugin.json` explicit paths; `src/copilot-cli/.claude-plugin/plugin.json` drops invalid `commands`

Gaps

None identified. All 13 requirements from Issue #1823 are covered by the implementation.

[!TIP]
VERDICT: PASS
All 13 requirements from Issue #1823 are covered. The 7-phase pipeline is documented with per-phase details (what it does, what it produces, when to use). Day Shift/Night Shift model is explicit. Fastest Start preserved. Cross-references to deep-dive docs included. Expanded scope items (CI fix, count reconciliation, plugin manifest hardening) passed validation per PR tests.

Implementation Completeness Details

Now I have all the information I need to validate the implementation against the specification in Issue #1823. Let me analyze the acceptance criteria and check implementation completeness.

Acceptance Criteria Checklist

Additional Work (Expanded Scope Per PR Description)

Session log schema compliance - sessionEnd MUST fields added
Count reconciliation across README.md and docs/getting-started.md
Dead skill-installer references removed
Plugin manifest hardening (explicit paths in .claude/.claude-plugin/plugin.json, fixed commands path in Copilot manifest)
CodeRabbit QA sign-off clarification applied (line 85: "QA gate sign-off")

Missing Functionality

None. All acceptance criteria from Issue #1823 are satisfied.

Edge Cases Not Covered

Offline users: No mention of offline workflow limitations (minor, not in spec)
Partial phase execution: Doc mentions "skipping phases is possible but reduces quality gates" (line 67), which addresses this

Implementation Quality

Completeness: 100% of acceptance criteria satisfied
Quality: High. Implementation exceeds spec by:
- Adding the requested Mermaid diagram
- Including explicit artifact file patterns (e.g., REQ-NNN-*.md)
- Providing a "Go deeper" section with 4 cross-references
- Fixing related doc accuracy issues discovered during implementation

[!TIP]
VERDICT: PASS
Implementation satisfies all 11 acceptance criteria from Issue #1823. The 7-phase pipeline table, Day/Night Shift split, Mermaid diagram, and cross-references are present. Expanded scope (count reconciliation, plugin manifest hardening, session log schema fix) adds value without introducing regressions.

Run Details

Property	Value
Run ID	25169744450
Triggered by	`pull_request` on `1825/merge`

_{Powered by AI Spec Validator workflow}

coderabbitai

Warning

CodeRabbit couldn't request changes on this pull request because it doesn't have sufficient GitHub permissions.

Please grant CodeRabbit Pull requests: Read and write permission and re-run the review.

👉 Steps to fix this

Actionable comments posted: 1

♻️ Duplicate comments (2)

docs/getting-started.md (2)

77-77: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

[P1] Use repo-standard QA verdict terms in the phase table.

The QA artifact uses PASS/WARN/CRITICAL_FAIL, but repo docs standardize this as PASS/FAIL/NEEDS WORK.

Proposed fix

-| 5 | QA | `/test` | Six quality gates: functional, non-functional, security, DevOps, DX, observability | Gate verdicts table with PASS/WARN/CRITICAL_FAIL per gate | After `/build` completes a slice |
+| 5 | QA | `/test` | Six quality gates: functional, non-functional, security, DevOps, DX, observability | Gate verdicts table with PASS/FAIL/NEEDS WORK per gate | After `/build` completes a slice |

As per coding guidelines: "Maintain consistent terminology for QA assessment statuses: use PASS/FAIL/NEEDS WORK."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/getting-started.md` at line 77, The QA phase table currently uses the
non-standard verdict terms "PASS/WARN/CRITICAL_FAIL"; update the table entry in
docs/getting-started.md (the QA artifact string) to use the repo-standard terms
"PASS/FAIL/NEEDS WORK" instead so the phase row for QA matches other docs and
guidelines.

116-121: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

[P1] Add the missing workflow validation cross-reference in “Go deeper”.

Step 2 still omits WORKFLOW-VALIDATION.md, which leaves the linked issue objective incomplete.

Proposed fix

 ### Go deeper
 
 - [Autonomous issue development](autonomous-issue-development.md) for running the full pipeline AFK
 - [Ideation workflow](ideation-workflow.md) for turning vague ideas into specs before `/spec`
+- [Workflow validation](WORKFLOW-VALIDATION.md) for gate expectations and evidence checks
 - [`.claude/commands/spec.md`](../.claude/commands/spec.md) for the full `/spec` process reference
 - [`.claude/skills/requirements-interview/SKILL.md`](../.claude/skills/requirements-interview/SKILL.md) for Grill Me skill internals

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/getting-started.md` around lines 116 - 121, The "Go deeper" list in
getting-started.md is missing a cross-reference to WORKFLOW-VALIDATION.md;
update the list (the block under the "Go deeper" heading in
docs/getting-started.md) to include a new bullet linking to
WORKFLOW-VALIDATION.md (e.g., "Workflow validation (WORKFLOW-VALIDATION.md)") so
the workflow validation step is referenced alongside the existing items like
"Autonomous issue development" and the `.claude` links.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Around line 161-164: Summary: The install matrix row for "Claude agents only"
incorrectly lists 24 agents instead of 23; update the table entry to match the
rest of the README. Fix: edit the table row containing the string "Claude agents
only" and the install token "claude-agents@ai-agents" to change "24 specialized
agent definitions for Claude Code" to "23 specialized agent definitions for
Claude Code", and verify the count now matches the "Claude full toolkit" and
other Claude references.

---

Duplicate comments:
In `@docs/getting-started.md`:
- Line 77: The QA phase table currently uses the non-standard verdict terms
"PASS/WARN/CRITICAL_FAIL"; update the table entry in docs/getting-started.md
(the QA artifact string) to use the repo-standard terms "PASS/FAIL/NEEDS WORK"
instead so the phase row for QA matches other docs and guidelines.
- Around line 116-121: The "Go deeper" list in getting-started.md is missing a
cross-reference to WORKFLOW-VALIDATION.md; update the list (the block under the
"Go deeper" heading in docs/getting-started.md) to include a new bullet linking
to WORKFLOW-VALIDATION.md (e.g., "Workflow validation (WORKFLOW-VALIDATION.md)")
so the workflow validation step is referenced alongside the existing items like
"Autonomous issue development" and the `.claude` links.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: bcfb3b13-4856-438f-a462-e57e1bb21dc3

📥 Commits

Reviewing files that changed from the base of the PR and between 5d36a24 and ff34f8b.

⛔ Files ignored due to path filters (1)

.agents/sessions/2026-04-29-session-1823-getting-started-workflow.json is excluded by !.agents/sessions/**

📒 Files selected for processing (4)

.claude/.claude-plugin/plugin.json
README.md
docs/getting-started.md
src/copilot-cli/.claude-plugin/plugin.json

CodeRabbit flagged that the install matrix says "24 agents" for the agents-only Claude plugin while the headline and the toolkit row say "23 agents". Both numbers are accurate but reflect different source directories: - claude-agents plugin -> src/claude/ -> 24 agent definitions - claude-toolkit plugin -> .claude/agents/ -> 23 agent definitions The two source dirs are kept in sync where they overlap but each set includes agents the other does not. The headline number (23) reflects the Fastest Start path (full toolkit), which is what most users get. Update the install-matrix descriptions to cite the source directory inline so the asymmetry is visible at the point of confusion. Add a paragraph below the table explaining the gap so future readers do not re-flag it. Refs #1825 (CodeRabbit comment on README.md:164) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

README.md (1)

410-423: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

[P1] Add .claude/ to the repo tree to match install/source docs

The tree omits .claude/, but Lines 163 and 166 describe .claude/ as the Claude full-toolkit source. Keep this section aligned to avoid conflicting guidance.

Proposed fix

 ai-agents/
 ├── src/
 │   ├── vs-code-agents/      # VS Code / GitHub Copilot agents
 │   ├── copilot-cli/         # GitHub Copilot CLI agents
 │   └── claude/              # Claude Code CLI agents
+├── .claude/                 # Claude full-toolkit agents, commands, hooks, skills
 ├── templates/               # Agent template system
 ├── scripts/                 # Validation and utility scripts
 ├── docs/                    # Documentation
 ├── .agents/                 # Agent artifacts (ADRs, plans, etc.)
 ├── .claude-plugin/          # Plugin marketplace manifest (marketplace.json)

As per coding guidelines: “Ensure cross-references between documentation files are accurate and up-to-date.”

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 410 - 423, The repository tree in README.md is
missing the .claude/ directory listed elsewhere; update the tree block to
include a line for ".claude/  # Claude full-toolkit source" so it aligns with
the references on Lines 163 and 166 (and other docs). Locate the tree snippet
containing "ai-agents/ ├── src/" and add the ".claude/" entry in the top-level
list (consistent with ".claude-plugin/" and ".agents/") and ensure the short
description matches the phrasing used in CLAUDE.md and other references.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@README.md`:
- Around line 410-423: The repository tree in README.md is missing the .claude/
directory listed elsewhere; update the tree block to include a line for
".claude/  # Claude full-toolkit source" so it aligns with the references on
Lines 163 and 166 (and other docs). Locate the tree snippet containing
"ai-agents/ ├── src/" and add the ".claude/" entry in the top-level list
(consistent with ".claude-plugin/" and ".agents/") and ensure the short
description matches the phrasing used in CLAUDE.md and other references.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: e7cc61fb-45ed-40d0-a895-1eb656b96f08

📥 Commits

Reviewing files that changed from the base of the PR and between ff34f8b and 73c0c07.

📒 Files selected for processing (1)

README.md

This regression (agents/skills/commands/hooks rejected by Claude Code at install time) has shipped THREE TIMES in five days: - PR #1773 (Apr 26) — original break - PR #1825 (Apr 30) — re-broke as 'plugin manifest hardening' - PR #1835 v1/v2 (Apr 30, this branch) — wrong fixes before landing on the right one (delete the keys) Each iteration cost the same set of investigations and customer outage because the prior fix's reasoning was nowhere a future contributor would find it. ADR-058 fixes that by encoding the rule + the history + the gate references into one canonical document. Wires: - build/scripts/validate_plugin_manifests.py Adds _check_adr_058_forbidden_keys() called from validate_manifest() BEFORE any field-level validation. Error message cites ADR-058 and the PR history so a future contributor lands on the postmortem. - tests/build_scripts/test_validate_plugin_manifests.py Inverts pre-ADR-058 'this key is valid' assertions into 'ADR-058 rejects this key' assertions. Adds 8 new test cases covering each forbidden key + the 'multiple keys produce one consolidated message' contract + the 'minimal manifest still passes' negative case. - scripts/validation/test_plugin_install.sh Header now references ADR-058 as the canonical source of the rule. All 262 tests in tests/build_scripts/ pass. Live install smoke test still passes for the 3 Claude-targeted plugins; Copilot-targeted plugins' separate hook-format issue intentionally out of scope (see PR description).

rjmurillo-bot and others added 2 commits April 29, 2026 22:19

chore(sessions): add session log for issue #1823

5b9fd3a

Refs #1823 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rjmurillo-bot enabled auto-merge (squash) April 30, 2026 05:19

github-actions Bot added documentation Improvements or additions to documentation enhancement New feature or request labels Apr 30, 2026

coderabbitai Bot requested a review from rjmurillo April 30, 2026 05:20

coderabbitai Bot added the area-workflows GitHub Actions workflows label Apr 30, 2026

gemini-code-assist Bot reviewed Apr 30, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread docs/getting-started.md

Comment thread docs/getting-started.md

Merge branch 'main' into feat/1823-autonomous

5d36a24

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread docs/getting-started.md

rjmurillo and others added 7 commits April 30, 2026 06:36

Merge branch 'main' into feat/1823-autonomous

738f4b9

Merge remote-tracking branch 'origin/main' into feat/1823-autonomous

ef42ad5

Merge remote-tracking branch 'origin/feat/1823-autonomous' into feat/…

ff34f8b

…1823-autonomous

github-actions Bot added the needs-split PR has too many commits and should be split label Apr 30, 2026

rjmurillo changed the title ~~docs(getting-started): add workflow-first Step 2 with 7-phase pipeline~~ docs: workflow-first Step 2 + install-path accuracy pass + plugin manifest hardening Apr 30, 2026

coderabbitai Bot added the area-infrastructure Build, CI/CD, configuration label Apr 30, 2026

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread README.md Outdated

coderabbitai Bot added the area-skills Skills documentation and patterns label Apr 30, 2026

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

coderabbitai Bot approved these changes Apr 30, 2026

View reviewed changes

rjmurillo-bot merged commit 2f38de7 into main Apr 30, 2026
110 of 111 checks passed

rjmurillo-bot deleted the feat/1823-autonomous branch April 30, 2026 14:23

coderabbitai Bot mentioned this pull request May 1, 2026

Plugin install fails: agents field rejected as 'Invalid input' by Claude Code 2.1.122 #1833

Closed

rjmurillo mentioned this pull request May 1, 2026

fix(plugins): omit discovery keys from marketplace manifests (#1833) #1835

Merged

coderabbitai Bot mentioned this pull request May 1, 2026

Copilot-CLI plugins fail to load in Claude Code (camelCase vs PascalCase hooks schema) #1840

Closed

This was referenced May 9, 2026

Child 5: /ship collapses to "did /review pass on this SHA?" #1938

Closed

Epic: lifecycle-gate convergence (CI as backstop, /review as primary, vendor-survivable knowledge homes) #1933

Closed

coderabbitai Bot mentioned this pull request Jun 11, 2026

bug(github-skill): installed plugin run_completion_gate.py cannot import scripts module #2572

Closed

Uh oh!

Conversation

rjmurillo-bot commented Apr 30, 2026 • edited by rjmurillo Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Original (issue #1823)

CI gate fix

Doc accuracy + dead refs (expanded scope)

Plugin manifest hardening (per user request)

Notes

Tests

Test plan

Uh oh!

github-actions Bot commented Apr 30, 2026

PR Validation Report

Description Validation

PR Standards

QA Validation

⚡ Warnings

Uh oh!

github-actions Bot commented Apr 30, 2026

Session Protocol Compliance Report

Compliance Summary

Detailed Validation Results

Uh oh!

github-actions Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Quality Gate Review

Review Summary

Security Review: PR #1825

PR Type Classification

Analysis

1. Secret Detection

2. Plugin Manifest Security

3. Documentation Security

Findings

Recommendations

Verdict

QA Review: PR #1825

PR Type Classification

Test Coverage Assessment

Validation Performed

Quality Concerns

Regression Risk Assessment

Pre-executed Test Results

Analysis: PR #1825

Code Quality Score

Impact Assessment

Findings

Recommendations

Evidence Verification

Verdict

Design Quality Assessment

Architectural Concerns

Breaking Change Assessment

Technical Debt Analysis

ADR Assessment

Recommendations

Verdict

DevOps Review: PR #1825

PR Scope Detection

Pipeline Impact Assessment

CI/CD Quality Checks

Findings

Template Assessment

Plugin Manifest Review

Session Log Schema Fix

Automation Opportunities

Recommendations

Verdict

Strategic Alignment Assessment

Feature Completeness

Impact Analysis

Concerns

Recommendations

Verdict

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

rjmurillo-bot commented Apr 30, 2026 •

edited by rjmurillo

Loading

github-actions Bot commented Apr 30, 2026 •

edited

Loading

coderabbitai Bot commented Apr 30, 2026 •

edited

Loading