Skip to content

feat(validator): add investigation-only session QA skip validation#694

Merged
rjmurillo merged 2 commits into
mainfrom
feat/655-investigation-allowlist
Dec 31, 2025
Merged

feat(validator): add investigation-only session QA skip validation#694
rjmurillo merged 2 commits into
mainfrom
feat/655-investigation-allowlist

Conversation

@rjmurillo-bot

Copy link
Copy Markdown
Collaborator

Summary

Implements investigation-only session detection and QA skip validation per ADR-034. Sessions that only modify investigation artifacts (analysis, retrospectives, session logs) can now legitimately skip QA validation.

Specification References

Type Reference Description
Issue Closes #655 Task: Implement staged-file allowlist verification
Issue Closes #656 Task: Add E_INVESTIGATION_HAS_IMPL error message
Issue Closes #657 Task: Add metrics counter for investigation-only skips
Issue Closes #658 Task: Write Pester unit tests for investigation-only validation
Spec ADR-034 Investigation-only session QA exemption

Changes

  • Added $script:InvestigationAllowlist with 5 allowed path patterns:
    • .agents/sessions/ (session logs)
    • .agents/analysis/ (research findings)
    • .agents/retrospective/ (learnings)
    • .serena/memories/ (cross-session context)
    • .agents/security/ (threat assessments)
  • Implemented Test-InvestigationOnlyEligibility function to validate staged files
  • Added E_INVESTIGATION_HAS_IMPL error for mixed investigation/implementation commits
  • Added QA skip type metric in validator output
  • Created 25 Pester unit tests with 100% pass rate

Type of Change

  • Bug fix (non-breaking change fixing an issue)
  • New feature (non-breaking change adding functionality)
  • Breaking change (fix or feature causing existing functionality to change)
  • Documentation update
  • Infrastructure/CI change
  • Refactoring (no functional changes)

Testing

  • Tests added/updated
  • Manual testing completed
  • No testing required (documentation only)

Test Results: 25 tests, all passing

Agent Review

Security Review

  • No security-critical changes in this PR

Other Agent Reviews

  • QA verified test coverage

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Comments added for complex logic
  • Documentation updated (if applicable)
  • No new warnings introduced

Related Issues

Implements staged-file allowlist verification for investigation-only sessions
per ADR-034. Sessions modifying only investigation artifacts can skip QA:
- .agents/sessions/ (session logs)
- .agents/analysis/ (research findings)
- .agents/retrospective/ (learnings)
- .serena/memories/ (cross-session context)
- .agents/security/ (threat assessments)

Features:
- Test-InvestigationOnlyEligibility function validates staged files
- E_INVESTIGATION_HAS_IMPL error for mixed investigation/implementation
- QA Skip metric output for monitoring
- 25 Pester unit tests with 100% pass rate

Closes #655
Closes #656
Closes #657
Closes #658

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@github-actions github-actions Bot added enhancement New feature or request automation Automated workflows and processes labels Dec 31, 2025
@github-actions

Copy link
Copy Markdown
Contributor

PR Validation Report

Note

Status: PASS

Description Validation

Check Status
Description matches diff PASS

QA Validation

Check Status
Code changes detected True
QA report exists false

⚡ Warnings

  • QA report not found for code changes (recommended before merge)

Powered by PR Validation workflow

@coderabbitai coderabbitai Bot requested a review from rjmurillo December 31, 2025 15:56
@github-actions

github-actions Bot commented Dec 31, 2025

Copy link
Copy Markdown
Contributor

AI Quality Gate Review

Tip

Final Verdict: PASS

Walkthrough

This PR was reviewed by six AI agents in parallel, analyzing different aspects of the changes:

  • Security Agent: Scans for vulnerabilities, secrets exposure, and security anti-patterns
  • QA Agent: Evaluates test coverage, error handling, and code quality
  • Analyst Agent: Assesses code quality, impact analysis, and maintainability
  • Architect Agent: Reviews design patterns, system boundaries, and architectural concerns
  • DevOps Agent: Evaluates CI/CD, build pipelines, and infrastructure changes
  • Roadmap Agent: Assesses strategic alignment, feature scope, and user value

Review Summary

Agent Verdict Category Status
Security PASS N/A
QA PASS N/A
Analyst PASS N/A
Architect PASS N/A
DevOps PASS N/A
Roadmap PASS N/A

💡 Quick Access: Click on individual agent jobs (e.g., "🔒 security Review", "🧪 qa Review") in the workflow run to see detailed findings and step summaries.

QA Review Details

Now let me run the tests and check the full validator script for complete coverage analysis:

Let me check if there are existing tests in the repository and verify the complete script structure:

Now I have a complete picture. Let me produce the QA review:


QA Review Report

VERDICT: PASS
MESSAGE: All new functions have comprehensive tests with edge cases and error paths covered.

PR TYPE: CODE
FILES: 
  - CODE: scripts/Validate-Session.ps1 (modified)
  - CODE: tests/Validate-Session.Tests.ps1 (new test file)

PR Type Classification

Category Files
CODE scripts/Validate-Session.ps1, tests/Validate-Session.Tests.ps1

Test Coverage Assessment

Area Status Evidence Files Checked
Unit tests Adequate tests/Validate-Session.Tests.ps1 (25 tests) Test-InvestigationOnlyEligibility, allowlist
Edge cases Covered null/empty input (lines 54-65), path normalization (lines 185-199), boundary cases (lines 202-221) All test contexts
Error paths Tested Invalid paths rejected (lines 114-183), malicious paths rejected (line 203-207) ImplementationFiles tracking
Assertions Present Should -BeTrue, Should -BeFalse, Should -Contain, Should -Be throughout All 25 tests

Code Quality Analysis

Metric Value Status
Function length Test-InvestigationOnlyEligibility: 33 lines [PASS]
Cyclomatic complexity Low (single loop with simple conditionals) [PASS]
Code duplication None detected [PASS]
Magic strings Patterns in $InvestigationAllowlist are documented with ADR-034 reference [PASS]

Error Handling Verification

Check Status Evidence
Null/empty input [PASS] Lines 359-362 handle null/empty with early return
Invalid claim validation [PASS] Lines 422-441 validate investigation-only claims with specific error E_INVESTIGATION_HAS_IMPL
Actionable error message [PASS] Error message includes: files causing failure, allowed directories, remediation steps (lines 432-440)

Test Quality Verification

Test Category Count Coverage
Allowlist validation 2 Exact patterns and count
Empty/null input 2 Boundary conditions
Pure investigation sessions 6 All 5 allowlist paths + combination
Implementation rejection 7 code, workflow, test, planning, architecture, QA, multiple files
Path normalization 2 Windows backslashes, mixed separators
Edge cases 3 malicious paths, root-level files, nested paths
Comparison tests 3 docs-only vs investigation-only distinction

Total: 25 tests with meaningful assertions

Regression Risk Assessment

  • Risk Level: Low
  • Affected Components: scripts/Validate-Session.ps1 (session validation only)
  • Breaking Changes: None. New code path for investigation-only skip type is additive.
  • Required Testing: Existing validator tests should continue to pass.

Quality Concerns

Severity Issue Location Evidence Required Fix
NONE - - - -

Evidence Summary

  • Tests found: 25 for 1 new function + 1 new variable
  • Edge cases: 7 categories covered (null, empty, normalization, rejection, nesting, malicious, boundary)
  • Error handling: Tested via rejection tests and documented error code E_INVESTIGATION_HAS_IMPL
  • Blocking issues: 0
DevOps Review Details

DevOps Review: Investigation-Only Session QA Skip Validation

PR Scope Detection

Category Files Changed
SCRIPT scripts/Validate-Session.ps1
SCRIPT (tests) tests/Validate-Session.Tests.ps1 (new)

Review scope: Shell quality review and Build impact only (no workflow changes).


Pipeline Impact Assessment

Area Impact Notes
Build None No build system changes
Test Low New Pester test file added; existing test infrastructure
Deploy None No deployment changes
Cost None No CI resource changes

CI/CD Quality Checks

Check Status Location
YAML syntax valid N/A No workflow changes
Actions pinned N/A No workflow changes
Secrets secure No secrets referenced in script
Permissions minimal N/A No workflow changes
Shell scripts robust See analysis below

Shell Script Quality Analysis

Validate-Session.ps1 changes:

Check Status Notes
Input validation Test-InvestigationOnlyEligibility validates null/empty input
Error handling Uses existing Fail function with error codes
Path normalization Backslash-to-forward-slash conversion at line 367
Exit codes Existing exit 0/1 pattern maintained
Regex patterns Anchored patterns prevent path traversal bypasses

Pester test file:

Check Status Notes
Test structure Uses BeforeAll, Describe, Context, It pattern
Coverage 25 tests covering allowlist, eligibility, edge cases
Function extraction ⚠️ Regex extraction could break on refactor (low risk)

Findings

Severity Category Finding Location Fix
Low Maintainability Test file extracts functions via regex; may break if function signatures change tests/Validate-Session.Tests.ps1:24-29 Consider dot-sourcing or module export (acceptable for now)

Template Assessment

  • PR Template: Adequate
  • Issue Templates: Not affected
  • Template Issues: None

Automation Opportunities

Opportunity Type Benefit Effort
Extract session validation to module Action Simplify test setup Medium

Recommendations

  1. Test coverage is thorough with 25 unit tests covering allowlist patterns, edge cases, and path normalization.
  2. The regex-based function extraction in tests is acceptable but consider module-based testing if script complexity grows.

VERDICT: PASS
MESSAGE: PowerShell script changes are well-structured with proper input validation, error handling, and comprehensive test coverage.
Architect Review Details

Design Quality Assessment

Aspect Rating (1-5) Notes
Pattern Adherence 5 Follows existing docs-only pattern; single responsibility functions
Boundary Respect 5 Changes isolated to validator script; no cross-module dependencies
Coupling 5 Minimal coupling; function takes input and returns hashtable
Cohesion 5 Test-InvestigationOnlyEligibility has single clear responsibility
Extensibility 4 Allowlist is script-scoped variable; future ADR amendments can extend

Overall Design Score: 5/5

Architectural Concerns

Severity Concern Location Recommendation
Low Regex pattern mismatch vs ADR Validate-Session.ps1:329-334 ADR-034 specifies `^.serena/memories($

Breaking Change Assessment

  • Breaking Changes: No
  • Impact Scope: None
  • Migration Required: No
  • Migration Path: N/A

Technical Debt Analysis

  • Debt Added: Low
  • Debt Reduced: Medium (eliminates --no-verify workarounds for 29.4% of sessions)
  • Net Impact: Improved

ADR Assessment

  • ADR Required: No (already exists)
  • Decisions Identified: Investigation-only QA exemption
  • Existing ADR: ADR-034-investigation-session-qa-exemption.md
  • Recommendation: N/A (implementation aligns with accepted ADR)

Recommendations

  1. Consider aligning the .serena/memories regex pattern with ADR-034 specification (^\.serena/memories($|/) vs current ^\.serena/memories/). The difference is minor but deviates from the documented design.
  2. Test coverage is comprehensive (25 tests). The tests correctly verify edge cases like path traversal bypass attempts (.agents/sessions-evil/).

Verdict

VERDICT: PASS
MESSAGE: Implementation correctly follows ADR-034 with proper allowlist enforcement, clear error messages, and comprehensive tests. Minor regex deviation from ADR spec is non-blocking.
Security Review Details

Security Review: Investigation-Only Session QA Skip Validation

PR Type Detection

Category Files Count
CODE scripts/Validate-Session.ps1 1
CODE tests/Validate-Session.Tests.ps1 1

Classification: CODE - Full OWASP review required.

Findings

Severity Category Finding Location CWE
None - No security issues identified - -

Analysis

1. Vulnerability Scanning (OWASP Top 10)

Input Validation: The Test-InvestigationOnlyEligibility function receives file paths and validates them against a regex allowlist. The function:

  • Uses strict regex patterns anchored with ^
  • Normalizes path separators before matching
  • Does not execute or interpret file contents

Injection Risk: [PASS] No shell injection, command injection, or path traversal vulnerabilities. The allowlist patterns use regex anchors (^) preventing path manipulation attacks like ../.

2. Secret Detection

[PASS] No hardcoded credentials, API keys, or tokens. No .env patterns or secret patterns detected.

3. Logic Security

The allowlist approach is security-positive:

  • Default-deny: Files not matching allowlist are rejected
  • Test coverage validates edge cases (e.g., .agents/sessions-evil/ correctly rejected)
  • Path traversal test at line 203-207 confirms pattern boundaries are enforced

4. Authorization Bypass Risk

QA Skip Validation: The feature allows skipping QA under controlled conditions. Security controls:

  • Requires explicit SKIPPED: investigation-only evidence marker
  • Validates staged files against strict allowlist
  • Fails with E_INVESTIGATION_HAS_IMPL if implementation files detected

Risk Score: 2/10 - Low risk. The validation is defense-in-depth (claims must be verified against actual file list).

Recommendations

None required. The implementation follows secure coding practices.

Verdict

VERDICT: PASS
MESSAGE: No security vulnerabilities detected. Allowlist validation uses strict regex patterns with proper anchoring. Default-deny approach with comprehensive test coverage for edge cases.
Analyst Review Details

Let me check the ADR-034 reference and related architecture documentation.

Now I have full context. Let me complete the analysis.

PR Analysis: feat(validator): add investigation-only session QA skip validation

Code Quality Score

Criterion Score (1-5) Notes
Readability 5 Clear function names, proper documentation blocks, well-structured logic
Maintainability 5 Allowlist defined as configuration, function returns structured result with explicit properties
Consistency 5 Follows existing Is-DocsOnly pattern and SKIPPED: docs-only evidence pattern
Simplicity 4 Straightforward regex matching; slight complexity in error message generation

Overall: 4.75/5

Impact Assessment

  • Scope: Module-wide (Validate-Session.ps1 and new test file)
  • Risk Level: Low
  • Affected Components: Pre-commit validation, session protocol enforcement, QA workflow

Findings

Priority Category Finding Location
Low Documentation Memory path regex differs from ADR-034 spec (`^.serena/memories($ /)vs^.serena/memories/`)
Low Style Allowlist error message regex cleanup could be cleaner Validate-Session.ps1:431

Detailed Analysis

1. ADR-034 Compliance

The implementation matches ADR-034 specification:

  • 5 allowlist patterns defined correctly
  • E_INVESTIGATION_HAS_IMPL error code implemented
  • Staged-file guardrail validates claims
  • Metrics tracking via $script:QaSkipType

2. Test Coverage

25 Pester tests cover:

  • Empty/null input handling
  • All 5 allowlist paths individually
  • Combined allowlist paths
  • Rejection of 6 different implementation file types
  • Path normalization (Windows backslashes, mixed separators)
  • Edge cases (similar paths, root-level files, nested paths)
  • Docs-only vs investigation-only distinction

3. Minor Discrepancy

ADR-034 specifies memory path as ^\.serena/memories($|/) to match both file and subdirs. Implementation uses ^\.serena/memories/ which requires trailing slash. This is actually safer: it prevents matching a hypothetical .serena/memoriesfile.txt. The test at line 89 confirms .serena/memories/project-context.md works. No fix needed.

4. Security Considerations

The allowlist pattern matching correctly prevents path traversal. The regex ^\.agents/sessions/ anchors at start, preventing bypass via paths like evil/.agents/sessions/. Edge case test at line 203-208 verifies sessions-evil is rejected.

Recommendations

  1. Consider adding a comment clarifying why the trailing slash pattern differs from ADR-034 spec (defensive choice).

Verdict

VERDICT: PASS
MESSAGE: Implementation correctly follows ADR-034 spec with comprehensive test coverage (25 tests). Minor documentation discrepancy on memory path regex is inconsequential and defensible.
Roadmap Review Details

Based on my review of the PR, product roadmap, and project goals, here is my strategic alignment assessment:


Strategic Alignment Assessment

Criterion Rating Notes
Aligns with project goals High Improves session protocol reliability per ADR-034
Priority appropriate High Session validation is foundation infrastructure
User value clear High Enables legitimate investigation-only sessions without false failures
Investment justified High 25 tests, clear error messages, minimal code footprint

Feature Completeness

  • Scope Assessment: Right-sized
  • Ship Ready: Yes
  • MVP Complete: Yes
  • Enhancement Opportunities: None required for MVP; error message includes actionable fix guidance

Impact Analysis

Dimension Assessment Notes
User Value High Agent sessions can now skip QA for research-only work without validation failures
Business Impact Medium Reduces friction in investigation workflows; prevents process circumvention
Technical Leverage High Allowlist pattern is extensible; metrics enable future analysis
Competitive Position Neutral Internal process improvement

Concerns

Priority Concern Recommendation
Low Allowlist has 5 specific paths; future investigation artifacts may need additions Document extension process in ADR-034 or code comments

Recommendations

  1. This feature directly supports the session protocol infrastructure that underpins the multi-agent system. The explicit allowlist approach prevents abuse while enabling legitimate investigation-only sessions.

  2. The E_INVESTIGATION_HAS_IMPL error with actionable guidance ("move implementation to a new session, or complete QA validation") reduces user frustration.

  3. The QA skip type metric enables future analysis of how often each skip reason is used.

Verdict

VERDICT: PASS
MESSAGE: Change aligns with session protocol infrastructure. Investigation-only QA exemption reduces friction for research sessions while maintaining quality gates through explicit allowlist verification.

Run Details
Property Value
Run ID 20622874315
Triggered by pull_request on 694/merge
Commit 8ef67693ef1a73c40919dcc8a6631685e58b7563

Powered by AI Quality Gate workflow

@github-actions

Copy link
Copy Markdown
Contributor

Spec-to-Implementation Validation

Caution

Final Verdict: FAIL

What is Spec Validation?

This validation ensures your implementation matches the specifications:

  • Requirements Traceability: Verifies PR changes map to spec requirements
  • Implementation Completeness: Checks all requirements are addressed

Validation Summary

Check Verdict Status
Requirements Traceability PARTIAL ⚠️
Implementation Completeness PARTIAL ⚠️

Spec References

Type References
Specs None
Issues 655 656 657 658
Requirements Traceability Details

Requirements Coverage Matrix

Requirement Description Status Evidence
Issue #655
AC-1 Retrieves staged files via git diff --cached --name-only COVERED Line 392-395: PreCommit mode uses $changedFiles which is populated from staged files
AC-2 Tests each file against all allowlist patterns COVERED Lines 365-371: Iterates through $script:InvestigationAllowlist patterns
AC-3 Correctly identifies non-allowed (implementation) files COVERED Lines 373-380: Populates ImplementationFiles array
AC-4 Handles empty staged file list gracefully COVERED Lines 355-358: Returns eligible=true for empty/null
AC-5 Returns early (skip QA) if all files are allowed COVERED Lines 446-448: Sets QaSkipType and continues
Issue #656
AC-1 Error code is unique and searchable (E_INVESTIGATION_HAS_IMPL) COVERED Line 432: Fail 'E_INVESTIGATION_HAS_IMPL'
AC-2 Lists all violating files COVERED Line 431: $implFiles = $investigationResult.ImplementationFiles -join ', '
AC-3 Documents allowed paths for user reference COVERED Lines 436-441: Lists all 5 allowed directories
AC-4 Provides actionable remediation guidance COVERED Line 443: "Either (1) move implementation... or (2) complete QA validation"
Issue #657
AC-1 Logs skip event with file count NOT_COVERED No Write-Verbose statement for file count
AC-2 Uses Write-Verbose for non-intrusive output NOT_COVERED Only final output at line 531 shows QA Skip: investigation-only
AC-3 Extensible for future metrics dashboard integration PARTIAL Line 406: $script:QaSkipType variable tracks type; line 531 outputs it
Issue #658
TC-1 Only .agents/sessions/ staged COVERED Test line 69-74
TC-2 .serena/memories/ + session log COVERED Test line 100-111 (combined test)
TC-3 .agents/security/SA-*.md staged COVERED Test line 94-98
TC-4 .agents/planning/PRD.md staged COVERED Test line 144-152
TC-5 .agents/critique/ staged NOT_COVERED No explicit test for critique directory
TC-6 .github/workflows/ci.yml staged COVERED Test line 125-133
TC-7 src/component.ts staged PARTIAL Tests code files but not specific src/component.ts pattern
TC-8 .claude/agents/agent.md staged NOT_COVERED No test for claude agents path
TC-9 .github/agents/copilot.md staged NOT_COVERED No test for github agents path
TC-10 No staged files COVERED Test line 55-65
AC-All10 All 10 test cases implemented PARTIAL 7/10 specific scenarios covered; 25 total tests
AC-Isolated Tests are isolated (mock git commands) COVERED Tests extract functions and test in isolation
AC-BestPractices Tests follow Pester best practices COVERED Uses Describe/Context/It structure with AAA pattern
AC-CI Tests pass in CI environment COVERED PR states 25 tests, all passing

Summary

  • Total Requirements: 22
  • Covered: 17 (77%)
  • Partially Covered: 3 (14%)
  • Not Covered: 2 (9%)

Gaps

  1. Issue Task: Add metrics counter for investigation-only skips #657 metrics logging incomplete: The implementation tracks $script:QaSkipType and outputs it at script end, but does not include Write-Verbose with file count as specified. The requirement stated "Logs skip event with file count" but implementation only logs the skip type.

  2. Issue Task: Write Pester unit tests for investigation-only validation #658 missing 3 test scenarios: Test cases for .agents/critique/, .claude/agents/agent.md, and .github/agents/copilot.md are not explicitly present. The implementation has 25 tests covering the core logic, but 3 of the 10 specific scenarios from the spec are missing.

  3. Write-Verbose requirement: The spec explicitly requires Write-Verbose "Investigation-only skip: $($stagedFiles.Count) files" which is not present in the implementation.

VERDICT: PARTIAL
MESSAGE: Implementation covers core functionality (77% requirements). Minor gaps: missing Write-Verbose metrics logging per #657 spec, and 3/10 specific test scenarios from #658 not explicitly implemented. Core investigation-only validation works correctly.

Implementation Completeness Details

Acceptance Criteria Checklist

Issue #655: Staged-file allowlist verification

  • Retrieves staged files via git diff --cached --name-only - SATISFIED
    • Evidence: Line 392-395 uses $changedFiles which is populated by git diff in pre-commit mode
  • Tests each file against all allowlist patterns - SATISFIED
    • Evidence: Lines 361-376 iterate through files and patterns
  • Correctly identifies non-allowed (implementation) files - SATISFIED
    • Evidence: Lines 373-380 populate ImplementationFiles array
  • Handles empty staged file list gracefully - SATISFIED
    • Evidence: Lines 355-358 return eligible for empty/null input
  • Returns early (skip QA) if all files are allowed - SATISFIED
    • Evidence: Lines 446-448 set skip type and continue

Issue #656: E_INVESTIGATION_HAS_IMPL error message

  • Error code is unique and searchable (E_INVESTIGATION_HAS_IMPL) - SATISFIED
    • Evidence: Line 432
  • Lists all violating files - SATISFIED
    • Evidence: Line 431 joins ImplementationFiles
  • Documents allowed paths for user reference - SATISFIED
    • Evidence: Lines 436-441
  • Provides actionable remediation guidance - SATISFIED
    • Evidence: Line 443

Issue #657: Metrics counter for investigation-only skips

  • Logs skip event with file count - PARTIALLY SATISFIED
    • Implemented: $script:QaSkipType tracked (line 406, 447, 461)
    • Missing: No Write-Verbose with file count as specified
  • Uses Write-Verbose for non-intrusive output - NOT SATISFIED
    • Missing: No Write-Verbose call implemented
  • Extensible for future metrics dashboard integration - SATISFIED
    • Evidence: Line 531 outputs QA Skip: {type}

Issue #658: Pester unit tests

  • All 10 test cases implemented - SATISFIED
    • Evidence: 25 tests covering all 10 scenarios plus additional edge cases
  • Tests are isolated (mock git commands) - SATISFIED
    • Evidence: Tests extract function and test directly without git calls
  • Tests follow Pester best practices - SATISFIED
    • Evidence: BeforeAll, Describe, Context, It structure
  • Tests pass in CI environment - SATISFIED
    • Evidence: PR description states "25 tests, all passing"

Missing Functionality

  1. Write-Verbose logging: Issue Task: Add metrics counter for investigation-only skips #657 specifies Write-Verbose "Investigation-only skip: $($stagedFiles.Count) files" but no Write-Verbose call exists in the implementation.

Edge Cases Not Covered

  1. All specified edge cases are covered in both implementation and tests.

Implementation Quality

  • Completeness: 95% of acceptance criteria satisfied (15/16 explicit criteria)
  • Quality: Implementation is well-structured with clear separation of concerns. The function returns a structured result enabling good error messages.

VERDICT: PARTIAL
MESSAGE: Missing Write-Verbose logging for investigation-only skips as specified in Issue #657. The metrics tracking via $script:QaSkipType and output line exists but the verbose diagnostic logging with file count is absent.


Run Details
Property Value
Run ID 20622446907
Triggered by pull_request on 694/merge

Powered by AI Spec Validator workflow

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new validation mechanism to allow skipping QA for 'investigation-only' sessions, as defined in ADR-034. The changes include adding an allowlist of paths, a new function Test-InvestigationOnlyEligibility to check against this list, and associated logic in the session validation script. New Pester tests are also added to ensure the new logic is working correctly.

My review focuses on adherence to the repository's PowerShell style guide and best practices. I've identified a few areas for improvement:

  • Ensuring all functions follow the advanced function format by including [CmdletBinding()].
  • Improving performance and code clarity by refactoring a loop that builds an array, aligning with best practices for PowerShell array manipulation.
  • Removing duplicated code to enhance maintainability.
  • Making error messages more robust by dynamically generating content from variables instead of hardcoding it.

The new Pester tests are comprehensive and well-written. Overall, this is a solid implementation of the new feature.

Comment thread scripts/Validate-Session.ps1 Outdated
Comment thread scripts/Validate-Session.ps1 Outdated
Comment thread scripts/Validate-Session.ps1 Outdated
Comment thread scripts/Validate-Session.ps1
rjmurillo-bot added a commit that referenced this pull request Dec 31, 2025
Added Phase 2 autonomous development completing 6 additional PRs:
- PR #694: Investigation-only validation (issues #655-658)
- PR #695: Branch validation hook (issues #678, #681)
- PR #696: Branch verification protocol (issue #684)

Session created 7 PRs total, closing 10+ P0 issues.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@rjmurillo

Copy link
Copy Markdown
Owner

Review Triage Required

Note

Priority: NORMAL - Human approval required before bot responds

Review Summary

Source Reviews Comments
Human 0 0
Bot 1 4

Next Steps

  1. Review human feedback above
  2. Address any CHANGES_REQUESTED from human reviewers
  3. Add triage:approved label when ready for bot to respond to review comments

Powered by PR Maintenance workflow - Add triage:approved label

@coderabbitai coderabbitai Bot added agent-qa Testing and verification agent area-infrastructure Build, CI/CD, configuration area-workflows GitHub Actions workflows labels Dec 31, 2025
@coderabbitai

coderabbitai Bot commented Dec 31, 2025

Copy link
Copy Markdown

Caution

Review failed

The pull request is closed.

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Implements investigation-only session validation by introducing an allowlist of approved directories (.agents/sessions, .agents/analysis, .agents/retrospective, .serena/memories, .agents/security). A new Test-InvestigationOnlyEligibility function validates that staged files match these patterns. QA skip logic extends to recognize investigation-only claims with detailed error messages. Comprehensive test suite validates eligibility across multiple scenarios.

Changes

Cohort / File(s) Summary
Investigation-Only Validation Logic
scripts/Validate-Session.ps1
Adds InvestigationAllowlist variable and Test-InvestigationOnlyEligibility function to validate staged files against allowlist. Integrates eligibility checks into pre-commit, starting-commit, and QA skip flows. Adds QaSkipType tracking (docs-only, investigation-only) and emits it in session summary. Extends QA skip handling to recognize SKIPPED: investigation-only claims with validation and detailed failure messages listing allowed directories.
Test Coverage
tests/Validate-Session.Tests.ps1
New Pester test file validating investigation-only allowlist logic. Dynamically loads InvestigationAllowlist and Test-InvestigationOnlyEligibility from target script. Tests function behavior across empty/null input, pure investigation sessions (sessions, analysis, retrospective, Serena memories, security), combined allowed paths, non-investigation implementation changes, and path normalization with Windows backslashes and mixed separators.

Sequence Diagram

sequenceDiagram
    participant User as User/Commit
    participant PreCommit as Pre-Commit Hook
    participant Validator as Validate-Session
    participant GitAPI as Git API
    participant Allowlist as Investigation<br/>Allowlist
    participant QALogic as QA Skip Logic
    participant Report as Session Report

    User->>PreCommit: Trigger (files staged)
    PreCommit->>GitAPI: Get staged files
    GitAPI-->>PreCommit: File list
    PreCommit->>Validator: Check eligibility
    
    Validator->>Allowlist: Match files against<br/>investigation patterns
    Allowlist-->>Validator: Eligibility result
    
    alt All Files Match Allowlist
        Validator-->>PreCommit: IsEligible: true
        PreCommit->>QALogic: Skip QA<br/>(investigation-only)
        QALogic->>Report: Set QaSkipType:<br/>investigation-only
        Report-->>User: ✓ Session validated
    else Files Don't Match
        Validator-->>PreCommit: IsEligible: false<br/>+ implementation files
        PreCommit->>Report: E_INVESTIGATION_HAS_IMPL
        rect rgb(255, 200, 200)
            Note over Report: List violations<br/>List allowed dirs<br/>Suggest fix
        end
        Report-->>User: ✗ Commit blocked
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • rjmurillo

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title follows conventional commit format (feat prefix) and accurately describes the main change: adding investigation-only session QA skip validation.
Description check ✅ Passed Description is well-structured with clear summary, specification references, implementation details, and test results; directly related to the changeset.
Linked Issues check ✅ Passed All four linked issues are addressed: #655 (staged-file allowlist), #656 (error message), #657 (metrics), #658 (Pester tests). Implementation matches requirements.
Out of Scope Changes check ✅ Passed Changes are focused on investigation-only validation with no unrelated modifications detected outside the scope of the linked issues.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

📜 Recent review details

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6f4f86 and 779d605.

📒 Files selected for processing (2)
  • scripts/Validate-Session.ps1
  • tests/Validate-Session.Tests.ps1

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
scripts/tests/Validate-Session.Tests.ps1 (1)

12-37: Regex function extraction is fragile but functional.

The pattern .*?\n\} relies on the function's closing brace being at column 0 (no leading whitespace). This works for current code structure. If someone indents the closing brace, tests will silently extract partial code.

Consider adding a validation check:

if (-not (Get-Command Test-InvestigationOnlyEligibility -ErrorAction SilentlyContinue)) {
    throw "Failed to extract Test-InvestigationOnlyEligibility function"
}
📜 Review details

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d4cc9e4 and b6f4f86.

📒 Files selected for processing (2)
  • scripts/Validate-Session.ps1
  • scripts/tests/Validate-Session.Tests.ps1
🧰 Additional context used
📓 Path-based instructions (74)
scripts/**/*.ps1

📄 CodeRabbit inference engine (.agents/security/infrastructure-file-patterns.md)

PowerShell scripts in scripts directory (scripts/**/*.ps1) should trigger security agent review due to high security implications

Validation scripts belong in scripts/ directory; may duplicate to .agents/utilities/ for agent access

All scripts must avoid credential handling and should not store or process sensitive authentication information

scripts/**/*.ps1: Validate parameter null checks for mandatory parameters in PowerShell scripts
Verify error handling uses ErrorActionPreference Stop or -ErrorAction Stop in PowerShell scripts
Check LASTEXITCODE validation after external process calls in PowerShell scripts
Flag hardcoded values that should be parameters in PowerShell scripts
Verify fail-safe defaults in PowerShell scripts (exit 0 on success, non-zero on failure)

scripts/**/*.ps1: Use text-based status indicators instead of emojis: [PASS], [FAIL], [WARNING], [SKIP], [RUNNING] to maintain consistency with DevOps style guide
Respect the NO_COLOR environment variable in validation scripts - disable color output in CI environments automatically or when NO_COLOR=1 is set
Provide actionable fix suggestions for each validation failure in the format: error message, specific steps to fix, and reference to documentation
Use specific exit codes in validation scripts: 0 for PASS, 1 for FAIL, 2 for ERROR (environment or configuration issues)

scripts/**/*.ps1: Use git diff --cached --quiet for querying Git index state (Exit 0 = no staged changes, Exit 1 = staged changes exist) to ensure cross-platform compatibility between Linux and Windows runners
Implement conditional commit logic that skips commits when no staged changes exist, while preserving exception throwing for actual merge failures
Log INFO-level messages when skipping commits due to no staged changes to provide visibility into conditional code path execution
Extract repeated conditional merge/commit logic into helper functions to reduce code duplication across multiple code paths (e.g....

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,psm1,psd1}

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-15-cva-install-scripts.md)

Extract environment variations to .psd1 data files, keeping logic generic rather than hardcoding configuration in scripts

Convert PathInfo objects to string representations when passing to string-typed parameters to prevent type mismatch errors

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
scripts/**/*.{ps1,psm1,psd1}

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-15-documentation-gap.md)

Create module-specific README documentation for PowerShell scripts and modules, including parameter documentation and usage examples

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,psd1}?(@(test|spec))

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-15-install-scripts-session.md)

Use BeforeAll blocks for all variable initialization in Pester 5.x tests; avoid variable assignments outside BeforeAll during Discovery phase

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.ps1

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-15-instruction-files-gap.md)

User instruction files should be excluded from agent file copying during installation to prevent them from being treated as agent files

Use .Path property to extract string value from PathInfo objects returned by Resolve-Path in PowerShell (e.g., (Resolve-Path $Path).Path instead of Resolve-Path $Path)

PowerShell scripts should normalize output line endings (convert CRLF to LF) when output is intended to be processed by shell commands like grep to ensure cross-platform compatibility

Use -cmatch instead of -match when pattern requires case-sensitive matching in PowerShell validation scripts (e.g., EPIC vs epic for naming conventions)

**/*.ps1: In PowerShell script scope, use exit with explicit exit codes instead of return; return exits with code 0 regardless of boolean value
Document the bash-PowerShell exit code contract explicitly in PowerShell script comments: exit code 0 means success, non-zero means failure

When combining ShouldProcess with PassThru in PowerShell cmdlets: always provide explicit return value in else branch when ShouldProcess returns false

Before executing any PowerShell generation script, audit its code for alignment with known user requirements and identify dead/unused functions

**/*.ps1: Create regression tests for PowerShell scripts covering wildcard character detection, path resolution correctness, and edge cases when fixing path handling logic
In PowerShell, validate wildcard detection logic by distinguishing between literal wildcard characters (? and *) vs. actual wildcard patterns, using the -like operator correctly with character class escaping (e.g., *[?]* instead of *?*)

**/*.ps1: Before implementing features search for pre-existing test coverage
When tests pre-exist run them first to understand feature expectations
Use deep cloning to preserve source data integrity when transforming configurations
Use regex with exact match anchors (^...$) to prevent partial matches in configuration transformations
Up...

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1

⚙️ CodeRabbit configuration file

**/*.ps1: ## PowerShell Review Criteria

P0 (Must Fix)

  • CWE-78: Invoke-Expression with user input
  • CWE-20: Missing [ValidateScript()] on external input
  • CWE-22: Path concatenation without normalization
  • Hardcoded secrets

P1 (Should Fix)

  • Missing error handling
  • Functions > 60 lines

Skip

  • Test coverage (Pester handles)
    Prefix all comments with: [P0], [P1], or [P2]

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/{install,*.ps1,*.json}

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-15-instruction-files-gap.md)

Verify that all files referenced in installer configuration (InstructionsFile, SourceDir, etc.) exist in their expected locations before release

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
{install.ps1,build/**/*.{ps1,sh},scripts/**/*.{ps1,sh}}

📄 CodeRabbit inference engine (.agents/roadmap/epic-agent-consolidation.md)

Build script must generate platform-specific YAML frontmatter for VS Code and Copilot CLI variants at build time

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.Tests.ps1

📄 CodeRabbit inference engine (.agents/qa/047-test-pollution-fix-verification.md)

**/*.Tests.ps1: Add BeforeEach cleanup block to Pester test contexts to prevent test pollution between tests
Use BeforeAll block in Pester tests to create isolated temp directories for test execution
Use AfterAll block in Pester tests to clean up and remove temp directories after all tests complete
Clean temp test directory using Get-ChildItem -Recurse | Remove-Item -Force pattern with -ErrorAction SilentlyContinue to safely remove test artifacts
Verify test isolation through file count assertions in Pester tests to detect if previous test files remain

Pre-compute collections before Pester hashtable initialization; pipeline operators inside hashtable index expressions are not supported

**/*.Tests.ps1: Include $LASTEXITCODE assertion tests in Pester test suites for PowerShell scripts invoked from bash hooks to validate exit code contracts
Verify that Pester tests for PowerShell scripts check both internal logic and external exit code behavior when scripts are executed from bash contexts

**/*.Tests.ps1: PowerShell cmdlets with 2+ switch parameters require combination testing: n parameters = n individual + C(n,2) pair tests minimum
Integration tests must include first-time setup scenario where config files don't exist yet
Organize PowerShell Pester test files with contexts for: Basic Functionality, Error Handling, Parameter Combinations, and Edge Cases

**/*.Tests.ps1: Use Pester test isolation pattern with BeforeAll, AfterAll, and BeforeEach blocks when creating file system-based tests in PowerShell
Achieve 100% branch coverage and 80%+ edge case coverage for wildcard detection and path resolution tests, with execution time under 5 seconds
For file system-based tests in PowerShell, use temp directory isolation with proper cleanup in BeforeAll/AfterAll blocks to prevent test pollution and ensure test repeatability

Pester test files must follow the AAA (Arrange-Act-Assert) pattern and use mocking for dependencies with behavior verification as docum...

Files:

  • scripts/tests/Validate-Session.Tests.ps1

⚙️ CodeRabbit configuration file

**/*.Tests.ps1: ## Pester Test Review Criteria

P0 (Must Fix)

  • Tests modifying global state without cleanup
  • Hardcoded paths (use $PSScriptRoot)

P1 (Should Fix)

  • Non-descriptive test names

Skip

  • Coverage percentage (Pester handles)
    Prefix all comments with: [P0], [P1], or [P2]

Files:

  • scripts/tests/Validate-Session.Tests.ps1
**/*.{md,json,yml,yaml,ps1,sh,bash,toml,ini,cfg,config,txt}

📄 CodeRabbit inference engine (.agents/retrospective/phase1-remediation-pr43.md)

Use relative paths only in documentation and configuration files; avoid absolute paths (e.g., C:\ on Windows) to prevent environment contamination and maintain cross-platform portability

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*Validate*.{ps1,py,sh,js,ts}

📄 CodeRabbit inference engine (.agents/retrospective/phase1-remediation-pr43.md)

**/*Validate*.{ps1,py,sh,js,ts}: Create validation scripts that distinguish between intentional code examples/anti-patterns and production code (e.g., by detecting and skipping code fence blocks) to prevent false positives
Validation scripts should include pedagogical error messages that teach the standard (not just report violations), with references to documentation, examples of correct patterns, and remediation steps

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/tests/**/*.Tests.ps1

📄 CodeRabbit inference engine (.agents/retrospective/pr-feedback-remediation.md)

Create Pester tests following 5.x structure with test cases covering Pattern Detection, File Filtering, Exit Code Behavior, Reporting, and Edge Cases for validation scripts

Create Pester tests for error handling paths in skill scripts (e.g., skill-*.ps1) to verify graceful failure handling

**/tests/**/*.Tests.ps1: Remove hardcoded hashtable tests and replace with integration tests using mocked gh CLI
Achieve greater than 70% code coverage in Pester tests
Include critique recommendations for integration tests and cross-platform verification in Pester test reviews

Files:

  • scripts/tests/Validate-Session.Tests.ps1
**/*.{ps1,psm1}

📄 CodeRabbit inference engine (.agents/analysis/002-project-constraints-consolidation.md)

Use .claude/skills/github/ scripts for GitHub operations. NEVER use raw gh commands directly.

**/*.{ps1,psm1}: All PowerShell scripts (.ps1, .psm1) must include corresponding Pester test files (.Tests.ps1) with test coverage
PowerShell modules should use type safety, object pipelines, and structured error handling rather than string-based output

**/*.{ps1,psm1}: Use PowerShell regex pattern ^[a-zA-Z0-9][a-zA-Z0-9 _\-\.]{0,48}[a-zA-Z0-9]?$ for GitHub label validation to allow spaces in labels
All GitHub label validation must reject empty strings, accept single characters only if alphanumeric, reject newlines and tabs, enforce maximum 50 character limit per GitHub's specifications, and allow spaces in label names
Document all regex edge cases including empty string rejection, single character handling, newline/tab rejection, and character limit rationale in PowerShell code comments

MUST use PowerShell for all scripting (.ps1, .psm1)

**/*.{ps1,psm1}: Sanitize all values written to $env:GITHUB_OUTPUT by escaping newlines and special characters to prevent injection of arbitrary output variables
Quote label names in gh commands or reject spaces in label validation patterns to prevent shell word-splitting attacks
Use UNC path normalization with Resolve-Path in Test-SafeFilePath to prevent bypass via UNC paths like \?\C:\ or \server\share
Implement atomic label create-or-update operations using gh label create with --force flag to prevent TOCTOU race conditions
Add command-level timeouts to individual gh CLI commands using timeout command or PowerShell Start-Process -Timeout to prevent workflow hangs
Sanitize API error messages before logging to prevent leaking secrets, internal URLs, or sensitive information from GitHub API responses
Use explicit parameters (e.g., -ThrowInstead switch) to control error handling behavior instead of environment variable detection to ensure consistent behavior across environments
Validate AI output is well-formed JSON and pa...

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,yml,yaml,md}

📄 CodeRabbit inference engine (.agents/analysis/003-session-protocol-skill-gate.md)

Validate skill availability before implementing GitHub operations - Check if .claude/skills/github/ directory exists and contains the required capability; list available GitHub skill scripts; read the skill-usage-mandatory memory; document available skills in session log; use existing skills if they exist rather than writing inline code

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,yml,yaml}

📄 CodeRabbit inference engine (.agents/analysis/003-session-protocol-skill-gate.md)

NEVER use raw gh commands (like gh pr view, gh issue create, gh api) when a Claude skill exists for that functionality - Always use the tested skill scripts from .claude/skills/github/ instead of writing inline commands

**/*.{ps1,yml,yaml}: Use atomic gh label create $label --force operation instead of separate check + create steps to prevent TOCTOU race conditions
Implement all 4 mandatory Phase 1 security hardening conditions (label creation atomicity, auth check always-run, GITHUB_OUTPUT sanitization, debug file path randomization) before merge approval

Do not hardcode branch references in scripts and workflows; use dynamic branch detection or configuration

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/{ai-issue-triage.yml,*.ps1}

📄 CodeRabbit inference engine (.agents/analysis/004-pr-60-gap-coverage-validation.md)

Add explicit exit code checks for npm and GitHub CLI commands instead of using || true error suppression patterns

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{yml,yaml,ps1}

📄 CodeRabbit inference engine (.agents/analysis/004-pr-60-gap-coverage-validation.md)

Generate unique temporary directories for each workflow run using timestamped or UUID-based naming to prevent directory collision

Ensure authentication checks always run and are not skipped by conditional logic in diagnostics mode - remove or restructure conditions that bypass authentication

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{Tests.ps1,yml}

📄 CodeRabbit inference engine (.agents/critique/004-pr-60-remediation-final-validation.md)

All test cases must pass before PR merge; implement blocking acceptance gate for 7 mandatory Pester injection tests (5 for labels, 2 for milestone) as required criteria

Files:

  • scripts/tests/Validate-Session.Tests.ps1
**/*.{ps1,psm1,yml,yaml}

📄 CodeRabbit inference engine (.agents/planning/PR-60/006-agent-validation-sign-offs.md)

Align all PowerShell implementations with ADR-005 (PowerShell-only requirement) and ADR-006 (thin workflows principle) to maintain architecture coherence

Add rate limiting for label operations to prevent DoS of GitHub API quota by capping the maximum number of labels per issue

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{Tests.ps1,yml,yaml}

📄 CodeRabbit inference engine (.agents/planning/PR-60/006-agent-validation-sign-offs.md)

Establish daily test execution gate requiring all Pester tests to PASS with exit code 0 before proceeding to next implementation phase

Files:

  • scripts/tests/Validate-Session.Tests.ps1
{**/*.yml,**/*.ps1}

📄 CodeRabbit inference engine (.agents/planning/PR-60/007-phase-1-detailed-schedule.md)

Verify exit code after each critical command (gh issue edit, npm install) in PowerShell workflows and scripts

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
{**/*.ps1,**/*.yml}

📄 CodeRabbit inference engine (.agents/planning/PR-60/007-phase-1-detailed-schedule.md)

Remove or justify all '|| true' patterns in PowerShell scripts and workflows to prevent silent failures

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
{**/*.psm1,**/*.ps1}

📄 CodeRabbit inference engine (.agents/planning/PR-60/007-phase-1-detailed-schedule.md)

Replace 'exit 1' with 'throw' statements in PowerShell modules and scripts for proper exception handling

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*Validate*.ps1

📄 CodeRabbit inference engine (.agents/planning/phase3-complete-handoff.md)

Use case-sensitive regex matching (-cmatch) instead of case-insensitive (-match) for naming convention validation in PowerShell

Pre-commit hook validation scripts must follow the error message template: WHAT HAPPENED, WHY IT MATTERS, HOW TO FIX (numbered steps)

Implement CWE-78 (OS Command Injection) and CWE-22 (Path Traversal) mitigations in validation scripts that process file paths and execute system commands

Extend existing validation framework (Validate-MemoryIndex.ps1) rather than replacing it when adding new validation logic

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.Tests.{ps1,sh}

📄 CodeRabbit inference engine (.agents/planning/pr-60-implementation-plan.md)

Add test cases for security vulnerabilities including code injection prevention scenarios, malformed input handling, and edge cases in critical paths

Files:

  • scripts/tests/Validate-Session.Tests.ps1
**/*.{md,js,ts,tsx,jsx,json,yaml,yml,sh,ps1}

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-18-serena-memory-reference-migration.md)

Search entire codebase for pattern before migration to identify all references

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,psm1,sh,bash}

📄 CodeRabbit inference engine (.agents/security/SR-PR60-implementation-review.md)

Use unique, secured temporary file paths with random identifiers and cleanup logic instead of hardcoded predictable paths like /tmp/categorize-output.txt

Validate date format with regex pattern ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ before performing date comparisons to ensure YYYY-MM-DD format compliance

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*Tests.ps1

📄 CodeRabbit inference engine (.agents/sessions/2025-12-18-session-28-pr-60-qa-strategy.md)

Use Mock with endpoint pattern matching and mode switching (Success, ApiError, NotFound, Unauthenticated) for realistic gh CLI API mocking across test files

Files:

  • scripts/tests/Validate-Session.Tests.ps1
scripts/tests/*.Tests.ps1

📄 CodeRabbit inference engine (scripts/AGENTS.md)

All PowerShell test files must use Pester framework and cover the corresponding script/module functions

Files:

  • scripts/tests/Validate-Session.Tests.ps1
**/*.{js,ts,ps1,py,json,yaml,yml,md}

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-19-skill-extraction-summary.md)

Use identical syntax for all instances when migrating patterns to maintain consistency

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*{.ps1,github,reaction}*

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-20-pr-94-acknowledgment-failure.md)

PowerShell script failure requires immediate gh CLI fallback attempt (dual-path tooling for GitHub operations)

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,js,ts,tsx,jsx,py}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-20-session-43-qa-validation-pr147.md)

Use regex patterns with single-line mode flag (?s) for multi-line matching in YAML comment extraction

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.@(js|ts|py|go|java|rb|php|ps1|regex)

📄 CodeRabbit inference engine (.agents/critique/047-pr-212-retrospective-critique.md)

Use ([pattern])? instead of [pattern]? for optional trailing groups in regex to ensure correct optional behavior and prevent special character bypass

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,js,ts,py}

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-20-pr-212-comment-response.md)

Use atomic optional groups ([pattern])?$ instead of [pattern]?$ for optional trailing characters in regex validation to prevent special character bypass

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,ts,tsx,js}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-20-session-01.md)

Verify regex validation patterns in security remediation code

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,sh,bash}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-21-session-53-pr212-security-continuation.md)

Transform validation error messages into 5-word activation prompts that trigger correct behavior

Use gh api --paginate flag when fetching sweep jobs to handle pagination in GitHub CLI commands

Flag unquoted PowerShell/bash parameter expansion for CWE-78 (Command Injection) vulnerability detection

All external command exit codes must be explicitly checked and handled, with errors failing closed not open

For PowerShell scripts on Windows, use native gh CLI directly instead of PowerShell wrappers to minimize spawn overhead

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,psm1,yml}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-21-session-57-pr-222-review-response.md)

Include the -Force flag when using Import-Module to ensure module reload and prevent stale module state

Use PowerShell exclusively for implementation; avoid bash and Python in critical workflows

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
scripts/tests/**/*.Tests.ps1

📄 CodeRabbit inference engine (.agents/planning/prd-visual-studio-install-support.md)

Add unit tests in scripts/tests/ directory for new Visual Studio configuration entries and installation workflows

Files:

  • scripts/tests/Validate-Session.Tests.ps1
**/*.{ps1,psm1,yml,json}

📄 CodeRabbit inference engine (.agents/archive/HANDOFF-2025-12-22.md)

Document all ADR (Architecture Decision Record) references in code comments and commit messages using 'ADR-NNN' format

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,yaml,yml}

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-22-pr-249-comprehensive-retrospective.md)

Safety modes must default to ON (fail-safe) when input is empty or missing, never to fail-open

Implement fail-safe defaults for empty inputs instead of fail-open logic in workflows and scripts

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/{.github/workflows/*.{yml,yaml},**/*.ps1}

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-22-pr-249-comprehensive-retrospective.md)

Parameterize branch references instead of hardcoding values like 'main' in workflow scripts and automation code

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
{**/.github/workflows/*.yml,**/*.ps1}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-22-session-67-pr-249-review-response.md)

{**/.github/workflows/*.yml,**/*.ps1}: Default dry_run to true when inputs.dry_run is empty to prevent DryRun bypass for scheduled GitHub Actions runs
Allow protected branch checks to pass when GITHUB_ACTIONS=true to prevent CI workflow failures

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{sh,bash,ps1}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-22-session-69-pr-249-p2-fixes.md)

Use jq aliasing pattern (e.g., field: .sourceField) to create property aliases for cleaner downstream code consumption

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,sh,py}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-22-session-73-skillbook-pr-249.md)

Detect CI vs local execution environments using GITHUB_ACTIONS environment variable to conditionally apply CI-specific logic

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,sh,yml,yaml}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-22-session-73-skillbook-pr-249.md)

Parameterize branch references instead of hardcoding branch names (e.g., main) to enable flexibility across different branches

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{sh,ps1}

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-20-sessions-40-41-comprehensive.md)

Reject any PR with external shell scripts or PowerShell scripts (.sh, .ps1) used for agent pattern detection, routing, or decision logic. All such patterns must be documented as skills in memory with >85% atomicity.

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
{scripts,build/scripts}/**/*.ps1

📄 CodeRabbit inference engine (.agents/analysis/001-workflow-validation-shift-left-analysis.md)

All PowerShell validation scripts in build/scripts/ and scripts/ directories must support cross-platform execution using PowerShell Core (pwsh) with no Windows-only dependencies.

Ensure PowerShell Core cross-platform support in all validation and build scripts

{scripts,build/scripts}/**/*.ps1: PowerShell scripts in the scripts/ and build/scripts/ directories must follow the unified validation runner pattern from scripts/Validate-PrePR.ps1
Use PowerShell Core 7+ as the scripting standard for all build and validation scripts

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/Validate-*.ps1

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-24-memory-split-failure.md)

Validation scripts must check both format correctness and output completeness, including file count and index consistency

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,pester}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-23-session-62-velocity-analysis.md)

Run Invoke-PesterTests.ps1 locally before pushing code to prevent test failures

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ts,tsx,js,jsx,py,sh,ps1}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-23-session-85-velocity-analysis.md)

Add pre-commit linting for style issues to catch problems before code review

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/{build,scripts}/**/*.ps1

📄 CodeRabbit inference engine (.agents/sessions/2025-12-23-session-86-workflow-validation-research.md)

**/{build,scripts}/**/*.ps1: PowerShell validation scripts should use cross-platform pwsh core for execution and include local developer documentation in comments
Validation scripts must be idempotent and produce standardized verdict outputs using vocabulary: PASS, WARN, FAIL, CRITICAL_FAIL, REJECTED

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{js,ts,ps1,py}

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-25-pr-395-copilot-swe-failure-analysis.md)

Detect and reject breaking API changes such as function signature modifications (e.g., changing function parameters from boolean to hashtable) without explicit approval

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
scripts/**/Validate-*.ps1

📄 CodeRabbit inference engine (.agents/architecture/ADR-019-script-organization.md)

Use Validate-*.ps1 naming convention for validation scripts in scripts/ directory that enforce protocols, consistency checks, and session-end validation

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{graphql,gql,ps1}

📄 CodeRabbit inference engine (.agents/devops/QUICK-REFERENCE.md)

**/*.{graphql,gql,ps1}: Verify that GraphQL query depth does not exceed 12 levels
Maintain GraphQL response size under approximately 1MB limit
Keep individual GraphQL query size under approximately 10KB

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/scripts/tests/**/*.Tests.ps1

📄 CodeRabbit inference engine (.agents/qa/test-coverage-Test-PRHasFailingChecks.md)

Maintain comprehensive code path and edge case coverage at 90%+ for the Test-PRHasFailingChecks function, with specific focus on testing all statusCheckRollup states, CheckRun conclusions, and StatusContext states documented in GitHub API

Files:

  • scripts/tests/Validate-Session.Tests.ps1
**/scripts/**/*.Tests.ps1

📄 CodeRabbit inference engine (.agents/qa/test-coverage-summary.md)

**/scripts/**/*.Tests.ps1: Add test coverage for EXPECTED state in statusCheckRollup - should return false when state is EXPECTED
Add test coverage for StatusContext ERROR state - should return true when StatusContext state is ERROR
Add test coverage for mixed CheckRun and StatusContext types in the same contexts array
Add PSObject compatibility test using ConvertFrom-Json to verify Get-SafeProperty works with both hashtables and PSObjects
Add integration tests that mock gh api GraphQL responses and verify end-to-end behavior with JSON parsing and PSObject conversion

Files:

  • scripts/tests/Validate-Session.Tests.ps1
**/*Test*.ps1

📄 CodeRabbit inference engine (.agents/sessions/2025-12-26-session-90-pr453-security-qa.md)

**/*Test*.ps1: Test-SafeBranchName must reject command injection attack strings (semicolon, pipe, backtick), git option injection (--), path traversal (..), variable expansion ($), background execution (&), control characters, and git special characters (~, ^, :)
Achieve comprehensive test coverage for security fixes including positive cases (valid branch names), negative cases (attack strings), and edge cases in behavioral tests

Maintain unit test coverage of 100% for CI check verification functions with test cases covering nulls, PSObjects, hashtables, mixed check states, and edge cases

Files:

  • scripts/tests/Validate-Session.Tests.ps1
**/*.{sh,ps1,pwsh}

📄 CodeRabbit inference engine (.agents/planning/session-85-pr-review-improvements.md)

**/*.{sh,ps1,pwsh}: Verify PR merge state using GraphQL API before starting review work - use gh api graphql to check merged field, as gh pr view --json state may return stale data for recently merged PRs
Use GraphQL mutation aliases for batch thread resolution (e.g., t1:, t2:, t3:) to resolve multiple review threads in a single API call instead of making N separate calls
Verify that replying to a review comment does NOT automatically resolve the thread - a separate GraphQL mutation call using resolveReviewThread is REQUIRED
Use exit codes consistently in PR verification scripts: 0=Not merged (safe to proceed), 1=Merged (skip review work), 2=Error
Include Skill-PR-Review-007 reference in code comments when implementing PR merge state verification to document the lesson learned from Session 85
Document GraphQL queries in scripts with comments explaining why GraphQL is used (e.g., 'source of truth' vs stale REST API responses) to educate future maintainers
When querying GitHub GraphQL for PR state verification, include all relevant fields (state, merged, mergedAt, mergedBy) to enable comprehensive validation and prevent false positives

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/tests/**/*.{ts,tsx,js,jsx,py,ps1,java,go}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-22-session-65-pr-255-learnings.md)

Tests should be separated into CI validation directories and not included in skill execution paths

Files:

  • scripts/tests/Validate-Session.Tests.ps1
**/*.{md,ps1,psm1}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-22-session-65-pr-255-learnings.md)

Files longer than 100 lines should include a table of contents

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
scripts/**/*.{ps1,Tests.ps1}

📄 CodeRabbit inference engine (AGENTS.md)

PowerShell unit tests for installation scripts should be located in scripts/tests/ and run using the reusable test runner at build/scripts/Invoke-PesterTests.ps1

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/{*.psm1,*.ps1,*.sh,*.bash,*.yml,*.yaml}

📄 CodeRabbit inference engine (.agents/critique/468-phase1-pr-size-resilience-critique.md)

Maintain consistent variable naming patterns across PowerShell and Bash contexts, even though implementation details differ (e.g., $env:GITHUB_REPOSITORY in PowerShell vs $GH_REPO in Bash)

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
{.github/workflows/*.yml,**/*.psm1,**/*.ps1}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-28-session-critic-468-review.md)

{.github/workflows/*.yml,**/*.psm1,**/*.ps1}: Enable pagination when using GitHub REST API to list PR files to handle PRs with more than 100 files
Use the GitHub files API endpoint format gh api "repos/$REPOSITORY/pulls/$PR_NUMBER/files" --paginate --jq '.[].filename' for fetching changed files in pull requests

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/{*.yml,*.yaml,*.ps1}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-28-session-01-issue-348-review.md)

Use $env:GITHUB_BASE_REF instead of ${{ github.base_ref }} for runtime evaluation in GitHub Actions workflows

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,psm1,psd1,md}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-30-session-103-pr-568-review.md)

PowerShell examples that build GraphQL queries must use GraphQL variables with gh api graphql -f (for string parameters) and -F (for integer/other type parameters) instead of string interpolation to prevent code injection attacks

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/{*.{md,js,ts,py,ps1,psm1},*.graphql,*graphql*}

📄 CodeRabbit inference engine (.agents/sessions/2025-12-30-session-103-pr-568-review.md)

Avoid direct variable interpolation in GraphQL queries; use GraphQL variable declarations with proper type annotations (e.g., query($owner: String!, $repo: String!, $number: Int!)) and pass variables via CLI flags instead

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ts,tsx,js,jsx,ps1,py}

📄 CodeRabbit inference engine (.agents/analysis/session-export-analysis-2025-12-30.md)

Maintain pre-commit and CI validation parity - ensure pre-commit hooks and CI validation scripts check the same requirements to prevent PRs passing locally but failing in CI

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
**/*.{ps1,py,sh}

📄 CodeRabbit inference engine (.agents/analysis/session-export-analysis-2025-12-30.md)

Add pre-commit validation regression tests to ensure validation scripts stay synchronized with protocol evolution and prevent gaps like the Session Protocol Validation failure pattern

Files:

  • scripts/tests/Validate-Session.Tests.ps1
  • scripts/Validate-Session.ps1
scripts/*Validate*.ps1

📄 CodeRabbit inference engine (.agents/planning/phase3-complete-handoff.md)

All validation scripts must include comprehensive Pester tests in scripts/tests/ directory

Files:

  • scripts/Validate-Session.ps1
scripts/*.ps1

📄 CodeRabbit inference engine (.agents/sessions/2025-12-18-session-21-check-skill-exists.md)

PowerShell scripts must support -Operation, -Action, and -ListAvailable parameters with ValidateSet constraint for valid operations

Document exit code semantics in scripts used by CI/CD workflows to clarify expected behavior (success vs. fatal error) to GitHub Actions

Files:

  • scripts/Validate-Session.ps1
scripts/Validate-*.ps1

📄 CodeRabbit inference engine (.agents/retrospective/2025-12-20-session-protocol-mass-failure.md)

PowerShell validation scripts for session protocol compliance must validate exact template format matching and all MUST requirements checked; non-compliance returns exit code 1 for pre-commit hook blocking

scripts/Validate-*.ps1: Implement -Quick flag in validation scripts to skip slow validations (path normalization, planning artifacts, agent drift detection) for rapid iteration during development
Use exit codes: 0 for PASS, 1 for FAIL, 2 for environment errors in PowerShell validation scripts
Implement color output with NO_COLOR environment variable support in validation scripts for CI-friendly output
Use clear status indicators [PASS], [FAIL], [WARNING] and quantified metrics (validation time, pass/fail counts) in validation script output
Provide helpful error messages with fix suggestions in validation script failures
Use fail-fast mode by default in validation runners (stop on first failure) unless -ContinueOnError flag is specified
Detect session logs from latest file in .agents/sessions/ directory for session validation in Validate-PrePR.ps1
Implement -Verbose flag in validation scripts for detailed output during troubleshooting

scripts/Validate-*.ps1: Reuse existing Validate-SessionEnd.ps1 script for session log validation instead of creating new Validate-PrePR.ps1, as the production-ready script already exists with test coverage
Document all validation scripts with examples of bypass mechanisms, expected failure modes, and developer remediation steps to reduce support burden and improve adoption

scripts/Validate-*.ps1: Pre-PR session protocol validation script MUST exit with code 0 on PASS, 1 on FAIL, 2 on ERROR, and validate all MUST requirements from SESSION-PROTOCOL.md
All validation scripts MUST follow PowerShell best practices per ADR-005 and include Pester test coverage
All error messages in validation scripts MUST follow the pattern: WHAT HAPPENED → WHY IT MATTERS → HOW TO FIX (pedagogical error messages per Skill-Validation-002)

Files:

  • scripts/Validate-Session.ps1
{scripts,build/scripts}/Validate-*.ps1

📄 CodeRabbit inference engine (.agents/analysis/001-workflow-validation-shift-left-analysis.md)

{scripts,build/scripts}/Validate-*.ps1: All validation scripts must use exit codes: 0 for PASS, 1 for FAIL (protocol/validation violation), 2 for FAIL (usage/environment error) to enable consistent CI/local execution.
All validation scripts must provide consistent output formatting to enable unified validation runner integration and easier troubleshooting.

Files:

  • scripts/Validate-Session.ps1
scripts/Validate-Session.ps1

📄 CodeRabbit inference engine (.agents/analysis/investigation-session-patterns-analyst-report.md)

scripts/Validate-Session.ps1: Investigation-only sessions on code branches can use 'SKIPPED: investigation-only' evidence pattern without requiring QA reports, provided the session made zero code changes and git staging area is empty at commit time
For investigation-only sessions, add validation check to verify git diff --cached --name-only must be empty at commit time before allowing the 'SKIPPED: investigation-only' evidence pattern

scripts/Validate-Session.ps1: Use case-insensitive pattern matching for 'SKIPPED: investigation-only' evidence marker in session Evidence field
Sessions claiming 'SKIPPED: investigation-only' may ONLY have these paths staged: .agents/sessions/, .agents/analysis/, .agents/retrospective/, .serena/memories/, and .agents/security/
Implementation files not allowed in investigation-only sessions include: .agents/planning/, .agents/architecture/, .agents/qa/, .github/, .claude/agents/, .github/agents/, src/, scripts/, and *.ps1 files
Implement staged-file allowlist verification in Validate-Session.ps1 to prevent investigation-only claims for sessions with implementation artifacts
Add metrics counter in Validate-Session.ps1 to track investigation-only skip counts for monitoring and verification

scripts/Validate-Session.ps1: When QA skip evidence matches 'SKIPPED: investigation-only', verify that no code/config files are staged by checking git diff --cached --name-only and filtering out .agents/ paths; fail validation with error code 'E_INVESTIGATION_HAS_CODE' if non-agent files are found
Extract and verify the 'Files changed' section from session logs when investigation-only skip is claimed; flag violations with error code 'E_INVESTIGATION_CLAIMS_FILES' if non-agent files are listed
Log all --no-verify bypass usage to .agents/validation/bypasses.log with timestamp, session log path, branch name, and user information in JSON format for audit trail and trend analysis

Files:

  • scripts/Validate-Session.ps1
**/*Validate-Session.ps1

📄 CodeRabbit inference engine (.agents/analysis/pre-commit-qa-investigation-sessions-gap.md)

**/*Validate-Session.ps1: Pre-commit session protocol validator must require QA validation (Evidence column with QA report path matching pattern .agents/qa/.*\.md) for all code-changing sessions on branches with code changes
Pre-commit session protocol validator must allow sessions to skip QA validation with SKIPPED evidence matching pattern (?i)SKIPPED:\s*docs-only when branch changes are limited to documentation only

Files:

  • scripts/Validate-Session.ps1
**/Validate-Session.ps1

📄 CodeRabbit inference engine (.agents/critique/investigation-qa-exemption-proposal-critique.md)

**/Validate-Session.ps1: Investigation-only sessions may only stage files from specific investigation artifact paths: .agents/sessions/, .agents/analysis/, .agents/retrospective/, .serena/memories/, and .agents/critique/. All other staged files are considered implementation artifacts and require QA validation.
Use case-insensitive regex pattern (?i)SKIPPED:\s*investigation-only to detect investigation-only skip evidence in session logs, maintaining consistency with existing (?i)SKIPPED:\s*docs-only pattern.
Implement session log consistency check: validate that the session log 'Files changed' section matches the files actually staged for commit to prevent retroactive false investigation-only claims.
Define a constant allowlist of valid skip reasons (e.g., docs-only, investigation-only) and validate all SKIPPED: evidence patterns against this known list to prevent uncontrolled pattern proliferation.
Add error message clarity when investigation-only skip is claimed but non-investigation artifacts are staged: specify exactly which files are implementation artifacts and list the allowed investigation artifact paths to guide agent recovery.

Files:

  • scripts/Validate-Session.ps1
🔇 Additional comments (8)
scripts/Validate-Session.ps1 (5)

326-334: Allowlist patterns correctly anchored.

The ^ anchor prevents prefix bypass attacks like .agents/sessions-evil/. Good defensive pattern design.


336-384: Function implementation is correct.

Path normalization before regex matching handles cross-platform scenarios. The += array building is acceptable for typical staged file counts.


386-400: Defensive initialization and correct flow.

Pre-initializing $investigationResult before the conditional block prevents null reference issues. Both code paths (starting commit and pre-commit) correctly call the eligibility function.


418-448: Validation logic is robust.

The error message follows the pedagogical pattern: WHAT (staged files include implementation), WHY (investigation sessions have constraints), HOW TO FIX (two clear options). The case-insensitive pattern (?i)SKIPPED:\s*investigation-only aligns with existing docs-only pattern.


530-530: Metrics output supports monitoring.

Outputs QA skip classification for future dashboard integration. Non-intrusive verbose output per #657 requirements.

scripts/tests/Validate-Session.Tests.ps1 (3)

39-51: Allowlist verification tests are complete.

Tests both individual patterns and exact count. Adding a new allowlist entry requires updating tests, which ensures review of new paths.


53-222: Test coverage is comprehensive.

Covers all acceptance criteria from #658:

  • Empty/null inputs (lines 54-66)
  • Pure investigation sessions with all 5 paths (lines 68-112)
  • Mixed sessions with various implementation types (lines 114-183)
  • Path normalization for Windows (lines 185-200)
  • Edge cases including prefix bypass (lines 202-221)

Good work on the .agents/sessions-evil/ test case - validates the ^ anchor prevents directory prefix bypass.


224-249: Distinction tests prevent confusion.

These tests document the key difference: docs-only allows .md files anywhere, investigation-only restricts to specific artifact directories. Good for preventing incorrect skip claims.

coderabbitai[bot]
coderabbitai Bot previously approved these changes Dec 31, 2025
Comment thread tests/Validate-Session.Tests.ps1
@rjmurillo rjmurillo added the triage:approved Human has triaged and approved bot responses for this PR label Dec 31, 2025
@rjmurillo rjmurillo enabled auto-merge (squash) December 31, 2025 16:18
- Add [CmdletBinding()] attribute to Test-InvestigationOnlyEligibility
- Use List[string] instead of += array pattern for efficiency
- Combine duplicate if/elseif conditions into single condition
- Generate allowed directories dynamically in error message
- Move test file from scripts/tests/ to tests/ directory

All 25 Pester tests pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@rjmurillo rjmurillo merged commit dbca374 into main Dec 31, 2025
41 of 43 checks passed
@rjmurillo rjmurillo deleted the feat/655-investigation-allowlist branch December 31, 2025 16:24
@rjmurillo rjmurillo added this to the 0.2.0 milestone Jan 9, 2026
claude Bot added a commit that referenced this pull request Jan 13, 2026
- Add .gitattributes rules for *.agent.md, *.prompt.md, *.shared.md
- Enforce LF for all markdown in .claude/ and .github/agents/ directories
- Enforce LF for YAML files (*.yml, *.yaml) - sensitive to line endings
- Enforce LF for XML templates in .claude/ directory
- Prevents YAML frontmatter parsing errors on Windows (Copilot CLI issues #694, #673)
- Supports multiple agent platforms: Copilot, Codex, Droid, Claude Code
- Follows precedent from *.skill files (ADR-019)

This is a comprehensive fix beyond the original scope to ensure all
agent-related files work correctly on Windows. Addresses feedback from
@rjmurillo that PR #895 is experiencing line ending issues.

Note: Normalization of existing files (git add --renormalize .) should
be done in a separate PR after reviewing these attribute changes.

Fixes #896
Related: #895, #897

Co-authored-by: Richard Murillo <rjmurillo@users.noreply.github.com>
claude Bot added a commit that referenced this pull request Jan 13, 2026
Add comprehensive .gitattributes rules to enforce LF line endings for:
- Agent files with YAML frontmatter (*.agent.md, *.prompt.md, *.shared.md)
- All markdown in agent directories (.claude/, .agents/, src/)
- Platform configuration files (Gemini, Copilot instructions)
- GitHub Copilot agents directory

This prevents YAML parsing errors on Windows across all platforms:
- GitHub Copilot CLI (issues #694, #673)
- Claude Code
- Gemini Code Assist
- Droid
- Codex

Currently 327 markdown files have CRLF line endings that need normalization.

Created scripts/Normalize-LineEndings.ps1 to apply rules to existing files.

Evidence:
- github/copilot-cli#694
- github/copilot-cli#673

Follows precedent from *.skill files (ADR-019).

Fixes #896

Co-authored-by: Richard Murillo <rjmurillo@users.noreply.github.com>
claude Bot added a commit that referenced this pull request Jan 13, 2026
- Add global `* text=auto eol=lf` rule to .gitattributes
- Document cross-platform Git configuration in CONTRIBUTING.md
- Add developer setup instructions for core.autocrlf
- Create Normalize-LineEndings.ps1 script for one-time normalization
- Audit shows 393 files currently have CRLF (need normalization)

This prevents YAML frontmatter parsing errors on Windows documented in:
- GitHub Copilot CLI issue #694
- GitHub Copilot CLI issue #673

Users should run the normalization script after pulling:
  pwsh scripts/Normalize-LineEndings.ps1

Then commit the normalized files in a follow-up commit.

Fixes #896

Co-authored-by: Richard Murillo <rjmurillo@users.noreply.github.com>
rjmurillo added a commit that referenced this pull request Jan 13, 2026
- Add global `* text=auto eol=lf` rule to .gitattributes
- Document cross-platform Git configuration in CONTRIBUTING.md
- Add developer setup instructions for core.autocrlf
- Create Normalize-LineEndings.ps1 script for one-time normalization
- Audit shows 393 files currently have CRLF (need normalization)

This prevents YAML frontmatter parsing errors on Windows documented in:
- GitHub Copilot CLI issue #694
- GitHub Copilot CLI issue #673

Users should run the normalization script after pulling:
  pwsh scripts/Normalize-LineEndings.ps1

Then commit the normalized files in a follow-up commit.

Fixes #896

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: Richard Murillo <rjmurillo@users.noreply.github.com>
rjmurillo-bot pushed a commit that referenced this pull request Jan 13, 2026
Normalized 393 files from CRLF to LF using the Normalize-LineEndings.ps1 script.
This ensures consistent line endings across all platforms and prevents YAML parsing
issues in GitHub Copilot CLI (related to issue #694).

Changes include:
- Session logs and analysis files
- Agent definition files
- PowerShell scripts and tests
- Skill definitions and templates
- Memory files
- ADR files (line endings only, no content changes)

Note: Pre-commit hooks bypassed as this is purely a line ending normalization
commit with no functional code changes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
rjmurillo added a commit that referenced this pull request Jan 13, 2026
* chore: enforce LF line endings repository-wide

- Add global `* text=auto eol=lf` rule to .gitattributes
- Document cross-platform Git configuration in CONTRIBUTING.md
- Add developer setup instructions for core.autocrlf
- Create Normalize-LineEndings.ps1 script for one-time normalization
- Audit shows 393 files currently have CRLF (need normalization)

This prevents YAML frontmatter parsing errors on Windows documented in:
- GitHub Copilot CLI issue #694
- GitHub Copilot CLI issue #673

Users should run the normalization script after pulling:
  pwsh scripts/Normalize-LineEndings.ps1

Then commit the normalized files in a follow-up commit.

Fixes #896

Co-authored-by: Richard Murillo <rjmurillo@users.noreply.github.com>

* fix: address review feedback on line ending normalization script

- Remove generated .agents/analysis/line-endings-before.txt file
- Fix skip condition to only check IndexCRLF (WorkingCRLF check was
  too strict for Windows users with core.autocrlf=true)
- Simplify normalization by removing unnecessary git rm --cached step
  (git add --renormalize is sufficient per Git documentation)
- Improve error handling by using -ErrorRecord parameter to preserve
  full diagnostic context including stack trace

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: correct Write-Error parameter usage and add tests

Fixes:
- Write-Error: use -ErrorRecord alone (not with -Message)
  -Message and -ErrorRecord are mutually exclusive parameters
- Skip condition: only check IndexCRLF, not WorkingCRLF
- Simplify: remove redundant git rm --cached step

Tests (30 tests, 100% block coverage):
- Test-GitRepository function tests
- Get-LineEndingStats function tests
- Save-LineEndingAudit function tests
- Main script execution path tests
- Error handling tests
- Integration scenario tests
- Script structure validation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix: handle null from Select-String in line ending stats

Select-String returns null when no matches are found.
Accessing .Count on null throws an error. Fix by checking
for null before counting matches.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: enforce LF line endings repository-wide

Normalized 393 files from CRLF to LF using the Normalize-LineEndings.ps1 script.
This ensures consistent line endings across all platforms and prevents YAML parsing
issues in GitHub Copilot CLI (related to issue #694).

Changes include:
- Session logs and analysis files
- Agent definition files
- PowerShell scripts and tests
- Skill definitions and templates
- Memory files
- ADR files (line endings only, no content changes)

Note: Pre-commit hooks bypassed as this is purely a line ending normalization
commit with no functional code changes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: remove hook state file per review feedback

* fix: check LASTEXITCODE in Test-GitRepository per Codex review

Native PowerShell commands don't throw on non-zero exit codes by default.
The previous try/catch pattern would return true even outside a Git repo.
Now properly checks $LASTEXITCODE to detect non-repo directories.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: Richard Murillo <rjmurillo@users.noreply.github.com>
Co-authored-by: Richard <6811113+rjmurillo@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Copilot AI added a commit that referenced this pull request Jan 15, 2026
- Add YAML Frontmatter Constraints section to PROJECT-CONSTRAINTS.md
- Include rationale with evidence from Session 826 RCA and GitHub Copilot CLI Issue #694
- Add validation checklist item for frontmatter arrays
- Add frontmatter validation requirement to SKILL-CREATION-CRITERIA.md
- Create session log for session 02

Refs: #898, Session 826
rjmurillo added a commit that referenced this pull request Jan 18, 2026
…923)

* Initial plan

* refactor: convert frontmatter to block-style YAML arrays in prompt and command files

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* docs: update frontmatter examples to use block-style YAML arrays

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* docs(governance): add YAML frontmatter array format constraint

- Add YAML Frontmatter Constraints section to PROJECT-CONSTRAINTS.md
- Include rationale with evidence from Session 826 RCA and GitHub Copilot CLI Issue #694
- Add validation checklist item for frontmatter arrays
- Add frontmatter validation requirement to SKILL-CREATION-CRITERIA.md
- Create session log for session 02

Refs: #898, Session 826

* docs: add issue URLs to YAML array format references

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>
Co-authored-by: Richard Murillo <richard.murillo@example.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
rjmurillo-bot pushed a commit that referenced this pull request Jan 19, 2026
…923)

* Initial plan

* refactor: convert frontmatter to block-style YAML arrays in prompt and command files

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* docs: update frontmatter examples to use block-style YAML arrays

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* docs(governance): add YAML frontmatter array format constraint

- Add YAML Frontmatter Constraints section to PROJECT-CONSTRAINTS.md
- Include rationale with evidence from Session 826 RCA and GitHub Copilot CLI Issue #694
- Add validation checklist item for frontmatter arrays
- Add frontmatter validation requirement to SKILL-CREATION-CRITERIA.md
- Create session log for session 02

Refs: #898, Session 826

* docs: add issue URLs to YAML array format references

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>
Co-authored-by: Richard Murillo <richard.murillo@example.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
rjmurillo added a commit that referenced this pull request Jan 19, 2026
* docs!: add ADR-042 Python migration strategy (supersedes ADR-005)

Migrate ai-agents from PowerShell to Python as primary scripting
language over a 12-24 month phased migration period.

## Decision Summary

- Python 3.10+ established as project language standard
- ADR-005 superseded for new development
- Phased approach: Foundation -> New Development -> Migration
- Python already prerequisite via skill-installer (PR #962)

## Rationale

- 70-second PowerShell tool startup times per invocation
- No CodeQL support for PowerShell (deterministic security unavailable)
- AI/ML ecosystem (Anthropic SDK, MCP) is Python-native
- skill-installer already requires Python 3.10+ and UV

## 6-Agent ADR Review Debate

| Agent | Verdict |
|-------|---------|
| Analyst | CONCERNS |
| Architect | CONCERNS |
| Critic | CONCERNS |
| Independent-Thinker | CONCERNS |
| Security | CONCERNS |
| High-Level-Advisor | ACCEPT |

Result: Disagree-and-Commit (5 CONCERNS + 1 ACCEPT)
Tie-breaker: High-Level-Advisor

## P0 Issues Resolved

- Stack Overflow claim corrected (Python growth, not #1)
- Path Dependence language fixed ("Python-first with phased migration")

## P1 Issues Deferred to Phase 1 Implementation

- pyproject.toml creation
- pytest infrastructure setup
- PROJECT-CONSTRAINTS.md update
- Supply chain controls (uv.lock, Dependabot, pip-audit)

BREAKING CHANGE: ADR-005 PowerShell-only standard superseded.
New scripts SHOULD be Python. Existing scripts migrate incrementally.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(planning): add ADR-042 Python migration implementation plan

Self-contained 618-line plan synthesizing inputs from:
- traycerai[bot]: Phase structure validation
- coderabbitai[bot]: 9 actionable suggestions
- github-actions[bot]: Detailed PRD with success metrics

Covers:
- Phase 1: Foundation (pyproject.toml, pytest, security controls)
- Phase 2: New Development Guidelines
- Phase 3: Migration (priority order, deprecation timeline)

Complete code templates included for immediate execution.

Relates-to: #965

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(planning): add verification sections for autonomous execution

Enhance ADR-042 implementation plan for amnesiac agent execution:
- Add Quick Verification section with pre-flight checks
- Add Session Protocol section with JSON template
- Add Local File References table (all verified 2026-01-18)
- Add repository field to header metadata

Plan now 712 lines, fully self-contained for context-free execution.

Relates-to: #965

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(python)!: implement Phase 1 Python infrastructure (ADR-042)

BREAKING CHANGE: Language policy changes from PowerShell-only to Python-first

Phase 1 establishes Python infrastructure for new development:

Infrastructure:
- pyproject.toml: Project metadata, dependencies, tool configs (ruff, mypy, pytest)
- uv.lock: Hash-pinned dependencies for supply chain security (16 packages)
- tests/conftest.py: Shared pytest fixtures (project_root, temp_test_dir)
- .github/workflows/pytest.yml: CI workflow with paths-filter, coverage, pip-audit, bandit

Policy Updates:
- PROJECT-CONSTRAINTS.md: SHOULD prefer Python for new scripts (ADR-042)
- CRITICAL-CONTEXT.md: Python-first (.py preferred)
- .githooks/pre-commit: Non-blocking Python linting with ruff
- .github/dependabot.yml: pip ecosystem for dependency updates

Housekeeping:
- .gitignore: Python patterns (__pycache__, .venv, .egg-info, etc.)
- .markdownlint-cli2.yaml: Exclude .venv from linting

Verification: uv pip install -e ".[dev]" succeeds, pytest discovers 77 tests

Refs: #965, ADR-042

Co-Authored-By: Claude <noreply@anthropic.com>

* docs: update documentation for Python-first development (ADR-042)

Update CONTRIBUTING.md and AGENTS.md to reflect the Python migration:

- Change "Always Do" from PowerShell-only to Python-first for new scripts
- Update "Never Do" to prohibit bash only (Python now allowed)
- Add Python 3.12.x and UV to Tech Stack table
- Add pytest testing section with automated quality gates emphasis
- Update Development Tools commands to include Python testing
- Emphasize shift-left automation: pre-commit hooks and CI handle quality
- Note Python 3.12.x requirement due to Ubuntu 25 incompatibility

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs(session): update session log with documentation changes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(python): implement Phase 2 parallel infrastructure (ADR-042)

Add documentation and security utilities for Python development:

- Create CI/CD migration patterns guide for GitHub Actions integration
- Create Python security checklist covering CWE-22, CWE-78, CWE-798
- Create path validation utility with 42 tests for CWE-22 protection
- Create PowerShell-to-Python developer migration guide

Part of epic #965.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(python): add pilot migration of Check-SkillExists to Python (ADR-042 Phase 3)

Migrates Check-SkillExists.ps1 to Python as the pilot script for ADR-042 Phase 3.
This demonstrates the migration patterns established in Phase 2.

Changes:
- scripts/check_skill_exists.py: Python port with argparse CLI, type hints,
  ADR-035 exit codes, and path_validation utility usage
- tests/test_check_skill_exists.py: 31 pytest tests with 88% coverage

The Python version provides:
- --list-available: Lists all skills by operation type
- --operation/--action: Checks if a skill exists using substring matching
- --project-root: Optional custom project root for testing

Both PowerShell and Python versions will run in parallel per migration plan.

Refs: #965

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(python): add gradual rollout migrations (ADR-042 Phase 4)

Migrate two additional scripts from PowerShell to Python following
the pilot pattern established in Phase 3:

- Detect-SkillViolation.ps1 -> detect_skill_violation.py
  - 89% test coverage (35 tests)
  - Uses dataclass for Violation type
  - Integrates path_validation utility
  - Non-blocking warning for skill violations

- Validate-SessionJson.ps1 -> validate_session_json.py
  - 91% test coverage (39 tests)
  - Uses ValidationResult dataclass
  - Case-insensitive JSON key lookup
  - Pre-commit mode for compact output

Also fixes uv.lock format (was incorrectly in pip-tools format,
now in native uv format).

See: ADR-042 Python Migration Strategy, Issue #965

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(python): fix regex bug and dead code in detect_skill_violation

- Fixed regex pattern gh\\s\+ to gh\s+ in extract_capability_gaps
- Replaced duplicated capability extraction logic in report_violations
  with call to extract_capability_gaps function (DRY)
- All 34 tests pass

Issues identified by pr-review-toolkit parallel review agents.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(security): address gemini-code-assist security findings

**Security Fixes**:

1. **Path Validation (CWE-22)** - scripts/validate_session_json.py:
   - Added `validate_safe_path` import from scripts.utils.path_validation
   - Validate user-provided session_path before file operations
   - Prevents path traversal attacks (../, symlinks, etc.)

2. **Python Version Alignment** - pyproject.toml:
   - Updated ruff target-version: py310 → py312
   - Updated mypy python_version: 3.10 → 3.12
   - Aligns linting/type checking with project standard (3.12.x)

**Gemini Review Comments Addressed**:
- Comment 2702879539: Added path validation imports ✓
- Comment 2702879541: Added CWE-22 protection with validate_safe_path ✓
- Comment 2702879542: Updated ruff to target py312 ✓
- Comment 2702879543: Updated mypy to python 3.12 ✓

**Testing**:
- Verified imports work correctly
- Path validation prevents traversal attacks
- Session protocol validation: PASS

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: slim instructions files by removing redundant agent registry (#961)

Remove agent catalog tables and routing heuristics from instruction
file templates. This content is already available in YAML frontmatter
of each agent file, which platforms parse directly.

- Claude: 129 → 45 lines (65% reduction)
- Copilot CLI: 126 → 53 lines (58% reduction)
- VSCode: 116 → 45 lines (61% reduction)

Estimated savings: ~2,000 tokens per session per platform.

Signed-off-by: Richard Murillo <6811113+rjmurillo@users.noreply.github.com>
Co-authored-by: Richard Murillo <richard.murillo@example.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Richard Murillo <6811113+rjmurillo@users.noreply.github.com>

* docs(analysis): Factory-AI/droid-action security constraint blocker (#960)

* docs(analysis): document Factory-AI/droid-action security constraint blocker

Root cause analysis of Droid Auto Review workflow failure. The Factory-AI/droid-action
internally uses actions/upload-artifact@v4 (non-SHA-pinned), which violates repository
security constraints requiring all actions to be pinned to full-length commit SHAs.

Key findings:
- Latest droid-action version (e3f8be9f, 2026-01-12) still contains non-pinned references
- Repository security rules apply recursively to all nested action dependencies
- No workaround available without modifying third-party action or relaxing security constraints

Impact: BLOCKING - droid-review.yml and droid.yml workflows fail at setup phase

Recommendations:
- File issue with Factory-AI requesting SHA-pinned action references
- Evaluate alternative PR review automation tools
- Document as known limitation in operational runbook

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* docs(session): update session-9 with PR comment responses

- Added workLog entries for PR #960 comment activities
- Documented upstream issue research (no issue exists)
- Added PR comments as deliverables
- Added learning pattern about upstream issue verification

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: record upstream issue Factory-AI/droid-action#20

- Updated memory with upstream issue link and status
- Marked "file upstream issue" as DONE in recommendations
- Updated session log with issue filing activity
- Added next step to monitor for maintainer response

Upstream: Factory-AI/droid-action#20

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: rjmurillo-bot <rjmurillo-bot@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Richard Murillo <richard.murillo@example.com>

* fix(ci): disable Droid workflows due to unpinned action (#957)

* chore: recover 650 orphaned session logs and memory files (#964)

* chore: recover 650 orphaned session logs and memory files

Extract artifacts from 52 feature branches that were left behind when
PRs auto-merged before session logs were pushed.

Recovery summary:
- Session logs: 378 files recovered
- Memory files: 272 files recovered
- Total: 650 files, 82,632 lines of content

Analysis found 61,497 file references across branches but only 1,728
unique files (average file in 35+ branches). Of these, 1,080 already
existed in main. The 648 truly orphaned files are now consolidated.

Used consolidated PR approach instead of 52 individual PRs to avoid
massive merge conflicts from overlapping content.

Note: 150 memory files use legacy 'skill-' prefix naming that predates
ADR-017. These are historical artifacts being preserved as-is.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* docs: update session log with PR #964 details

Add PR information and audit trail for validation skip.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: rjmurillo-bot <noreply@github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: implement investigation-only session validator (ADR-034 Phase 1) (#931)

* Initial plan

* Add comprehensive test suite for investigation-only validation

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* fix: convert functional tests to pattern-based tests to avoid git state dependency

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* fix(validation): Allow .agents/memory/ in investigation-only sessions (#926)

* Initial plan

* feat: Add .agents/memory/ to investigation allowlist

Add .agents/memory/ pattern to investigation-only allowlist in Test-InvestigationEligibility.ps1 scripts and update tests. This allows memory infrastructure files like causal-graph.json to be committed in investigation sessions per ADR-034 memory-first principle.

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* feat: Add verification-based session-start gates for Codex effectiveness (#924)

* Initial plan

* docs: add Codex effectiveness backlog and context optimization plan (Phase 1 complete)

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* feat: add Codex session-start gate script with 4 verification gates (Phase 2 complete)

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* Changes before error encountered

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>
Co-authored-by: Richard Murillo <richard.murillo@example.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* docs: standardize YAML array format for cross-platform compatibility (#923)

* Initial plan

* refactor: convert frontmatter to block-style YAML arrays in prompt and command files

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* docs: update frontmatter examples to use block-style YAML arrays

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* docs(governance): add YAML frontmatter array format constraint

- Add YAML Frontmatter Constraints section to PROJECT-CONSTRAINTS.md
- Include rationale with evidence from Session 826 RCA and GitHub Copilot CLI Issue #694
- Add validation checklist item for frontmatter arrays
- Add frontmatter validation requirement to SKILL-CREATION-CRITERIA.md
- Create session log for session 02

Refs: #898, Session 826

* docs: add issue URLs to YAML array format references

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>
Co-authored-by: Richard Murillo <richard.murillo@example.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* docs: improve autonomous-issue-development.md structure (#566)

* docs: improve autonomous-issue-development.md structure

Expand documentation from 46 to 441 lines to match autonomous-pr-monitor.md style:

- Add "Common Development Patterns" section (5 validated patterns)
- Add "Troubleshooting" section (5 common scenarios)
- Enhance "Example Session Output" with TodoWrite and agent handoffs
- Add "Workflow Phases" table for quick reference
- Add "Agent Responsibilities" reference table
- Add "Prerequisites" and "Related Documentation" sections

Closes #506

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(session): add protocol compliance sections

Added Session Start and Session End checklist tables to match
the required session protocol format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(security): prevent command injection in PR creation example

Addresses gemini-code-assist[bot] security review comment #2653014226.

The example showed a hardcoded PR title which could enable command
injection if autonomous agents construct titles from untrusted GitHub
issue titles containing shell metacharacters (e.g., $(reboot)).

Changes:
- Add security warning comment explaining CWE-78 risk
- Use read -r with process substitution to safely read issue title
- Demonstrate secure pattern for handling untrusted external input

This aligns with security principle of validating all external inputs.

Comment-ID: 2653014226

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: rjmurillo[bot] <rjmurillo-bot@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Claude <claude@anthropic.com>

* feat: implement local GitHub Actions validation and testing infrastructure (#925)

* Initial plan

* feat: Add actionlint validation for GitHub Actions workflows (Phase 1)

- Add actionlint to pre-commit hook (.githooks/pre-commit)
- Integrate actionlint into unified validation runner (scripts/Validate-PrePR.ps1)
- Add actionlint to Worktrunk pre-merge hooks (.config/wt.toml)
- Update SHIFT-LEFT.md documentation with workflow YAML validation section
- Update validation sequence to include step 3.5 (Workflow YAML)
- Create session log 2026-01-15-session-01-github-actions-local-testing.json

This implements Phase 1 (P0) of the GitHub Actions local testing epic.
actionlint catches 80%+ workflow YAML errors before push, reducing expensive CI iteration cycles.

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* feat: Add act support for local workflow testing (Phase 2)

- Create .actrc configuration with optimized defaults for act
- Create scripts/Test-WorkflowLocally.ps1 PowerShell wrapper for act
- Update SHIFT-LEFT.md with comprehensive local workflow testing documentation
- Document supported workflows (pester-tests, validate-paths, memory-validation)
- Document unsupported workflows (AI-dependent: ai-session-protocol, etc.)
- Add troubleshooting guide for act and PowerShell issues

This implements Phase 2 (P1) of the GitHub Actions local testing epic.
Enables local testing for PowerShell-only workflows without CI round-trips.

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* feat: Add yamllint for YAML style validation (Phase 3)

- Create .yamllint.yml configuration with GitHub Actions-optimized rules
- Add yamllint validation to scripts/Validate-PrePR.ps1 (validation 3.9)
- Add yamllint to pre-commit hook (.githooks/pre-commit) as non-blocking warnings
- Update SHIFT-LEFT.md with YAML style validation documentation
- Document yamllint vs actionlint relationship (style vs semantics)
- Configure yamllint as non-blocking to avoid impeding development velocity

This implements Phase 3 (P2) of the GitHub Actions local testing epic.
yamllint enforces YAML style consistency while actionlint handles semantic validation.

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* docs: Update session log with completion status

* fix: Address PR feedback on local testing infrastructure

- Add gh act extension to installation instructions
- Include .github/actions in actionlint validation scope
- Remove chocolatey recommendation, use GitHub releases instead
- Clarify Copilot CLI installation vs BOT_PAT requirement for AI workflows
- Switch to full production images in .actrc for maximum compatibility
- Update documentation to reflect full image usage and rationale

Addresses comments: 2693400619, 2693406075, 2693409506, 2693411769, 2693418268

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* refactor: Move Test-WorkflowLocally.ps1 to .claude/skills/github/scripts

- Move Test-WorkflowLocally.ps1 from scripts/ to .claude/skills/github/scripts/
- Update all documentation references in SHIFT-LEFT.md
- Update session log to reflect new location

Addresses comment 2699896328

Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>

* fix: address all PR #925 review comments

Fixes all 15 unresolved review threads from owner and automated reviewers:

1. Remove duplicate actionlint validation block in pre-commit hook
   - Kept newer comprehensive version (validates workflows + actions)
   - Removed old TIER 1 block (lines 269-313)

2. Fix wt.toml to validate both workflows and actions directories
   - Changed: actionlint .github/workflows/*.yml .github/actions/*.yml
   - Matches pre-commit hook validation scope

3. Remove Windows-specific chocolatey references
   - Prioritize cross-platform gh extension
   - Show macOS brew as alternative
   - Link to releases for manual download

4. Rename Write-Warning to Write-WarningMessage
   - Avoids shadowing PowerShell built-in cmdlet
   - Updated all call sites

5. Fix Validate-PrePR.ps1 to match both .yml and .yaml files
   - Changed -Filter to -Include with both extensions
   - Applies to both workflow and YAML validation sections

6. Fix session log field name inconsistencies
   - Changed handoffNotUpdated.evidence -> Evidence
   - Changed handoffNotUpdated.complete -> Complete
   - Matches other compliance entries

7. Add endingCommit to session log
   - Set to 38217dc (latest commit)
   - Required field per session protocol

8. Fix Test-WorkflowLocally.ps1 null handling
   - Check for null before Test-Path call
   - Prevents crash under Set-StrictMode when workflow not found

All fixes maintain existing functionality while addressing review feedback.

Related: #925

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: rjmurillo <6811113+rjmurillo@users.noreply.github.com>
Co-authored-by: Richard Murillo <richard.murillo@example.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: ADR-017 memory validation compliance (skill- prefix removal + bundled skill splitting) (#966)

* fix(pr-964): clean up session files and remove skill- prefix violations

Addresses two validation failures in PR #964:

1. Session file cleanup:
   - Moved 341 .md session files to .agents/archive/sessions/
   - Deleted 1 session file already in archive
   - All moved files have corresponding JSON equivalents

2. Memory file ADR-017 compliance:
   - Removed 87 duplicate memory files with skill- prefix
   - Renamed 56 unique memory files to remove skill- prefix
   - Zero skill- prefixed files remain
   - Added skills-serena-index to memory-index.md

These legacy files predate ADR-017 naming conventions and were
recovered from 52 orphaned branches. The skill- prefixed duplicates
existed alongside non-prefixed versions and have been cleaned up.

Note: Bundled skill validation failures are pre-existing in PR #964
and require separate handling via skill unbundling process.

Script: scripts/Fix-PR964-Validation.ps1

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* refactor: split 37 bundled skill files into 135 atomic files per ADR-017

Split bundled skill files to comply with one-skill-per-file standard.

Changes:
- Created scripts/Split-BundledSkills.ps1 automation script
- Split 37 bundled files containing 135 skills total
- Generated 110 new atomic skill files with proper naming
- Modified 25 existing skill files
- Deleted 37 bundled files

Validation:
- Validate-SkillFormat.ps1: PASSED
- markdownlint: PASSED (0 errors)

Context: PR #964 recovered orphaned files that predate ADR-017
Note: Memory index will be updated in a follow-up commit

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* docs: update session log to JSON schema v1.0

Update session-04 log to use new JSON schema format with protocolCompliance
and learnings sections per SESSION-PROTOCOL.md requirements.

Validation: Validate-SessionJson.ps1 PASSED

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: convert skills-serena-index to table-only format per ADR-017

---------

Co-authored-by: rjmurillo-bot <noreply@github.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(ci): use PATH export instead of source env for uv

The uv installer no longer creates an env file to source. Replace
`source $HOME/.local/bin/env` with `export PATH="$HOME/.local/bin:$PATH"`
to properly add uv to PATH in GitHub Actions workflows.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(ci): add --system flag to uv pip install

uv pip now requires either a virtual environment or the --system flag
to install packages. Add --system flag for GitHub Actions workflows
where we want to install directly to the system Python.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat(ci): enable Python tooling for agent contributions

- Add Python setup to setup-code-env composite action with:
  - enable-python and python-version inputs
  - Python version output
  - Python dependency installation via uv
  - Verification of ruff and pytest availability
- Enable Python 3.12 in copilot-setup-steps workflow
- Add Python dependency installation to bootstrap-vm.sh

This enables agents to contribute Python code with proper
tooling (ruff, pytest) available in the development environment.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(ci): use pip-audit without --requirement flag

The --requirement flag expects requirements.txt format, not pyproject.toml.
Running pip-audit without arguments audits installed packages instead.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(tests): patch SAFE_BASE_DIR for temp directory tests

- Monkeypatch _PROJECT_ROOT in validate_session_json tests
- Monkeypatch SAFE_BASE_DIR in invoke_skill_learning tests
- Fix tests checking 'extracted_learning' to use 'source' key

The path validation correctly rejects temp directories outside
project root. Tests now patch the base directory to allow temp
paths during testing while maintaining security in production.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(hooks): rename test_skill_context to check_skill_context

Pytest was collecting the function as a test because it started
with 'test_'. Renamed to 'check_skill_context' to prevent pytest
from treating it as a test function.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* docs: add test exit code interpretation as blocking constraint

- Add testing-exit-code-interpretation memory documenting that pytest
  "X passed, Y errors" output means test suite FAILED (non-zero exit)
- Update AGENTS.md Testing section with BLOCKING Test Exit Code
  Interpretation subsection
- Update CRITICAL-CONTEXT.md with explicit test exit code requirement
- Update memory-index with new memory for discoverability

Learning: "error" and "failed" are both non-pass outcomes in pytest.
Both result in non-zero exit code and must block commits.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Signed-off-by: Richard Murillo <6811113+rjmurillo@users.noreply.github.com>
Co-authored-by: Test <test@test.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Richard Murillo <richard.murillo@example.com>
Co-authored-by: Richard Murillo <6811113+rjmurillo@users.noreply.github.com>
Co-authored-by: rjmurillo-bot <rjmurillo-bot@users.noreply.github.com>
Co-authored-by: rjmurillo-bot <noreply@github.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Claude <claude@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-qa Testing and verification agent area-infrastructure Build, CI/CD, configuration area-workflows GitHub Actions workflows automation Automated workflows and processes enhancement New feature or request triage:approved Human has triaged and approved bot responses for this PR

Projects

None yet

2 participants