Skip to content

[Testing] Fix & un-quarantine the 5 pre-existing hook-test failures #528

@atlas-apex

Description

@atlas-apex

Driver

#526 turned the hook test suite into a CI gate. To land the gate green, 5 tests that were already failing on dev (pre-existing drift, not caused by #526) were quarantined in bin/run-hook-tests.sh. They represent real test debt and should be fixed and un-quarantined so the gate covers them too.

Scope

For each, fix the root cause, then remove its entry from the QUARANTINE array in bin/run-hook-tests.sh:

  • test_agent_routing_sync_and_drift.sh — "case 2: qa-engineer override" fails; reconcile the test with the committed agent-routing.yaml override behaviour.
  • test_handover_clone_prompt.sh — asserts the pre-restructure clone-prompt spec (literal strings like [y / n / later], "Offer the clone-first"); the /handover SKILL moved the clone to step 1.5 and the follow-up offer to step 8. Update the assertions to the current spec.
  • test_harnessability_scoring.sh — 1 of 14 scoring cases drifted; realign the expected score with the current rubric in the SKILL.
  • test_md_to_pdf_fallback.sh — on a runner WITH npx it runs npx -y md-to-pdf (network npm install + headless chromium), which is heavy/flaky in CI; make it skip unless a converter is actually provisioned, or provision one deterministically.
  • test_token_efficiency_wave1.sh — two doc-hygiene drifts: plan-initiative SKILL.md description is 318 chars (>200 hard cap), and /release-sync exists under .claude/skills/ but is missing from the CLAUDE.md skill table.

Acceptance Criteria

  • Each of the 5 root causes is fixed.
  • Each entry is removed from QUARANTINE in bin/run-hook-tests.sh.
  • bash bin/run-hook-tests.sh is green on Linux CI with an empty quarantine list.
  • No test is silently disabled — the quarantine array ends empty.

Glossary

Term Definition
Quarantine The documented skip-list in bin/run-hook-tests.sh for tests excluded from the gate; this ticket drains it to empty.
Pre-existing drift A test that was already failing on dev before the gate existed — debt the gate surfaced, not caused.

Metadata

Metadata

Assignees

No one assigned

    Labels

    taskTechnical tasktestingTest coverage / test infrastructure

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions