test(#528): fix + un-quarantine agent_routing (test-isolation bug)#530
Merged
Conversation
Root cause was test isolation, not a config regression: apply-agent-routing.sh resolves the ops root via _lib-ops-root.sh, which inside a Claude Code session honours the session pin ($APEXYARD_OPS_PIN_DIR/ops-root-$CLAUDE_CODE_SESSION_ID) and pointed at the REAL fork instead of the mktemp sandbox — so the hook rewrote the real .claude/agents/*.md (+ a stray snapshot) and every sandbox assertion failed. Two fixes: - export APEXYARD_OPS_DISABLE_PIN=1 at the top of the test so ops-root resolves by walk-up to the sandbox (no-op in headless CI, which has no pin). This alone took the suite from 4/14 to 13/14 and stopped the real-repo mutation. - case 4 (idempotency) declared endpoint http://localhost:11434 but, unlike cases 9-13, never mocked curl — so the unreachable endpoint was correctly filtered and the endpoint_count assertion tested the environment. Hoisted make_mock_curl above case 4 and mocked the endpoint reachable → 14/14. Un-quarantined in bin/run-hook-tests.sh. Only test_handover_clone_prompt remains quarantined (it spec-pins a removed clone-prompt design; needs a rewrite). #528 Refs #528 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This was referenced Jun 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the 4th of 5 quarantined tests from #528 — and the root cause turned out to be a test-isolation bug with a real side-effect, not the config regression I'd flagged.
apply-agent-routing.shresolves the ops root via_lib-ops-root.sh, which — inside a real Claude Code session — honours the session pin ($APEXYARD_OPS_PIN_DIR/ops-root-$CLAUDE_CODE_SESSION_ID) and resolves to the real fork, not the test'smktempsandbox. So the hook rewrote the real.claude/agents/*.md(and dropped a stray.framework-defaults.json) while every sandbox assertion failed. (It also silently mutated my working copy — restored.)export APEXYARD_OPS_DISABLE_PIN=1at the top of the test so ops-root resolves by walk-up to the sandbox. No-op in headless CI (no pin). This alone took the suite 4/14 → 13/14 and stopped the real-repo mutation.endpoint: http://localhost:11434but, unlike cases 9–13, never mockedcurl— so the unreachable endpoint was correctly filtered by the hook and theendpoint_countassertion was testing the environment, not idempotency. Hoistedmake_mock_curlabove case 4 and mocked the endpoint reachable → 14/14.agent_routing_sync_and_driftinbin/run-hook-tests.sh.What remains in #528 (1 of 5)
test_handover_clone_promptstays quarantined — but I've sharpened the reason: it spec-pins a clone-prompt design that no longer exists (a[y / n / later]prompt at step 8). The/handoverSKILL was redesigned to clone-by-default at step 1.5 with a follow-up offer at step 8, so this needs a test rewrite against the current spec, not a string patch. Left for a focused follow-up under #528.Refs #528.Testing
bash .claude/hooks/tests/test_agent_routing_sync_and_drift.sh→ 14 passed, 0 failed.git status .claude/agents/shows only pre-existing WIP).shellcheck -S errorclean on the test andbin/run-hook-tests.sh.bash bin/run-hook-tests.sh→ agent_routing PASS, only handover_clone_prompt SKIP; this PR'stestsgate on Linux is the check (expected green, now enforcing 64).Refs #528
Glossary
_lib-ops-root.shresolving the ops root from$APEXYARD_OPS_PIN_DIR/ops-root-$CLAUDE_CODE_SESSION_IDinside a live session.APEXYARD_OPS_DISABLE_PINcurlon$PATHthat returns canned fixtures, making endpoint reachability deterministic in tests.