fix: harden observer hooks and test discovery#513
Conversation
📝 WalkthroughWalkthroughThis pull request addresses three architectural issues: making the SessionEnd hook non-blocking via async execution with a timeout, implementing glob-based test discovery to automatically detect new tests, and adding a catalog verification script that compares documentation counts against filesystem state. It also moves automated-session guards in observe.sh before project-detection side effects and expands test coverage for these behaviors. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
1 issue found across 5 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="tests/hooks/hooks.test.js">
<violation number="1" location="tests/hooks/hooks.test.js:3393">
P2: Use a platform-agnostic assertion for nested test paths; the current hard-coded `nested/deep.test.js` fails on Windows path separators.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| const result = runPatchedRunAll(testRoot); | ||
| assert.strictEqual(result.code, 0, `run-all wrapper should succeed, stderr: ${result.stderr}`); | ||
| assert.ok(result.stdout.includes('Running top.test.js'), 'Should run the top-level test'); | ||
| assert.ok(result.stdout.includes('Running nested/deep.test.js'), 'Should run nested .test.js files'); |
There was a problem hiding this comment.
P2: Use a platform-agnostic assertion for nested test paths; the current hard-coded nested/deep.test.js fails on Windows path separators.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/hooks/hooks.test.js, line 3393:
<comment>Use a platform-agnostic assertion for nested test paths; the current hard-coded `nested/deep.test.js` fails on Windows path separators.</comment>
<file context>
@@ -3320,6 +3375,32 @@ async function runTests() {
+ const result = runPatchedRunAll(testRoot);
+ assert.strictEqual(result.code, 0, `run-all wrapper should succeed, stderr: ${result.stderr}`);
+ assert.ok(result.stdout.includes('Running top.test.js'), 'Should run the top-level test');
+ assert.ok(result.stdout.includes('Running nested/deep.test.js'), 'Should run nested .test.js files');
+ assert.ok(!result.stdout.includes('ignore.js'), 'Should ignore non-.test.js files');
+ assert.ok(result.stdout.includes('Total Tests: 3'), `Should aggregate nested test totals, got: ${result.stdout}`);
</file context>
| assert.ok(result.stdout.includes('Running nested/deep.test.js'), 'Should run nested .test.js files'); | |
| assert.ok(/Running nested[\\/]deep\.test\.js/.test(result.stdout), 'Should run nested .test.js files'); |
Greptile SummaryThis PR hardens three independent areas: the Key changes and findings:
Confidence Score: 3/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[SessionEnd event] --> B[run-with-flags.js\nasync=true, timeout=10]
B --> C[session-end-marker.js\nnon-blocking cleanup]
D[PostToolUse event] --> E[observe.sh post]
E --> F{Automated\nsession guards}
F -->|non-cli entrypoint| G[exit 0\nbefore detect-project.sh]
F -->|ECC_HOOK_PROFILE=minimal| G
F -->|ECC_SKIP_OBSERVE=1| G
F -->|agent_id present| G
F -->|cwd in skip paths| G
F -->|passes all guards| H[source detect-project.sh\ncreates project dirs]
H --> I[Write observation\nto JSONL]
J[tests/run-all.js] --> K[walkFiles testsDir]
K --> L[repo-relative paths]
L --> M{matchesTestGlob\ntests/**/*.test.js}
M -->|match| N[testsDir-relative path]
M -->|no match| O[excluded]
N --> P[spawnSync test file]
Q[scripts/ci/catalog.js] --> R[buildCatalog\nagents/commands/skills]
Q --> S[parseReadmeExpectations]
Q --> T[parseAgentsDocExpectations]
R & S & T --> U[evaluateExpectations]
U --> V{any failures?}
V -->|yes| W[exit 1]
V -->|no| X[exit 0]
Last reviewed commit: 3f68ad8 |
| const wrapperPath = path.join(tempRoot, 'run-all-wrapper.js'); | ||
| const tempTestsDir = path.join(tempRoot, 'tests'); | ||
| let source = fs.readFileSync(path.join(__dirname, '..', 'run-all.js'), 'utf8'); | ||
| source = source.replace('const testsDir = __dirname;', `const testsDir = ${JSON.stringify(tempTestsDir)};`); |
There was a problem hiding this comment.
Fragile source-patching with no guard against replacement failure
source.replace(...) returns the original string unchanged if 'const testsDir = __dirname;' is not found (e.g., after any whitespace/formatting change to run-all.js). In that case the wrapper is written unmodified, __dirname resolves to tempRoot (the wrapper's location), repoRoot becomes the parent of tempRoot, and no files under tests/ match the ^tests\/ regex — so the runner exits 1 with "No test files matched". The test then fails with run-all wrapper should succeed, which is an opaque failure rather than a clear "patching failed" message.
Add an assertion that the replacement actually fired:
| source = source.replace('const testsDir = __dirname;', `const testsDir = ${JSON.stringify(tempTestsDir)};`); | |
| let source = fs.readFileSync(path.join(__dirname, '..', 'run-all.js'), 'utf8'); | |
| const NEEDLE = 'const testsDir = __dirname;'; | |
| assert.ok(source.includes(NEEDLE), `run-all.js no longer contains "${NEEDLE}" — update runPatchedRunAll`); | |
| source = source.replace(NEEDLE, `const testsDir = ${JSON.stringify(tempTestsDir)};`); |
| console.log('## Mismatches\n'); | ||
| for (const mismatch of mismatches) { | ||
| console.log(`- ${formatExpectation(mismatch)}`); | ||
| } |
There was a problem hiding this comment.
renderMarkdown sends mismatches to stdout instead of stderr
In renderText, mismatches are written with console.error (stderr), which lets CI pipelines separate error signals from normal output. renderMarkdown writes the same mismatches with console.log (stdout), breaking that separation for --md mode. Given that the script already exits with code 1 on mismatches, keeping stderr consistent is valuable for tooling that tails stderr.
| console.log('## Mismatches\n'); | |
| for (const mismatch of mismatches) { | |
| console.log(`- ${formatExpectation(mismatch)}`); | |
| } | |
| console.log('## Mismatches\n'); | |
| for (const mismatch of mismatches) { | |
| console.error(`- ${formatExpectation(mismatch)}`); | |
| } |
| "timeout": 10 | ||
| } |
There was a problem hiding this comment.
timeout unit is ambiguous
The timeout: 10 value has no accompanying comment documenting whether the unit is seconds or milliseconds. The existing hook test only asserts > 0, so the contract is unclear from the code alone. If this follows the Claude Code hooks schema (where timeout is in seconds), 10 s is reasonable; if it is milliseconds, session-end-marker.js would almost certainly be killed before finishing. A brief inline comment would remove the ambiguity for future maintainers:
| "timeout": 10 | |
| } | |
| "async": true, | |
| "timeout": 10 |
Consider adding a comment in the adjacent description field or in a nearby doc file clarifying the unit (e.g., "description": "Session end lifecycle marker (non-blocking, 10 s timeout)").
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/ci/catalog.js`:
- Around line 101-115: The parseAgentsDocExpectations function currently sets
skills mode to 'minimum' when the regex captures a '+' (summaryMatch[3]),
allowing "65+" to be treated as a lower bound; change this to always use mode:
'exact' for the skills entry so AGENTS.md counts are enforced exactly. Locate
parseAgentsDocExpectations and the return entry that builds the skills object
(where it uses summaryMatch[2] and summaryMatch[3]) and remove/ignore the
'+'-based branch so the skills entry is created with mode: 'exact' and expected:
Number(summaryMatch[2]) and source: 'AGENTS.md summary'.
In `@tests/hooks/hooks.test.js`:
- Around line 181-188: The test helper currently treats any zero exit as
success; change the assertion after runShellScript(observePath, ...) to also
fail when the observe.sh Python-resolution short-circuits (i.e., when the script
returned 0 because no python/python3 was found). Specifically, update the check
around result.code to assert that result.code === 0 AND that
result.stdout/result.stderr does NOT contain the observe.sh “no python”/“could
not find python” message (or other specific text your observe.sh emits on that
branch), so runShellScript/observePath truly exercised the
interpreter-resolution path before calling assertNoProjectDetectionSideEffects.
In `@tests/run-all.js`:
- Around line 38-43: discoverTestFiles returns OS-specific paths because
path.relative uses platform separators; normalize repoRelativePath to forward
slashes before returning to ensure tests expecting 'nested/deep.test.js' pass on
Windows. In the mapping chain inside discoverTestFiles (after repoRelativePath
is computed or before the final .sort()), replace backslashes with '/' (e.g.,
normalize repoRelativePath by replacing \\ with / or using path.posix-style
join) so matchesTestGlob and downstream code always receive forward-slash paths.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 77ab1845-2761-425d-8806-4f700e657588
📒 Files selected for processing (5)
hooks/hooks.jsonscripts/ci/catalog.jsskills/continuous-learning-v2/hooks/observe.shtests/hooks/hooks.test.jstests/run-all.js
| function parseAgentsDocExpectations(agentsContent) { | ||
| const summaryMatch = agentsContent.match(/providing\s+(\d+)\s+specialized agents,\s+(\d+)(\+)?\s+skills,\s+(\d+)\s+commands/i); | ||
| if (!summaryMatch) { | ||
| throw new Error('AGENTS.md is missing the catalog summary line'); | ||
| } | ||
|
|
||
| const catalog = { | ||
| agents: { count: agents.length, list: agents }, | ||
| commands: { count: commands.length, list: commands }, | ||
| skills: { count: skills.length, list: skills } | ||
| }; | ||
| return [ | ||
| { category: 'agents', mode: 'exact', expected: Number(summaryMatch[1]), source: 'AGENTS.md summary' }, | ||
| { | ||
| category: 'skills', | ||
| mode: summaryMatch[3] ? 'minimum' : 'exact', | ||
| expected: Number(summaryMatch[2]), | ||
| source: 'AGENTS.md summary' | ||
| }, | ||
| { category: 'commands', mode: 'exact', expected: Number(summaryMatch[4]), source: 'AGENTS.md summary' } |
There was a problem hiding this comment.
Use exact AGENTS.md counts here.
Treating 65+ skills as a minimum lets AGENTS.md stay stale indefinitely. Once README.md is corrected to the real value, CI will still go green even if AGENTS.md keeps advertising an outdated lower bound, which defeats the sync check added for #438.
Suggested fix
function parseAgentsDocExpectations(agentsContent) {
const summaryMatch = agentsContent.match(/providing\s+(\d+)\s+specialized agents,\s+(\d+)(\+)?\s+skills,\s+(\d+)\s+commands/i);
if (!summaryMatch) {
throw new Error('AGENTS.md is missing the catalog summary line');
}
+ if (summaryMatch[3]) {
+ throw new Error('AGENTS.md must use exact catalog counts; remove the "+" suffix from skills.');
+ }
return [
{ category: 'agents', mode: 'exact', expected: Number(summaryMatch[1]), source: 'AGENTS.md summary' },
{
category: 'skills',
- mode: summaryMatch[3] ? 'minimum' : 'exact',
+ mode: 'exact',
expected: Number(summaryMatch[2]),
source: 'AGENTS.md summary'
},🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/ci/catalog.js` around lines 101 - 115, The parseAgentsDocExpectations
function currently sets skills mode to 'minimum' when the regex captures a '+'
(summaryMatch[3]), allowing "65+" to be treated as a lower bound; change this to
always use mode: 'exact' for the skills entry so AGENTS.md counts are enforced
exactly. Locate parseAgentsDocExpectations and the return entry that builds the
skills object (where it uses summaryMatch[2] and summaryMatch[3]) and
remove/ignore the '+'-based branch so the skills entry is created with mode:
'exact' and expected: Number(summaryMatch[2]) and source: 'AGENTS.md summary'.
| const result = await runShellScript(observePath, ['post'], payload, { | ||
| HOME: homeDir, | ||
| USERPROFILE: homeDir, | ||
| ...testCase.env | ||
| }, projectDir); | ||
|
|
||
| assert.strictEqual(result.code, 0, `${testCase.name} should exit successfully, stderr: ${result.stderr}`); | ||
| assertNoProjectDetectionSideEffects(homeDir, testCase.name); |
There was a problem hiding this comment.
Make the observe-sh guard helper fail if Python resolution short-circuits first.
This helper currently treats any zero exit as success, but skills/continuous-learning-v2/hooks/observe.sh, Lines 49-53 also return 0 when no python/python3 is available. In that case all five new guard-path tests can pass without exercising the guard ordering you wanted to lock down.
Suggested fix
const result = await runShellScript(observePath, ['post'], payload, {
HOME: homeDir,
USERPROFILE: homeDir,
...testCase.env
}, projectDir);
assert.strictEqual(result.code, 0, `${testCase.name} should exit successfully, stderr: ${result.stderr}`);
+ assert.ok(
+ !result.stderr.includes('No python interpreter found'),
+ `${testCase.name} exited before guard evaluation because observe.sh could not resolve Python`
+ );
assertNoProjectDetectionSideEffects(homeDir, testCase.name);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/hooks/hooks.test.js` around lines 181 - 188, The test helper currently
treats any zero exit as success; change the assertion after
runShellScript(observePath, ...) to also fail when the observe.sh
Python-resolution short-circuits (i.e., when the script returned 0 because no
python/python3 was found). Specifically, update the check around result.code to
assert that result.code === 0 AND that result.stdout/result.stderr does NOT
contain the observe.sh “no python”/“could not find python” message (or other
specific text your observe.sh emits on that branch), so
runShellScript/observePath truly exercised the interpreter-resolution path
before calling assertNoProjectDetectionSideEffects.
| function discoverTestFiles() { | ||
| return walkFiles(testsDir) | ||
| .map(fullPath => path.relative(repoRoot, fullPath)) | ||
| .filter(matchesTestGlob) | ||
| .map(repoRelativePath => path.relative(testsDir, path.join(repoRoot, repoRelativePath))) | ||
| .sort(); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
node <<'NODE'
const path = require('path');
const repoRoot = 'C:\\repo';
const testsDir = 'C:\\repo\\tests';
const fullPath = 'C:\\repo\\tests\\nested\\deep.test.js';
const repoRelativePath = path.win32.relative(repoRoot, fullPath);
const discovered = path.win32.relative(
testsDir,
path.win32.join(repoRoot, repoRelativePath)
);
console.log(JSON.stringify({
repoRelativePath,
discovered,
normalized: discovered.split(path.win32.sep).join('/')
}, null, 2));
NODE
rg -n "path\\.relative\\(testsDir, path\\.join\\(repoRoot, repoRelativePath\\)\\)" tests/run-all.jsRepository: affaan-m/everything-claude-code
Length of output: 305
🏁 Script executed:
rg -n "nested.*deep\.test\.js|nested/deep\.test\.js" --type jsRepository: affaan-m/everything-claude-code
Length of output: 351
🏁 Script executed:
cat -n tests/run-all.js | sed -n '35,50p'Repository: affaan-m/everything-claude-code
Length of output: 597
🏁 Script executed:
rg -n "discoverTestFiles\|nested-discovery" tests/ --type js -A 3Repository: affaan-m/everything-claude-code
Length of output: 57
Normalize discovered test paths to forward slashes for cross-platform consistency.
Line 42 uses path.relative() which returns OS-specific separators. On Windows, this becomes nested\deep.test.js, but the test at tests/hooks/hooks.test.js:3393 expects 'Running nested/deep.test.js' with forward slashes. The test will fail on Windows with the current implementation.
Suggested fix
function discoverTestFiles() {
return walkFiles(testsDir)
.map(fullPath => path.relative(repoRoot, fullPath))
.filter(matchesTestGlob)
.map(repoRelativePath => path.relative(testsDir, path.join(repoRoot, repoRelativePath)))
+ .map(relativePath => relativePath.split(path.sep).join('/'))
.sort();
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function discoverTestFiles() { | |
| return walkFiles(testsDir) | |
| .map(fullPath => path.relative(repoRoot, fullPath)) | |
| .filter(matchesTestGlob) | |
| .map(repoRelativePath => path.relative(testsDir, path.join(repoRoot, repoRelativePath))) | |
| .sort(); | |
| function discoverTestFiles() { | |
| return walkFiles(testsDir) | |
| .map(fullPath => path.relative(repoRoot, fullPath)) | |
| .filter(matchesTestGlob) | |
| .map(repoRelativePath => path.relative(testsDir, path.join(repoRoot, repoRelativePath))) | |
| .map(relativePath => relativePath.split(path.sep).join('/')) | |
| .sort(); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/run-all.js` around lines 38 - 43, discoverTestFiles returns OS-specific
paths because path.relative uses platform separators; normalize repoRelativePath
to forward slashes before returning to ensure tests expecting
'nested/deep.test.js' pass on Windows. In the mapping chain inside
discoverTestFiles (after repoRelativePath is computed or before the final
.sort()), replace backslashes with '/' (e.g., normalize repoRelativePath by
replacing \\ with / or using path.posix-style join) so matchesTestGlob and
downstream code always receive forward-slash paths.
Summary
SessionEndmarker hook cleanup-safe by running it async with a timeouttests/run-all.jstotests/**/*.test.jsdiscovery and add a direct nested-discovery regression testscripts/ci/catalog.jsto scan agents, commands, and skills and compare repo counts againstREADME.mdandAGENTS.mdTesting
node tests/hooks/hooks.test.jsnode tests/ci/validators.test.jsnode tests/run-all.jsNotes
scripts/ci/catalog.jsis currently reporting real docs drift outside this scope: the repo scans as18agents,48commands, and94skills, whileREADME.mdandAGENTS.mdstill document16 / 40 / 65-style totals.Closes #429
Closes #437
Closes #438
Summary by cubic
Hardened observer hooks and test discovery to prevent cleanup blocking and project side effects. Added a CI catalog verifier to catch docs drift in agent/command/skill counts.
Bug Fixes
SessionEndmarker hook async with a 10s timeout to keep cleanup non-blocking.skills/continuous-learning-v2/hooks/observe.shahead of project detection to avoid side effects.tests/**/*.test.jsdiscovery and add a nested discovery regression test.New Features
scripts/ci/catalog.jsto compare repo catalog counts withREADME.mdandAGENTS.md(--json,--md,--text; exits non-zero on mismatch).Written for commit 3f68ad8. Summary will update on new commits.
Summary by CodeRabbit