test(subagents): consolidate anemic loader and actor tests by Aaronontheweb · Pull Request #654 · netclaw-dev/netclaw

Aaronontheweb · 2026-04-14T02:56:31Z

Summary

Replace nine near-identical LoadAll_skips_X tests — each of which wrote one bad file and just asserted Assert.Empty(results) — with one consolidated LoadAll_loads_valid_agent_and_rejects_invalid_siblings_with_warnings that drops a valid agent alongside eight kinds of invalid siblings in the same directory and verifies (a) the valid agent still loads, (b) each invalid sibling produced a specific warning naming both the file and the reason, and (c) stray .json / .txt files produce no warning at all because the scanner never touches them.
Introduce a minimal ListLogger<T> (≈20 lines, inline in the test file) that captures warning-level formatted messages so tests can assert on log contents.
Delete three BuildUserMessage_* unit tests on SubAgentActor. They pinned a 4-line string interpolation helper whose behavior is already fully covered by the two end-to-end RuntimeContext_is_prefixed_onto_first_user_message and Null_RuntimeContext_leaves_first_user_message_as_raw_task tests. With the unit tests gone, BuildUserMessage has no external caller, so tighten its visibility from internal to private.

Why

The LoadAll_skips_X tests from #652 were a pattern the Netclaw constitution explicitly warns against: "If the test is just asserting that $"{a}/{b}" equals "a/b", delete it. Prefer fewer tests that cover real behavior over many tests that pad coverage." They looked like they pinned nine distinct validation paths but if the loader were rewritten to unconditionally return an empty list for every file, all nine would still pass. That's negative signal dressed up as coverage.

More importantly, the whole point of the format redesign was "fail loud on malformed files" (per CLAUDE.md's no-silent-fallbacks rule), and the old tests didn't actually verify any warning was logged. A silent-skip regression would have sailed through the suite.

The BuildUserMessage_* unit tests were the same smell applied to a string helper. The E2E tests exercise the same helper through the actor pipeline and catch any regression the unit tests would.

Test plan

dotnet test src/Netclaw.Configuration.Tests — 181/181 green (was 189 before; removed 8 redundant tests)
dotnet test src/Netclaw.Actors.Tests --filter FullyQualifiedName~SubAgent — 27/27 green (was 30; removed 3 redundant unit tests)
dotnet slopwatch analyze — 0 issues
GPG signed commit

Notes

The one consolidated test intentionally asserts on substring matches in captured warning messages rather than exact strings, so minor copy edits to the loader's warning text won't break the suite — but a silent skip (no warning at all) or a warning that omits the filename / reason absolutely will.

The loader test suite from #652 had nine near-identical tests that all reduced to `Assert.Empty(results)` after writing one malformed file. If the loader were rewritten to unconditionally return an empty list, all nine would still pass — they looked like they pinned nine distinct validation paths but actually pinned zero and never verified the "fail loud" contract (no assertion that a warning was logged, no check that the scanner actually reached the file). Replace with one consolidated test that drops a valid agent alongside eight kinds of invalid siblings in the same directory, asserts the valid agent still loads, and verifies each invalid sibling produced a specific warning naming both the file and the reason. That exercises both "don't abort the scan on one bad file" and "fail loud" in a single test. Also asserts that stray .json and .txt files in the directory are ignored at the glob layer — they must produce no warning at all because the scanner never touches them. Introduce a minimal `ListLogger<T>` that captures warning-level formatted messages so tests can assert on log contents. ~20 lines, lives in the same test file; if we need it elsewhere we can promote it. Separately, delete three `BuildUserMessage_*` unit tests on `SubAgentActor`. They were textbook examples of what CLAUDE.md says to delete ("if the test is just asserting that $"{a}/{b}" equals "a/b", delete it") — they pinned a 4-line string interpolation helper whose behavior is already covered by the two end-to-end `RuntimeContext_*` tests. With the unit tests gone, `BuildUserMessage` has no external caller, so tighten its visibility from `internal` to `private`. Net: Configuration.Tests 189 → 181, Actors.Tests SubAgent filter 30 → 27. Fewer tests, each covering real behavior. All green; slopwatch clean.

Aaronontheweb added cleanup Code quality improvements and tech debt reduction sessions LLM session actor, turn lifecycle, pipelines labels Apr 14, 2026

Aaronontheweb enabled auto-merge (squash) April 14, 2026 02:58

Aaronontheweb merged commit fe497d2 into dev Apr 14, 2026
4 of 5 checks passed

Aaronontheweb deleted the fix/consolidate-subagent-loader-tests branch April 14, 2026 03:02

Aaronontheweb mentioned this pull request Apr 14, 2026

docs(subagents): rewrite runbook for single-file markdown format #655

Merged

Aaronontheweb added the subagents spawn_agent, SubAgentActor, definition loader, discovery context layer, and related features label Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(subagents): consolidate anemic loader and actor tests#654

test(subagents): consolidate anemic loader and actor tests#654
Aaronontheweb merged 1 commit into
devfrom
fix/consolidate-subagent-loader-tests

Aaronontheweb commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aaronontheweb commented Apr 14, 2026

Summary

Why

Test plan

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant