feat(subagents): single-file markdown format with contextual prompt by Aaronontheweb · Pull Request #652 · netclaw-dev/netclaw

Aaronontheweb · 2026-04-14T02:38:57Z

Summary

Adopt the markdown-with-YAML-frontmatter shape used by Claude Code, OpenCode, and Netclaw's own skill system. Each subagent is now a single .md file under ~/.netclaw/agents/ where the frontmatter carries metadata (name, description, tools, modelRole, timeoutSeconds, visibility, emitStructuredFindings) and the body is the system prompt verbatim. Replaces the JSON + .md sidecar pair that was the only two-file artifact type left in Netclaw.
Add an optional Context parameter to spawn_agent so the frontline LLM can pass per-invocation background (workspace details, the user's broader goal, facts the subagent would otherwise have to rediscover) without editing the agent file. Context is prefixed onto the subagent's first user message as a separate Context: block, keeping the agent's static system prompt verbatim and reproducible across calls. When Context is null the protocol is byte-identical to the pre-change shape.
Enrich the [available-subagents] discovery context layer to emit each agent's full description, tool list, and a usage example that includes the new Context field — today's one-line form did not teach the model how or when to delegate.

Why

Netclaw's sub-agents were the only disk-based artifact in the install that still used a two-file JSON+MD pair. Claude Code, OpenCode, AgentSkills.io SKILL.md, and Aaron's own local-ai-preferences/agents/*.md library all use single markdown-with-frontmatter files. Adopting the de facto shape means existing libraries become drop-in portable (modulo tool-name translation, tracked separately) and new sub-agents are half the authoring cost.

The Context parameter addresses a separate but related problem: Phase 0.2 smoke testing earlier in this session confirmed Qwen won't organically delegate on research-worthy prompts, and even when it does, the sub-agent gets a cold start with no workspace/goal awareness. Runtime context lets the parent session specialize a single profile per invocation rather than authoring N tightly-scoped profiles.

Zero-migration observation: the three stock defaults are the only files that have ever existed in ~/.netclaw/agents/ on a fresh install, so the JSON → markdown migration carries no user-content risk.

Key changes

Add YamlDotNet PackageReference to Netclaw.Configuration
New SubAgentMarkdownParser with ExtractFrontmatter/ExtractBody matching the SkillScanner pattern; SubAgentFrontmatter DTO maps 1:1 to SubAgentProfile
Rewrite FileSubAgentDefinitionLoader to scan *.md, fail loud on malformed frontmatter, missing required fields (name/description/tools/body), unknown tool names for user-facing agents, duplicate names across files
Regenerate the three init-wizard seeds (research-assistant, code-analyst, summarizer) in the new format
Thread an optional runtimeContext parameter through SpawnAgentTool.Params → SubAgentSpawner.SpawnAsync → RunSubAgent → SubAgentActor initial user message
BuildUserMessage helper composes Context:\n...\n\nTask:\n... when context is present, raw task otherwise
Enriched discovery layer produces per-agent description/tools/timeout plus a "how to delegate" block with the context field example

Test plan

13 new/rewritten FileSubAgentDefinitionLoader tests covering full frontmatter parsing, required-field failures, duplicate-name rejection, disallowed-tool rejection, hyphenated+PascalCase visibility variants, empty body, missing frontmatter, non-.md files ignored
5 new SubAgentActor tests: BuildUserMessage unit cases (null/whitespace/populated) plus end-to-end RuntimeContext_is_prefixed_onto_first_user_message and Null_RuntimeContext_leaves_first_user_message_as_raw_task via a FakeChatClient.LastReceivedMessages capture
Netclaw.Configuration.Tests: 189/189 green
Netclaw.Actors.Tests (SubAgent + Session + ToolIndexUpdater filter): 307/307 green
Netclaw.Daemon.Tests: 436/436 green
dotnet slopwatch analyze — 0 issues
Live smoke test against a fresh daemon (post-merge, not in this PR)

Closes feat(subagents): single-file markdown format + contextual prompt + project-scoped discovery #647
Expand subagent delegation guidance #522 — expand delegation guidance in AGENTS.md (adoption gap, runs in parallel and benefits from the enriched discovery layer in this PR)
feat(providers): concurrent multi-model / multi-provider architecture for per-role and per-agent routing #648 — multi-model provider architecture (follow-on; unblocks feat(subagents): per-agent model selection in frontmatter #649)
feat(subagents): per-agent model selection in frontmatter #649 — per-agent model selection in frontmatter (blocked on feat(providers): concurrent multi-model / multi-provider architecture for per-role and per-agent routing #648 and this PR)
SubAgentActor: tool file paths don't contribute to parent session WorkingContext #600 — subagent file paths don't flow into parent WorkingContext (orthogonal)

Adopt the markdown-with-YAML-frontmatter shape used by Claude Code, OpenCode, and Netclaw's own skill system. Each subagent is now a single .md file under ~/.netclaw/agents/ where the frontmatter carries metadata (name, description, tools, modelRole, timeoutSeconds, visibility, emitStructuredFindings) and the body is the system prompt verbatim. Replaces the JSON + .md sidecar pair that was the only two-file artifact type left in Netclaw. Add an optional Context parameter to spawn_agent so the frontline LLM can pass per-invocation background (workspace details, the user's broader goal, facts the subagent would otherwise have to rediscover) without editing the agent file. Context is prefixed onto the subagent's first user message as a separate "Context:" block, keeping the agent's static system prompt verbatim and reproducible across calls. When Context is null the protocol is byte-identical to the pre-change shape. Enrich the [available-subagents] discovery context layer to emit each agent's full description, tool list, and a usage example that includes the new Context field — today's one-line form did not teach the model how or when to delegate. Key changes: - Add YamlDotNet PackageReference to Netclaw.Configuration - New SubAgentMarkdownParser with ExtractFrontmatter/ExtractBody matching the SkillScanner pattern; SubAgentFrontmatter DTO maps 1:1 to SubAgentProfile - Rewrite FileSubAgentDefinitionLoader to scan *.md, fail loud on malformed frontmatter, missing required fields (name/description/tools/body), unknown tool names for user-facing agents, duplicate names across files - Regenerate the three init-wizard seeds (research-assistant, code-analyst, summarizer) in the new format - Thread an optional runtimeContext parameter through SpawnAgentTool.Params → SubAgentSpawner.SpawnAsync → RunSubAgent → SubAgentActor initial user message - BuildUserMessage helper composes "Context:\n...\n\nTask:\n..." when context is present, raw task otherwise - Enriched discovery layer produces per-agent description/tools/timeout plus a "how to delegate" block with the context field example Tests: - 13 loader tests covering full frontmatter parsing, required-field failures, duplicate-name rejection, disallowed-tool rejection, visibility variants - 5 new SubAgentActor tests: BuildUserMessage unit cases (null/whitespace/ populated) + end-to-end RuntimeContext-is-prefixed + null-context-leaves- raw-task; captures FakeChatClient.LastReceivedMessages for assertion - All affected suites green: Configuration.Tests 189/189, Actors.Tests SubAgent+Session+ToolIndexUpdater filter 307/307, Daemon.Tests 436/436 Zero-migration observation: the three stock defaults are the only files that have ever existed in ~/.netclaw/agents/ on a fresh install, so the JSON → markdown migration carries no user-content risk. Closes #647

The loader test suite from #652 had nine near-identical tests that all reduced to `Assert.Empty(results)` after writing one malformed file. If the loader were rewritten to unconditionally return an empty list, all nine would still pass — they looked like they pinned nine distinct validation paths but actually pinned zero and never verified the "fail loud" contract (no assertion that a warning was logged, no check that the scanner actually reached the file). Replace with one consolidated test that drops a valid agent alongside eight kinds of invalid siblings in the same directory, asserts the valid agent still loads, and verifies each invalid sibling produced a specific warning naming both the file and the reason. That exercises both "don't abort the scan on one bad file" and "fail loud" in a single test. Also asserts that stray .json and .txt files in the directory are ignored at the glob layer — they must produce no warning at all because the scanner never touches them. Introduce a minimal `ListLogger<T>` that captures warning-level formatted messages so tests can assert on log contents. ~20 lines, lives in the same test file; if we need it elsewhere we can promote it. Separately, delete three `BuildUserMessage_*` unit tests on `SubAgentActor`. They were textbook examples of what CLAUDE.md says to delete ("if the test is just asserting that $"{a}/{b}" equals "a/b", delete it") — they pinned a 4-line string interpolation helper whose behavior is already covered by the two end-to-end `RuntimeContext_*` tests. With the unit tests gone, `BuildUserMessage` has no external caller, so tighten its visibility from `internal` to `private`. Net: Configuration.Tests 189 → 181, Actors.Tests SubAgent filter 30 → 27. Fewer tests, each covering real behavior. All green; slopwatch clean.

PR #652 migrated sub-agent definitions from JSON+MD sidecar pairs to single markdown files with YAML frontmatter, added an optional Context parameter to spawn_agent, and enriched the [available-subagents] discovery context layer to include per-agent description, tools, and an example call. None of that was reflected in the runbook — anyone following the existing "Defining subagents" section would write a .json file the loader rejects. Rewrite docs/runbooks/subagents.md end to end: - Discovery section shows the new enriched block (per-agent description + tools + timeout + a how-to-delegate block with the context field example) instead of the old one-line form. - Invocation section documents all three spawn_agent arguments (agent, task, context) with one null-context and one context-populated example. Clarifies that Context is prefixed onto the subagent's first user message as a Context:/Task: block, and that the agent's system prompt stays verbatim from disk (not mutated by runtime context). - Defining subagents section replaces the JSON schema + companion-file walkthrough with a single markdown-with-frontmatter template. Documents every frontmatter field with required/default columns: name, description, tools, modelRole, timeoutSeconds, visibility (hyphenated or PascalCase), emitStructuredFindings. Adds a loader-behavior-fail-loud subsection listing every rejection reason the scanner can log. - Creating a custom agent walkthrough now writes a single .md file with frontmatter instead of a JSON+MD pair. - Limitations section gains an entry noting that per-agent model selection is not yet supported and is blocked on the multi-model provider follow-on. Also update IMPLEMENTATION_PLAN.md line 969 — the "Subagent Roadmap" entry still listed the JSON file format. Updated minimally to reference the new format and cite both #226 (original disk-based design) and #652 (migration to frontmatter). Historical references in RELEASE_NOTES.md are left alone — they accurately describe what shipped at #226 and shouldn't be retconned.

The loader test suite from #652 had nine near-identical tests that all reduced to `Assert.Empty(results)` after writing one malformed file. If the loader were rewritten to unconditionally return an empty list, all nine would still pass — they looked like they pinned nine distinct validation paths but actually pinned zero and never verified the "fail loud" contract (no assertion that a warning was logged, no check that the scanner actually reached the file). Replace with one consolidated test that drops a valid agent alongside eight kinds of invalid siblings in the same directory, asserts the valid agent still loads, and verifies each invalid sibling produced a specific warning naming both the file and the reason. That exercises both "don't abort the scan on one bad file" and "fail loud" in a single test. Also asserts that stray .json and .txt files in the directory are ignored at the glob layer — they must produce no warning at all because the scanner never touches them. Introduce a minimal `ListLogger<T>` that captures warning-level formatted messages so tests can assert on log contents. ~20 lines, lives in the same test file; if we need it elsewhere we can promote it. Separately, delete three `BuildUserMessage_*` unit tests on `SubAgentActor`. They were textbook examples of what CLAUDE.md says to delete ("if the test is just asserting that $"{a}/{b}" equals "a/b", delete it") — they pinned a 4-line string interpolation helper whose behavior is already covered by the two end-to-end `RuntimeContext_*` tests. With the unit tests gone, `BuildUserMessage` has no external caller, so tighten its visibility from `internal` to `private`. Net: Configuration.Tests 189 → 181, Actors.Tests SubAgent filter 30 → 27. Fewer tests, each covering real behavior. All green; slopwatch clean.

The init wizard seeds an AGENTS.md "Subagent Delegation" section that tells the frontline LLM when to delegate to spawn_agent. The current guidance sets the bar too high — "deep web research requiring multiple searches and synthesis" as the threshold excludes the bulk of research tasks that would benefit from delegation. Live smoke testing earlier in the sub-agent work confirmed this: Qwen performed ~50k tokens of in-process research on a clearly delegable task because the guidance didn't reach the delegation trigger. Rewrite the section end to end, applying the proposal from #522 verbatim with two adjacent additions: 1. A context-window-protection bullet in "When to delegate". Delegation's biggest structural benefit isn't specialization or parallelism — it's keeping raw file/web content out of the main session's context window. The existing guidance never said that out loud. 2. A new "Per-call specialization" paragraph introducing the optional `context` argument that shipped in the single-file format work (#647 / #652). The discovery context layer advertises it, but AGENTS.md was still telling the model to call spawn_agent with only agent + task. The expanded "When to delegate" list lowers the bar to: - Research requiring 2+ sources or multiple searches (was: "deep research") - Parallelizable tasks (multiple independent queries concurrently) - Work that would otherwise pull large content into this session's context - Background prep work that doesn't block immediate response - Code analysis on large files or multiple files - Summarization of long documents or web pages - Preliminary passes on topics before diving deep The expanded "When NOT to delegate" list adds: - Tasks where coordination overhead outweighs parallelization benefits Adds a parallelization tip pointing out that independent topics should be spawned as concurrent subagents, not serialized. Note: this update affects the init-wizard seed template, so new installs pick it up on `netclaw init`. Existing users' AGENTS.md files are not mutated automatically — operators who want the new guidance should copy the new block into their `~/.netclaw/identity/AGENTS.md` manually. Closes #522

PR #652 migrated sub-agent definitions from JSON+MD sidecar pairs to single markdown files with YAML frontmatter, added an optional Context parameter to spawn_agent, and enriched the [available-subagents] discovery context layer to include per-agent description, tools, and an example call. None of that was reflected in the runbook — anyone following the existing "Defining subagents" section would write a .json file the loader rejects. Rewrite docs/runbooks/subagents.md end to end: - Discovery section shows the new enriched block (per-agent description + tools + timeout + a how-to-delegate block with the context field example) instead of the old one-line form. - Invocation section documents all three spawn_agent arguments (agent, task, context) with one null-context and one context-populated example. Clarifies that Context is prefixed onto the subagent's first user message as a Context:/Task: block, and that the agent's system prompt stays verbatim from disk (not mutated by runtime context). - Defining subagents section replaces the JSON schema + companion-file walkthrough with a single markdown-with-frontmatter template. Documents every frontmatter field with required/default columns: name, description, tools, modelRole, timeoutSeconds, visibility (hyphenated or PascalCase), emitStructuredFindings. Adds a loader-behavior-fail-loud subsection listing every rejection reason the scanner can log. - Creating a custom agent walkthrough now writes a single .md file with frontmatter instead of a JSON+MD pair. - Limitations section gains an entry noting that per-agent model selection is not yet supported and is blocked on the multi-model provider follow-on. Also update IMPLEMENTATION_PLAN.md line 969 — the "Subagent Roadmap" entry still listed the JSON file format. Updated minimally to reference the new format and cite both #226 (original disk-based design) and #652 (migration to frontmatter). Historical references in RELEASE_NOTES.md are left alone — they accurately describe what shipped at #226 and shouldn't be retconned.

…656) The init wizard seeds an AGENTS.md "Subagent Delegation" section that tells the frontline LLM when to delegate to spawn_agent. The current guidance sets the bar too high — "deep web research requiring multiple searches and synthesis" as the threshold excludes the bulk of research tasks that would benefit from delegation. Live smoke testing earlier in the sub-agent work confirmed this: Qwen performed ~50k tokens of in-process research on a clearly delegable task because the guidance didn't reach the delegation trigger. Rewrite the section end to end, applying the proposal from #522 verbatim with two adjacent additions: 1. A context-window-protection bullet in "When to delegate". Delegation's biggest structural benefit isn't specialization or parallelism — it's keeping raw file/web content out of the main session's context window. The existing guidance never said that out loud. 2. A new "Per-call specialization" paragraph introducing the optional `context` argument that shipped in the single-file format work (#647 / #652). The discovery context layer advertises it, but AGENTS.md was still telling the model to call spawn_agent with only agent + task. The expanded "When to delegate" list lowers the bar to: - Research requiring 2+ sources or multiple searches (was: "deep research") - Parallelizable tasks (multiple independent queries concurrently) - Work that would otherwise pull large content into this session's context - Background prep work that doesn't block immediate response - Code analysis on large files or multiple files - Summarization of long documents or web pages - Preliminary passes on topics before diving deep The expanded "When NOT to delegate" list adds: - Tasks where coordination overhead outweighs parallelization benefits Adds a parallelization tip pointing out that independent topics should be spawned as concurrent subagents, not serialized. Note: this update affects the init-wizard seed template, so new installs pick it up on `netclaw init`. Existing users' AGENTS.md files are not mutated automatically — operators who want the new guidance should copy the new block into their `~/.netclaw/identity/AGENTS.md` manually. Closes #522

Aaronontheweb added enhancement New feature or request sessions LLM session actor, turn lifecycle, pipelines labels Apr 14, 2026

Aaronontheweb merged commit f503937 into dev Apr 14, 2026
4 of 5 checks passed

Aaronontheweb deleted the feature/subagent-markdown-format branch April 14, 2026 02:48

Aaronontheweb mentioned this pull request Apr 14, 2026

test(subagents): consolidate anemic loader and actor tests #654

Merged

4 tasks

Aaronontheweb mentioned this pull request Apr 14, 2026

docs(subagents): rewrite runbook for single-file markdown format #655

Merged

Aaronontheweb mentioned this pull request Apr 14, 2026

feat(agents): expand subagent delegation guidance in AGENTS.md seed #656

Merged

4 tasks

Aaronontheweb added the subagents spawn_agent, SubAgentActor, definition loader, discovery context layer, and related features label Apr 14, 2026

Aaronontheweb mentioned this pull request Apr 14, 2026

feat(skills): declarative subagent routing via metadata.subagent #661

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(subagents): single-file markdown format with contextual prompt#652

feat(subagents): single-file markdown format with contextual prompt#652
Aaronontheweb merged 1 commit into
devfrom
feature/subagent-markdown-format

Aaronontheweb commented Apr 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aaronontheweb commented Apr 14, 2026

Summary

Why

Key changes

Test plan

Related

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant