Skip to content

feat(subagents): single-file markdown format with contextual prompt#652

Merged
Aaronontheweb merged 1 commit into
devfrom
feature/subagent-markdown-format
Apr 14, 2026
Merged

feat(subagents): single-file markdown format with contextual prompt#652
Aaronontheweb merged 1 commit into
devfrom
feature/subagent-markdown-format

Conversation

@Aaronontheweb

Copy link
Copy Markdown
Collaborator

Summary

  • Adopt the markdown-with-YAML-frontmatter shape used by Claude Code, OpenCode, and Netclaw's own skill system. Each subagent is now a single .md file under ~/.netclaw/agents/ where the frontmatter carries metadata (name, description, tools, modelRole, timeoutSeconds, visibility, emitStructuredFindings) and the body is the system prompt verbatim. Replaces the JSON + .md sidecar pair that was the only two-file artifact type left in Netclaw.
  • Add an optional Context parameter to spawn_agent so the frontline LLM can pass per-invocation background (workspace details, the user's broader goal, facts the subagent would otherwise have to rediscover) without editing the agent file. Context is prefixed onto the subagent's first user message as a separate Context: block, keeping the agent's static system prompt verbatim and reproducible across calls. When Context is null the protocol is byte-identical to the pre-change shape.
  • Enrich the [available-subagents] discovery context layer to emit each agent's full description, tool list, and a usage example that includes the new Context field — today's one-line form did not teach the model how or when to delegate.

Why

Netclaw's sub-agents were the only disk-based artifact in the install that still used a two-file JSON+MD pair. Claude Code, OpenCode, AgentSkills.io SKILL.md, and Aaron's own local-ai-preferences/agents/*.md library all use single markdown-with-frontmatter files. Adopting the de facto shape means existing libraries become drop-in portable (modulo tool-name translation, tracked separately) and new sub-agents are half the authoring cost.

The Context parameter addresses a separate but related problem: Phase 0.2 smoke testing earlier in this session confirmed Qwen won't organically delegate on research-worthy prompts, and even when it does, the sub-agent gets a cold start with no workspace/goal awareness. Runtime context lets the parent session specialize a single profile per invocation rather than authoring N tightly-scoped profiles.

Zero-migration observation: the three stock defaults are the only files that have ever existed in ~/.netclaw/agents/ on a fresh install, so the JSON → markdown migration carries no user-content risk.

Key changes

  • Add YamlDotNet PackageReference to Netclaw.Configuration
  • New SubAgentMarkdownParser with ExtractFrontmatter/ExtractBody matching the SkillScanner pattern; SubAgentFrontmatter DTO maps 1:1 to SubAgentProfile
  • Rewrite FileSubAgentDefinitionLoader to scan *.md, fail loud on malformed frontmatter, missing required fields (name/description/tools/body), unknown tool names for user-facing agents, duplicate names across files
  • Regenerate the three init-wizard seeds (research-assistant, code-analyst, summarizer) in the new format
  • Thread an optional runtimeContext parameter through SpawnAgentTool.ParamsSubAgentSpawner.SpawnAsyncRunSubAgentSubAgentActor initial user message
  • BuildUserMessage helper composes Context:\n...\n\nTask:\n... when context is present, raw task otherwise
  • Enriched discovery layer produces per-agent description/tools/timeout plus a "how to delegate" block with the context field example

Test plan

  • 13 new/rewritten FileSubAgentDefinitionLoader tests covering full frontmatter parsing, required-field failures, duplicate-name rejection, disallowed-tool rejection, hyphenated+PascalCase visibility variants, empty body, missing frontmatter, non-.md files ignored
  • 5 new SubAgentActor tests: BuildUserMessage unit cases (null/whitespace/populated) plus end-to-end RuntimeContext_is_prefixed_onto_first_user_message and Null_RuntimeContext_leaves_first_user_message_as_raw_task via a FakeChatClient.LastReceivedMessages capture
  • Netclaw.Configuration.Tests: 189/189 green
  • Netclaw.Actors.Tests (SubAgent + Session + ToolIndexUpdater filter): 307/307 green
  • Netclaw.Daemon.Tests: 436/436 green
  • dotnet slopwatch analyze0 issues
  • Live smoke test against a fresh daemon (post-merge, not in this PR)

Related

Adopt the markdown-with-YAML-frontmatter shape used by Claude Code, OpenCode,
and Netclaw's own skill system. Each subagent is now a single .md file under
~/.netclaw/agents/ where the frontmatter carries metadata (name, description,
tools, modelRole, timeoutSeconds, visibility, emitStructuredFindings) and the
body is the system prompt verbatim. Replaces the JSON + .md sidecar pair that
was the only two-file artifact type left in Netclaw.

Add an optional Context parameter to spawn_agent so the frontline LLM can
pass per-invocation background (workspace details, the user's broader goal,
facts the subagent would otherwise have to rediscover) without editing the
agent file. Context is prefixed onto the subagent's first user message as a
separate "Context:" block, keeping the agent's static system prompt verbatim
and reproducible across calls. When Context is null the protocol is
byte-identical to the pre-change shape.

Enrich the [available-subagents] discovery context layer to emit each agent's
full description, tool list, and a usage example that includes the new
Context field — today's one-line form did not teach the model how or when to
delegate.

Key changes:

- Add YamlDotNet PackageReference to Netclaw.Configuration
- New SubAgentMarkdownParser with ExtractFrontmatter/ExtractBody matching
  the SkillScanner pattern; SubAgentFrontmatter DTO maps 1:1 to SubAgentProfile
- Rewrite FileSubAgentDefinitionLoader to scan *.md, fail loud on malformed
  frontmatter, missing required fields (name/description/tools/body), unknown
  tool names for user-facing agents, duplicate names across files
- Regenerate the three init-wizard seeds (research-assistant, code-analyst,
  summarizer) in the new format
- Thread an optional runtimeContext parameter through SpawnAgentTool.Params →
  SubAgentSpawner.SpawnAsync → RunSubAgent → SubAgentActor initial user message
- BuildUserMessage helper composes "Context:\n...\n\nTask:\n..." when context
  is present, raw task otherwise
- Enriched discovery layer produces per-agent description/tools/timeout plus
  a "how to delegate" block with the context field example

Tests:

- 13 loader tests covering full frontmatter parsing, required-field failures,
  duplicate-name rejection, disallowed-tool rejection, visibility variants
- 5 new SubAgentActor tests: BuildUserMessage unit cases (null/whitespace/
  populated) + end-to-end RuntimeContext-is-prefixed + null-context-leaves-
  raw-task; captures FakeChatClient.LastReceivedMessages for assertion
- All affected suites green: Configuration.Tests 189/189, Actors.Tests
  SubAgent+Session+ToolIndexUpdater filter 307/307, Daemon.Tests 436/436

Zero-migration observation: the three stock defaults are the only files that
have ever existed in ~/.netclaw/agents/ on a fresh install, so the JSON →
markdown migration carries no user-content risk.

Closes #647
@Aaronontheweb Aaronontheweb added enhancement New feature or request sessions LLM session actor, turn lifecycle, pipelines labels Apr 14, 2026
@Aaronontheweb Aaronontheweb merged commit f503937 into dev Apr 14, 2026
4 of 5 checks passed
@Aaronontheweb Aaronontheweb deleted the feature/subagent-markdown-format branch April 14, 2026 02:48
Aaronontheweb added a commit that referenced this pull request Apr 14, 2026
The loader test suite from #652 had nine near-identical tests that all reduced
to `Assert.Empty(results)` after writing one malformed file. If the loader
were rewritten to unconditionally return an empty list, all nine would still
pass — they looked like they pinned nine distinct validation paths but
actually pinned zero and never verified the "fail loud" contract (no
assertion that a warning was logged, no check that the scanner actually
reached the file).

Replace with one consolidated test that drops a valid agent alongside eight
kinds of invalid siblings in the same directory, asserts the valid agent
still loads, and verifies each invalid sibling produced a specific warning
naming both the file and the reason. That exercises both "don't abort the
scan on one bad file" and "fail loud" in a single test. Also asserts that
stray .json and .txt files in the directory are ignored at the glob layer —
they must produce no warning at all because the scanner never touches them.

Introduce a minimal `ListLogger<T>` that captures warning-level formatted
messages so tests can assert on log contents. ~20 lines, lives in the same
test file; if we need it elsewhere we can promote it.

Separately, delete three `BuildUserMessage_*` unit tests on `SubAgentActor`.
They were textbook examples of what CLAUDE.md says to delete ("if the test
is just asserting that $"{a}/{b}" equals "a/b", delete it") — they pinned a
4-line string interpolation helper whose behavior is already covered by the
two end-to-end `RuntimeContext_*` tests. With the unit tests gone,
`BuildUserMessage` has no external caller, so tighten its visibility from
`internal` to `private`.

Net: Configuration.Tests 189 → 181, Actors.Tests SubAgent filter 30 → 27.
Fewer tests, each covering real behavior. All green; slopwatch clean.
Aaronontheweb added a commit that referenced this pull request Apr 14, 2026
PR #652 migrated sub-agent definitions from JSON+MD sidecar pairs to single
markdown files with YAML frontmatter, added an optional Context parameter to
spawn_agent, and enriched the [available-subagents] discovery context layer
to include per-agent description, tools, and an example call. None of that
was reflected in the runbook — anyone following the existing "Defining
subagents" section would write a .json file the loader rejects.

Rewrite docs/runbooks/subagents.md end to end:

- Discovery section shows the new enriched block (per-agent description +
  tools + timeout + a how-to-delegate block with the context field example)
  instead of the old one-line form.
- Invocation section documents all three spawn_agent arguments (agent, task,
  context) with one null-context and one context-populated example. Clarifies
  that Context is prefixed onto the subagent's first user message as a
  Context:/Task: block, and that the agent's system prompt stays verbatim
  from disk (not mutated by runtime context).
- Defining subagents section replaces the JSON schema + companion-file
  walkthrough with a single markdown-with-frontmatter template. Documents
  every frontmatter field with required/default columns: name, description,
  tools, modelRole, timeoutSeconds, visibility (hyphenated or PascalCase),
  emitStructuredFindings. Adds a loader-behavior-fail-loud subsection listing
  every rejection reason the scanner can log.
- Creating a custom agent walkthrough now writes a single .md file with
  frontmatter instead of a JSON+MD pair.
- Limitations section gains an entry noting that per-agent model selection
  is not yet supported and is blocked on the multi-model provider follow-on.

Also update IMPLEMENTATION_PLAN.md line 969 — the "Subagent Roadmap" entry
still listed the JSON file format. Updated minimally to reference the new
format and cite both #226 (original disk-based design) and #652 (migration
to frontmatter).

Historical references in RELEASE_NOTES.md are left alone — they accurately
describe what shipped at #226 and shouldn't be retconned.
Aaronontheweb added a commit that referenced this pull request Apr 14, 2026
The loader test suite from #652 had nine near-identical tests that all reduced
to `Assert.Empty(results)` after writing one malformed file. If the loader
were rewritten to unconditionally return an empty list, all nine would still
pass — they looked like they pinned nine distinct validation paths but
actually pinned zero and never verified the "fail loud" contract (no
assertion that a warning was logged, no check that the scanner actually
reached the file).

Replace with one consolidated test that drops a valid agent alongside eight
kinds of invalid siblings in the same directory, asserts the valid agent
still loads, and verifies each invalid sibling produced a specific warning
naming both the file and the reason. That exercises both "don't abort the
scan on one bad file" and "fail loud" in a single test. Also asserts that
stray .json and .txt files in the directory are ignored at the glob layer —
they must produce no warning at all because the scanner never touches them.

Introduce a minimal `ListLogger<T>` that captures warning-level formatted
messages so tests can assert on log contents. ~20 lines, lives in the same
test file; if we need it elsewhere we can promote it.

Separately, delete three `BuildUserMessage_*` unit tests on `SubAgentActor`.
They were textbook examples of what CLAUDE.md says to delete ("if the test
is just asserting that $"{a}/{b}" equals "a/b", delete it") — they pinned a
4-line string interpolation helper whose behavior is already covered by the
two end-to-end `RuntimeContext_*` tests. With the unit tests gone,
`BuildUserMessage` has no external caller, so tighten its visibility from
`internal` to `private`.

Net: Configuration.Tests 189 → 181, Actors.Tests SubAgent filter 30 → 27.
Fewer tests, each covering real behavior. All green; slopwatch clean.
Aaronontheweb added a commit that referenced this pull request Apr 14, 2026
The init wizard seeds an AGENTS.md "Subagent Delegation" section that tells
the frontline LLM when to delegate to spawn_agent. The current guidance sets
the bar too high — "deep web research requiring multiple searches and
synthesis" as the threshold excludes the bulk of research tasks that would
benefit from delegation. Live smoke testing earlier in the sub-agent work
confirmed this: Qwen performed ~50k tokens of in-process research on a
clearly delegable task because the guidance didn't reach the delegation
trigger.

Rewrite the section end to end, applying the proposal from #522 verbatim
with two adjacent additions:

1. A context-window-protection bullet in "When to delegate". Delegation's
   biggest structural benefit isn't specialization or parallelism — it's
   keeping raw file/web content out of the main session's context window.
   The existing guidance never said that out loud.
2. A new "Per-call specialization" paragraph introducing the optional
   `context` argument that shipped in the single-file format work (#647 /
   #652). The discovery context layer advertises it, but AGENTS.md was
   still telling the model to call spawn_agent with only agent + task.

The expanded "When to delegate" list lowers the bar to:
- Research requiring 2+ sources or multiple searches (was: "deep research")
- Parallelizable tasks (multiple independent queries concurrently)
- Work that would otherwise pull large content into this session's context
- Background prep work that doesn't block immediate response
- Code analysis on large files or multiple files
- Summarization of long documents or web pages
- Preliminary passes on topics before diving deep

The expanded "When NOT to delegate" list adds:
- Tasks where coordination overhead outweighs parallelization benefits

Adds a parallelization tip pointing out that independent topics should be
spawned as concurrent subagents, not serialized.

Note: this update affects the init-wizard seed template, so new installs
pick it up on `netclaw init`. Existing users' AGENTS.md files are not
mutated automatically — operators who want the new guidance should copy
the new block into their `~/.netclaw/identity/AGENTS.md` manually.

Closes #522
@Aaronontheweb Aaronontheweb added the subagents spawn_agent, SubAgentActor, definition loader, discovery context layer, and related features label Apr 14, 2026
Aaronontheweb added a commit that referenced this pull request Apr 14, 2026
PR #652 migrated sub-agent definitions from JSON+MD sidecar pairs to single
markdown files with YAML frontmatter, added an optional Context parameter to
spawn_agent, and enriched the [available-subagents] discovery context layer
to include per-agent description, tools, and an example call. None of that
was reflected in the runbook — anyone following the existing "Defining
subagents" section would write a .json file the loader rejects.

Rewrite docs/runbooks/subagents.md end to end:

- Discovery section shows the new enriched block (per-agent description +
  tools + timeout + a how-to-delegate block with the context field example)
  instead of the old one-line form.
- Invocation section documents all three spawn_agent arguments (agent, task,
  context) with one null-context and one context-populated example. Clarifies
  that Context is prefixed onto the subagent's first user message as a
  Context:/Task: block, and that the agent's system prompt stays verbatim
  from disk (not mutated by runtime context).
- Defining subagents section replaces the JSON schema + companion-file
  walkthrough with a single markdown-with-frontmatter template. Documents
  every frontmatter field with required/default columns: name, description,
  tools, modelRole, timeoutSeconds, visibility (hyphenated or PascalCase),
  emitStructuredFindings. Adds a loader-behavior-fail-loud subsection listing
  every rejection reason the scanner can log.
- Creating a custom agent walkthrough now writes a single .md file with
  frontmatter instead of a JSON+MD pair.
- Limitations section gains an entry noting that per-agent model selection
  is not yet supported and is blocked on the multi-model provider follow-on.

Also update IMPLEMENTATION_PLAN.md line 969 — the "Subagent Roadmap" entry
still listed the JSON file format. Updated minimally to reference the new
format and cite both #226 (original disk-based design) and #652 (migration
to frontmatter).

Historical references in RELEASE_NOTES.md are left alone — they accurately
describe what shipped at #226 and shouldn't be retconned.
Aaronontheweb added a commit that referenced this pull request Apr 14, 2026
…656)

The init wizard seeds an AGENTS.md "Subagent Delegation" section that tells
the frontline LLM when to delegate to spawn_agent. The current guidance sets
the bar too high — "deep web research requiring multiple searches and
synthesis" as the threshold excludes the bulk of research tasks that would
benefit from delegation. Live smoke testing earlier in the sub-agent work
confirmed this: Qwen performed ~50k tokens of in-process research on a
clearly delegable task because the guidance didn't reach the delegation
trigger.

Rewrite the section end to end, applying the proposal from #522 verbatim
with two adjacent additions:

1. A context-window-protection bullet in "When to delegate". Delegation's
   biggest structural benefit isn't specialization or parallelism — it's
   keeping raw file/web content out of the main session's context window.
   The existing guidance never said that out loud.
2. A new "Per-call specialization" paragraph introducing the optional
   `context` argument that shipped in the single-file format work (#647 /
   #652). The discovery context layer advertises it, but AGENTS.md was
   still telling the model to call spawn_agent with only agent + task.

The expanded "When to delegate" list lowers the bar to:
- Research requiring 2+ sources or multiple searches (was: "deep research")
- Parallelizable tasks (multiple independent queries concurrently)
- Work that would otherwise pull large content into this session's context
- Background prep work that doesn't block immediate response
- Code analysis on large files or multiple files
- Summarization of long documents or web pages
- Preliminary passes on topics before diving deep

The expanded "When NOT to delegate" list adds:
- Tasks where coordination overhead outweighs parallelization benefits

Adds a parallelization tip pointing out that independent topics should be
spawned as concurrent subagents, not serialized.

Note: this update affects the init-wizard seed template, so new installs
pick it up on `netclaw init`. Existing users' AGENTS.md files are not
mutated automatically — operators who want the new guidance should copy
the new block into their `~/.netclaw/identity/AGENTS.md` manually.

Closes #522
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request sessions LLM session actor, turn lifecycle, pipelines subagents spawn_agent, SubAgentActor, definition loader, discovery context layer, and related features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(subagents): single-file markdown format + contextual prompt + project-scoped discovery

1 participant