subagent example: parallel mode truncates output to 100 chars and drops failed-task diagnostics

### Title

`subagent example: parallel mode truncates output to 100 chars and drops failed-task diagnostics`

### Labels (suggested)

`bug`, `area:examples`, `area:subagent`

### Affected versions

Reproduced on `@earendil-works/pi-coding-agent` **0.74.0**, **0.74.1**, and `main` HEAD (commit `3e5ad67e0f`, 2026-05-07). Both bugs predate 0.74.0 — the file's last logic change is older than the 0.74.0 tag; only a package-scope rename has touched it since.

---

### Summary

Two related defects in `packages/coding-agent/examples/extensions/subagent/index.ts` parallel mode result aggregator (lines 619–632) make the example unusable for orchestrator-style fan-outs:

1. **Output truncated to 100 chars per task** — `output.slice(0, 100)` is the only path back to the parent model. Full text exists in `details` but the parent LLM never sees it, so synthesizing parallel research results is impossible.
2. **Failed-task diagnostics swallowed** — only `getFinalOutput(r.messages)` is consulted. A child that exits non-zero before emitting `message_end` (unknown agent, model error, transport failure, abort) surfaces only `"(no output)"` to the parent — indistinguishable from a successful-but-silent run. The result object's `errorMessage`, `stderr`, and `stopReason` (all populated in such cases) are dropped.

The same file already implements the correct four-step fallback in **single mode** (line 652) and **chain mode** (line 541):

```ts
result.errorMessage || result.stderr || getFinalOutput(result.messages) || "(no output)"
```

This issue proposes mirroring that pattern in parallel mode and lifting the 100-char clamp to a documented byte cap.

---

### Reproduction

**Setup:** any pi 0.74.x install with the subagent example loaded, plus one valid user-level agent producing more than ~100 chars of output (any built-in or example).

Invoke the subagent tool with two parallel tasks — one valid, one with a deliberately invalid agent name:

```jsonc
{
  "tasks": [
    { "agent": "<any-valid-agent>", "task": "produce a short paragraph (300+ chars) on any topic" },
    { "agent": "this-agent-does-not-exist", "task": "anything" }
  ]
}
```

This single invocation triggers both defects: the valid task's output is truncated to 100 chars, and the failed task's `Unknown agent` diagnostic is dropped.

---

### Captured evidence

Real output from the repro above, run against pi 0.74.1.

**Valid task's full output (380 chars; captured from `details`):**

```text
Semantic line breaks place a newline after each complete clause or sentence rather than wrapping at a fixed column width, so source structure mirrors meaning. Because each line is a self-contained unit, edits show up as isolated diff hunks instead of cascading reflow churn, making code review, blame, and conflict resolution dramatically cleaner.
```

**Failed task's available diagnostic (`runSingleAgent` already constructs this at `index.ts:257-265`):**

```text
Unknown agent: "this-agent-does-not-exist". Available agents: [list elided].
```

**What the parent model actually receives under current upstream code** (the `content[].text` returned by the parallel tool):

```text
Parallel: 1/2 succeeded

[docs-expert] completed: Semantic line breaks place a newline after each complete clause or sentence rather than wrapping at...

[this-agent-does-not-exist] failed: (no output)
```

The valid task is clipped at 100 chars mid-sentence; the failed task's diagnostic is gone.

**What the parent model receives with the proposed patch applied** (same invocation, same data):

```text
Parallel: 1/2 succeeded

### [docs-expert] completed

Semantic line breaks place a newline after each complete clause or sentence rather than wrapping at a fixed column width, so source structure mirrors meaning. Because each line is a self-contained unit, edits show up as isolated diff hunks instead of cascading reflow churn, making code review, blame, and conflict resolution dramatically cleaner.

---

### [this-agent-does-not-exist] failed

Unknown agent: "this-agent-does-not-exist". Available agents: [list elided].
```

The valid task arrives intact (well under the 50 KB cap); the failed task's diagnostic — already computed by `runSingleAgent` and stored on `result.errorMessage` — is surfaced instead of dropped.

---

### Why this matters

The subagent example is the natural substrate for orchestrator-style fan-outs (multi-perspective code review, parallel research, cross-domain investigation). Both defects above silently break the orchestrator pattern:

- Defect 1: the parent LLM cannot synthesize parallel outputs it cannot see.
- Defect 2: operator-facing errors (rate-limit hit, unknown agent name, model error, child abort) become invisible — the orchestrator misclassifies a hard failure as a silent success and produces a confident-but-wrong synthesis.

Defect 2 is particularly insidious because the same diagnostic the parallel branch discards is already correctly surfaced by single mode and chain mode in the same file — so any user comparing modes finds the inconsistency surprising rather than expected.

---

### Proposed fix

Single change, ~20 lines, covering both defects in one place. Hoists a named module-scope constant for the cap, mirrors the existing single/chain-mode failure predicate and fallback chain, and tags failed tasks with their `stopReason` so the cause is visible without expanding TUI `details`.

```diff
--- a/packages/coding-agent/examples/extensions/subagent/index.ts
+++ b/packages/coding-agent/examples/extensions/subagent/index.ts
@@ -<existing module-scope constants block>
 const MAX_PARALLEL_TASKS = 8;
 const MAX_CONCURRENCY = 4;
+// Per-task byte cap for the parallel summary returned to the parent model.
+// Matches the DEFAULT_MAX_BYTES (50 KB) convention used by truncated-tool.ts
+// and tool-override.ts elsewhere in the examples. Large enough to carry a
+// structured review or research report; small enough to stay well under the
+// harness tool-result limit even with 8 tasks.
+const PER_TASK_OUTPUT_CAP = 50_000;
@@ -619,11 +625,28 @@
                 const successCount = results.filter((r) => r.exitCode === 0).length;
+                // Parallel summary needs to (a) return enough output for the parent
+                // model to actually synthesize results — a 100-char preview is
+                // unusable — and (b) surface diagnostics for failed children,
+                // mirroring the fallback already used by single mode (~line 652)
+                // and chain mode (~line 541). Without (b), a child that exits
+                // non-zero before emitting an assistant message_end (unknown agent,
+                // model error, transport failure, abort) surfaces only "(no output)"
+                // — indistinguishable from a silent success.
                 const summaries = results.map((r) => {
-                    const output = getFinalOutput(r.messages);
-                    const preview = output.slice(0, 100) + (output.length > 100 ? "..." : "");
-                    return `[${r.agent}] ${r.exitCode === 0 ? "completed" : "failed"}: ${preview || "(no output)"}`;
+                    const isError =
+                        r.exitCode !== 0 || r.stopReason === "error" || r.stopReason === "aborted";
+                    const output = isError
+                        ? r.errorMessage || r.stderr || getFinalOutput(r.messages) || "(no output)"
+                        : getFinalOutput(r.messages) || "(no output)";
+                    const status = isError
+                        ? `failed${r.stopReason && r.stopReason !== "end" ? ` (${r.stopReason})` : ""}`
+                        : "completed";
+                    const body =
+                        output.length > PER_TASK_OUTPUT_CAP
+                            ? `${output.slice(0, PER_TASK_OUTPUT_CAP)}\n\n…[truncated ${output.length - PER_TASK_OUTPUT_CAP} bytes; full output preserved in tool details]`
+                            : output;
+                    return `### [${r.agent}] ${status}\n\n${body}`;
                 });
                 return {
                     content: [
                         {
                             type: "text",
-                            text: `Parallel: ${successCount}/${results.length} succeeded\n\n${summaries.join("\n\n")}`,
+                            text: `Parallel: ${successCount}/${results.length} succeeded\n\n${summaries.join("\n\n---\n\n")}`,
                         },
                     ],
                     details: makeDetails("parallel")(results),
```

Cross-references in the same file that establish the precedent:

- Single-mode fallback: `index.ts:652`
- Chain-mode fallback: `index.ts:541`
- Failure predicate (single mode): `index.ts:678`
- Failure predicate (chain mode): `index.ts:545`
- `SingleResult` shape: `index.ts:142-155` (`errorMessage?`, `stderr`, `stopReason?` all populated on failure paths at `index.ts:338-339`)
- `getFinalOutput` returns `""` on empty messages (`index.ts:163-173`) — this is the state of a child that crashed before any `message_end`, which is why the fallback chain matters.

### Tool-result shape change

The parallel mode `content[].text` shape changes from:

```text
[agent] completed: <≤100-char preview>...
```

To:

```text
### [agent] completed

<full output, up to 50 KB>
```

with `---` separators between tasks. Intentional, model-friendlier format; calling out for any downstream snippets that may assert on the old preview format.

---

### Open questions for maintainer

1. **Test pattern for `examples/extensions/*`?** Happy to add a minimal `node --test` file alongside `index.ts` that constructs synthetic `SingleResult[]` and asserts on the rendered summary, if that fits the conventions.
2. **`PER_TASK_OUTPUT_CAP` configurable?** Should this be wired through the extension's `init` options (alongside `agentScope` / `confirmProjectAgents`), or is a hard-coded module-scope constant preferable for an example?
3. **Harness tool-result `content[].text` cap?** 50 KB was chosen to match the `DEFAULT_MAX_BYTES` precedent in `truncated-tool.ts` and `tool-override.ts`. If the harness applies its own cap to `content[].text`, the value should align (or stay comfortably under).
4. **TUI renderer parallel-branch predicate** (`exitCode > 0` at `index.ts:896-897`, treating `-1` as running) differs from the single/chain-mode predicate this patch mirrors (`exitCode !== 0 || stopReason in {error, aborted}`). Should the renderer be unified in this PR or kept as a follow-up?
5. **PR shape preference if you would like one** — the diff above is a single self-contained change; happy to open it as a PR on request. Two commits (preview-clamp first, diagnostics-fallback second) would split cleanly along the two conceptual fixes, but the line ranges overlap, so one squashed commit may be preferable.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

subagent example: parallel mode truncates output to 100 chars and drops failed-task diagnostics #4710

Title

Labels (suggested)

Affected versions

Summary

Reproduction

Captured evidence

Why this matters

Proposed fix

Tool-result shape change

Open questions for maintainer

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

subagent example: parallel mode truncates output to 100 chars and drops failed-task diagnostics #4710

Description

Title

Labels (suggested)

Affected versions

Summary

Reproduction

Captured evidence

Why this matters

Proposed fix

Tool-result shape change

Open questions for maintainer

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions