Skip to content

agents: request-by-reference LLM handoff + sibling-tidied providers#1483

Merged
jonastemplestein merged 1 commit into
mainfrom
provider-skeleton
Jun 10, 2026
Merged

agents: request-by-reference LLM handoff + sibling-tidied providers#1483
jonastemplestein merged 1 commit into
mainfrom
provider-skeleton

Conversation

@jonastemplestein

@jonastemplestein jonastemplestein commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Steps B and C of the agents roadmap (after #1460 and #1475): the LLM request handoff stops embedding the conversation, and the two LLM request processors become deliberate, tidy siblings sharing pure helpers.

Request-by-reference (no more embedded body)

agent/llm-request-requested used to carry the full chat request. Since the conversation grows with the stream, every request stored a complete copy of it — O(N²) stream growth. The llmRequestId already IS the requested event's offset, so the body is redundant: providers can rebuild it from committed history.

Before:

// agent processor, on handoff
payload: {
  model,
  runOpts,
  body: buildLlmChatRequest(stateAtRequest), // full conversation, every time
}

After:

// agent processor: just the reference + how to run it
payload: { model: stateAtRequest.llmConfig.model, runOpts: stateAtRequest.llmConfig.runOpts }

// provider, at execution time (both cloudflare-ai and openai-ws):
// Request-by-reference: the requested event carries no body; rebuild the
// chat request from committed history up to the request's own offset.
const body = buildAgentLlmRequestBody({
  events: await this.deps.readStreamEvents(),
  llmRequestId, // === the requested event's offset
});

The rebuild reduces history events.filter((e) => e.offset <= llmRequestId) through the same reduceAgentEvents + buildLlmChatRequest pair the agent itself uses, so the model-visible context is reproducible from the stream forever — including for crash-recovery retries, which re-derive exactly what the dead incarnation would have sent.

Breaking change to the agent/llm-request-requested payload (and cloudflare-ai/llm-request-started, which also embedded the body). No backcompat bridge; prd gets redeployed.

Providers as siblings, not an abstraction

cloudflare-ai and openai-ws were ~500/790-line copy-pasted state machines. Rather than an abstract base class, they're now deliberate siblings: same method names, same control flow, same comments where the logic matches — each keeps its own event types (which scales better as providers diverge) and its own transport (one AI.run() call vs a shared Responses WebSocket). What they share are four stateless functions in llm-request-helpers.ts:

buildAgentLlmRequestBody({ events, llmRequestId });     // the request-by-reference rebuild
isAgentLlmRequestStillCurrent({ events, llmRequestId }); // stale-output guard before agent-visible appends
findDanglingLlmRequestIds({ requests, executedLlmRequestIds }); // crash-recovery candidates
parseLlmRequestRequestedEventAt({ events, llmRequestId });      // typed re-derivation for recovery

Both implementation files open with the same note: "When you fix something here, check whether the sibling needs the same fix."

Bounded execution-claims set

#executedLlmRequestIds (the instance-scoped set distinguishing "this incarnation is executing it" from "a dead one was") previously only grew. Both siblings now drop a claim when the request's own completed fact reduces back:

case "events.iterate.com/cloudflare-ai/llm-request-completed":
  // The completed fact is durable; this instance can never need to
  // (re-)execute this request again, so drop the claim — this is what
  // keeps the executed set bounded.
  this.#executedLlmRequestIds.delete(event.payload.llmRequestId);
  return;

Tests

  • New in both provider suites: rebuilds the chat request from history up to the request's offset — history rows after the requested event's offset are excluded from what the model sees.
  • The agent handoff test now asserts the requested payload is { model, runOpts } with no body.
  • Provider fixtures lost their embedded bodies; conversation content now flows through readStreamEvents history, matching production.

pnpm typecheck && pnpm lint && pnpm format && pnpm test all green.

🤖 Generated with Claude Code


Note

Medium Risk
Breaking event payload shapes for llm-request-requested and provider started events; correctness now depends on history reads and offset-bounded rebuild at execution time, including crash recovery paths.

Overview
Request-by-reference LLM handoff stops embedding the full conversation on agent/llm-request-requested (and drops body from cloudflare-ai/llm-request-started). Handoffs are now { model, runOpts } only; llmRequestId stays the requested event’s offset, and cloudflare-ai / openai-ws rebuild model input at execution via shared llm-request-helpers.ts (buildAgentLlmRequestBody reduces history with offset <= llmRequestId).

The two provider processors are aligned as siblings (shared helpers for rebuild, still-current checks, dangling recovery, typed re-parse) instead of duplicated logic. #executedLlmRequestIds now drops entries when each request’s provider completed event reduces, so long-lived instances don’t grow the claim set forever.

Tests assert handoff payloads have no body and that providers exclude stream rows after the request offset when building chat input.

Reviewed by Cursor Bugbot for commit 83ed554. Bugbot is set up for automated code reviews on this repo. Configure here.

Environment Config Lease

No active environment config lease.

OS

Status: released
Commit: 83ed554
Preview: https://os.iterate-preview-7.com
Summary: Preview app released.
Workflow run
Updated: 2026-06-10T22:09:52.345Z

agent/llm-request-requested no longer embeds the conversation body —
embedding it stored a full copy of the growing history in every request
(O(N^2) stream growth). The llmRequestId IS the requested event's
offset, so providers rebuild the chat request by reducing committed
history up to that offset; the payload keeps {model, runOpts} for
observability. Breaking change to the event payload; prd gets
redeployed.

The two LLM request processors (cloudflare-ai, openai-ws) are tidied
into deliberate siblings: same method names, same control flow, same
comments where the logic matches, each owning its event types and
transport. Stateless logic they share moves to llm-request-helpers.ts
(buildAgentLlmRequestBody, isAgentLlmRequestStillCurrent,
findDanglingLlmRequestIds, parseLlmRequestRequestedEventAt).

Also bounds the instance-scoped #executedLlmRequestIds set: a claim is
dropped once the request's completed fact reduces back, since the
durable completed-skip makes re-execution impossible from then on.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jonastemplestein jonastemplestein merged commit 085d923 into main Jun 10, 2026
8 checks passed
@jonastemplestein jonastemplestein deleted the provider-skeleton branch June 10, 2026 22:08
jonastemplestein added a commit that referenced this pull request Jun 10, 2026
The last batch of outstanding work from the agents audit (after #1460,
#1475, #1483): the section-3 dead-code sweep and the UI shrink. Net
**−188 lines** (−503/+315).

## One agent setup form

`agents/new.tsx` and `agents/new-preset.tsx` were ~270-line,
~90%-identical forms (provider/model/runOpts/system-prompt/custom-events
fields plus the YAML preview pane). They now share one
`AgentSetupFormPage` component; each route keeps only what genuinely
differs — its path normalization, its preview builder, and its submit
mutation:

```tsx
<AgentSetupFormPage
  title="New Agent"
  pathLabel="Agent path"
  buildPreview={(values) => buildPreviewEvents({ projectId: project.id, values })}
  submitIdleLabel="Create agent"
  isPending={createAgent.isPending}
  onSubmit={({ preview }) => createAgent.mutate(preview)}
  ...
/>
```

The routes drop from ~270 lines each to ~125, and the next form tweak
happens once instead of twice.

## Legacy Slack preset filter deleted

`agent-presets.ts` carried `isLegacyGeneratedSlackOpenAiPreset` — a
content-sniffing filter that suppressed an old auto-generated
`/agents/slack` preset by matching its system-prompt text. Checked prd
before deleting: the iterate project has **zero** stored presets, so the
filter guards nothing. The intentional behavior next to it (Slack agents
never inherit the generic `/agents` preset) stays, with its tests.

## Stale migration headers

Every processor under
`apps/os/src/domains/{agents,slack}/stream-processors/` opened with
"Migrated from `packages/shared/src/stream-processors/...`" — a
directory that no longer exists. Those provenance paragraphs are gone.
Where they carried a live constraint, the constraint survives in its own
words:

```ts
// Appended event types, payload shapes, and idempotency-key derivations
// (`agent/<key>@<sourceOffset>`) are stable wire formats — changing them
// breaks dedup against events already committed to streams.
```

## Audit bug status (no code change needed)

- **2.4 zombie `pendingTriggerCount`** — fixed by construction since
#1460: the reconcilers guarantee dangling requests reach a terminal
event, and a queued count only ever represents real user inputs whose
follow-up turn rebuilds from full history.
- **2.3 cancellation check-then-act race** — still a theoretical window;
closing it needs conditional appends (append-if-still-current at the
stream layer), which is its own design, not a cleanup.

`pnpm typecheck && pnpm lint && pnpm format && pnpm test` all green.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Mostly UI deduplication and comment edits; Slack preset selection is
slightly broader if old auto-generated presets exist in storage, which
the PR assumes is empty.
> 
> **Overview**
> Introduces **`AgentSetupFormPage`** so **New Agent** and **New Agent
Preset** share one form (provider, model, run options, system prompt,
custom events YAML, live preview). Each route only keeps path handling,
its preview builder, and submit logic—roughly halving page size.
> 
> **Removes** the legacy **`isLegacyGeneratedSlackOpenAiPreset`** filter
and its test from `agent-presets.ts`. Slack agents still only match
Slack-scoped presets; stored `/agents/slack` presets are no longer
sniffed and ignored by prompt text.
> 
> **Trims** stale “migrated from `packages/shared`…” headers across
agent and Slack stream-processor modules, leaving short notes where wire
formats and idempotency keys must stay stable.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
c9cf738. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- CLOUDFLARE_PREVIEW -->
## Environment Config Lease
<!-- CLOUDFLARE_PREVIEW_STATE -->
<!--
{
  "apps": {
    "os": {
      "appDisplayName": "OS",
      "appSlug": "os",
      "status": "deployed",
      "updatedAt": "2026-06-10T22:24:45.981Z",
      "headSha": "c9cf738e2dad4877de8286cf370cf50cedd81eeb",
      "message": null,
      "publicUrl": "https://os.iterate-preview-4.com",
"runUrl": "https://github.com/iterate/iterate/actions/runs/27310094398",
      "shortSha": "c9cf738"
    }
  },
  "environmentConfigLease": {
    "dopplerConfig": "preview_4",
    "leasedUntil": 1781133683378,
    "leaseId": "51ccdfaf-05f8-4f1c-a8ed-1b26dec7500b",
    "slug": "preview-4",
    "type": "environment-config-lease"
  }
}
-->
<!-- /CLOUDFLARE_PREVIEW_STATE -->
Lease: `preview-4`
Doppler config: `preview_4`
Type: `environment-config-lease`
Leased until: 2026-06-10T23:21:23.378Z

### OS
Status: deployed
Commit: `c9cf738`
Preview: https://os.iterate-preview-4.com
[Workflow
run](https://github.com/iterate/iterate/actions/runs/27310094398)
Updated: 2026-06-10T22:24:45.981Z
<!-- /CLOUDFLARE_PREVIEW -->

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
jonastemplestein added a commit that referenced this pull request Jun 11, 2026
Post-merge grooming after the agents workstream landed (#1460, #1475,
#1483, #1484). Grooming rules (docs/tasks-grooming.md) say tasks are
deleted when done:

- **Deleted** `tasks/streams-core-processor-host-homogenization.md` —
the plan of record for what shipped in #1460.
- **Deleted** `tasks/agents-system-audit-and-reconciler-design.md` — the
audit knowledge dump; every verified bug and design direction in it is
now either shipped or carried by a live task file.
- **Updated** the two deferred follow-ups
(`streams-core-clock-durable-timers.md`,
`streams-event-kinds-metadata.md`) to drop their `dependsOn`/background
references to the deleted docs, pointing at the merged PRs instead.
- **Added** `tasks/streams-conditional-appends.md` — the one audit
finding that survived everything: the check-then-act window between a
provider's still-current check and its `agent/output-added` append.
Backlog, with the conditional-append direction written down so it isn't
lost with the audit doc.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Documentation-only changes under `tasks/` with no runtime or API
impact.
> 
> **Overview**
> **Grooms the `tasks/` backlog** after agents/streams work landed in
#1460 and related PRs, per `docs/tasks-grooming.md` (delete tasks when
done).
> 
> **Removes** the shipped plan-of-record
(`streams-core-processor-host-homogenization.md`) and the umbrella
audit/knowledge dump (`agents-system-audit-and-reconciler-design.md`),
since their content is either merged or split elsewhere.
> 
> **Refreshes** deferred follow-ups:
`streams-core-clock-durable-timers.md` and
`streams-event-kinds-metadata.md` drop `dependsOn` on deleted tasks and
cite PR #1460 in background instead of dead links.
> 
> **Adds** `streams-conditional-appends.md` (backlog) to capture the
remaining audit item—the LLM output **check-then-act** race—and the
direction (stream-level conditional append / CAS), so it isn’t lost with
the audit doc.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
d9b9d7f. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant