v0.6.3.0 refactor(llm-client): multi-turn ChatMessage[] API (#149) by jayzalowitz · Pull Request #152 · jayzalowitz/skytwin

jayzalowitz · 2026-05-05T19:46:31Z

Summary

Closes #149. `LlmClient.generate` and `generateStream` now accept `string | ChatMessage[]`. Each provider translates the array to its native chat-completion shape. The `User:` / `Assistant:` prompt-flattening workaround in `@skytwin/assistant` is gone — `reply()` and `replyStream()` pass the conversation history directly.

Pure refactor — no new user-visible feature — but unblocks #148 (action-intent routing) and removes a comment-laden workaround that's been load-bearing since assistant phase 1.

Backward-compatible: existing string callers (decision-engine's LLM strategies, every provider integration test, anything outside this monorepo importing `@skytwin/llm-client`) work unchanged.

What landed

`@skytwin/llm-client` public API

```ts
type ChatMessage = { role: 'system' | 'user' | 'assistant'; content: string };

LlmClient.generate(prompt: string | ChatMessage[], options?): Promise
LlmClient.generateStream(prompt: string | ChatMessage[], options?): AsyncIterable

// Helpers (also re-exported)
toMessages(input: string | ChatMessage[]): ChatMessage[]
splitSystemAndConversation(messages, fallbackSystem?): { system, conversation }
```

Provider translations

Provider	Before	After
Anthropic	`messages: [{role: 'user', content: }]` + `system` top-level	Pass-through `messages` array; system-role messages hoisted to the `system` field
OpenAI	Hardcoded system + user pair	Pass-through `messages`; falls back to `options.systemPrompt` only when no inline system message
Google/Gemini	Fake `user: ` + `model: "Understood."` pair to emulate system	Native `system_instruction` field; assistant role correctly translated to `'model'`. Saves tokens AND removes a drift hazard
Ollama	`/api/generate` with `systemPrompt + "\n\n" + prompt` flattened	`/api/chat` with native `messages` array. Both endpoints exist on every modern Ollama server. Response parsing moved from `{response}` to `{message: {content}}`

`AssistantService` cleanup

`reply()` and `replyStream()` now pass the trimmed `ChatTurn[]` directly to `LlmClient.generate` / `generateStream`. `ChatTurn` and `ChatMessage` are structurally identical, so the change is just dropping the `formatHistoryAsPrompt` call.
New private `composeSystemPrompt(enrichment?)` helper shared between `reply()` and `replyStream()` so the two paths cannot drift on the prepend-context-block step.
`formatHistoryAsPrompt` stays exported for back-compat (no known external callers; plan to remove on the next major bump).

Inline-system precedence

`splitSystemAndConversation` makes inline system messages WIN over `options.systemPrompt`. This matters: the assistant package injects its enrichment context (twin profile + memories) as a system turn at the head of the array; the route also passes the default system prompt via `options.systemPrompt`. Without this precedence, generic instructions would silently override the personalized context — exactly the regression Phase 2b was supposed to prevent.

Safety / invariants

API surface stable — string callers untouched; the new array path is opt-in.
Inline system precedence asserted in tests for both Anthropic + OpenAI.
Provider response parsing verified per-provider in the new tests.
Decision-engine LLM strategies (LlmCandidateGenerator, LlmSituationStrategy) NOT migrated — their PromptBuilder-built strings work as-is, and changing them would conflate this refactor with their own re-shaping. Tracked separately if it ever becomes valuable.

Test plan

`pnpm build` — 20 packages green
`pnpm test` — 40 packages green; 28 new tests + 1 updated
`pnpm lint` — clean
Manual: configure each provider in turn, send a multi-turn assistant conversation, verify the model responds with conversational coherence (vs. treating each message as standalone)
Manual: pull the worker log, confirm the LLM request bodies have the right shape per-provider (no fake `Understood.` from Gemini, no `User:` / `Assistant:` flattening from Anthropic)

Phase 2 — last remaining work

Assistant phase 2c: action-intent routing through @skytwin/decision-engine #148 action-intent routing through the decision engine — unblocked by this PR. Biggest remaining feature, biggest safety surface (Safety Invariants feat: bootstrap SkyTwin monorepo (v0.1.0.0) #1–feat: real OpenClaw execution, 6 new domains, expanded onboarding #7). Sequence ends here for the assistant phase 2 epic.

🤖 Generated with Claude Code

Closes #149. LlmClient.generate and generateStream now accept string | ChatMessage[]. Each provider translates the array to its native chat-completion shape: - Anthropic: pass-through messages, system messages hoisted to top-level - OpenAI: pass-through messages, no-duplicate-system guard - Google/Gemini: native system_instruction (no fake user/model pair), assistant role correctly translated to 'model' - Ollama: switched from /api/generate (concat prompt) to /api/chat (native messages array); response shape moved from {response} to {message: {content}} AssistantService.reply and replyStream now pass ChatTurn[] directly to LlmClient — formatHistoryAsPrompt is no longer in the main path. Kept exported as legacy for back-compat. New private composeSystemPrompt helper shared between sync + streaming paths so they cannot drift. New @skytwin/llm-client public API: - ChatMessage type - toMessages helper (string -> [{role:user, content}]) - splitSystemAndConversation (peels system messages, fallback to options.systemPrompt when none inline; inline wins so the assistant context block is never overridden) Backward-compatible: existing string callers (decision-engine LLM strategies, every provider integration test) work unchanged. Tests: 28 new (7 helpers + 15 per-provider request-body shape + 1 updated for the assistant history-cap test). Full suite green across 40 packages; lint clean. Unblocks #148 (action-intent routing) — the intent classifier can now look at structured turns instead of regexing a flattened prompt. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Copilot

Pull request overview

Refactors @skytwin/llm-client to support native multi-turn chat by allowing LlmClient.generate() / generateStream() to accept string | ChatMessage[], updating each provider to translate message arrays into its native wire format, and updating @skytwin/assistant to pass trimmed conversation history directly (dropping prompt-flattening in the main path).

Changes:

Extend LlmClient + provider function signatures to accept string | ChatMessage[], with shared helpers (toMessages, splitSystemAndConversation) and new translation tests.
Update providers (Anthropic/OpenAI/Gemini/Ollama) to use message-array-native request bodies, including system-message precedence handling.
Update AssistantService.reply() / replyStream() to pass ChatTurn[] directly and centralize system prompt composition.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
VERSION	Bumps version to `0.6.3.0`.
CHANGELOG.md	Documents the multi-turn API refactor and provider behavior changes.
packages/llm-client/src/types.ts	Adds `ChatMessage` + updates provider function types to accept multi-turn input.
packages/llm-client/src/messages.ts	Adds helpers to normalize prompts and split system vs conversation messages.
packages/llm-client/src/llm-client.ts	Updates `generate`/`generateStream` signatures and preserves provider-chain behavior.
packages/llm-client/src/index.ts	Re-exports `ChatMessage` and message helpers from package root.
packages/llm-client/src/providers/anthropic.ts	Hoists system messages to top-level `system` and supports message-array prompts for sync/stream.
packages/llm-client/src/providers/openai.ts	Uses native `messages` array with inline-system precedence over `options.systemPrompt`.
packages/llm-client/src/providers/google.ts	Translates to Gemini `contents` + `system_instruction` and maps assistant → model role.
packages/llm-client/src/providers/ollama.ts	Switches from `/api/generate` to `/api/chat` and updates response parsing.
packages/llm-client/src/tests/messages.test.ts	Adds unit tests for `toMessages` and `splitSystemAndConversation`.
packages/llm-client/src/tests/provider-multiturn.test.ts	Adds provider request-shape assertions for multi-turn translation.
packages/assistant/src/assistant-service.ts	Passes `ChatTurn[]` directly to `LlmClient` and dedupes system prompt composition.
packages/assistant/src/tests/assistant-service.test.ts	Updates history-cap test to assert on message arrays rather than flattened prompts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+export interface ChatMessage {
+  role: 'system' | 'user' | 'assistant';
+  content: string;
+}


+ * Anthropic takes `system` as a top-level field separate from the
+ * `messages` array, so we split system messages out of the conversation
+ * — see `splitSystemAndConversation`. Adjacent same-role messages are
+ * NOT merged here (Anthropic accepts them) but the API rejects empty
+ * conversations, so we always have at least one message after the split.
+ */


+    const { system, conversation } = splitSystemAndConversation(
+      toMessages(prompt),
+      options.systemPrompt,
+    );


+ * native top-level `system` field). When both are present, the array
+ * system messages take precedence — the assistant injects context as a
+ * system turn at the head of the array.


Copilot AI review requested due to automatic review settings May 5, 2026 19:46

Copilot started reviewing on behalf of jayzalowitz May 5, 2026 19:47 View session

Copilot AI reviewed May 5, 2026

View reviewed changes

jayzalowitz merged commit 334f497 into main May 5, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.6.3.0 refactor(llm-client): multi-turn ChatMessage[] API (#149)#152

v0.6.3.0 refactor(llm-client): multi-turn ChatMessage[] API (#149)#152
jayzalowitz merged 1 commit into
mainfrom
jayzalowitz/llm-multiturn

jayzalowitz commented May 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jayzalowitz commented May 5, 2026

Summary

What landed

`@skytwin/llm-client` public API

Provider translations

`AssistantService` cleanup

Inline-system precedence

Safety / invariants

Test plan

Phase 2 — last remaining work

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants