v0.6.3.0 refactor(llm-client): multi-turn ChatMessage[] API (#149)#152
Merged
Conversation
Closes #149. LlmClient.generate and generateStream now accept string | ChatMessage[]. Each provider translates the array to its native chat-completion shape: - Anthropic: pass-through messages, system messages hoisted to top-level - OpenAI: pass-through messages, no-duplicate-system guard - Google/Gemini: native system_instruction (no fake user/model pair), assistant role correctly translated to 'model' - Ollama: switched from /api/generate (concat prompt) to /api/chat (native messages array); response shape moved from {response} to {message: {content}} AssistantService.reply and replyStream now pass ChatTurn[] directly to LlmClient — formatHistoryAsPrompt is no longer in the main path. Kept exported as legacy for back-compat. New private composeSystemPrompt helper shared between sync + streaming paths so they cannot drift. New @skytwin/llm-client public API: - ChatMessage type - toMessages helper (string -> [{role:user, content}]) - splitSystemAndConversation (peels system messages, fallback to options.systemPrompt when none inline; inline wins so the assistant context block is never overridden) Backward-compatible: existing string callers (decision-engine LLM strategies, every provider integration test) work unchanged. Tests: 28 new (7 helpers + 15 per-provider request-body shape + 1 updated for the assistant history-cap test). Full suite green across 40 packages; lint clean. Unblocks #148 (action-intent routing) — the intent classifier can now look at structured turns instead of regexing a flattened prompt. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Refactors @skytwin/llm-client to support native multi-turn chat by allowing LlmClient.generate() / generateStream() to accept string | ChatMessage[], updating each provider to translate message arrays into its native wire format, and updating @skytwin/assistant to pass trimmed conversation history directly (dropping prompt-flattening in the main path).
Changes:
- Extend
LlmClient+ provider function signatures to acceptstring | ChatMessage[], with shared helpers (toMessages,splitSystemAndConversation) and new translation tests. - Update providers (Anthropic/OpenAI/Gemini/Ollama) to use message-array-native request bodies, including system-message precedence handling.
- Update
AssistantService.reply()/replyStream()to passChatTurn[]directly and centralize system prompt composition.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| VERSION | Bumps version to 0.6.3.0. |
| CHANGELOG.md | Documents the multi-turn API refactor and provider behavior changes. |
| packages/llm-client/src/types.ts | Adds ChatMessage + updates provider function types to accept multi-turn input. |
| packages/llm-client/src/messages.ts | Adds helpers to normalize prompts and split system vs conversation messages. |
| packages/llm-client/src/llm-client.ts | Updates generate/generateStream signatures and preserves provider-chain behavior. |
| packages/llm-client/src/index.ts | Re-exports ChatMessage and message helpers from package root. |
| packages/llm-client/src/providers/anthropic.ts | Hoists system messages to top-level system and supports message-array prompts for sync/stream. |
| packages/llm-client/src/providers/openai.ts | Uses native messages array with inline-system precedence over options.systemPrompt. |
| packages/llm-client/src/providers/google.ts | Translates to Gemini contents + system_instruction and maps assistant → model role. |
| packages/llm-client/src/providers/ollama.ts | Switches from /api/generate to /api/chat and updates response parsing. |
| packages/llm-client/src/tests/messages.test.ts | Adds unit tests for toMessages and splitSystemAndConversation. |
| packages/llm-client/src/tests/provider-multiturn.test.ts | Adds provider request-shape assertions for multi-turn translation. |
| packages/assistant/src/assistant-service.ts | Passes ChatTurn[] directly to LlmClient and dedupes system prompt composition. |
| packages/assistant/src/tests/assistant-service.test.ts | Updates history-cap test to assert on message arrays rather than flattened prompts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+44
to
+47
| export interface ChatMessage { | ||
| role: 'system' | 'user' | 'assistant'; | ||
| content: string; | ||
| } |
Comment on lines
+11
to
+16
| * Anthropic takes `system` as a top-level field separate from the | ||
| * `messages` array, so we split system messages out of the conversation | ||
| * — see `splitSystemAndConversation`. Adjacent same-role messages are | ||
| * NOT merged here (Anthropic accepts them) but the API rejects empty | ||
| * conversations, so we always have at least one message after the split. | ||
| */ |
| const { system, conversation } = splitSystemAndConversation( | ||
| toMessages(prompt), | ||
| options.systemPrompt, | ||
| ); |
Comment on lines
+40
to
+42
| * native top-level `system` field). When both are present, the array | ||
| * system messages take precedence — the assistant injects context as a | ||
| * system turn at the head of the array. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #149. `LlmClient.generate` and `generateStream` now accept `string | ChatMessage[]`. Each provider translates the array to its native chat-completion shape. The `User:` / `Assistant:` prompt-flattening workaround in `@skytwin/assistant` is gone — `reply()` and `replyStream()` pass the conversation history directly.
Pure refactor — no new user-visible feature — but unblocks #148 (action-intent routing) and removes a comment-laden workaround that's been load-bearing since assistant phase 1.
Backward-compatible: existing string callers (decision-engine's LLM strategies, every provider integration test, anything outside this monorepo importing `@skytwin/llm-client`) work unchanged.
What landed
`@skytwin/llm-client` public API
```ts
type ChatMessage = { role: 'system' | 'user' | 'assistant'; content: string };
LlmClient.generate(prompt: string | ChatMessage[], options?): Promise
LlmClient.generateStream(prompt: string | ChatMessage[], options?): AsyncIterable
// Helpers (also re-exported)
toMessages(input: string | ChatMessage[]): ChatMessage[]
splitSystemAndConversation(messages, fallbackSystem?): { system, conversation }
```
Provider translations
`AssistantService` cleanup
Inline-system precedence
`splitSystemAndConversation` makes inline system messages WIN over `options.systemPrompt`. This matters: the assistant package injects its enrichment context (twin profile + memories) as a system turn at the head of the array; the route also passes the default system prompt via `options.systemPrompt`. Without this precedence, generic instructions would silently override the personalized context — exactly the regression Phase 2b was supposed to prevent.
Safety / invariants
Test plan
Phase 2 — last remaining work
🤖 Generated with Claude Code