Today's LlmClient.generate(prompt: string, options) takes a single string. The assistant flattens conversation history into that string with User: / Assistant: role labels (packages/assistant/src/assistant-service.ts:formatHistoryAsPrompt), which works but is a workaround.
Scope
- Refactor
LlmClient.generate to accept either a string (back-compat) or a messages: ChatMessage[] array (new path).
- Each provider in
packages/llm-client/src/providers/{anthropic,openai,google,ollama}.ts translates the messages array into its native chat-completions format.
- Update
AssistantService to pass messages directly; delete formatHistoryAsPrompt and the User: / Assistant: workaround.
- Update tests across
@skytwin/llm-client and @skytwin/assistant.
Out of scope
- Tool-use / function-calling — different refactor (track separately).
- Vision / multi-modal inputs — different refactor.
Why
Native multi-turn is more accurate (the model knows turn boundaries explicitly), more efficient (some providers cache prompts at message-array boundaries), and removes a comment-laden workaround.
Should land before phase 2c (action routing) so action intent detection works on real turn boundaries instead of the flattened string.
Today's
LlmClient.generate(prompt: string, options)takes a single string. The assistant flattens conversation history into that string withUser:/Assistant:role labels (packages/assistant/src/assistant-service.ts:formatHistoryAsPrompt), which works but is a workaround.Scope
LlmClient.generateto accept either a string (back-compat) or amessages: ChatMessage[]array (new path).packages/llm-client/src/providers/{anthropic,openai,google,ollama}.tstranslates the messages array into its native chat-completions format.AssistantServiceto pass messages directly; deleteformatHistoryAsPromptand theUser:/Assistant:workaround.@skytwin/llm-clientand@skytwin/assistant.Out of scope
Why
Native multi-turn is more accurate (the model knows turn boundaries explicitly), more efficient (some providers cache prompts at message-array boundaries), and removes a comment-laden workaround.
Should land before phase 2c (action routing) so action intent detection works on real turn boundaries instead of the flattened string.