You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PawWork needs local session exports from #194 to include enough model/API diagnostics to explain failures like an assistant message finishing successfully with token usage but no visible text, tool result, reasoning part, or error. The export should let us distinguish whether the issue happened in the provider API response, the AI SDK stream conversion, or PawWork session persistence without publishing the conversation through the opencode share service.
What do you do today?
Today the local database and logs preserve session/message/part records plus aggregate token usage. In a real session, alibaba-coding-plan-cn/kimi-k2.5 produced finish=stop and output=110 tokens, but the assistant message only had step-start and step-finish parts. There was no saved raw API chunk, no stream event counts, no content versus reasoning_content evidence, and no explicit empty-completion marker. The only richer handoff path is the existing cloud share flow that #194 is replacing, and even that share only contains the already-persisted session parts, not raw LLM stream diagnostics.
What would a good result look like?
Record a lightweight, local, structured diagnostic summary for each assistant run and include it in the local session export from #194. The summary should include provider, model, finish reason, token usage, stream event type counts, whether any text-delta, reasoning-delta, tool call, tool result, or stream error was observed, and an explicit diagnostic flag for finish=stop with no user-visible output. The export should make this readable without requiring raw logs or cloud publishing.
Which audience does this matter to most?
Both
Extra context
This is a follow-up to #194, not a replacement for it. #194 defines the safer product path: local session export instead of publishing to opncd.ai. This issue defines extra diagnostic content that the export should be able to carry. It also belongs under the harness series #195 and is related to #133, but it is narrower than general loop detection: the focus here is LLM stream and empty-completion diagnosability.
Acceptance criteria
Assistant runs persist a lightweight local diagnostic summary that does not upload conversation content by default.
The diagnostic summary records provider/model, finish reason, token usage, and stream event type counts.
The summary records whether visible text, reasoning, tool calls, tool results, and stream errors were observed.
finish=stop with no visible assistant output is explicitly flagged as an empty completion.
Raw prompt text, raw API chunks, and full tool bodies are not recorded by default unless a separate explicit debug mode is introduced.
Non-goals
Do not reintroduce cloud session sharing.
Do not store full raw provider responses by default.
Do not fix every empty-completion behavior in this issue; a separate bug can add retry or user-visible fallback once this diagnostic layer identifies the failing boundary.
What task are you trying to do?
PawWork needs local session exports from #194 to include enough model/API diagnostics to explain failures like an assistant message finishing successfully with token usage but no visible text, tool result, reasoning part, or error. The export should let us distinguish whether the issue happened in the provider API response, the AI SDK stream conversion, or PawWork session persistence without publishing the conversation through the opencode share service.
What do you do today?
Today the local database and logs preserve session/message/part records plus aggregate token usage. In a real session,
alibaba-coding-plan-cn/kimi-k2.5producedfinish=stopandoutput=110tokens, but the assistant message only hadstep-startandstep-finishparts. There was no saved raw API chunk, no stream event counts, nocontentversusreasoning_contentevidence, and no explicit empty-completion marker. The only richer handoff path is the existing cloud share flow that #194 is replacing, and even that share only contains the already-persisted session parts, not raw LLM stream diagnostics.What would a good result look like?
Record a lightweight, local, structured diagnostic summary for each assistant run and include it in the local session export from #194. The summary should include provider, model, finish reason, token usage, stream event type counts, whether any
text-delta,reasoning-delta, tool call, tool result, or stream error was observed, and an explicit diagnostic flag forfinish=stopwith no user-visible output. The export should make this readable without requiring raw logs or cloud publishing.Which audience does this matter to most?
Both
Extra context
This is a follow-up to #194, not a replacement for it. #194 defines the safer product path: local session export instead of publishing to
opncd.ai. This issue defines extra diagnostic content that the export should be able to carry. It also belongs under the harness series #195 and is related to #133, but it is narrower than general loop detection: the focus here is LLM stream and empty-completion diagnosability.Acceptance criteria
finish=stopwith no visible assistant output is explicitly flagged as an empty completion.Non-goals