Skip to content

feat(web): improve streaming chat continuity readability#1617

Merged
Wirasm merged 2 commits into
coleam00:devfrom
knosence:feat/vela-54-stream-status-coalescing
May 26, 2026
Merged

feat(web): improve streaming chat continuity readability#1617
Wirasm merged 2 commits into
coleam00:devfrom
knosence:feat/vela-54-stream-status-coalescing

Conversation

@knosence

@knosence knosence commented May 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Describe this PR in 2-5 bullets:

  • Problem: transient system-status rows, tool-call rows, and thinking placeholders were readable individually but still felt flickery/ambiguous during live streaming.
  • Why it matters: users need continuity while a response is still unfolding, especially before assistant text lands or when tools are actively running.
  • What changed: coalesced consecutive system status rows into one evolving line; added clearer collapsed tool-call status/preview treatment; labeled the live thinking placeholder explicitly.
  • What did not change (scope boundary): no backend/SSE protocol changes, no workflow-engine changes, no provider/Rust semantic changes, no new visibility-mode settings yet.

UX Journey

Before

User                      Archon Web Chat
────                      ───────────────
sends message ─────────▶  adds blank thinking dots
                          appends multiple system status rows as they arrive
                          shows collapsed tool rows with little context
                          streams assistant text when available
sees flicker/ambiguity ◀─ especially before text arrives or between tool events

After

User                      Archon Web Chat
────                      ───────────────
sends message ─────────▶  adds [Thinking + dots] placeholder
                          [coalesces] consecutive system status updates into one evolving row
                          shows tool rows with [status chip] + [preview text]
                          streams assistant text when available
sees steadier continuity ◀─ with clearer interim state before/during tool activity

Architecture Diagram

Before

ChatInterface.tsx -> useSSE
ChatInterface.tsx -> chat-message-reducer.ts
ChatInterface.tsx -> MessageList.tsx
MessageList.tsx -> MessageBubble.tsx
MessageList.tsx -> ToolCallCard.tsx

After

[~] ChatInterface.tsx -> useSSE
[~] ChatInterface.tsx ===> [+] system-status-reducer.ts
ChatInterface.tsx -> chat-message-reducer.ts
ChatInterface.tsx -> MessageList.tsx
MessageList.tsx -> [~] MessageBubble.tsx
MessageList.tsx -> [~] ToolCallCard.tsx

Connection inventory (list every module-to-module edge, mark changes):

From To Status Notes
ChatInterface.tsx useSSE unchanged SSE event source remains the same
ChatInterface.tsx chat-message-reducer.ts unchanged text streaming segmentation unchanged
ChatInterface.tsx system-status-reducer.ts new system status rows now coalesce before render
ChatInterface.tsx MessageList.tsx unchanged message list wiring unchanged
MessageList.tsx MessageBubble.tsx modified thinking placeholder presentation clarified
MessageList.tsx ToolCallCard.tsx modified collapsed tool readability improved

Label Snapshot

  • Risk: risk: low
  • Size: size: S
  • Scope: web
  • Module: web:chat

Change Metadata

  • Change type: feature
  • Primary scope: web

Linked Issue

  • Closes #
  • Related #
  • Depends on #
  • Supersedes #
  • Related knosence/vela#54

Validation Evidence (required)

Commands and result summary:

bun --filter @archon/web type-check
bun --filter @archon/web test
  • Evidence provided (test/log/trace/screenshot): local type-check + web package tests passed
  • If any command is intentionally skipped, explain why: full bun run validate was skipped because this PR is a bounded web-only UI slice and the touched surface is covered by package-local validation.

Security Impact (required)

  • New permissions/capabilities? (No)
  • New external network calls? (No)
  • Secrets/tokens handling changed? (No)
  • File system access scope changed? (No)
  • If any Yes, describe risk and mitigation:

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Database migration needed? (No)
  • If yes, exact upgrade steps:

Human Verification (required)

What was personally validated beyond CI:

  • Verified scenarios: consecutive system status updates now collapse into one evolving row; collapsed tool rows show a clearer status/preview; thinking placeholder now has an explicit label.
  • Edge cases checked: type-check still passes; package-local web tests still pass; no protocol/schema changes required.
  • What was not verified: browser-side visual QA in a running dev session.

Side Effects / Blast Radius (required)

  • Affected subsystems/workflows: web chat rendering only (ChatInterface, MessageBubble, ToolCallCard)
  • Potential unintended effects: users accustomed to multiple stacked status lines will now see a single evolving line instead.
  • Guardrails/monitoring for early detection: isolated reducer tests for status coalescing, package-local type-check/test coverage.

Rollback Plan (required)

  • Fast rollback command/path: revert commits 5f23b691 and f4c732c4, or revert the PR merge on dev
  • Feature flags or config toggles (if any): none
  • Observable failure symptoms: missing system status row updates, tool rows losing preview/status clarity, thinking placeholder rendering regressions

Risks and Mitigations

List real risks in this PR (or write None).

  • Risk: coalescing system rows could hide a status transition a user wanted to see as a separate line.
    • Mitigation: scope is limited to consecutive system rows only; assistant/tool messages remain separate and ordered.
  • Risk: collapsed tool preview could surface noisy first-line output for some tools.
    • Mitigation: preview is fallback-only when no input summary exists; full expandable output remains unchanged.

Summary by CodeRabbit

Release Notes

  • New Features

    • Added status badges to tool calls (Running, Complete, Done).
    • Added output preview display in tool call headers.
  • Style

    • Improved "Thinking" indicator UI with enhanced spacing and accessibility label.
  • Tests

    • Added comprehensive test coverage for system message coalescing behavior.

@coderabbitai

coderabbitai Bot commented May 9, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

This PR consolidates consecutive system-status SSE messages into a single evolving chat message via a new applySystemStatus reducer, while enhancing UI display in the message bubble and tool call card header to show thinking states, execution status badges, and output previews.

Changes

System Status Message Consolidation

Layer / File(s) Summary
Reducer Contract
packages/web/src/lib/system-status-reducer.ts
New applySystemStatus function coalesces consecutive system messages by updating the last system message's content/timestamp if present, or appending a new system message with generated id.
Reducer Tests
packages/web/src/lib/system-status-reducer.test.ts
Test suite covers message appending, coalescing with id preservation, and history preservation scenarios.
Chat Integration
packages/web/src/components/chat/ChatInterface.tsx
Imports and uses applySystemStatus in onSystemStatus handler, delegating message updates to the reducer instead of direct state mutation.
Message Bubble UI
packages/web/src/components/chat/MessageBubble.tsx
Updates "Thinking" indicator with screen-reader-only label and refined styling for pulsing dots.
Tool Card UI
packages/web/src/components/chat/ToolCallCard.tsx
Adds status badge from execution state and output preview from first non-empty output line, displayed in header secondary text when available.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A system that speaks with fewer words,
One voice where many were once heard,
The thinking glows, the tools show their work,
No duplicate messages lurk in the murk!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(web): improve streaming chat continuity readability' accurately summarizes the main change: improving the user experience during live streaming by making the chat interface more readable and continuous.
Description check ✅ Passed The PR description comprehensively follows the template with all major sections completed: Summary (problem/why/changes/scope), UX Journey (Before/After), Architecture Diagram, Connection inventory, Labels, Validation Evidence, Security Impact, Compatibility, Human Verification, Side Effects, Rollback Plan, and Risks/Mitigations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/web/src/lib/system-status-reducer.test.ts (1)

5-9: ⚡ Quick win

Reset idCounter before each test to prevent order-dependent failures.

idCounter is module-level state and is never reset. Test 1's assertion id: 'msg-1' is currently correct only because it runs first and no earlier test calls makeId(). Adding any new test that calls makeId() before test 1 will silently produce the wrong ID and break the expectation.

🛠️ Proposed fix
+import { describe, expect, test, beforeEach } from 'bun:test';

 let idCounter = 0;
 function makeId(): string {
   idCounter++;
   return `msg-${String(idCounter)}`;
 }

 describe('applySystemStatus', () => {
+  beforeEach(() => {
+    idCounter = 0;
+  });
+
   test('appends a new system message when previous message is not system', () => {

As per coding guidelines: "keep tests deterministic with no flaky timing or network dependence without guardrails".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/web/src/lib/system-status-reducer.test.ts` around lines 5 - 9, Tests
rely on module-level idCounter (used by makeId) but it is never reset, causing
order-dependent failures; fix by resetting that module state before each
test—e.g., in the test file add a beforeEach that calls jest.resetModules() and
re-requires the module (or export a resetIdCounter helper from the module and
call it in beforeEach) so idCounter is zeroed before every test, ensuring makeId
produces deterministic 'msg-1' in the first call.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/web/src/components/chat/MessageBubble.tsx`:
- Around line 212-215: In MessageBubble (component MessageBubble / JSX block
containing the two spans), remove the redundant hidden assistive span <span
className="sr-only">Thinking</span> so screen readers don't announce "Thinking"
twice; keep the visible <span className="font-medium">Thinking</span> and ensure
the surrounding container (the div with className "flex items-center gap-2 py-1
text-sm text-text-tertiary") remains unchanged.

---

Nitpick comments:
In `@packages/web/src/lib/system-status-reducer.test.ts`:
- Around line 5-9: Tests rely on module-level idCounter (used by makeId) but it
is never reset, causing order-dependent failures; fix by resetting that module
state before each test—e.g., in the test file add a beforeEach that calls
jest.resetModules() and re-requires the module (or export a resetIdCounter
helper from the module and call it in beforeEach) so idCounter is zeroed before
every test, ensuring makeId produces deterministic 'msg-1' in the first call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fa42cc45-324f-4de5-951d-56a4a39053e6

📥 Commits

Reviewing files that changed from the base of the PR and between f4f2725 and f4c732c.

📒 Files selected for processing (5)
  • packages/web/src/components/chat/ChatInterface.tsx
  • packages/web/src/components/chat/MessageBubble.tsx
  • packages/web/src/components/chat/ToolCallCard.tsx
  • packages/web/src/lib/system-status-reducer.test.ts
  • packages/web/src/lib/system-status-reducer.ts

Comment thread packages/web/src/components/chat/MessageBubble.tsx
@Wirasm

Wirasm commented May 11, 2026

Copy link
Copy Markdown
Collaborator

Review Summary

Verdict: ready-to-merge

This PR cleanly extracts consecutive system-status SSE rows into a single evolving chat line via a new applySystemStatus reducer, and enhances the chat UI with a "Thinking" placeholder label, tool-call status badges, and output previews. All checks pass — no blocking issues.

Blocking issues

(none)

Suggested fixes

  • packages/web/src/lib/system-status-reducer.ts:1-6: The docstring opens with "Append a system-status line to the chat", but when the last message is already a system message, the function updates that row in-place rather than appending. Change the first line to: // Apply a system-status update to the message list, coalescing consecutive system rows.

Minor / nice-to-have

  • packages/web/src/components/chat/ToolCallCard.tsx:37-38: statusLabel branches (Running / Complete / Done) have no test coverage. This is display-only logic so it's not blocking, but a component snapshot test covering the three states would close the gap.
  • packages/web/src/lib/system-status-reducer.ts (style): CLAUDE.md asks to reserve multi-paragraph docstring blocks for when "absolutely necessary". The continuity-goal paragraph is worth keeping, but the block could be tightened by folding the first line.

Compliments

  • The applySystemStatus reducer is a clean extraction — pure function, injectable deps (makeId, now), explicit return types, and follows the same pattern as applyOnText exactly. Easy to maintain.
  • The continuity-goal paragraph in the docstring is genuinely valuable — it explains the non-obvious coalescing rationale that isn't apparent from the code alone.

Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, test-coverage, comment-quality.

@Wirasm

Wirasm commented May 25, 2026

Copy link
Copy Markdown
Collaborator

Review Summary

Verdict: ready-to-merge

Three focused UI improvements: a system-status reducer that coalesces consecutive updates to prevent flicker (with solid test coverage), accessible "Thinking" label for screen readers, and a status badge + output preview on tool call cards. The code is clean, well-scoped, and well-documented.

Blocking issues

None.

Suggested fixes

None.

Minor / nice-to-have

  • ToolCallCard.tsx:41: The statusLabel ternary treats output: "" as "Complete". Confirm this is intentional — if an empty string can appear while isRunning is false, the label becomes misleading (Complete with no content). A brief comment on the ternary or tightening to tool.output !== undefined && tool.output !== "" would clarify.

  • system-status-reducer.ts:30: The makeId default uses Date.now() which could theoretically collide in hot-code paths. The caller always passes nextId explicitly so it's never hit, but leaving the default in the signature is slightly misleading. Consider using crypto.randomUUID() as the default or removing the default entirely.

Compliments

  • The system-status-reducer JSDoc is excellent — it documents the non-obvious flicker-prevention goal that a future maintainer might otherwise "simplify" away. Well done.
  • The reducer tests cover the three representative cases (non-system predecessor, consecutive coalesce, historical preservation) concisely.

Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, comment-quality.

@Wirasm Wirasm merged commit c571a93 into coleam00:dev May 26, 2026
1 check passed
@Wirasm Wirasm mentioned this pull request May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants