feat(web): improve streaming chat continuity readability by knosence · Pull Request #1617 · coleam00/Archon

knosence · 2026-05-09T01:03:17Z

Summary

Describe this PR in 2-5 bullets:

Problem: transient system-status rows, tool-call rows, and thinking placeholders were readable individually but still felt flickery/ambiguous during live streaming.
Why it matters: users need continuity while a response is still unfolding, especially before assistant text lands or when tools are actively running.
What changed: coalesced consecutive system status rows into one evolving line; added clearer collapsed tool-call status/preview treatment; labeled the live thinking placeholder explicitly.
What did not change (scope boundary): no backend/SSE protocol changes, no workflow-engine changes, no provider/Rust semantic changes, no new visibility-mode settings yet.

UX Journey

Before

User                      Archon Web Chat
────                      ───────────────
sends message ─────────▶  adds blank thinking dots
                          appends multiple system status rows as they arrive
                          shows collapsed tool rows with little context
                          streams assistant text when available
sees flicker/ambiguity ◀─ especially before text arrives or between tool events

After

User                      Archon Web Chat
────                      ───────────────
sends message ─────────▶  adds [Thinking + dots] placeholder
                          [coalesces] consecutive system status updates into one evolving row
                          shows tool rows with [status chip] + [preview text]
                          streams assistant text when available
sees steadier continuity ◀─ with clearer interim state before/during tool activity

Architecture Diagram

Before

ChatInterface.tsx -> useSSE
ChatInterface.tsx -> chat-message-reducer.ts
ChatInterface.tsx -> MessageList.tsx
MessageList.tsx -> MessageBubble.tsx
MessageList.tsx -> ToolCallCard.tsx

After

[~] ChatInterface.tsx -> useSSE
[~] ChatInterface.tsx ===> [+] system-status-reducer.ts
ChatInterface.tsx -> chat-message-reducer.ts
ChatInterface.tsx -> MessageList.tsx
MessageList.tsx -> [~] MessageBubble.tsx
MessageList.tsx -> [~] ToolCallCard.tsx

Connection inventory (list every module-to-module edge, mark changes):

From	To	Status	Notes
ChatInterface.tsx	useSSE	unchanged	SSE event source remains the same
ChatInterface.tsx	chat-message-reducer.ts	unchanged	text streaming segmentation unchanged
ChatInterface.tsx	system-status-reducer.ts	new	system status rows now coalesce before render
ChatInterface.tsx	MessageList.tsx	unchanged	message list wiring unchanged
MessageList.tsx	MessageBubble.tsx	modified	thinking placeholder presentation clarified
MessageList.tsx	ToolCallCard.tsx	modified	collapsed tool readability improved

Label Snapshot

Risk: risk: low
Size: size: S
Scope: web
Module: web:chat

Change Metadata

Change type: feature
Primary scope: web

Linked Issue

Closes #
Related #
Depends on #
Supersedes #
Related knosence/vela#54

Validation Evidence (required)

Commands and result summary:

bun --filter @archon/web type-check
bun --filter @archon/web test

Evidence provided (test/log/trace/screenshot): local type-check + web package tests passed
If any command is intentionally skipped, explain why: full bun run validate was skipped because this PR is a bounded web-only UI slice and the touched surface is covered by package-local validation.

Security Impact (required)

New permissions/capabilities? (No)
New external network calls? (No)
Secrets/tokens handling changed? (No)
File system access scope changed? (No)
If any Yes, describe risk and mitigation:

Compatibility / Migration

Backward compatible? (Yes)
Config/env changes? (No)
Database migration needed? (No)
If yes, exact upgrade steps:

Human Verification (required)

What was personally validated beyond CI:

Verified scenarios: consecutive system status updates now collapse into one evolving row; collapsed tool rows show a clearer status/preview; thinking placeholder now has an explicit label.
Edge cases checked: type-check still passes; package-local web tests still pass; no protocol/schema changes required.
What was not verified: browser-side visual QA in a running dev session.

Side Effects / Blast Radius (required)

Affected subsystems/workflows: web chat rendering only (ChatInterface, MessageBubble, ToolCallCard)
Potential unintended effects: users accustomed to multiple stacked status lines will now see a single evolving line instead.
Guardrails/monitoring for early detection: isolated reducer tests for status coalescing, package-local type-check/test coverage.

Rollback Plan (required)

Fast rollback command/path: revert commits 5f23b691 and f4c732c4, or revert the PR merge on dev
Feature flags or config toggles (if any): none
Observable failure symptoms: missing system status row updates, tool rows losing preview/status clarity, thinking placeholder rendering regressions

Risks and Mitigations

List real risks in this PR (or write None).

Risk: coalescing system rows could hide a status transition a user wanted to see as a separate line.
- Mitigation: scope is limited to consecutive system rows only; assistant/tool messages remain separate and ordered.
Risk: collapsed tool preview could surface noisy first-line output for some tools.
- Mitigation: preview is fallback-only when no input summary exists; full expandable output remains unchanged.

Summary by CodeRabbit

Release Notes

New Features
- Added status badges to tool calls (Running, Complete, Done).
- Added output preview display in tool call headers.
Style
- Improved "Thinking" indicator UI with enhanced spacing and accessibility label.
Tests
- Added comprehensive test coverage for system message coalescing behavior.

coderabbitai · 2026-05-09T01:03:31Z

📝 Walkthrough

Walkthrough

This PR consolidates consecutive system-status SSE messages into a single evolving chat message via a new applySystemStatus reducer, while enhancing UI display in the message bubble and tool call card header to show thinking states, execution status badges, and output previews.

Changes

System Status Message Consolidation

Layer / File(s)	Summary
Reducer Contract `packages/web/src/lib/system-status-reducer.ts`	New `applySystemStatus` function coalesces consecutive system messages by updating the last system message's content/timestamp if present, or appending a new system message with generated id.
Reducer Tests `packages/web/src/lib/system-status-reducer.test.ts`	Test suite covers message appending, coalescing with id preservation, and history preservation scenarios.
Chat Integration `packages/web/src/components/chat/ChatInterface.tsx`	Imports and uses `applySystemStatus` in `onSystemStatus` handler, delegating message updates to the reducer instead of direct state mutation.
Message Bubble UI `packages/web/src/components/chat/MessageBubble.tsx`	Updates "Thinking" indicator with screen-reader-only label and refined styling for pulsing dots.
Tool Card UI `packages/web/src/components/chat/ToolCallCard.tsx`	Adds status badge from execution state and output preview from first non-empty output line, displayed in header secondary text when available.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A system that speaks with fewer words,
One voice where many were once heard,
The thinking glows, the tools show their work,
No duplicate messages lurk in the murk!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat(web): improve streaming chat continuity readability' accurately summarizes the main change: improving the user experience during live streaming by making the chat interface more readable and continuous.
Description check	✅ Passed	The PR description comprehensively follows the template with all major sections completed: Summary (problem/why/changes/scope), UX Journey (Before/After), Architecture Diagram, Connection inventory, Labels, Validation Evidence, Security Impact, Compatibility, Human Verification, Side Effects, Rollback Plan, and Risks/Mitigations.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

packages/web/src/lib/system-status-reducer.test.ts (1)
5-9: ⚡ Quick win

Reset idCounter before each test to prevent order-dependent failures.

idCounter is module-level state and is never reset. Test 1's assertion id: 'msg-1' is currently correct only because it runs first and no earlier test calls makeId(). Adding any new test that calls makeId() before test 1 will silently produce the wrong ID and break the expectation.
🛠️ Proposed fix
+import { describe, expect, test, beforeEach } from 'bun:test';

 let idCounter = 0;
 function makeId(): string {
   idCounter++;
   return `msg-${String(idCounter)}`;
 }

 describe('applySystemStatus', () => {
+  beforeEach(() => {
+    idCounter = 0;
+  });
+
   test('appends a new system message when previous message is not system', () => {
As per coding guidelines: "keep tests deterministic with no flaky timing or network dependence without guardrails".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/web/src/lib/system-status-reducer.test.ts` around lines 5 - 9, Tests
rely on module-level idCounter (used by makeId) but it is never reset, causing
order-dependent failures; fix by resetting that module state before each
test—e.g., in the test file add a beforeEach that calls jest.resetModules() and
re-requires the module (or export a resetIdCounter helper from the module and
call it in beforeEach) so idCounter is zeroed before every test, ensuring makeId
produces deterministic 'msg-1' in the first call.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/web/src/components/chat/MessageBubble.tsx`:
- Around line 212-215: In MessageBubble (component MessageBubble / JSX block
containing the two spans), remove the redundant hidden assistive span <span
className="sr-only">Thinking</span> so screen readers don't announce "Thinking"
twice; keep the visible <span className="font-medium">Thinking</span> and ensure
the surrounding container (the div with className "flex items-center gap-2 py-1
text-sm text-text-tertiary") remains unchanged.

---

Nitpick comments:
In `@packages/web/src/lib/system-status-reducer.test.ts`:
- Around line 5-9: Tests rely on module-level idCounter (used by makeId) but it
is never reset, causing order-dependent failures; fix by resetting that module
state before each test—e.g., in the test file add a beforeEach that calls
jest.resetModules() and re-requires the module (or export a resetIdCounter
helper from the module and call it in beforeEach) so idCounter is zeroed before
every test, ensuring makeId produces deterministic 'msg-1' in the first call.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fa42cc45-324f-4de5-951d-56a4a39053e6

📥 Commits

Reviewing files that changed from the base of the PR and between f4f2725 and f4c732c.

📒 Files selected for processing (5)

packages/web/src/components/chat/ChatInterface.tsx
packages/web/src/components/chat/MessageBubble.tsx
packages/web/src/components/chat/ToolCallCard.tsx
packages/web/src/lib/system-status-reducer.test.ts
packages/web/src/lib/system-status-reducer.ts

Wirasm · 2026-05-11T17:00:41Z

Review Summary

Verdict: ready-to-merge

This PR cleanly extracts consecutive system-status SSE rows into a single evolving chat line via a new applySystemStatus reducer, and enhances the chat UI with a "Thinking" placeholder label, tool-call status badges, and output previews. All checks pass — no blocking issues.

Blocking issues

(none)

Suggested fixes

packages/web/src/lib/system-status-reducer.ts:1-6: The docstring opens with "Append a system-status line to the chat", but when the last message is already a system message, the function updates that row in-place rather than appending. Change the first line to: // Apply a system-status update to the message list, coalescing consecutive system rows.

Minor / nice-to-have

packages/web/src/components/chat/ToolCallCard.tsx:37-38: statusLabel branches (Running / Complete / Done) have no test coverage. This is display-only logic so it's not blocking, but a component snapshot test covering the three states would close the gap.
packages/web/src/lib/system-status-reducer.ts (style): CLAUDE.md asks to reserve multi-paragraph docstring blocks for when "absolutely necessary". The continuity-goal paragraph is worth keeping, but the block could be tightened by folding the first line.

Compliments

The applySystemStatus reducer is a clean extraction — pure function, injectable deps (makeId, now), explicit return types, and follows the same pattern as applyOnText exactly. Easy to maintain.
The continuity-goal paragraph in the docstring is genuinely valuable — it explains the non-obvious coalescing rationale that isn't apparent from the code alone.

Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, test-coverage, comment-quality.

Wirasm · 2026-05-25T19:08:24Z

Review Summary

Verdict: ready-to-merge

Three focused UI improvements: a system-status reducer that coalesces consecutive updates to prevent flicker (with solid test coverage), accessible "Thinking" label for screen readers, and a status badge + output preview on tool call cards. The code is clean, well-scoped, and well-documented.

Blocking issues

None.

Suggested fixes

None.

Minor / nice-to-have

ToolCallCard.tsx:41: The statusLabel ternary treats output: "" as "Complete". Confirm this is intentional — if an empty string can appear while isRunning is false, the label becomes misleading (Complete with no content). A brief comment on the ternary or tightening to tool.output !== undefined && tool.output !== "" would clarify.
system-status-reducer.ts:30: The makeId default uses Date.now() which could theoretically collide in hot-code paths. The caller always passes nextId explicitly so it's never hit, but leaving the default in the signature is slightly misleading. Consider using crypto.randomUUID() as the default or removing the default entirely.

Compliments

The system-status-reducer JSDoc is excellent — it documents the non-obvious flicker-prevention goal that a future maintainer might otherwise "simplify" away. Well done.
The reducer tests cover the three representative cases (non-system predecessor, consecutive coalesce, historical preservation) concisely.

Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, comment-quality.

knosence added 2 commits May 8, 2026 20:36

fix: coalesce transient chat status updates

5f23b69

feat(web): improve streaming thinking and tool readability

f4c732c

coderabbitai Bot reviewed May 9, 2026

View reviewed changes

Comment thread packages/web/src/components/chat/MessageBubble.tsx

Wirasm merged commit c571a93 into coleam00:dev May 26, 2026
1 check passed

Wirasm mentioned this pull request May 28, 2026

Release 0.4.0 #1791

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(web): improve streaming chat continuity readability#1617

feat(web): improve streaming chat continuity readability#1617
Wirasm merged 2 commits into
coleam00:devfrom
knosence:feat/vela-54-stream-status-coalescing

knosence commented May 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 9, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Wirasm commented May 11, 2026

Uh oh!

Wirasm commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

knosence commented May 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

UX Journey

Before

After

Architecture Diagram

Before

After

Label Snapshot

Change Metadata

Linked Issue

Validation Evidence (required)

Security Impact (required)

Compatibility / Migration

Human Verification (required)

Side Effects / Blast Radius (required)

Rollback Plan (required)

Risks and Mitigations

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Wirasm commented May 11, 2026

Review Summary

Blocking issues

Suggested fixes

Minor / nice-to-have

Compliments

Uh oh!

Wirasm commented May 25, 2026

Review Summary

Blocking issues

Suggested fixes

Minor / nice-to-have

Compliments

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

knosence commented May 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 9, 2026 •

edited

Loading