Skip to content

Feature Request: Native Streaming Response Support for webchat Channel (SSE/WebSocket) #41130

@qingyunpor

Description

@qingyunpor

Summary

Feature Request: Native Streaming Response Support for webchat Channel (SSE/WebSocket)

📌 Summary

Current Issue: OpenClaw's webchat channel does not support native streaming responses. Users must wait for the entire AI response to be generated before seeing any output, resulting in poor user experience (UX), especially for long-form content.

Desired Outcome: Enable real-time streaming of AI responses through webchat using Server-Sent Events (SSE) or WebSocket, matching the capabilities of competing platforms like Moltbot, ChatGPT, and Claude Code.


🎯 Motivation & User Impact

Current State (Non-streaming)

  • High initial latency: Users wait 5-10+ seconds before seeing the first character
  • No progress feedback: Cannot tell if AI is "thinking" or processing
  • Long content is painful: Reading a 500-word response feels like waiting for a movie to finish loading

Target State (Streaming)

  • Fast time-to-first-byte (TTFT): Show first character within 1-2 seconds
  • Progress visualization: Display intermediate steps/thoughts as they're generated
  • Interruptible: User can send follow-up questions while AI is still generating

Impact Assessment: This would significantly improve user satisfaction and perceived performance, especially for complex tasks like data analysis, code generation, and long-form content creation.


Problem to solve

webchat channel currently lacks native streaming support, causing users to wait for complete responses without any intermediate feedback. This creates poor UX compared to Feishu/Discord channels which already support streaming.

Proposed solution

🔍 Technical Feasibility Analysis

Existing Streaming Support (Proof of Concept)

Channel Streaming Status Notes
Feishu streaming: true Already working with chunkSize=5
Discord/Slack ✅ Native support WebSocket/SSE from third-party APIs
Custom Provider ⚠️ Depends on config Local LLM (1234) needs SSE handler

Model Layer Support

Provider SSE Compatibility Configuration Required
Qwen Portal ✅ Native support stream: true + Accept: text/event-stream
OpenAI API ✅ Native support Standard OpenAI streaming format
Ollama ✅ Partial support OLLAMA_STREAM=true

Architecture Analysis

The current architecture already supports streaming for Feishu and other channels. Extending this to webchat is feasible with minimal changes:

  1. Gateway layer: Add SSE/WebSocket handler for webchat channel
  2. Message framing: Split large responses into chunks
  3. Client-side rendering: Incremental DOM updates (no full reload)

Estimated Effort:

  • Phase 1 (SSE support): 2-3 days
  • Phase 2 (WebSocket + UI enhancements): 1-2 weeks
  • Phase 3 (Full hybrid mode with progress visualization): 4-6 weeks

🛠️ Proposed Implementation

Core Features (MVP)

1. SSE Streaming Handler

Enable streaming responses from model providers:

{
  "agents": {
    "defaults": {
      "streamingEnabled": true,              // ⚙️ Enable/disable streaming
      "streamChunkSize": 500,                // 📏 Chunk size (bytes)
      "streamTimeoutMs": 60000               // ⏱️ Timeout threshold
    }
  },
  "channels": {
    "webchat": {
      "streaming": true,                     // ✅ Enable streaming for webchat
      "protocol": "websocket"                // 🔌 Transport protocol (SSE/WS)
    }
  }
}

2. Client-Side Incremental Rendering

  • EventSource or WebSocket connection to gateway
  • Incremental DOM updates (avoid full re-renders)
  • Typing animation / loading indicators during generation

Advanced Features (Post-MVP)

3. Progress Visualization (Hybrid Mode)

Display intermediate steps/thoughts as they're generated:

[🔄 Step 1] Analyzing user request...
   → Data loaded: sales_2026_q1.xlsx
   
[📊 Preliminary findings]
- Q1 revenue: ¥2.8M (+8.5% YoY)
-华东区 contributes 42% of total

[⏳ Generating detailed report...]
   [✅ Chart generated (3/5)]

Benefits:

  • Transparent AI reasoning process
  • Reduced user anxiety during long waits
  • Educational value (understanding how AI works)

4. Interruptible Generation

Allow users to send follow-up messages while AI is still generating, enabling conversational flow without waiting for completion.


📋 Acceptance Criteria

MVP Success Criteria

  1. Webchat displays streaming indicators during response generation (e.g., "Generating...", typing animation)
  2. Text appears incrementally as tokens are generated (not all at once)
  3. Works with Qwen Portal and OpenAI-compatible APIs out of the box
  4. Configurable toggle to enable/disable streaming (streamingEnabled flag)
  5. Graceful degradation: Falls back to non-streaming if SSE fails

Advanced Success Criteria (Optional for Phase 2)

  1. ⚠️ Progress visualization cards showing intermediate steps
  2. ⚠️ Interruptible generation with follow-up message support
  3. ⚠️ WebSocket protocol alternative to SSE for more flexible control

🐛 Known Issues & Workarounds

Current Workaround (Manual Testing)

Users can test streaming manually by:

  1. Configuring streamingEnabled: true in openclaw.json
  2. Using Feishu channel (already supports streaming) to verify model SSE compatibility
  3. Attempting webchat (currently falls back to non-streaming)

Limitation: No streaming in webchat despite gateway configuration changes.


📊 Expected Impact Metrics

Metric Before After (Streaming) Improvement
Time-to-first-byte (TTFT) 5-10s 1-2s ⬇️ 80% reduction
User-perceived latency High Low/Medium ⬆️ Significantly better
Interruptibility ❌ No ✅ Yes New capability
Long content readability Poor (waiting) Good (reading while generating) ⬆️ Much better

📚 References & Related Issues


💬 Community Feedback & Discussion

Discord Discussion (if applicable):

Add any relevant Discord thread links or community feedback here

Community Interest Level:

  • High interest among power users who frequently use webchat
  • Multiple requests in GitHub issues about "waiting too long for responses"
  • Competitive pressure from other AI platforms that already support streaming

🚀 Implementation Roadmap

Phase 1: Core SSE Support (2-3 days)

  • Add streamingEnabled config option
  • Implement SSE handler in gateway
  • Basic client-side EventSource connection
  • Test with Qwen Portal and OpenAI API

Phase 2: UI Enhancements (1 week)

  • Incremental text rendering engine
  • Loading indicators / typing animation
  • Configurable chunk size and timeout settings
  • Graceful error handling

Phase 3: Advanced Features (4 weeks, optional)

  • Progress visualization cards
  • Hybrid streaming mode (show steps + final answer)
  • Interruptible generation support
  • WebSocket protocol alternative

📎 Additional Notes

Security & Privacy Considerations

  • Streaming should not leak sensitive data through intermediate tokens
  • Ensure SSE/WebSocket connections use proper authentication (token-based)
  • Rate limiting to prevent abuse of streaming connections

Browser Compatibility

  • Chrome/Firefox/Edge: Full support for EventSource and WebSocket
  • Safari: Requires polyfill for older versions
  • Mobile browsers: Tested on iOS Safari and Android Chrome

Testing Strategy

  1. Unit tests for SSE handler parsing
  2. Integration tests with Qwen Portal sandbox
  3. E2E tests simulating long-form generation (500+ tokens)
  4. Stress testing with concurrent streaming connections

🙋 Questions for Core Team

  1. Priority Assessment: How critical is webchat streaming compared to other pending features?
  2. Architecture Preference: SSE vs WebSocket - which protocol do you prefer for webchat?
  3. Model Provider Constraints: Are there specific models we should prioritize for streaming support initially?
  4. Timeline Expectation: What's the realistic timeline for MVP delivery (2-6 weeks)?

📝 Tags

enhancement streaming webchat SSE UX improvement high-priority


Created by: OpenClaw Community Contributor
Date: March 9, 2026
Version: v1.0 (Initial Draft)

Alternatives considered

No response

Impact

Affected users: All webchat users
Severity: Medium (UX degradation, not system failure)
Frequency: Always (every request)
Consequence: Users wait 5-10+ seconds without feedback, causing frustration and perceived slowness

Evidence/examples

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions