Feature Request: Native Streaming Response Support for webchat Channel (SSE/WebSocket)

### Summary

# Feature Request: Native Streaming Response Support for webchat Channel (SSE/WebSocket)

## 📌 Summary

**Current Issue**: OpenClaw's `webchat` channel does not support **native streaming responses**. Users must wait for the entire AI response to be generated before seeing any output, resulting in poor user experience (UX), especially for long-form content.

**Desired Outcome**: Enable real-time streaming of AI responses through webchat using Server-Sent Events (SSE) or WebSocket, matching the capabilities of competing platforms like Moltbot, ChatGPT, and Claude Code.

---

## 🎯 Motivation & User Impact

### Current State (Non-streaming)
- ❌ **High initial latency**: Users wait 5-10+ seconds before seeing the first character
- ❌ **No progress feedback**: Cannot tell if AI is "thinking" or processing
- ❌ **Long content is painful**: Reading a 500-word response feels like waiting for a movie to finish loading

### Target State (Streaming)
- ✅ **Fast time-to-first-byte (TTFT)**: Show first character within 1-2 seconds
- ✅ **Progress visualization**: Display intermediate steps/thoughts as they're generated
- ✅ **Interruptible**: User can send follow-up questions while AI is still generating

**Impact Assessment**: This would significantly improve user satisfaction and perceived performance, especially for complex tasks like data analysis, code generation, and long-form content creation.

---

### Problem to solve

webchat channel currently lacks native streaming support, causing users to wait for complete responses without any intermediate feedback. This creates poor UX compared to Feishu/Discord channels which already support streaming.

### Proposed solution

## 🔍 Technical Feasibility Analysis

### Existing Streaming Support (Proof of Concept)
| Channel | Streaming Status | Notes |
|---------|-----------------|-------|
| **Feishu** | ✅ `streaming: true` | Already working with chunkSize=5 |
| **Discord/Slack** | ✅ Native support | WebSocket/SSE from third-party APIs |
| **Custom Provider** | ⚠️ Depends on config | Local LLM (1234) needs SSE handler |

### Model Layer Support
| Provider | SSE Compatibility | Configuration Required |
|----------|-----------------|----------------------|
| **Qwen Portal** | ✅ Native support | `stream: true` + `Accept: text/event-stream` |
| **OpenAI API** | ✅ Native support | Standard OpenAI streaming format |
| **Ollama** | ✅ Partial support | `OLLAMA_STREAM=true` |

### Architecture Analysis
The current architecture already supports streaming for Feishu and other channels. Extending this to webchat is feasible with minimal changes:

1. **Gateway layer**: Add SSE/WebSocket handler for webchat channel
2. **Message framing**: Split large responses into chunks
3. **Client-side rendering**: Incremental DOM updates (no full reload)

**Estimated Effort**: 
- Phase 1 (SSE support): 2-3 days
- Phase 2 (WebSocket + UI enhancements): 1-2 weeks
- Phase 3 (Full hybrid mode with progress visualization): 4-6 weeks

---

## 🛠️ Proposed Implementation

### Core Features (MVP)

#### 1. SSE Streaming Handler
Enable streaming responses from model providers:

```json
{
  "agents": {
    "defaults": {
      "streamingEnabled": true,              // ⚙️ Enable/disable streaming
      "streamChunkSize": 500,                // 📏 Chunk size (bytes)
      "streamTimeoutMs": 60000               // ⏱️ Timeout threshold
    }
  },
  "channels": {
    "webchat": {
      "streaming": true,                     // ✅ Enable streaming for webchat
      "protocol": "websocket"                // 🔌 Transport protocol (SSE/WS)
    }
  }
}
```

#### 2. Client-Side Incremental Rendering
- **EventSource** or **WebSocket** connection to gateway
- Incremental DOM updates (avoid full re-renders)
- Typing animation / loading indicators during generation

### Advanced Features (Post-MVP)

#### 3. Progress Visualization (Hybrid Mode)
Display intermediate steps/thoughts as they're generated:

```text
[🔄 Step 1] Analyzing user request...
   → Data loaded: sales_2026_q1.xlsx
   
[📊 Preliminary findings]
- Q1 revenue: ¥2.8M (+8.5% YoY)
-华东区 contributes 42% of total

[⏳ Generating detailed report...]
   [✅ Chart generated (3/5)]
```

Benefits:
- Transparent AI reasoning process
- Reduced user anxiety during long waits
- Educational value (understanding how AI works)

#### 4. Interruptible Generation
Allow users to send follow-up messages while AI is still generating, enabling conversational flow without waiting for completion.

---

## 📋 Acceptance Criteria

### MVP Success Criteria
1. ✅ **Webchat displays streaming indicators** during response generation (e.g., "Generating...", typing animation)
2. ✅ **Text appears incrementally** as tokens are generated (not all at once)
3. ✅ **Works with Qwen Portal and OpenAI-compatible APIs** out of the box
4. ✅ **Configurable toggle** to enable/disable streaming (`streamingEnabled` flag)
5. ✅ **Graceful degradation**: Falls back to non-streaming if SSE fails

### Advanced Success Criteria (Optional for Phase 2)
1. ⚠️ **Progress visualization cards** showing intermediate steps
2. ⚠️ **Interruptible generation** with follow-up message support
3. ⚠️ **WebSocket protocol** alternative to SSE for more flexible control

---

## 🐛 Known Issues & Workarounds

### Current Workaround (Manual Testing)
Users can test streaming manually by:
1. Configuring `streamingEnabled: true` in `openclaw.json`
2. Using Feishu channel (already supports streaming) to verify model SSE compatibility
3. Attempting webchat (currently falls back to non-streaming)

**Limitation**: No streaming in webchat despite gateway configuration changes.

---

## 📊 Expected Impact Metrics

| Metric | Before | After (Streaming) | Improvement |
|--------|--------|-------------------|-------------|
| **Time-to-first-byte (TTFT)** | 5-10s | 1-2s | ⬇️ 80% reduction |
| **User-perceived latency** | High | Low/Medium | ⬆️ Significantly better |
| **Interruptibility** | ❌ No | ✅ Yes | New capability |
| **Long content readability** | Poor (waiting) | Good (reading while generating) | ⬆️ Much better |

---

## 📚 References & Related Issues

- **#6446**: "Webchat tool streaming not working" - *Closed as not planned* (still relevant for AI responses)
- **#20942**: BlueBubbles stream corruption issue - indicates SSE connection handling needs work
- **Moltbot documentation**: [Streaming response feature](https://www.tencentcloud.com/techpedia/136741) (inspiration for hybrid mode)

---

## 💬 Community Feedback & Discussion

**Discord Discussion** (if applicable):
> *Add any relevant Discord thread links or community feedback here*

**Community Interest Level**: 
- High interest among power users who frequently use webchat
- Multiple requests in GitHub issues about "waiting too long for responses"
- Competitive pressure from other AI platforms that already support streaming

---

## 🚀 Implementation Roadmap

### Phase 1: Core SSE Support (2-3 days)
- [ ] Add `streamingEnabled` config option
- [ ] Implement SSE handler in gateway
- [ ] Basic client-side EventSource connection
- [ ] Test with Qwen Portal and OpenAI API

### Phase 2: UI Enhancements (1 week)
- [ ] Incremental text rendering engine
- [ ] Loading indicators / typing animation
- [ ] Configurable chunk size and timeout settings
- [ ] Graceful error handling

### Phase 3: Advanced Features (4 weeks, optional)
- [ ] Progress visualization cards
- [ ] Hybrid streaming mode (show steps + final answer)
- [ ] Interruptible generation support
- [ ] WebSocket protocol alternative

---

## 📎 Additional Notes

### Security & Privacy Considerations
- Streaming should not leak sensitive data through intermediate tokens
- Ensure SSE/WebSocket connections use proper authentication (token-based)
- Rate limiting to prevent abuse of streaming connections

### Browser Compatibility
- **Chrome/Firefox/Edge**: Full support for EventSource and WebSocket
- Safari: Requires polyfill for older versions
- Mobile browsers: Tested on iOS Safari and Android Chrome

### Testing Strategy
1. Unit tests for SSE handler parsing
2. Integration tests with Qwen Portal sandbox
3. E2E tests simulating long-form generation (500+ tokens)
4. Stress testing with concurrent streaming connections

---

## 🙋 Questions for Core Team

1. **Priority Assessment**: How critical is webchat streaming compared to other pending features?
2. **Architecture Preference**: SSE vs WebSocket - which protocol do you prefer for webchat?
3. **Model Provider Constraints**: Are there specific models we should prioritize for streaming support initially?
4. **Timeline Expectation**: What's the realistic timeline for MVP delivery (2-6 weeks)?

---

## 📝 Tags

`enhancement` `streaming` `webchat` `SSE` `UX improvement` `high-priority`

---

**Created by**: OpenClaw Community Contributor  
**Date**: March 9, 2026  
**Version**: v1.0 (Initial Draft)

### Alternatives considered

_No response_

### Impact

Affected users: All webchat users
Severity: Medium (UX degradation, not system failure)  
Frequency: Always (every request)
Consequence: Users wait 5-10+ seconds without feedback, causing frustration and perceived slowness

### Evidence/examples

_No response_

### Additional information

_No response_

Channel	Streaming Status	Notes
Feishu	✅ `streaming: true`	Already working with chunkSize=5
Discord/Slack	✅ Native support	WebSocket/SSE from third-party APIs
Custom Provider	⚠️ Depends on config	Local LLM (1234) needs SSE handler

Provider	SSE Compatibility	Configuration Required
Qwen Portal	✅ Native support	`stream: true` + `Accept: text/event-stream`
OpenAI API	✅ Native support	Standard OpenAI streaming format
Ollama	✅ Partial support	`OLLAMA_STREAM=true`

Metric	Before	After (Streaming)	Improvement
Time-to-first-byte (TTFT)	5-10s	1-2s	⬇️ 80% reduction
User-perceived latency	High	Low/Medium	⬆️ Significantly better
Interruptibility	❌ No	✅ Yes	New capability
Long content readability	Poor (waiting)	Good (reading while generating)	⬆️ Much better

Uh oh!

Feature Request: Native Streaming Response Support for webchat Channel (SSE/WebSocket) #41130

Description

Summary

Feature Request: Native Streaming Response Support for webchat Channel (SSE/WebSocket)

📌 Summary

🎯 Motivation & User Impact

Current State (Non-streaming)

Target State (Streaming)

Problem to solve

Proposed solution

🔍 Technical Feasibility Analysis

Existing Streaming Support (Proof of Concept)

Model Layer Support

Architecture Analysis

🛠️ Proposed Implementation

Core Features (MVP)

1. SSE Streaming Handler

2. Client-Side Incremental Rendering

Advanced Features (Post-MVP)

3. Progress Visualization (Hybrid Mode)

4. Interruptible Generation

📋 Acceptance Criteria

MVP Success Criteria

Advanced Success Criteria (Optional for Phase 2)

🐛 Known Issues & Workarounds

Current Workaround (Manual Testing)

📊 Expected Impact Metrics

📚 References & Related Issues

💬 Community Feedback & Discussion

🚀 Implementation Roadmap

Phase 1: Core SSE Support (2-3 days)

Phase 2: UI Enhancements (1 week)

Phase 3: Advanced Features (4 weeks, optional)

📎 Additional Notes

Security & Privacy Considerations

Browser Compatibility

Testing Strategy

🙋 Questions for Core Team

📝 Tags

Alternatives considered

Impact

Evidence/examples

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions