#About EchoCode 🎀

##The Inspiration πŸ’‘

**EchoCode** was born from a simple question: What if developers could simply *speak* their coding ideas and watch them materialize in real-time?
Here’s your **properly formatted and cleaned-up Markdown**, ready for GitHub, Devpost, or any documentation site:

```markdown
# About EchoCode 🎀

## The Inspiration πŸ’‘

**EchoCode** was born from a simple question:  
> What if developers could simply *speak* their coding ideas and watch them materialize in real time?

Modern voice assistants have transformed how we interact with technology β€” but not how we *build* it.  
We built a **voice-powered coding companion** that:

- Understands natural language coding requests  
- Searches the web for documentation via **Tavily API**  
- Generates algorithms and pseudocode on demand  
- Provides real-time feedback through an intuitive interface  

---

## What Was Learned πŸ“š

### 🧠 Real-Time Audio Processing
- Captured audio at **16 kHz PCM** with **Voice Activity Detection (VAD)**  
- Found optimal frame size: **40 ms frames** for the best latency–quality balance  

### βš™οΈ Event-Driven Architecture
- Built a **type-safe event system** with *9 distinct event types*  
- Implemented **WebSocket** bidirectional communication  
- Designed a robust **Gateway layer** for event routing  

### πŸ€– AI Integration & Safety
- Integrated **OpenAI Whisper** for speech-to-text with **< 2 s latency**  
- Built resilient API clients with exponential backoff:  
  \[
  delay_n = \min(base \times 2^{n-1} + jitter, 10000)
  \]
- Implemented safety policies for protected paths and diff-size validation  

---

## How It Was Built πŸ”§

### πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ UI Layer β”‚ TypeScript + Web Audio API + WebSocket β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Gateway β”‚ Express + WebSocket (Port 3000/3001) β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ Agent β”‚ Custom Runtime + Planner + Skills (Port 3002) β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”‚ External β”‚ OpenAI β€’ Tavily β€’ WandB β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜


### 🧩 Key Technologies

- **Frontend:** TypeScript, Web Audio API, 7 custom UI components  
- **Gateway:** Express.js + `ws` WebSocket library  
- **Agent:** Custom runtime with Planner, Skills Registry, Safety Policy Enforcer  
- **APIs:** OpenAI Whisper / GPT-3.5, Tavily Search, Weights & Biases  

### πŸ”¬ Implementation Highlights

- **Audio Pipeline:** 40 ms frames (640 samples @ 16 kHz) with VAD threshold \(\theta = 0.01\)  
- **Retry Logic:** Exponential backoff + jitter for API resilience  
- **Planner:** Keyword-based intent analysis generating up to 4 execution steps  

---

## Challenges We Faced ⚑

### 1️⃣ Audio Quality vs Latency Trade-off
**Solution:** Settled on **40 ms frames** providing ~80–120 ms end-to-end latency with high recognition accuracy.

### 2️⃣ WebSocket Message Size Limits
**Solution:** Frame validation (≀ 65 536 bytes), queue management (≀ 50 frames), and automatic frame dropping with logging.

### 3️⃣ Race Conditions in Event Flow
**Solution:** Added `stepIndex` for ordering, implemented event buffering in Gateway, and made UI components handle out-of-order events gracefully.

### 4️⃣ API Rate Limiting
**Solution:** HTTP 429 detection, exponential backoff (3 attempts), graceful degradation, and clear user error messages.

### 5️⃣ Type Safety Across Workspaces
**Solution:** Created `@voice-ag/shared` package with shared contracts, used TypeScript project references, and strict union event typing.

---

## Performance Metrics πŸ“Š

| Metric | Value |
|--------|-------|
| **Average Turn Duration** | 4.2 s |
| **STT Latency (Whisper)** | 1.8 s |
| **Tavily Search Time** | 2.1 s |
| **UI Event Latency** | < 100 ms |
| **Audio Frame Rate** | 25 fps |
| **WebSocket Uptime** | 99.7 % |

---

## What’s Next? πŸš€

1. **LLM-Based Intent Classification** for deeper understanding  
2. **Multi-Turn Conversations** with context retention  
3. **Code Execution** in sandboxed environments  
4. **IDE Integration** (VS Code extension)  
5. **Voice Feedback** with TTS responses  

---

Built With

Share this project:

Updates