SOUNDAdvice

Inspiration

Having worked in business analytics environments, our team knows first‐hand the number of tickets raised and escalations caused by non-timely resolution of issues by call-centre assistants. This isn’t the assistants’ fault — what do you do when you have a plethora of unorganized information at your disposal, but inaccessible when you need it?

In order to streamline issue resolution and increase customer satisfaction, our team jumped at the idea of creating a real-time voice transcription agent powered by LLMs and RAG to generate timely, context-relevant suggestions.

What it does

Audio Transcription:
Uses AWS Transcribe to display live transcripts as you speak.
Context Storage:
Built a RAG pipeline with ChromaDB; users can upload PDFs containing tone guidelines, common issues, etc.
Real-time Advice Generation:
Leverages Ollama to generate advice on-the-fly based on both your transcript and uploaded PDFs.
Customer Satisfaction Score:
Generates a 1–10 score via sentiment analysis (TextBlob) to capture satisfaction accurately.
Summarization CSV:
Produces a concise summary (plus extracted customer name) and auto-downloads it as a CSV for later analysis.

How we built it

1) Frontend

Vite + React for a lightning-fast, hot-reload dev experience
WebSocket-based audio streaming using the browser’s MediaStream & ScriptProcessorNode
Custom 10 s polling loop to fetch advice without spamming the LLM
File-upload widget to ingest PDFs for on-demand vectorization

2) Backend

FastAPI (async) for both WebSocket and REST endpoints
Amazon Transcribe Streaming client for low-latency STT
ChromaDB vector store for transcripts & PDF embeddings
Ollama for on-prem LLM inference (advice, summarization, name extraction)
TextBlob for sentiment analysis → 1–10 satisfaction score

3) Deployment & Infra

Single EC2 instance (g4ad.xlarge) hosting FastAPI, ChromaDB, and Ollama
Dockerized services behind an Nginx reverse proxy for HTTPS & CORS

Challenges we ran into

Real-time streaming:
Coordinating audio buffers, WebSocket delivery, and AWS Transcribe’s partial vs. final transcripts.
Prompt engineering:
Constraining Ollama to “only return customer name” or a strict 5-line summary without extra chatter.
Compute limits:
GPU constraints on our EC2 tier meant longer inference times and fewer concurrent sessions.
CORS & networking:
Debugging cross-origin issues between React (port 5173) and FastAPI (port 8000).

Accomplishments & learnings

Learning about AWS:
Our first foray into EC2—launching instances, evaluating costs, configuring inbound/outbound rules, and setting up security groups.
Frontend Development:
Mastered end-to-end audio capture, buffering, WebSocket streaming at 16 kHz, and built a responsive React UI with live transcripts, 10 s advice updates, and robust cleanup/timer management.
Backend Development:
Stitched together FastAPI, ChromaDB, and Ollama for real-time PDF ingestion, embedding generation, and vector search. Learned to parse messy docs, craft low-hallucination prompts, and serve streaming advice plus downloadable CSVs efficiently.

What’s next for SOUNDAdvice

Analytics dashboard:
Visualize sentiment trends and historical issues per customer.
Model upgrades:
Integrate larger LLaMA 3 or Mistral models once GPU resources allow.
Multi-language support:
Extend beyond English to serve low-resource languages.
CRM integration:
Auto-log summaries & tickets into Salesforce, Zendesk, etc.

Built With

amazon-web-services
chromadb
fastapi
ollama
python
react
vite

Updates

Pranay Dang started this project — Jul 10, 2025 07:29 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.