Sonaly: Audio Analysis for Businesses and Contact Centers

Inspiration

Sonaly was born from the need to automate quality assurance in customer interactions, particularly in contact centers where service consistency is crucial. The project addresses the challenge of manually reviewing countless hours of customer service calls using AI.

What It Does

Sonaly is an advanced audio analysis platform that:

  • Processes and transcribes customer service calls using AI models
  • Analyzes conversations against predefined checklists
  • Detects inappropriate language or prohibited phrases
  • Evaluates tone and emotional content
  • Measures audio quality
  • Provides compliance summaries
  • Automatically anonymizes sensitive information (PII)

How We Built It

Modular architecture with:

  • Core Engine: Node.js processing pipeline
  • AI Integration: Multiple models (OpenAI and GPT-OSS, HuggingFace, among others)
  • Database: Configurable (PostgreSQL, MongoDB, etc.)
  • Cloud Integration: AWS Lambda and S3
  • Multilingual Support: With GPT-OSS, over 100 languages.

Challenges

  1. Accuracy in noisy environments for audio.
  2. Language nuances and dialects, here, gpt-oss models are great to adapt to many languages and dialects.
  3. Real-time processing to analyze audio constantly.
  4. Data privacy using AI to anonymize important data from users in the analysis.

Key Achievements

  1. Modular architecture for easy integration
  2. Scalability to process thousands of audio hours
  3. Multilingual support with high accuracy
  4. Privacy-first approach with robust PII detection
  5. Open-source foundation to take full advantage of gpt-oss models (20b and 120b) through the HuggingFace Inference SDK.

What We Learned

  1. AI model trade-offs: We had to fine-tune the prompt to be very detailed in the gpt-oss results, as it relies heavily on another model for transcription, so there was an iteration process to have a well-defined prompt.
  2. Impact of audio quality: Ensure that the audio files meet the minimum required quality, otherwise even gpt-oss-120b will start to hallucinate in its response when it encounters low-quality data.
  3. Cultural sensitivity in analysis
  4. Regulatory compliance: Allow it to be used in local environments, especially if the audio cannot leave the company's servers, but the models can be run locally.
  5. Performance optimization

Next Steps for Sonaly

Technical Improvements

  • Real-time analysis
  • More granular emotion detection
  • Advanced noise reduction
  • Customizable analysis templates

Business Opportunities

  • Contact Center Optimization
    • Real-time agent assistance
    • Automated coaching
  • Compliance Monitoring
    • Automatic violation detection
    • Regulatory reporting
  • Customer Sentiment Analysis
    • Trend identification
    • Early dissatisfaction detection
  • Quality Assurance
    • Automated agent scoring
    • Improvement opportunity identification

Open Source Model Advantages

  • Cost Efficiency: Lower operational costs
  • Customization: Industry-specific fine-tuning
  • Privacy: On-premise deployment options
  • Transparency: Full model visibility
  • Community Support: Continuous improvements

Business Benefits

  • Enhanced Customer Experience
  • Cost Reduction from manual review
  • Regulatory Compliance assurance
  • Actionable Insights for decision-making
  • Scalability without proportional cost increases

Sonaly is particularly valuable in industries with strict regulatory requirements such as finance, healthcare, and telecommunications, where maintaining service quality and compliance is critical.

Built With

Share this project:

Updates