Skip to main content
SpeechRecognition

Welcome to SpeechRecognition

SpeechRecognition is a comprehensive Python library for performing speech recognition, with support for several engines and APIs, both online and offline. With over 8,900 stars on GitHub, it’s the go-to solution for Python developers working with speech-to-text functionality.

Key Features

Multiple Engines

Support for 10+ speech recognition engines including Google, Whisper, Azure, IBM, and more

Real-time Input

Capture and process audio from microphones in real-time with PyAudio integration

Audio Files

Process audio files in multiple formats: WAV, AIFF, and FLAC

Offline Recognition

Work offline with Sphinx, Vosk, and local Whisper models

Background Listening

Continuously listen and process audio in the background

Noise Adjustment

Automatically adjust for ambient noise levels

Multi-language

Support for multiple languages across different recognition engines

Simple API

Clean, intuitive Python API that’s easy to learn and use

Supported Recognition Engines

SpeechRecognition provides a unified interface to multiple speech recognition services:
  • Google Speech Recognition - Free, no API key required
  • Google Cloud Speech API - Enterprise-grade recognition
  • OpenAI Whisper - State-of-the-art offline recognition
  • OpenAI Whisper API - Cloud-based Whisper models
  • Groq Whisper API - Fast Whisper inference
  • Microsoft Azure Speech - Enterprise speech services
  • Wit.ai - Natural language processing
  • IBM Speech to Text - Watson-powered recognition
  • Houndify API - Fast voice AI
  • CMU Sphinx - Offline recognition (PocketSphinx)
  • Vosk API - Offline recognition with multiple language models

Quick Example

Here’s a simple example of recognizing speech from a microphone:
import speech_recognition as sr

# Create recognizer instance
r = sr.Recognizer()

# Use microphone as audio source
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

# Recognize speech using Google Speech Recognition
try:
    text = r.recognize_google(audio)
    print(f"You said: {text}")
except sr.UnknownValueError:
    print("Could not understand audio")
except sr.RequestError as e:
    print(f"Error: {e}")

Getting Started

Installation

Install SpeechRecognition and its dependencies

Quickstart

Get up and running with your first speech recognition application

Core Concepts

Learn about audio sources, recognizers, and audio data

API Reference

Explore the complete API documentation

Community and Support

License

SpeechRecognition is made available under the 3-clause BSD license. See the LICENSE.txt file for more information.