Introduction

Welcome to SpeechRecognition
Key Features
Supported Recognition Engines
Quick Example
Getting Started
Community and Support
License

Welcome to SpeechRecognition

SpeechRecognition is a comprehensive Python library for performing speech recognition, with support for several engines and APIs, both online and offline. With over 8,900 stars on GitHub, it’s the go-to solution for Python developers working with speech-to-text functionality.

Key Features

Multiple Engines

Support for 10+ speech recognition engines including Google, Whisper, Azure, IBM, and more

Real-time Input

Capture and process audio from microphones in real-time with PyAudio integration

Audio Files

Process audio files in multiple formats: WAV, AIFF, and FLAC

Offline Recognition

Work offline with Sphinx, Vosk, and local Whisper models

Background Listening

Continuously listen and process audio in the background

Noise Adjustment

Automatically adjust for ambient noise levels

Multi-language

Support for multiple languages across different recognition engines

Simple API

Clean, intuitive Python API that’s easy to learn and use

Supported Recognition Engines

SpeechRecognition provides a unified interface to multiple speech recognition services:

Google Speech Recognition - Free, no API key required
Google Cloud Speech API - Enterprise-grade recognition
OpenAI Whisper - State-of-the-art offline recognition
OpenAI Whisper API - Cloud-based Whisper models
Groq Whisper API - Fast Whisper inference
Microsoft Azure Speech - Enterprise speech services
Wit.ai - Natural language processing
IBM Speech to Text - Watson-powered recognition
Houndify API - Fast voice AI
CMU Sphinx - Offline recognition (PocketSphinx)
Vosk API - Offline recognition with multiple language models

Quick Example

Here’s a simple example of recognizing speech from a microphone:

import speech_recognition as sr

# Create recognizer instance
r = sr.Recognizer()

# Use microphone as audio source
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

# Recognize speech using Google Speech Recognition
try:
    text = r.recognize_google(audio)
    print(f"You said: {text}")
except sr.UnknownValueError:
    print("Could not understand audio")
except sr.RequestError as e:
    print(f"Error: {e}")

Getting Started

Installation

Install SpeechRecognition and its dependencies

Quickstart

Get up and running with your first speech recognition application

Core Concepts

Learn about audio sources, recognizers, and audio data

API Reference

Explore the complete API documentation

Community and Support

GitHub Repository: Uberi/speech_recognition
PyPI Package: SpeechRecognition
Issue Tracker: Report bugs and request features

License

SpeechRecognition is made available under the 3-clause BSD license. See the LICENSE.txt file for more information.

Installation

⌘I

Getting Started

Core Concepts

Recognition Engines

Guides

Examples

Welcome to SpeechRecognition

Key Features

Multiple Engines

Real-time Input

Audio Files

Offline Recognition

Background Listening

Noise Adjustment

Multi-language

Simple API

Supported Recognition Engines

Quick Example

Getting Started

Installation

Quickstart

Core Concepts

API Reference

Community and Support

License

Getting Started

Core Concepts

Recognition Engines

Guides

Examples

​Welcome to SpeechRecognition

​Key Features

Multiple Engines

Real-time Input

Audio Files

Offline Recognition

Background Listening

Noise Adjustment

Multi-language

Simple API

​Supported Recognition Engines

​Quick Example

​Getting Started

Installation

Quickstart

Core Concepts

API Reference

​Community and Support

​License

Welcome to SpeechRecognition

Key Features

Supported Recognition Engines

Quick Example

Getting Started

Community and Support

License