Welcome to the Esperanto provider guide. This page helps you choose the right AI provider for your needs.
| Provider | LLM | Embedding | Reranking | Speech-to-Text | Text-to-Speech | JSON Mode |
|---|---|---|---|---|---|---|
| OpenAI | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ |
| OpenAI-Compatible | ✅ | ✅ | ❌ | ✅ | ✅ | |
| Anthropic | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Google (GenAI) | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ |
| Vertex AI | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ |
| Azure OpenAI | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ |
| Groq | ✅ | ❌ | ❌ | ✅ | ❌ | ✅ |
| Ollama | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Mistral | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ |
| DeepSeek | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Perplexity | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
| xAI | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| DashScope (Qwen) | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
| MiniMax | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
| OpenRouter | ✅ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Transformers | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| Jina | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| Voyage | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ElevenLabs | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ |
*
All-in-One Providers:
- OpenAI: LLM + Embedding + STT + TTS (industry standard)
- Azure OpenAI: Same as OpenAI + enterprise compliance
- Google GenAI: LLM + Embedding + TTS (competitive pricing)
- OpenAI-Compatible: Use different endpoints for different capabilities
Partial Multi-Modal:
- Groq: LLM + STT (fastest inference)
- Vertex AI: LLM + Embedding + TTS (Google Cloud)
- ElevenLabs: STT + TTS (premium voice quality)
Fully Local:
- Ollama: LLM + Embedding (simple setup, no API costs)
- Transformers: Embedding + Reranking (100+ models, offline)
- OpenAI-Compatible: Connect to local LM Studio, vLLM, or Ollama
Self-Hosted Options:
- Azure OpenAI: Private cloud deployment
- Vertex AI: Google Cloud with VPC/security controls
Enterprise-Ready:
- Azure OpenAI: SLA, compliance, private endpoints, regional control
- Vertex AI: Google Cloud security, IAM, audit logs
- OpenAI: Enterprise tier available
- Mistral: European data residency
Most Cost-Effective:
- Ollama: Free (local deployment)
- Transformers: Free (local deployment)
- OpenAI-Compatible: Free (local deployment)
- DeepSeek: Low API costs, strong performance
- OpenRouter: Compare prices across providers, some free models
Good Value:
- Google GenAI: Competitive pricing, generous free tier
- Groq: Fast + affordable
Best Overall Quality:
- OpenAI: GPT-4o, o1, o3, o4 (industry standard)
- Anthropic: Claude 3.5 (excellent reasoning, long context)
Best for Reasoning:
- Anthropic: Claude 3.5 Sonnet, Opus (complex reasoning)
- DeepSeek: deepseek-reasoner (step-by-step reasoning)
- OpenAI: o1, o3, o4 series (advanced reasoning)
Fastest Inference:
Best for Code:
Real-Time Information:
- Perplexity: Web search integration, citations
- xAI: Real-time knowledge
Multiple Models Access:
- OpenRouter: 100+ models from various providers
Local Deployment:
- Ollama: Llama, Mistral, Qwen, etc.
- OpenAI-Compatible: LM Studio, vLLM
Best Overall:
- OpenAI: text-embedding-3-large, text-embedding-3-small (industry standard)
- Voyage: voyage-3 (specialized retrieval, 32K context)
Advanced Features:
- Jina: Native task types, late chunking, dimension control, multilingual
- Google: Native task type support, 8 task types
Local/Privacy:
- Transformers: 100+ HuggingFace models, completely offline
- Ollama: Local models with simple setup
Domain-Specific:
- Voyage: voyage-code-2 (code), voyage-law-2 (legal), voyage-finance-2 (finance)
Multilingual:
- Jina: jina-embeddings-v3 (multilingual excellence)
- Google: text-multilingual-embedding-002
- Mistral: mistral-embed (multilingual)
Enterprise:
- Azure: text-embedding-3-large, text-embedding-3-small (private cloud)
- Vertex AI: text-embedding-004 (Google Cloud)
All Reranking Providers:
- Jina: Multilingual (100+ languages), production-ready
- Voyage: rerank-2, rerank-1 (high accuracy)
- Transformers: Universal support (any CrossEncoder model), local/offline
Best for Multilingual:
- Jina: jina-reranker-v2-base-multilingual
Best for Privacy:
- Transformers: Completely local processing
Best for Accuracy:
- Voyage: Specialized for retrieval tasks
Best Overall:
Enterprise:
- Azure: whisper (private cloud, compliance)
Premium Quality:
- ElevenLabs: Advanced speech recognition
Local Deployment:
- OpenAI-Compatible: faster-whisper, local Whisper
Best Voice Quality:
- ElevenLabs: Premium voices, voice cloning, emotional control
Best Overall:
- OpenAI: tts-1, tts-1-hd (natural voices, good quality)
Most Voices:
- Google: 30+ unique voices with personalities
Enterprise:
Local Deployment:
- OpenAI-Compatible: Connect to local TTS endpoints
| Provider | Streaming | JSON Mode | Long Context | Max Context |
|---|---|---|---|---|
| OpenAI | ✅ | ✅ | ✅ | 128K-200K |
| Anthropic | ✅ | ✅ | ✅ | 200K |
| ✅ | ✅ | ✅ | 2M (Gemini 1.5) | |
| Azure | ✅ | ✅ | ✅ | 128K-200K |
| Groq | ✅ | ✅ | ❌ | 8K-32K |
| Mistral | ✅ | ✅ | ✅ | 128K |
| DeepSeek | ✅ | ✅ | ✅ | 64K |
| Perplexity | ✅ | ✅ | ❌ | 32K |
| xAI | ✅ | ❌ | ✅ | 128K |
| DashScope | ✅ | ✅ | ✅ | 1M (qwen-max-longcontext) |
| MiniMax | ✅ | ✅ | ✅ | 204K |
| OpenRouter | ✅ | ✅ | ✅ | Varies |
| Ollama | ✅ | ❌ | ✅ | Model-dependent |
| OpenAI-Compatible | ✅ | Endpoint-dependent |
| Provider | Task Types | Late Chunking | Output Dimensions | Max Input |
|---|---|---|---|---|
| OpenAI | Emulated | ❌ | Some models | 8K tokens |
| Native (8 types) | ❌ | ❌ | 3K tokens | |
| Jina | Native | Native | ✅ (64-1024) | 8K tokens |
| Voyage | ❌ | ❌ | ❌ | 4K-32K tokens |
| Azure | Emulated | ❌ | Some models | 8K tokens |
| Mistral | ❌ | ❌ | ❌ | 8K tokens |
| Transformers | Emulated | Emulated | ❌ | Model-dependent |
| Ollama | ❌ | ❌ | ❌ | Model-dependent |
| Vertex AI | ❌ | ❌ | ❌ | 3K tokens |
| OpenAI-Compatible | ❌ | ❌ | ❌ | Endpoint-dependent |
- Choose a provider based on your requirements (see selection guide above)
- Read the provider page for detailed setup instructions
- Get API credentials (if required)
- Set environment variables from
.env.example - Start coding with the examples in each provider guide
See the root .env.example file for all available environment variables:
# Copy and customize
cp .env.example .env
# Edit with your API keys
nano .envFor detailed configuration, see Configuration Guide.
from esperanto.factory import AIFactory
# All capabilities from one provider
llm = AIFactory.create_language("openai", "gpt-4")
embedder = AIFactory.create_embedding("openai", "text-embedding-3-small")
transcriber = AIFactory.create_speech_to_text("openai", "whisper-1")
speaker = AIFactory.create_text_to_speech("openai", "tts-1")from esperanto.factory import AIFactory
# Best-in-class for each capability
llm = AIFactory.create_language("anthropic", "claude-3-5-sonnet-20241022")
embedder = AIFactory.create_embedding("jina", "jina-embeddings-v3")
reranker = AIFactory.create_reranker("voyage", "rerank-2")
speaker = AIFactory.create_text_to_speech("elevenlabs", "eleven_multilingual_v2")from esperanto.factory import AIFactory
# Local models for privacy, cloud for specialized tasks
local_embedder = AIFactory.create_embedding("transformers", "BAAI/bge-large-en-v1.5")
cloud_llm = AIFactory.create_language("openai", "gpt-4")
# Or all local
local_llm = AIFactory.create_language("ollama", "llama3.2")
local_embedder = AIFactory.create_embedding("ollama", "nomic-embed-text")from esperanto.factory import AIFactory
# Free local models
llm = AIFactory.create_language("ollama", "llama3.2")
embedder = AIFactory.create_embedding("transformers", "BAAI/bge-base-en-v1.5")
reranker = AIFactory.create_reranker("transformers", "BAAI/bge-reranker-base")
# Or cost-effective cloud
cheap_llm = AIFactory.create_language("deepseek", "deepseek-chat")Require API keys, pay-per-use:
- OpenAI
- Anthropic
- Google GenAI
- Groq
- Mistral
- DeepSeek
- DashScope (Qwen)
- MiniMax
- Perplexity
- xAI
- OpenRouter
- Jina
- Voyage
- ElevenLabs
Enterprise features, private deployment:
No API costs, privacy-focused:
- Quick Start: docs/quickstart.md
- Configuration: docs/configuration.md
- Capabilities: docs/capabilities/
- Advanced Topics: docs/advanced/