Skip to content

Latest commit

 

History

History

README.md

Provider Guide

Welcome to the Esperanto provider guide. This page helps you choose the right AI provider for your needs.

Provider Support Matrix

Provider LLM Embedding Reranking Speech-to-Text Text-to-Speech JSON Mode
OpenAI
OpenAI-Compatible ⚠️*
Anthropic
Google (GenAI)
Vertex AI
Azure OpenAI
Groq
Ollama
Mistral
DeepSeek
Perplexity
xAI
DashScope (Qwen)
MiniMax
OpenRouter
Transformers
Jina
Voyage
ElevenLabs

*⚠️ OpenAI-Compatible: JSON mode support depends on the specific endpoint implementation

Quick Selection Guide

By Use Case

Need Multi-Modal Capabilities?

All-in-One Providers:

  • OpenAI: LLM + Embedding + STT + TTS (industry standard)
  • Azure OpenAI: Same as OpenAI + enterprise compliance
  • Google GenAI: LLM + Embedding + TTS (competitive pricing)
  • OpenAI-Compatible: Use different endpoints for different capabilities

Partial Multi-Modal:

  • Groq: LLM + STT (fastest inference)
  • Vertex AI: LLM + Embedding + TTS (Google Cloud)
  • ElevenLabs: STT + TTS (premium voice quality)

Need Privacy/Local Deployment?

Fully Local:

  • Ollama: LLM + Embedding (simple setup, no API costs)
  • Transformers: Embedding + Reranking (100+ models, offline)
  • OpenAI-Compatible: Connect to local LM Studio, vLLM, or Ollama

Self-Hosted Options:

Need Enterprise Features?

Enterprise-Ready:

  • Azure OpenAI: SLA, compliance, private endpoints, regional control
  • Vertex AI: Google Cloud security, IAM, audit logs
  • OpenAI: Enterprise tier available
  • Mistral: European data residency

Need Cost Optimization?

Most Cost-Effective:

Good Value:

By Capability

Language Models (LLM)

Best Overall Quality:

  • OpenAI: GPT-4o, o1, o3, o4 (industry standard)
  • Anthropic: Claude 3.5 (excellent reasoning, long context)

Best for Reasoning:

  • Anthropic: Claude 3.5 Sonnet, Opus (complex reasoning)
  • DeepSeek: deepseek-reasoner (step-by-step reasoning)
  • OpenAI: o1, o3, o4 series (advanced reasoning)

Fastest Inference:

  • Groq: LPU architecture, ultra-fast responses
  • OpenAI: GPT-4o-mini, GPT-3.5 Turbo

Best for Code:

Real-Time Information:

  • Perplexity: Web search integration, citations
  • xAI: Real-time knowledge

Multiple Models Access:

Local Deployment:

Embeddings

Best Overall:

  • OpenAI: text-embedding-3-large, text-embedding-3-small (industry standard)
  • Voyage: voyage-3 (specialized retrieval, 32K context)

Advanced Features:

  • Jina: Native task types, late chunking, dimension control, multilingual
  • Google: Native task type support, 8 task types

Local/Privacy:

  • Transformers: 100+ HuggingFace models, completely offline
  • Ollama: Local models with simple setup

Domain-Specific:

  • Voyage: voyage-code-2 (code), voyage-law-2 (legal), voyage-finance-2 (finance)

Multilingual:

  • Jina: jina-embeddings-v3 (multilingual excellence)
  • Google: text-multilingual-embedding-002
  • Mistral: mistral-embed (multilingual)

Enterprise:

  • Azure: text-embedding-3-large, text-embedding-3-small (private cloud)
  • Vertex AI: text-embedding-004 (Google Cloud)

Reranking

All Reranking Providers:

  • Jina: Multilingual (100+ languages), production-ready
  • Voyage: rerank-2, rerank-1 (high accuracy)
  • Transformers: Universal support (any CrossEncoder model), local/offline

Best for Multilingual:

  • Jina: jina-reranker-v2-base-multilingual

Best for Privacy:

Best for Accuracy:

  • Voyage: Specialized for retrieval tasks

Speech-to-Text

Best Overall:

  • OpenAI: Whisper-1 (accurate, multilingual)
  • Groq: whisper-large-v3 (fastest transcription)

Enterprise:

  • Azure: whisper (private cloud, compliance)

Premium Quality:

Local Deployment:

Text-to-Speech

Best Voice Quality:

  • ElevenLabs: Premium voices, voice cloning, emotional control

Best Overall:

  • OpenAI: tts-1, tts-1-hd (natural voices, good quality)

Most Voices:

  • Google: 30+ unique voices with personalities

Enterprise:

  • Azure: Custom neural voices, private cloud
  • Vertex AI: Multi-speaker support, Google Cloud

Local Deployment:

Feature Comparison

LLM Features

Provider Streaming JSON Mode Long Context Max Context
OpenAI 128K-200K
Anthropic 200K
Google 2M (Gemini 1.5)
Azure 128K-200K
Groq 8K-32K
Mistral 128K
DeepSeek 64K
Perplexity 32K
xAI 128K
DashScope 1M (qwen-max-longcontext)
MiniMax 204K
OpenRouter Varies
Ollama Model-dependent
OpenAI-Compatible ⚠️ ⚠️ Endpoint-dependent

Embedding Features

Provider Task Types Late Chunking Output Dimensions Max Input
OpenAI Emulated Some models 8K tokens
Google Native (8 types) 3K tokens
Jina Native Native ✅ (64-1024) 8K tokens
Voyage 4K-32K tokens
Azure Emulated Some models 8K tokens
Mistral 8K tokens
Transformers Emulated Emulated Model-dependent
Ollama Model-dependent
Vertex AI 3K tokens
OpenAI-Compatible Endpoint-dependent

Getting Started

  1. Choose a provider based on your requirements (see selection guide above)
  2. Read the provider page for detailed setup instructions
  3. Get API credentials (if required)
  4. Set environment variables from .env.example
  5. Start coding with the examples in each provider guide

Environment Setup

See the root .env.example file for all available environment variables:

# Copy and customize
cp .env.example .env

# Edit with your API keys
nano .env

For detailed configuration, see Configuration Guide.

Common Patterns

Single Provider Setup

from esperanto.factory import AIFactory

# All capabilities from one provider
llm = AIFactory.create_language("openai", "gpt-4")
embedder = AIFactory.create_embedding("openai", "text-embedding-3-small")
transcriber = AIFactory.create_speech_to_text("openai", "whisper-1")
speaker = AIFactory.create_text_to_speech("openai", "tts-1")

Multi-Provider Setup

from esperanto.factory import AIFactory

# Best-in-class for each capability
llm = AIFactory.create_language("anthropic", "claude-3-5-sonnet-20241022")
embedder = AIFactory.create_embedding("jina", "jina-embeddings-v3")
reranker = AIFactory.create_reranker("voyage", "rerank-2")
speaker = AIFactory.create_text_to_speech("elevenlabs", "eleven_multilingual_v2")

Local/Cloud Hybrid

from esperanto.factory import AIFactory

# Local models for privacy, cloud for specialized tasks
local_embedder = AIFactory.create_embedding("transformers", "BAAI/bge-large-en-v1.5")
cloud_llm = AIFactory.create_language("openai", "gpt-4")

# Or all local
local_llm = AIFactory.create_language("ollama", "llama3.2")
local_embedder = AIFactory.create_embedding("ollama", "nomic-embed-text")

Cost-Optimized Setup

from esperanto.factory import AIFactory

# Free local models
llm = AIFactory.create_language("ollama", "llama3.2")
embedder = AIFactory.create_embedding("transformers", "BAAI/bge-base-en-v1.5")
reranker = AIFactory.create_reranker("transformers", "BAAI/bge-reranker-base")

# Or cost-effective cloud
cheap_llm = AIFactory.create_language("deepseek", "deepseek-chat")

Provider Categories

Cloud API Providers

Require API keys, pay-per-use:

Cloud Enterprise Providers

Enterprise features, private deployment:

Local/Self-Hosted Providers

No API costs, privacy-focused:

Next Steps

See Also