Skip to content
This repository was archived by the owner on Apr 30, 2026. It is now read-only.

feat: GeminiEmbedding rate-limit handling#2237

Merged
marcusschiesser merged 3 commits into
run-llama:mainfrom
jeremybmerrill:feat/respect-google-gemini-rate-limits
Dec 2, 2025
Merged

feat: GeminiEmbedding rate-limit handling#2237
marcusschiesser merged 3 commits into
run-llama:mainfrom
jeremybmerrill:feat/respect-google-gemini-rate-limits

Conversation

@jeremybmerrill

Copy link
Copy Markdown
Contributor

Google Gemini has rate-limits on embeddings -- 3000 vectors per minute. Right now, if we're generating embeddings via LlamaIndex (e.g. with VectorStoreIndex.init or similar methods), if we hit the rate limits, it just errors out, with no ability to wait or restart.

This proposed addition would wait 5s and retry, up to 20 times any embed call that fails with a rate limit error. That up-to-100s wait gets you out of the per-minute limit -- so that the requests-per-minute limit is seamlessly handled by applications using LlamaIndex.

I've added an example file that fails with the existing main branch of llamaindexts, but succeeds with this patch. You can run it with ts-node examples/models/gemini/embedding_ratelimits.ts. (I didn't really know how write a proper jest test for this, without actually hitting the Gemini API and without faking the way that the Google AI library throws errors. Rather than tightly couple the test to the current behavior of the Google AI library, I wrote an example that does hit the Gemini API.)

@changeset-bot

changeset-bot Bot commented Nov 6, 2025

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 8ad2add

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new

pkg-pr-new Bot commented Nov 6, 2025

Copy link
Copy Markdown

Open in StackBlitz

@llamaindex/autotool

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/autotool@2237

@llamaindex/community

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/community@2237

@llamaindex/core

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/core@2237

@llamaindex/env

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/env@2237

@llamaindex/experimental

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/experimental@2237

llamaindex

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/llamaindex@2237

@llamaindex/node-parser

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/node-parser@2237

@llamaindex/readers

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/readers@2237

@llamaindex/tools

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/tools@2237

@llamaindex/wasm-tools

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/wasm-tools@2237

@llamaindex/workflow

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/workflow@2237

@llamaindex/anthropic

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/anthropic@2237

@llamaindex/assemblyai

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/assemblyai@2237

@llamaindex/aws

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/aws@2237

@llamaindex/clip

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/clip@2237

@llamaindex/cohere

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/cohere@2237

@llamaindex/deepinfra

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/deepinfra@2237

@llamaindex/deepseek

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/deepseek@2237

@llamaindex/discord

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/discord@2237

@llamaindex/excel

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/excel@2237

@llamaindex/fireworks

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/fireworks@2237

@llamaindex/google

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/google@2237

@llamaindex/groq

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/groq@2237

@llamaindex/huggingface

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/huggingface@2237

@llamaindex/jinaai

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/jinaai@2237

@llamaindex/mistral

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/mistral@2237

@llamaindex/mixedbread

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/mixedbread@2237

@llamaindex/notion

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/notion@2237

@llamaindex/ollama

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/ollama@2237

@llamaindex/openai

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/openai@2237

@llamaindex/ovhcloud

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/ovhcloud@2237

@llamaindex/perplexity

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/perplexity@2237

@llamaindex/portkey-ai

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/portkey-ai@2237

@llamaindex/replicate

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/replicate@2237

@llamaindex/together

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/together@2237

@llamaindex/vercel

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/vercel@2237

@llamaindex/vllm

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/vllm@2237

@llamaindex/voyage-ai

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/voyage-ai@2237

@llamaindex/xai

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/xai@2237

@llamaindex/astra

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/astra@2237

@llamaindex/azure

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/azure@2237

@llamaindex/chroma

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/chroma@2237

@llamaindex/elastic-search

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/elastic-search@2237

@llamaindex/firestore

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/firestore@2237

@llamaindex/milvus

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/milvus@2237

@llamaindex/mongodb

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/mongodb@2237

@llamaindex/pinecone

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/pinecone@2237

@llamaindex/postgres

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/postgres@2237

@llamaindex/qdrant

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/qdrant@2237

@llamaindex/supabase

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/supabase@2237

@llamaindex/upstash

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/upstash@2237

@llamaindex/weaviate

npm i https://pkg.pr.new/run-llama/LlamaIndexTS/@llamaindex/weaviate@2237

commit: 8ad2add

@marcusschiesser

Copy link
Copy Markdown
Contributor

This PR should not include the postgres change

@jeremybmerrill jeremybmerrill force-pushed the feat/respect-google-gemini-rate-limits branch from 8cf9b48 to 6345e76 Compare November 25, 2025 15:06
@jeremybmerrill

Copy link
Copy Markdown
Contributor Author

This PR should not include the postgres change

Aw crap, how'd I do that.

I've re-pushed a version of this branch that doesn't include the Postgres change.

@marcusschiesser

Copy link
Copy Markdown
Contributor

Thanks @jeremybmerrill

@marcusschiesser marcusschiesser merged commit 020928c into run-llama:main Dec 2, 2025
19 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants