Skip to content

feat: add Google/Gemini embedding provider support#589

Closed
ZaynJarvis wants to merge 1 commit intovolcengine:mainfrom
ZaynJarvis:feat/google-embedding-provider
Closed

feat: add Google/Gemini embedding provider support#589
ZaynJarvis wants to merge 1 commit intovolcengine:mainfrom
ZaynJarvis:feat/google-embedding-provider

Conversation

@ZaynJarvis
Copy link
Copy Markdown
Collaborator

Summary

Add Google/Gemini as a new embedding provider, supporting Gemini Embedding 2 (gemini-embedding-2-preview) model.

Changes

  • New file: openviking/models/embedder/google_embedders.pyGoogleDenseEmbedder using Gemini's OpenAI-compatible endpoint
  • Config: Added google as a valid provider in embedding_config.py with factory wiring
  • Docs: Updated both zh and en configuration guides with Gemini examples and model table

Design

  • Uses Gemini's OpenAI-compatible endpoint (generativelanguage.googleapis.com/v1beta/openai/) — no new SDK dependency needed
  • Supports Matryoshka (MRL) dimension reduction (768, 1024, 1536, 2048, 3072)
  • Supports task types (RETRIEVAL_QUERY, RETRIEVAL_DOCUMENT, SEMANTIC_SIMILARITY, etc.)
  • Default dimension: 3072

Configuration Example

{
  "embedding": {
    "dense": {
      "provider": "google",
      "api_key": "your-gemini-api-key",
      "model": "gemini-embedding-2-preview",
      "dimension": 3072
    }
  }
}

@ZaynJarvis
Copy link
Copy Markdown
Collaborator Author

Closes #566

@MaojiaSheng
Copy link
Copy Markdown
Collaborator

#607

@ZaynJarvis ZaynJarvis closed this Mar 16, 2026
@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 16, 2026
@ZaynJarvis
Copy link
Copy Markdown
Collaborator Author

close as dups with #607

@ZaynJarvis
Copy link
Copy Markdown
Collaborator Author

open as #607 is reverted and introduce more complexity on multimodal.

multimodal will be supported in future.

@ZaynJarvis ZaynJarvis reopened this Mar 17, 2026
- Add GoogleDenseEmbedder with new API pattern
- Support query_param/document_param with is_query parameter
- Add key=value parsing for multiple parameters
- Integrate with embedding config factory
- Update imports and __init__.py

Usage:
- Simple: query_param='RETRIEVAL_QUERY', document_param='RETRIEVAL_DOCUMENT'
- Enhanced: query_param='task_type=RETRIEVAL_QUERY,output_dimensionality=1024'

Supports Google's OpenAI-compatible endpoint with task_type and Matryoshka dimensions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants