Skip to content

feat: add Ollama integration example (closes #118)#238

Merged
LarFii merged 3 commits intoHKUDS:mainfrom
jwchmodx:feat/ollama-integration-example
Apr 7, 2026
Merged

feat: add Ollama integration example (closes #118)#238
LarFii merged 3 commits intoHKUDS:mainfrom
jwchmodx:feat/ollama-integration-example

Conversation

@jwchmodx
Copy link
Copy Markdown
Contributor

@jwchmodx jwchmodx commented Apr 3, 2026

Problem

Ollama does not expose the /v1/embeddings endpoint that the existing
openai_embed helper targets. Its embedding API is /api/embed and must
be called via the native ollama Python client:

# Ollama-native (works for all models)
response = await ollama.AsyncClient(host=host).embed(model=model, input=texts)

# OpenAI-compat shim — only works if Ollama's /v1/embeddings is enabled,
# which is NOT the default for most models
embeddings = await openai_embed(texts, model=model, base_url=f"{host}/v1")

As reported in #118, this caused silent embedding failures when users
followed the existing example and pointed it at an Ollama host.

Solution

Add examples/ollama_integration_example.py modelled after the existing
lmstudio_integration_example.py. Key differences:

LM Studio Ollama
LLM endpoint /v1/chat/completions /v1/chat/completions ✅ same
Embedding endpoint /v1/embeddings /api/embed (native client)
Auth arbitrary key "ollama" (ignored)

The example includes:

  • Connection check — lists available models, warns if required ones are missing with the correct ollama pull command
  • Embedding sanity-check — calls ollama.AsyncClient.embed() and validates the vector dimension against OLLAMA_EMBEDDING_DIM
  • Chat sanity-check — one-shot prompt via openai_complete_if_cache
  • RAG init + sample query — inserts a text snippet and runs a hybrid query

Environment variables (all optional, defaults shown):

OLLAMA_HOST=http://localhost:11434
OLLAMA_LLM_MODEL=llama3.2
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
OLLAMA_EMBEDDING_DIM=768

Quick start

ollama pull llama3.2
ollama pull nomic-embed-text
pip install ollama raganything
python examples/ollama_integration_example.py

Checklist

  • No changes to library code — example file only
  • ruff check --ignore=E402 + ruff format --check pass (E402 is ignored in pre-commit config)
  • Follows same structure as lmstudio_integration_example.py
  • No new required dependencies (ollama is already an optional extra)

jwchmodx added 3 commits April 3, 2026 12:16
HKUDS#159)

Reasoning models (DeepSeek-R1, Qwen2.5-think, etc.) wrap their
chain-of-thought in <think>…</think> blocks before emitting the
final answer.  When _robust_json_parse fails to extract a valid JSON
object from the response, the four modal-processor parse methods
(_parse_response, _parse_table_response, _parse_equation_response,
_parse_generic_response) were returning the **raw** LLM response as
the fallback caption and summary.  This caused internal model
reasoning to be stored in the knowledge graph instead of the actual
content description.

Fix: add a static helper `BaseModalProcessor._strip_thinking_tags`
that removes <think>/<thinking> blocks (case-insensitive, multiline)
and apply it in every fallback branch so only the final-answer text
is stored or returned.

The helper is tested in tests/test_strip_thinking_tags.py with 13
unit tests covering: tag variants, multiline blocks, multiple blocks,
case-insensitivity, and the full fallback path for all four
processor classes.
HKUDS#230)

On systems where only 'soffice' is on PATH (common on macOS), the
existing fallback loop logged a WARNING for the 'libreoffice' candidate
before successfully converting via 'soffice'.  This caused users to see:

  WARNING: LibreOffice command 'libreoffice' not found
  INFO:    Successfully converted file.pptx to PDF using soffice

…and conclude that something was broken, even though the conversion
succeeded.

Fix: log FileNotFoundError at DEBUG level for any non-final candidate
so that routine 'libreoffice' → 'soffice' fallback stays silent in
normal logs.  The WARNING is preserved only when the last candidate in
the list is not found (meaning no usable LibreOffice binary exists at
all and the conversion is about to fail).
Ollama uses a different embedding API (/api/embed via the native
ollama Python client) compared to the OpenAI-compatible /v1/embeddings
endpoint assumed by the existing openai_embed helper.  Pointing that
helper at an Ollama host causes embedding failures for most models.

Add examples/ollama_integration_example.py that:
- Uses openai_complete_if_cache against Ollama's /v1 chat endpoint
  (works out of the box — Ollama exposes OpenAI-compatible chat)
- Calls ollama.AsyncClient.embed() for embeddings so every model in
  the Ollama registry is supported without extra configuration
- Follows the same structure as lmstudio_integration_example.py:
  connection check, embedding sanity-check, chat sanity-check,
  RAG init, sample insert + query
- Supports OLLAMA_HOST / OLLAMA_LLM_MODEL / OLLAMA_EMBEDDING_MODEL /
  OLLAMA_EMBEDDING_DIM env vars with sensible defaults
  (llama3.2 + nomic-embed-text / 768-dim)
@LarFii LarFii merged commit 8b622b8 into HKUDS:main Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants