fix(search): clamp inf/nan scores from vector search to prevent JSON serialization failure by a1461750564 · Pull Request #824 · volcengine/OpenViking

a1461750564 · 2026-03-20T12:28:25Z

Problem

/api/v1/search/find and /api/v1/search/search return 500 INTERNAL with:

ValueError: Out of range float values are not JSON compliant: inf

This happens when local vector search returns inf scores (e.g., zero vectors, embedding overflow), which get passed through to the API response. Starlette's json.dumps(allow_nan=False) rejects inf/nan.

Root Cause

In hierarchical_retriever.py, _convert_to_matched_contexts() reads _score from vector search results and blends it with hotness score. When the source score is inf, the blended final_score is also inf, causing JSON serialization to fail.

Fix

Two changes:

openviking/retrieve/hierarchical_retriever.py: Clamp semantic_score and final_score to 0.0 when math.isfinite() returns False
openviking/server/routers/search.py: Add _sanitize_floats() recursive sanitizer as defense-in-depth on find and search endpoints

Environment

OpenViking: v0.2.9
Local vectorDB (sqlite-vec)
Local embedding: Ollama qwen3-embedding:0.6b (1024d)

Reproduction

curl -X POST http://localhost:1933/api/v1/search/find   -H 'Content-Type: application/json'   -d '{"query": "test", "limit": 3}'
# Before: {"status":"error","error":{"message":"Out of range float values are not JSON compliant: inf"}}
# After:  {"status":"ok","result":{"memories":[{"score":0.0802,...}]}}

…serialization failure When local vector search returns inf scores (e.g., zero vectors or embedding overflow), the hierarchical retriever passes them through to the API response. FastAPI/Starlette's JSON encoder rejects inf/nan with: ValueError: Out of range float values are not JSON compliant: inf Fix: 1. hierarchical_retriever.py: clamp semantic_score and final_score to 0.0 when math.isfinite() returns False 2. search.py: add _sanitize_floats() as a defense-in-depth layer on the find and search endpoints Closes #inf-score

CLAassistant · 2026-03-20T12:28:43Z

All committers have signed the CLA.

github-actions · 2026-03-20T12:29:15Z

Failed to generate code suggestions for PR

qin-ctx · 2026-03-21T03:47:56Z

Thanks for this PR! The fix itself looks reasonable, but we would like to dig deeper into the root cause to determine whether additional safeguards are needed at the embedding pipeline level. A few questions:

Discovery: How did you encounter this issue? Was it sporadic during normal usage, or consistently reproducible with specific operations?
Embedding model: You mentioned using Ollama qwen3-embedding:0.6b (1024d). Have you always used this model, or did you switch from a different embedding model/provider at some point? If you switched, did you rebuild the vector index afterward?
Source of inf: Were you able to identify which specific resources/memories produced inf scores? For example, do any of them have all-zero embedding vectors, or could there be a dimension mismatch between the embeddings and the index?
Reproduction scope: Does every query trigger this, or only specific ones? Can you reproduce it with a fresh database and newly ingested content?

This context will help us understand whether the inf originates from the embedding model itself, a dimension mismatch after switching models, or a bug in the vector ingestion pipeline.

a1461750564 · 2026-03-21T06:15:49Z

Thanks for the review! Here are answers to your questions:

1. Discovery:
It was sporadic during normal usage — not every query triggered it. I first noticed it when doing memory search operations on committed session memories. It seemed to happen intermittently rather than consistently.

2. Embedding model:
I have been using qwen3-embedding:0.6b consistently since setting up OpenViking. The vector index was built from scratch with this model. No switching occurred.

3. Source of inf:
I was not able to pinpoint the exact resources producing inf scores. My suspicion is that very short text content (like single-word memories or extremely short session summaries) might produce embeddings that are close to zero, leading to anomalous similarity scores during vector search. I did not check the raw embedding vectors in sqlite-vec to confirm this.

4. Reproduction scope:
Not every query triggers this. I would estimate it happens in roughly 1 in 20-30 search queries. It does not seem related to a fresh vs. existing database — both seem susceptible if the query matches certain content patterns.

Additional context:
My setup uses Ollama with qwen3-embedding:0.6b for embedding and qwen2.5:7b for summarization, all running locally on a RTX 4060. The inf scores appear to originate in the blend scoring step in hierarchical_retriever.py rather than at the raw vector retrieval level.

Would it help if I tried to capture a specific inf-producing query and its associated vector data from sqlite-vec?

qin-ctx · 2026-03-21T06:25:27Z

Thanks for the detailed answers! This is very helpful.

Since the model was never switched and the issue is sporadic with short text content, the most likely root cause is near-zero embedding vectors leading to inf during cosine similarity computation. Your two-layer defense approach (clamping at retriever + sanitizing at API) is a solid fix for the immediate problem.

Let's go ahead and merge this. If you're able to reproduce the issue and capture the specific query along with its associated vector data from sqlite-vec, that would be great for a follow-up investigation into the embedding pipeline — but it's not a blocker for this PR.

Thanks for the contribution!

…serialization failure (volcengine#824) When local vector search returns inf scores (e.g., zero vectors or embedding overflow), the hierarchical retriever passes them through to the API response. FastAPI/Starlette's JSON encoder rejects inf/nan with: ValueError: Out of range float values are not JSON compliant: inf Fix: 1. hierarchical_retriever.py: clamp semantic_score and final_score to 0.0 when math.isfinite() returns False 2. search.py: add _sanitize_floats() as a defense-in-depth layer on the find and search endpoints Closes #inf-score Co-authored-by: a1461750564 <a1461750564@users.noreply.github.com>

github-project-automation bot added this to OpenViking project Mar 20, 2026

github-project-automation bot moved this to Backlog in OpenViking project Mar 20, 2026

qin-ctx self-assigned this Mar 21, 2026

qin-ctx approved these changes Mar 21, 2026

View reviewed changes

qin-ctx merged commit d1f8423 into volcengine:main Mar 21, 2026
1 of 2 checks passed

github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(search): clamp inf/nan scores from vector search to prevent JSON serialization failure#824

fix(search): clamp inf/nan scores from vector search to prevent JSON serialization failure#824
qin-ctx merged 1 commit intovolcengine:mainfrom
a1461750564:fix/inf-score-json-serde

a1461750564 commented Mar 20, 2026

Uh oh!

CLAassistant commented Mar 20, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 20, 2026

Uh oh!

qin-ctx commented Mar 21, 2026

Uh oh!

a1461750564 commented Mar 21, 2026

Uh oh!

qin-ctx commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

a1461750564 commented Mar 20, 2026

Problem

Root Cause

Fix

Environment

Reproduction

Uh oh!

CLAassistant commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 20, 2026

Uh oh!

qin-ctx commented Mar 21, 2026

Uh oh!

a1461750564 commented Mar 21, 2026

Uh oh!

qin-ctx commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented Mar 20, 2026 •

edited

Loading