Skip to content

fix(search): clamp inf/nan scores from vector search to prevent JSON serialization failure#824

Merged
qin-ctx merged 1 commit intovolcengine:mainfrom
a1461750564:fix/inf-score-json-serde
Mar 21, 2026
Merged

fix(search): clamp inf/nan scores from vector search to prevent JSON serialization failure#824
qin-ctx merged 1 commit intovolcengine:mainfrom
a1461750564:fix/inf-score-json-serde

Conversation

@a1461750564
Copy link
Copy Markdown
Contributor

Problem

/api/v1/search/find and /api/v1/search/search return 500 INTERNAL with:

ValueError: Out of range float values are not JSON compliant: inf

This happens when local vector search returns inf scores (e.g., zero vectors, embedding overflow), which get passed through to the API response. Starlette's json.dumps(allow_nan=False) rejects inf/nan.

Root Cause

In hierarchical_retriever.py, _convert_to_matched_contexts() reads _score from vector search results and blends it with hotness score. When the source score is inf, the blended final_score is also inf, causing JSON serialization to fail.

Fix

Two changes:

  1. openviking/retrieve/hierarchical_retriever.py: Clamp semantic_score and final_score to 0.0 when math.isfinite() returns False

  2. openviking/server/routers/search.py: Add _sanitize_floats() recursive sanitizer as defense-in-depth on find and search endpoints

Environment

  • OpenViking: v0.2.9
  • Local vectorDB (sqlite-vec)
  • Local embedding: Ollama qwen3-embedding:0.6b (1024d)

Reproduction

curl -X POST http://localhost:1933/api/v1/search/find   -H 'Content-Type: application/json'   -d '{"query": "test", "limit": 3}'
# Before: {"status":"error","error":{"message":"Out of range float values are not JSON compliant: inf"}}
# After:  {"status":"ok","result":{"memories":[{"score":0.0802,...}]}}

…serialization failure

When local vector search returns inf scores (e.g., zero vectors or
embedding overflow), the hierarchical retriever passes them through to
the API response. FastAPI/Starlette's JSON encoder rejects inf/nan with:
  ValueError: Out of range float values are not JSON compliant: inf

Fix:
1. hierarchical_retriever.py: clamp semantic_score and final_score to 0.0
   when math.isfinite() returns False
2. search.py: add _sanitize_floats() as a defense-in-depth layer on the
   find and search endpoints

Closes #inf-score
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 20, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions
Copy link
Copy Markdown

Failed to generate code suggestions for PR

@qin-ctx
Copy link
Copy Markdown
Collaborator

qin-ctx commented Mar 21, 2026

Thanks for this PR! The fix itself looks reasonable, but we would like to dig deeper into the root cause to determine whether additional safeguards are needed at the embedding pipeline level. A few questions:

  1. Discovery: How did you encounter this issue? Was it sporadic during normal usage, or consistently reproducible with specific operations?
  2. Embedding model: You mentioned using Ollama qwen3-embedding:0.6b (1024d). Have you always used this model, or did you switch from a different embedding model/provider at some point? If you switched, did you rebuild the vector index afterward?
  3. Source of inf: Were you able to identify which specific resources/memories produced inf scores? For example, do any of them have all-zero embedding vectors, or could there be a dimension mismatch between the embeddings and the index?
  4. Reproduction scope: Does every query trigger this, or only specific ones? Can you reproduce it with a fresh database and newly ingested content?

This context will help us understand whether the inf originates from the embedding model itself, a dimension mismatch after switching models, or a bug in the vector ingestion pipeline.

@qin-ctx qin-ctx self-assigned this Mar 21, 2026
@a1461750564
Copy link
Copy Markdown
Contributor Author

Thanks for the review! Here are answers to your questions:

1. Discovery:
It was sporadic during normal usage — not every query triggered it. I first noticed it when doing memory search operations on committed session memories. It seemed to happen intermittently rather than consistently.

2. Embedding model:
I have been using qwen3-embedding:0.6b consistently since setting up OpenViking. The vector index was built from scratch with this model. No switching occurred.

3. Source of inf:
I was not able to pinpoint the exact resources producing inf scores. My suspicion is that very short text content (like single-word memories or extremely short session summaries) might produce embeddings that are close to zero, leading to anomalous similarity scores during vector search. I did not check the raw embedding vectors in sqlite-vec to confirm this.

4. Reproduction scope:
Not every query triggers this. I would estimate it happens in roughly 1 in 20-30 search queries. It does not seem related to a fresh vs. existing database — both seem susceptible if the query matches certain content patterns.

Additional context:
My setup uses Ollama with qwen3-embedding:0.6b for embedding and qwen2.5:7b for summarization, all running locally on a RTX 4060. The inf scores appear to originate in the blend scoring step in hierarchical_retriever.py rather than at the raw vector retrieval level.

Would it help if I tried to capture a specific inf-producing query and its associated vector data from sqlite-vec?

@qin-ctx
Copy link
Copy Markdown
Collaborator

qin-ctx commented Mar 21, 2026

Thanks for the detailed answers! This is very helpful.

Since the model was never switched and the issue is sporadic with short text content, the most likely root cause is near-zero embedding vectors leading to inf during cosine similarity computation. Your two-layer defense approach (clamping at retriever + sanitizing at API) is a solid fix for the immediate problem.

Let's go ahead and merge this. If you're able to reproduce the issue and capture the specific query along with its associated vector data from sqlite-vec, that would be great for a follow-up investigation into the embedding pipeline — but it's not a blocker for this PR.

Thanks for the contribution!

@qin-ctx qin-ctx merged commit d1f8423 into volcengine:main Mar 21, 2026
1 of 2 checks passed
@github-project-automation github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 21, 2026
zeattacker pushed a commit to zeattacker/OpenViking that referenced this pull request Mar 21, 2026
…serialization failure (volcengine#824)

When local vector search returns inf scores (e.g., zero vectors or
embedding overflow), the hierarchical retriever passes them through to
the API response. FastAPI/Starlette's JSON encoder rejects inf/nan with:
  ValueError: Out of range float values are not JSON compliant: inf

Fix:
1. hierarchical_retriever.py: clamp semantic_score and final_score to 0.0
   when math.isfinite() returns False
2. search.py: add _sanitize_floats() as a defense-in-depth layer on the
   find and search endpoints

Closes #inf-score

Co-authored-by: a1461750564 <a1461750564@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants