Skip to content

[Bug]: Memory commit triggers oversized embedding input → unhandled exception hangs uvicorn #686

@laofahai

Description

@laofahai

Summary

After memory extraction via session.commit(), the semantic processor generates .overview.md for the parent memory directory. When this overview text exceeds the embedding model's context length, OpenAIDenseEmbedder.embed() raises an unhandled RuntimeError. This exception appears to block the uvicorn event loop, causing the entire HTTP server to become unresponsive (process alive, port open, but all endpoints hang).

Environment

  • OpenViking: installed via pipx (latest as of 2026-03-17)
  • OS: macOS arm64 (Darwin 25.3.0)
  • Embedding: Ollama nomic-embed-text (8192 token context, 768 dim) via OpenAI-compatible API
  • VLM: Bailian qwen3-max
  • Mode: local, bound to 127.0.0.1:1933
  • Integration: OpenClaw memory-openviking plugin

Steps to Reproduce

  1. Accumulate enough memories in a directory (e.g. viking://user/default/memories/preferences)
  2. Trigger a new memory extraction (e.g. via session commit from OpenClaw)
  3. Memory extractor writes the new memory file successfully
  4. Semantic processor runs on the parent directory (recursive=False)
  5. Generated .overview.md aggregates all file summaries → text exceeds embedding model's token limit
  6. Embedding queue calls OpenAIDenseEmbedder.embed() with the oversized text
  7. Ollama returns HTTP 400: the input length exceeds the context length
  8. embed() raises RuntimeErrorcollection_schemas.py:on_dequeue propagates the exception
  9. Server hangs: all HTTP endpoints stop responding, curl times out

Relevant Logs

INFO - Processing semantic generation for: viking://user/default/memories/preferences (recursive=False)
WARNING - Candidate data is None for label index 4 (label: ...), skipping.
INFO - Created memory file: viking://user/default/memories/preferences/mem_04c2ef28-...md
INFO - Enqueued memory for vectorization

stderr:

openai.BadRequestError: Error code: 400 - {'error': {'message': 'the input length exceeds the context length', ...}}
RuntimeError: OpenAI API error: Error code: 400 - ...

After this error, no further log output appears and all HTTP requests time out.

Root Cause Analysis

Two issues combine:

  1. No input truncation guard in OpenAIDenseEmbedder.embed() (openai_embedders.py): text is passed directly to the API without any length check. When the embedding model has a limited context window, oversized input causes a hard API error.

  2. Unhandled exception in embedding queue blocks uvicorn: the RuntimeError from the embedder propagates through collection_schemas.py:on_dequeue and appears to block or crash the async event loop, making the entire server unresponsive.

Expected Behavior

  • Embedding input should be truncated (or chunked) before being sent to the provider
  • Embedding failures should be caught gracefully without blocking the HTTP server
  • The server should remain responsive even if individual vectorization tasks fail

Workaround

Monkey-patched OpenAIDenseEmbedder.embed() to truncate input to 24000 chars (~6000-8000 tokens) before calling the API. Server remains stable after the patch.

Related Issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions