Skip to content

Reranker context size too small for longer document chunks #314

@prateekjain

Description

@prateekjain

Problem

When using qmd query on a collection with longer document chunks (meeting transcripts), the Qwen3 reranker crashes with:

Error: The input lengths of some of the given documents exceed the context size.
Try to increase the context size to at least 2339 or use another model that supports longer contexts.

The hardcoded RERANK_CONTEXT_SIZE = 2048 in dist/llm.js:464 is not enough for documents where chunk + query + template overhead exceeds 2048 tokens.

Environment

  • QMD version: latest (npm @tobilu/qmd)
  • Collection: ~1072 markdown files (meeting transcripts with YAML frontmatter)
  • Embedding chunk size: 900 tokens (default)
  • Node: v23.5.0, macOS, Apple M1 Pro

Suggested Fix

Either:

  1. Increase the default RERANK_CONTEXT_SIZE to 4096 (memory goes from ~960MB to ~1.8GB per context, still very manageable)
  2. Make it configurable via CLI flag or config file (e.g., --rerank-context-size)
  3. Gracefully handle oversized chunks — skip or truncate documents that exceed the context size instead of crashing

Workaround

I'm patching llm.js locally:

// line 464
static RERANK_CONTEXT_SIZE = 4096;  // was 2048

This resolves the crash with no noticeable performance impact on an M1 Pro with 16GB RAM.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions