Problem
When using qmd query on a collection with longer document chunks (meeting transcripts), the Qwen3 reranker crashes with:
Error: The input lengths of some of the given documents exceed the context size.
Try to increase the context size to at least 2339 or use another model that supports longer contexts.
The hardcoded RERANK_CONTEXT_SIZE = 2048 in dist/llm.js:464 is not enough for documents where chunk + query + template overhead exceeds 2048 tokens.
Environment
- QMD version: latest (npm
@tobilu/qmd)
- Collection: ~1072 markdown files (meeting transcripts with YAML frontmatter)
- Embedding chunk size: 900 tokens (default)
- Node: v23.5.0, macOS, Apple M1 Pro
Suggested Fix
Either:
- Increase the default
RERANK_CONTEXT_SIZE to 4096 (memory goes from ~960MB to ~1.8GB per context, still very manageable)
- Make it configurable via CLI flag or config file (e.g.,
--rerank-context-size)
- Gracefully handle oversized chunks — skip or truncate documents that exceed the context size instead of crashing
Workaround
I'm patching llm.js locally:
// line 464
static RERANK_CONTEXT_SIZE = 4096; // was 2048
This resolves the crash with no noticeable performance impact on an M1 Pro with 16GB RAM.
Problem
When using
qmd queryon a collection with longer document chunks (meeting transcripts), the Qwen3 reranker crashes with:The hardcoded
RERANK_CONTEXT_SIZE = 2048indist/llm.js:464is not enough for documents where chunk + query + template overhead exceeds 2048 tokens.Environment
@tobilu/qmd)Suggested Fix
Either:
RERANK_CONTEXT_SIZEto 4096 (memory goes from ~960MB to ~1.8GB per context, still very manageable)--rerank-context-size)Workaround
I'm patching
llm.jslocally:This resolves the crash with no noticeable performance impact on an M1 Pro with 16GB RAM.