fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate by builderjarvis · Pull Request #453 · tobi/qmd

builderjarvis · 2026-03-22T03:59:57Z

Problem

qmd query crashes on longer documents (session transcripts, CJK text, large markdown files) with:

Error: The input lengths of some of the given documents exceed the context size.
Try to increase the context size to at least 2207 or use another model that supports longer contexts.

This affects multiple reported issues: #91, #290, #291, #314

Root cause

Two compounding issues:

RERANK_CONTEXT_SIZE = 2048 is too small for documents longer than ~1600 tokens. The Qwen3 reranker template overhead is higher than estimated, so even after truncation, some chunks still exceed the context window.
RERANK_TEMPLATE_OVERHEAD = 200 underestimates the actual Qwen3 chat template overhead. Measured at ~350 tokens on real queries; truncation budgets based on 200 allow documents through that still overflow the context.

Fix

Bump RERANK_CONTEXT_SIZE default from 2048 → 4096
Make it overridable via QMD_RERANK_CONTEXT_SIZE env var for users with tighter memory budgets or very long documents
Bump RERANK_TEMPLATE_OVERHEAD from 200 → 512 so the truncation budget correctly accounts for actual template overhead

The 4096 default comfortably fits real-world long documents while staying well below the 40 960-token auto size.

…e via QMD_RERANK_CONTEXT_SIZE env var, fix RERANK_TEMPLATE_OVERHEAD underestimate 200→512 Default 2048 was too small for longer documents (session transcripts, CJK text, large markdown files). After truncation the Qwen3 reranker template adds more overhead than the original 200-token estimate, causing node-llama-cpp to throw 'input lengths exceed context size'. Fixes: tobi#91 tobi#290 tobi#291 tobi#314

Merges dev-upstream-fixes (cherry-picked PRs tobi#462, tobi#463, tobi#455, tobi#418, tobi#456, tobi#442, tobi#453) into dev. Resolved mcp/server.ts bind conflict — keep 0.0.0.0 for Docker container accessibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate

tobi merged commit 616776e into tobi:main Mar 28, 2026

jaylfc added a commit to jaylfc/qmd that referenced this pull request Apr 5, 2026

Merge pull request tobi#453 from builderjarvis/fix/rerank-context-size

4c67b50

fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate

jaylfc added a commit to jaylfc/qmd that referenced this pull request Apr 5, 2026

Merge pull request tobi#453 from builderjarvis/fix/rerank-context-size

1ec9f57

fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate

tanarchytan referenced this pull request in tanarchytan/lotl Apr 8, 2026

Merge pull request #453 from builderjarvis/fix/rerank-context-size

ed6fb7a

fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate

lucndm pushed a commit to lucndm/qmd that referenced this pull request Jun 7, 2026

Merge pull request tobi#453 from builderjarvis/fix/rerank-context-size

a0577a0

fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate#453

fix: increase RERANK_CONTEXT_SIZE default 2048→4096, configurable via env var, fix template overhead underestimate#453
tobi merged 1 commit into
tobi:mainfrom
builderjarvis:fix/rerank-context-size

builderjarvis commented Mar 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

builderjarvis commented Mar 22, 2026

Problem

Root cause

Fix

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants