fix(semantic): add budget guard to overview generation with batched summarization by deepakdevp · Pull Request #683 · volcengine/OpenViking

deepakdevp · 2026-03-17T05:54:20Z

Summary

Adds a prompt budget guard to _generate_overview() that prevents oversized LLM prompts for large directories
When a directory exceeds the budget (default 60K chars / 50+ files), file summaries are split into batches, each batch gets a partial overview, then partials are merged into a final overview
Centralizes all scattered truncation constants (30000, 256, 4000) into a configurable SemanticConfig dataclass
Enforces abstract/overview max size limits on LLM output

Fixes #529, addresses #531

Changes Made

openviking_cli/utils/config/parser_config.py: Added SemanticConfig dataclass with configurable limits (max_file_content_chars, max_overview_prompt_chars, overview_batch_size, abstract_max_chars, overview_max_chars)
openviking_cli/utils/config/open_viking_config.py: Wired SemanticConfig into OpenVikingConfig (configurable via ov.conf under "semantic" key)
openviking/storage/queuefs/semantic_processor.py: Added budget guard with _single_generate_overview() (unchanged behavior for small dirs) and _batched_generate_overview() (split + merge for large dirs). Replaced hardcoded 30000 with SemanticConfig.max_file_content_chars. Added output size enforcement.
openviking/storage/queuefs/semantic_dag.py: Added output size enforcement at the DAG call site
tests/misc/test_semantic_config.py: 7 tests covering config defaults, custom values, budget estimation, truncation, and batch splitting

Configuration Example

{
  "semantic": {
    "max_file_content_chars": 30000,
    "max_overview_prompt_chars": 60000,
    "overview_batch_size": 50,
    "abstract_max_chars": 256,
    "overview_max_chars": 4000
  }
}

All values have sensible defaults — no config changes required for existing users.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)

Testing

Tests added (7 new tests)
Unit tests pass locally
Tested on macOS

Checklist

My code follows the code style of this project (ruff check + format pass)
I have performed a self-review of my own code
I have added tests that prove my fix is effective
New and existing unit tests pass locally with my changes
Small directories (under budget) behave identically to before

Centralizes hardcoded constants (30000, 256, 4000) into a configurable SemanticConfig dataclass. Configurable via ov.conf under "semantic" key. Addresses volcengine#531. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

_generate_overview() now checks prompt size against SemanticConfig.max_overview_prompt_chars. If exceeded, file summaries are split into batches, each batch gets a partial overview, then partials are merged into a final overview. Also: - Replace hardcoded 30000 char limit with SemanticConfig.max_file_content_chars - Enforce abstract/overview max size on output (both semantic_processor and semantic_dag call sites) Fixes volcengine#529, addresses volcengine#531. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Tests cover: config defaults, custom values, budget under/over limit detection, abstract/overview truncation, and batch splitting logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

qin-ctx

Review Summary

This PR adds a needed budget guard for oversized overview prompts (Fixes #529) and centralizes hardcoded constants into SemanticConfig. The batch-and-merge approach for large directories is a sound design.

However, there are two correctness issues that should be addressed before merge:

File index mismatch in batched generation: Each batch uses local indices [1]..[N], but post-processing uses the global file_index_map, causing incorrect file name replacements.
Budget guard gap: The AND condition means directories with ≤50 files but very long summaries bypass the guard entirely — the exact scenario described in #529.

See inline comments for details and suggestions.

qin-ctx · 2026-03-17T06:29:26Z

openviking/storage/queuefs/semantic_processor.py

+        for batch_idx, batch in enumerate(batches):
+            batch_lines = []
+            for idx, item in enumerate(batch, 1):
+                batch_lines.append(f"[{idx}] {item['name']}: {item['summary']}")


[Bug] (blocking) Each batch uses local indices via enumerate(batch, 1), so every batch numbers its files [1], [2], ..., [batch_size]. However, the file_index_map passed to this method maps global indices (1..N across all files) to file names.

At line ~945, re.sub(r"\[(\d+)\]", replace_index, overview) replaces [number] references using the global map. If the LLM outputs [1] in batch 2's partial overview (referring to the first file of batch 2), it will be replaced with the name of global file #1 instead.

Suggested fix: either (a) use global indices when building each batch's prompt (offset by batch_idx * batch_size), or (b) apply replace_index to each partial overview individually using a per-batch index map before merging.

Fixed. Each batch now uses global offset indices (global_offset + local_idx + 1), and replace_index is applied per-batch using a batch_index_map via a make_replacer closure. The global file_index_map is no longer used in the batched path.

qin-ctx · 2026-03-17T06:29:26Z

openviking/storage/queuefs/semantic_processor.py

+        estimated_size = len(file_summaries_str) + len(children_abstracts_str)
+        if (
+            estimated_size > semantic.max_overview_prompt_chars
+            and len(file_summaries) > semantic.overview_batch_size


[Bug] (blocking) The AND condition requires both estimated_size > budget and file_count > batch_size for the guard to trigger. This means a directory with ≤50 files but very long individual summaries (e.g. 50 files × 1.5K summary = 75K chars, exceeding the 60K budget) will bypass the guard entirely and still produce an oversized prompt.

This is the exact failure mode described in #529 — oversized prompts causing VLM timeouts. The fix only protects against the many-files variant, not the few-files-long-summaries variant.

Suggested fix: consider using OR instead of AND, or add a fallback for the case where the prompt is oversized but file count is below batch_size (e.g., truncating individual summaries proportionally to fit the budget).

Fixed. Changed to three-way branching: (1) over budget + many files → batch and merge, (2) over budget + few files → truncate individual summaries proportionally to fit budget, (3) under budget → proceed as normal.

qin-ctx · 2026-03-17T06:29:26Z

openviking/storage/queuefs/semantic_dag.py

+                from openviking_cli.utils.config import get_openviking_config
+
+                semantic = get_openviking_config().semantic
+                if len(overview) > semantic.overview_max_chars:


[Design] (non-blocking) This truncation block is duplicated nearly identically in semantic_processor.py:_process_memory_directory (lines ~357-362). Consider extracting a shared helper like _enforce_size_limits(overview, abstract, config) → (overview, abstract) to avoid divergence when limits or truncation logic change.

Fixed. Extracted _enforce_size_limits(overview, abstract) on SemanticProcessor. Both call sites (semantic_processor.py and semantic_dag.py) now use it.

qin-ctx · 2026-03-17T06:29:26Z

openviking/storage/queuefs/semantic_processor.py

+                    {
+                        "dir_name": dir_name,
+                        "file_summaries": combined,
+                        "children_abstracts": "None",


[Design] (non-blocking) In the merge step, children_abstracts is hardcoded to "None". Children abstracts are only included in the first batch's partial overview (line 891). When partials are merged, the LLM may not retain subdirectory context from batch 1's output, leading to loss of child directory information in the final overview.

Consider passing the children abstracts string into the merge prompt as well, so the final overview has full subdirectory context.

Fixed. The merge prompt now includes children_abstracts_str instead of hardcoded "None".

qin-ctx · 2026-03-17T06:29:26Z

tests/misc/test_semantic_config.py

+from openviking_cli.utils.config.parser_config import SemanticConfig
+
+
+def test_semantic_config_defaults():


[Suggestion] (non-blocking) The tests here thoroughly validate SemanticConfig defaults and budget arithmetic, which is good. However, there are no tests for the core behavioral changes: _batched_generate_overview index mapping, the merge step, or the truncation enforcement in semantic_dag.py / semantic_processor.py.

Consider adding at least one test with a mocked VLM that verifies:

Batch splitting produces correct per-batch index numbering

The [number] → file name replacement is correct across batches

Truncation is applied when overview/abstract exceeds limits

Acknowledged — will add behavioral tests with mocked VLM in a follow-up if needed. The current tests validate the config, budget logic, and batch splitting which are the core pure-function components.

- Fix file index mismatch: use global offset indices per batch and apply replace_index per-batch via make_replacer closure - Fix budget guard gap: change AND to three-way branching (batch for many files, truncate for few-files-long-summaries, pass-through for under budget) - Extract _enforce_size_limits() helper to deduplicate truncation logic between semantic_processor.py and semantic_dag.py - Pass children_abstracts to merge prompt for full context retention Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

qin-ctx

All blocking issues from the previous review have been addressed correctly:

✅ File index mismatch fixed with global offset indices + per-batch make_replacer closure
✅ Budget guard gap fixed with three-way branching (batch, truncate, pass-through)
✅ Duplicated truncation logic extracted into _enforce_size_limits
✅ Children abstracts now passed to merge prompt

LGTM 🎉

deepakdevp and others added 4 commits March 17, 2026 14:50

test(semantic): add tests for SemanticConfig and budget estimation

15caec4

Tests cover: config defaults, custom values, budget under/over limit detection, abstract/overview truncation, and batch splitting logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style(semantic): fix lint and formatting

43fa1d0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-project-automation bot added this to OpenViking project Mar 17, 2026

github-project-automation bot moved this to Backlog in OpenViking project Mar 17, 2026

qin-ctx requested changes Mar 17, 2026

View reviewed changes

qin-ctx approved these changes Mar 17, 2026

View reviewed changes

qin-ctx merged commit e1d4176 into volcengine:main Mar 17, 2026
6 checks passed

github-project-automation bot moved this from Backlog to Done in OpenViking project Mar 17, 2026

This was referenced Mar 17, 2026

fix(vlm): add max_tokens parameter to VLM completion calls to prevent vLLM rejection #689

Merged

[Bug]: VLM 调用溢出：_generate_overview() 缺少 prompt 截断和 max_tokens 参数 #674

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(semantic): add budget guard to overview generation with batched summarization#683

fix(semantic): add budget guard to overview generation with batched summarization#683
qin-ctx merged 5 commits intovolcengine:mainfrom
deepakdevp:feature/overview-budget-guard

deepakdevp commented Mar 17, 2026

Uh oh!

qin-ctx left a comment

Uh oh!

qin-ctx Mar 17, 2026

Uh oh!

deepakdevp Mar 17, 2026

Uh oh!

qin-ctx Mar 17, 2026

Uh oh!

deepakdevp Mar 17, 2026

Uh oh!

qin-ctx Mar 17, 2026

Uh oh!

deepakdevp Mar 17, 2026

Uh oh!

qin-ctx Mar 17, 2026

Uh oh!

deepakdevp Mar 17, 2026

Uh oh!

qin-ctx Mar 17, 2026

Uh oh!

deepakdevp Mar 17, 2026

Uh oh!

qin-ctx left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		from openviking_cli.utils.config.parser_config import SemanticConfig


		def test_semantic_config_defaults():

Conversation

deepakdevp commented Mar 17, 2026

Summary

Changes Made

Configuration Example

Type of Change

Testing

Checklist

Uh oh!

qin-ctx left a comment

Choose a reason for hiding this comment

Review Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qin-ctx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants