Skip to content

fix: remove misdirected SemanticMsg enqueue causing O(n²) reprocessing (#505)#5

Open
chethanuk wants to merge 1 commit intomainfrom
fix/505-quadratic-semantic-reprocessing
Open

fix: remove misdirected SemanticMsg enqueue causing O(n²) reprocessing (#505)#5
chethanuk wants to merge 1 commit intomainfrom
fix/505-quadratic-semantic-reprocessing

Conversation

@chethanuk
Copy link
Copy Markdown
Owner

@chethanuk chethanuk commented Mar 19, 2026

Summary

  • Removed misdirected SemanticMsg enqueue in session.py (lines 309-324): After _flush_semantic_operations() already enqueued correct messages with change-tracking dicts, commit_async() was enqueuing a second SemanticMsg targeting the session URI (not a memory directory) with changes=None
  • Hardened cache loading in semantic_processor.py: _process_memory_directory() now always loads cached summaries from .overview.md regardless of msg.changes value (defense-in-depth against other changes=None callers like lock_manager.py and summarizer.py)
  • Added 2 regression tests verifying both the enqueue removal and the cache guard

Problem

On every commit, session.py:309-324 enqueued a SemanticMsg with changes=None after the compressor had already correctly enqueued messages with proper change-tracking dicts. This misdirected message bypassed the cache guard in _process_memory_directory() (which only loaded cached summaries when msg.changes was truthy), causing unconditional VLM API calls for every memory file — O(n²) reprocessing. At 500 memories this wasted 100K+ tokens/day.

Fix

  1. session.py: Removed 16-line block that enqueued the misdirected SemanticMsg — the compressor's _flush_semantic_operations() already handles this correctly
  2. semantic_processor.py: Removed if msg.changes: guard on cache loading so .overview.md is always consulted

Testing

  • 2 regression tests added (test_fix_505_duplicate_semantic_enqueue.py)
    • test_no_misdirected_semantic_enqueue_after_flush — verifies commit_async() doesn't enqueue after compressor flush
    • test_process_memory_directory_loads_cache_when_changes_none — verifies cache is loaded even when msg.changes is None
  • Both tests confirmed RED before fix, GREEN after fix (TDD cycle)
  • Full test suite: no regressions (pre-existing config errors unrelated)

Impact

~98.7% token reduction at 500 memories (100K+ tokens/day saved)

Closes volcengine#505

Summary by CodeRabbit

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical performance issue where semantic processing was being triggered redundantly on every commit, leading to an O(n²) reprocessing problem and significant token waste. The fix addresses a misdirected SemanticMsg enqueue in session.py that bypassed cache guards, and hardens the cache loading logic in semantic_processor.py to ensure .overview.md is always consulted. This change drastically reduces unnecessary VLM API calls, resulting in substantial token savings and improved system efficiency.

Highlights

  • Redundant Semantic Message Enqueue Removed: Eliminated a misdirected SemanticMsg enqueue operation in session.py that was causing duplicate processing and unnecessary VLM API calls.
  • Hardened Semantic Processor Cache Loading: Strengthened cache loading in semantic_processor.py to always read .overview.md for existing summaries, even when no explicit changes are present, preventing O(n²) reprocessing.
  • Regression Tests Added: Introduced two new regression tests to verify the removal of the duplicate enqueue and the robustness of the cache loading mechanism, ensuring the fix is effective.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a56b4d6-7db0-44bb-8d3a-2a87b8cea188

📥 Commits

Reviewing files that changed from the base of the PR and between 0e43e40 and 0fd6da0.

📒 Files selected for processing (3)
  • openviking/session/session.py
  • openviking/storage/queuefs/semantic_processor.py
  • tests/session/test_fix_505_duplicate_semantic_enqueue.py
💤 Files with no reviewable changes (1)
  • openviking/session/session.py
✅ Files skipped from review due to trivial changes (1)
  • tests/session/test_fix_505_duplicate_semantic_enqueue.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • openviking/storage/queuefs/semantic_processor.py

📝 Walkthrough

Walkthrough

Session commit no longer enqueues a semantic processing message after AGFS and relation writes; the semantic processor now always attempts to load an existing /.overview.md for cached summaries regardless of msg.changes; two regression tests verify no duplicate semantic enqueue and cached-summary behavior when changes is None.

Changes

Cohort / File(s) Summary
Session commit simplification
openviking/session/session.py
Removed creation/enqueue of a SemanticMsg after _write_to_agfs_async() and _write_relations_async(); commit now marks redo_log task done, updates active counts, and returns results without additional semantic enqueue.
Semantic processor caching
openviking/storage/queuefs/semantic_processor.py
Adjusted _process_memory_directory to always attempt reading/parsing f"{dir_uri}/.overview.md" (unconditional), allowing cached summaries to be loaded even when msg.changes is absent.
Regression tests
tests/session/test_fix_505_duplicate_semantic_enqueue.py
Added two async tests: one ensures Session.commit_async() does not enqueue a semantic message after flush; the other verifies _process_memory_directory uses cached .overview.md when changes is None, avoiding per-file VLM calls.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Session
participant AGFS as AGFS (archive)
participant Relations as RelationWriter
participant Redo as RedoLog
participant SemanticQueue as SemanticQueue

Session->>AGFS: _write_to_agfs_async(messages)
Session->>Relations: _write_relations_async()
Session->>Redo: mark_done(task_id)
Session->>Redo: update_active_counts()
Note right of Session: Commit returns result/stats
%% No SemanticQueue enqueue in new flow

mermaid
sequenceDiagram
participant SemanticProcessor
participant FS as QueueFS
participant VLM as VLM/Generator

SemanticProcessor->>FS: ls(dir)
SemanticProcessor->>FS: read_file(dir + '/.overview.md')
alt overview present
    SemanticProcessor->>SemanticProcessor: parse_overview_md() (use cached summaries)
else overview missing
    FS->>SemanticProcessor: list files
    SemanticProcessor->>VLM: _generate_single_file_summary(file) (per-file)
    SemanticProcessor->>SemanticProcessor: aggregate -> write .overview.md
end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A hop, a skip, one enqueue gone—
No double queues to wander on.
Cached notes tucked safe and neat,
Fewer calls, a lighter beat.
🥕✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: removing a misdirected SemanticMsg enqueue that was causing O(n²) reprocessing, and references the issue number. It directly correlates with the primary changes in session.py and the testing added.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/505-quadratic-semantic-reprocessing
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

Failed to generate code suggestions for PR

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the O(n²) reprocessing issue by removing the redundant SemanticMsg enqueue in session.py and hardening the cache loading logic in semantic_processor.py. The addition of two regression tests, which were confirmed RED before the fix and GREEN after, provides excellent coverage and confidence in the solution. The changes significantly improve efficiency by reducing unnecessary VLM API calls, leading to substantial token savings.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/session/test_fix_505_duplicate_semantic_enqueue.py (1)

66-127: Test correctly validates cache loading when changes is None.

The test structure effectively exercises the fix. Consider adding an explicit assertion that read_file was called for the cached overview to strengthen verification:

💡 Optional: Add explicit verification that cache was read
     await processor._process_memory_directory(msg)

+    # Verify .overview.md was read (cache loading occurred)
+    mock_fs.read_file.assert_any_call(
+        "viking://test/memories/dir1/.overview.md", ctx=processor._current_ctx
+    )
+
     # With cache loaded, NO VLM calls should be made for unchanged files
     assert mock_generate_summary.await_count == 0, (
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/session/test_fix_505_duplicate_semantic_enqueue.py` around lines 66 -
127, Add an explicit assertion that the cached overview was read to strengthen
the test: after calling SemanticProcessor._process_memory_directory in
test_process_memory_directory_loads_cache_when_changes_none(), assert that
mock_fs.read_file was awaited exactly once (e.g., check
mock_fs.read_file.await_count == 1 or use AsyncMock's
assert_awaited/assert_awaited_once) so the test verifies the cache read behavior
in addition to no VLM calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/session/test_fix_505_duplicate_semantic_enqueue.py`:
- Around line 66-127: Add an explicit assertion that the cached overview was
read to strengthen the test: after calling
SemanticProcessor._process_memory_directory in
test_process_memory_directory_loads_cache_when_changes_none(), assert that
mock_fs.read_file was awaited exactly once (e.g., check
mock_fs.read_file.await_count == 1 or use AsyncMock's
assert_awaited/assert_awaited_once) so the test verifies the cache read behavior
in addition to no VLM calls.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: da92346a-6383-48e1-935e-2d01cd687b44

📥 Commits

Reviewing files that changed from the base of the PR and between 9d59d6b and 0e43e40.

📒 Files selected for processing (3)
  • openviking/session/session.py
  • openviking/storage/queuefs/semantic_processor.py
  • tests/session/test_fix_505_duplicate_semantic_enqueue.py
💤 Files with no reviewable changes (1)
  • openviking/session/session.py

volcengine#505)

On every commit, session.py enqueued a SemanticMsg targeting the session
URI (not a memory directory) with changes=None after the compressor had
already enqueued correct messages with change-tracking dicts. This
misdirected message bypassed the cache guard in _process_memory_directory()
(which only loaded cached summaries when msg.changes was truthy), causing
unconditional VLM API calls for every memory file — O(n²) reprocessing.

Fix:
- Remove the misdirected enqueue block in session.py (lines 309-324)
- Harden cache loading in semantic_processor.py to always consult cached
  summaries regardless of msg.changes value (defense-in-depth)

Impact: ~98.7% token reduction at 500 memories (100K+ tokens/day saved)
@chethanuk chethanuk force-pushed the fix/505-quadratic-semantic-reprocessing branch from 0e43e40 to 0fd6da0 Compare March 19, 2026 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory extraction triggers O(n²) semantic reprocessing — token cost grows quadratically with memory count

1 participant