fix: pass entity_chunks_storage and relation_chunks_storage to merge_nodes_and_edges (closes #241)#260
Closed
Abdeltoto wants to merge 1 commit into
Closed
Conversation
…nodes_and_edges (closes HKUDS#241) During multimodal ingestion, three call sites of `lightrag.operate.merge_nodes_and_edges` were missing the `entity_chunks_storage` and `relation_chunks_storage` arguments (and additionally `full_entities_storage` / `full_relations_storage` in `BaseModalProcessor._process_chunk_for_extraction`). Because these parameters default to `None`, calls succeeded silently but entity-to-chunk and relation-to-chunk mappings for multimodal entities were never persisted to `kv_store_entity_chunks.json` / `kv_store_relation_chunks.json`, degrading retrieval quality for image and table content. Forward all four storage instances from the wrapped LightRAG instance, matching the way LightRAG itself invokes the function during text ingestion. Made-with: Cursor
This was referenced Apr 22, 2026
Contributor
Author
|
Closing this PR in favor of #247 (@sjhddh) and #250 (@peterCheng123321), which were both opened a few hours before mine and which I hadn't noticed when I pushed this — apologies for the duplicate noise. For maintainers triaging the three: #250 is the most complete of the three because it also forwards Reviews left on both. Thanks for the great work, folks. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #241.
Three call sites of
lightrag.operate.merge_nodes_and_edgesin the multimodal ingestion path were not forwarding the chunk-tracking storages, and one of them was also dropping the document-level entity / relation storages:raganything/processor.py_process_multimodal_content_individualentity_chunks_storage,relation_chunks_storageraganything/processor.py_batch_merge_lightrag_style_type_awareentity_chunks_storage,relation_chunks_storageraganything/modalprocessors.pyBaseModalProcessor._process_chunk_for_extractionfull_entities_storage,full_relations_storage,entity_chunks_storage,relation_chunks_storageBecause all four parameters default to
None, the calls succeeded silently — but entity-to-chunk and relation-to-chunk mappings for multimodal entities (images, tables, equations) were never persisted tokv_store_entity_chunks.json/kv_store_relation_chunks.json. Text-only ingestion was unaffected because LightRAG itself populates those mappings during its own pipeline.What this PR does
Forwards
self.lightrag.full_entities,self.lightrag.full_relations,self.lightrag.entity_chunks, andself.lightrag.relation_chunkstomerge_nodes_and_edges, mirroring the way LightRAG invokes the function during its native text ingestion path. The diff matches the suggested fix in #241 exactly.Verification
lightrag.operate.merge_nodes_and_edgessignature (parametersentity_chunks_storage,relation_chunks_storage,full_entities_storage,full_relations_storageare all optionalBaseKVStorageslots — passing them is strictly additive).LightRAGexposes the four storages as instance attributes (self.full_entities,self.full_relations,self.entity_chunks,self.relation_chunks) at construction time inlightrag/lightrag.py.BaseModalProcessor.__init__already retains aself.lightragreference, so the new forwarding lines are safe with no constructor change.Backward compatibility
Test plan
ruff formatandruff check --ignore=E402pass on the touched files.kv_store_entity_chunks.json/kv_store_relation_chunks.jsonnow contain entries whose IDs match the multimodal chunks.Many thanks to @ashah1992 for the precise diagnosis in #241.