💫 EraRAG: Efficient and Incremental Retrieval-Augmented Generation for Growing Corpora

If you like our project, please give us a star ⭐ on GitHub for the latest update.

EraRAG is a novel hierarchical graph construction framework that supports dynamic updates through localized selective re-partitioning, enabling efficient and scalable retrieval with strong static accuracy and stable performance under corpus changes.

💫 Key Features

Accuracy: Achieves state-of-the-art performance across diverse question types, including multi-hop and long-document QA tasks.
Efficiency: Significantly reduces both graph construction time and token consumption compared to existing RAG baselines.
Incremental Updates: Supports fast and efficient integration of new documents without requiring global tree reconstruction, enabling dynamic corpus adaptation.

🚀 Get Start

EraRAG and controled baselines are built on the unified framework proposed by In-depth study of graphrag. Requirements.txt is included to help get you started. To run EraRAG, use the following command:

python main.py -opt <Method>.yaml -dataset_name <Datasetname> -external_tree <External tree path> -root <rootname> -query <wether to query>

On default, EraRAG will treat the input corpus as new corpus and enforce a global reconstruction. To make a insertion to a existing tree, set Dynamic.yaml key parameters as follows.

force: False
add: True

🧰 Experimental Settings

We have incorporated several baseline methods and benchmark datasets:

Baseline	Paper	Code
GraphRAG	From Local to Global: A Graph RAG Approach to Query-Focused Summarization	graphrag
LightRAG	LightRAG: Simple and Fast Retrieval-Augmented Generation	LightRAG
HippoRAG	HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models	HippoRAG
RAPTOR	RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval	raptor

⚙️ Experimental Results

Our proposed EraRAG framework achieves significant retrieval performance against state of the art graph-based RAG frameworks.

Thanks to the proposed selective reconstruction mechanism, EraRAG is able to perform fast insertions on evolving corpora, surpassing benchmarks on time and token cost reduction.

Acknowledgements

We acknowledge these excellent works for providing open-source code: GraphRAG, RAPTOR, LightRAG, HippoRAG, In-depth study of graphrag.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
Config		Config
Core		Core
Data		Data
Option		Option
figures		figures
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💫 EraRAG: Efficient and Incremental Retrieval-Augmented Generation for Growing Corpora

If you like our project, please give us a star ⭐ on GitHub for the latest update.

💫 Key Features

🚀 Get Start

🧰 Experimental Settings

⚙️ Experimental Results

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

💫 EraRAG: Efficient and Incremental Retrieval-Augmented Generation for Growing Corpora

If you like our project, please give us a star ⭐ on GitHub for the latest update.

💫 Key Features

🚀 Get Start

🧰 Experimental Settings

⚙️ Experimental Results

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages