GitHub - ertugrulakben/NeuroCausal-RAG: Causality-aware RAG with multi-hop retrieval, contradiction detection, and temporal reasoning. 382 tests, 11K+ LOC. Deployed 2 months before CC-RAG (UIUC).

Causality-Aware Retrieval-Augmented Generation
Find what keyword search can't — by understanding why things are connected.

Problem • Solution • Quick Start • Search Modes • API • Benchmarks

Research Context: In June 2025, researchers at the University of Illinois Urbana-Champaign published "CC-RAG: Structured Multi-Hop Reasoning via Theme-Based Causal Graphs" — a breakthrough paper that brought causal reasoning into RAG systems. The academic world was excited: RAG could finally "understand and connect," not just "find and fetch."

We had already been doing this for two months. NeuroCausal RAG v5.0 was deployed to production in April 2025. The causal engine, multi-hop retrieval, and chain injection were already running in real enterprise environments.

This is how we work: we build for real clients first, battle-test in production, then open-source. Our personal AI system JARVIS has been alive for 5 years and operating as an autonomous agent for 3 years — months before platforms like OpenClaw existed. We plan to open-source that too.

Read more: Our 2025 AI R&D: NeuroCausal RAG, DSGMv2, and 100+ SaaS Projects

The Problem

Classic RAG systems retrieve documents by keyword similarity. Search for "stress" and you get documents containing the word "stress."

But real-world knowledge doesn't work that way:

Stress → Cortisol rises → Sleep disrupted → Attention drops → Workplace accident risk increases

If your system can't see this chain, it misses critical connections.

The academic world noticed this in June 2025 when UIUC researchers published CC-RAG, introducing causal graphs into RAG.

We deployed this to production in April 2025. Two months earlier.

The Solution

NeuroCausal RAG builds a causal knowledge graph on top of your documents and retrieves information by understanding why things are connected — not just what words they share.

	Classic RAG	NeuroCausal RAG
Retrieval	Keyword similarity	Cause-effect relationships
Search for "stress"	Documents about stress	+ Cortisol, sleep, workplace accidents
Hops	Single (1-hop)	N-degree (multi-hop)
Scoring	Vector distance	Hybrid: Similarity + Causal + PageRank
Memory	None	Persistent feedback loop
Contradictions	Ignored	Detected and flagged

Architecture

NeuroCausal RAG v6.0
├── Core Layer
│   ├── Causal Knowledge Graph (NetworkX / Neo4j)
│   ├── Multilingual Embeddings (Sentence-BERT)
│   └── Vector Index (BruteForce / FAISS / Milvus)
│
├── Search & Retrieval
│   ├── Hybrid Retriever (Similarity + Causal + Importance)
│   ├── Multi-Hop Search (N-hop path finding + bridge docs)
│   ├── Search Optimizer (6 adaptive modes)
│   └── Query Decomposer (complex → sub-queries)
│
├── Reasoning
│   ├── Contradiction Detector
│   ├── Temporal Reasoner
│   └── Entity Linker (alias resolution)
│
├── Learning
│   ├── Causal Discovery (semantic + NLI + funnel)
│   ├── Feedback Loop (RLHF)
│   └── Persistent Memory (SQLite)
│
├── Agentic RAG
│   └── LangGraph Self-Correcting Agent
│
└── API & UI
    ├── FastAPI REST API
    └── Streamlit Dashboard

Scoring Formula

Final Score = α × Similarity + β × Causal + γ × Importance

Multi-Hop Decay: hop_score = base_score × (0.7 ^ hop_distance)

Quick Start

Installation

git clone https://github.com/ertugrulakben/NeuroCausal-RAG.git
cd NeuroCausal-RAG
pip install -r requirements.txt

Basic Usage

from neurocausal_rag import NeuroCausalRAG

rag = NeuroCausalRAG()

# Add documents
rag.add_document("cement", "Cement production is responsible for 8% of global CO2 emissions.")
rag.add_document("co2", "CO2 is the primary greenhouse gas driving climate change.")
rag.add_document("warming", "Global warming causes sea level rise and extreme weather.")

# Add causal links
rag.add_causal_link("cement", "co2", "causes")
rag.add_causal_link("co2", "warming", "causes")

# Search — finds cement even though query doesn't mention it
results = rag.search("What causes global warming?")
# → Returns: co2, warming, AND cement (via causal chain)

Multi-Hop Search

from neurocausal_rag.search import create_multi_hop_retriever

retriever = create_multi_hop_retriever(graph, embedding, max_hops=3)
results = retriever.search("How does cement affect sea levels?")

# Discovered chain:
# Cement Production → CO2 Emissions → Global Warming → Sea Level Rise
explanation = retriever.explain_connection("cement", "warming")

Docker

docker-compose up -d
# API: http://localhost:8000
# UI: http://localhost:8501

Search Modes

6 preset modes for different retrieval strategies:

Mode	α (Similarity)	β (Causal)	γ (Importance)	Best For
BALANCED	0.5	0.3	0.2	General purpose
ENCYCLOPEDIA	0.7	0.2	0.1	Factual queries
DETECTIVE	0.3	0.5	0.2	Cause-effect investigation
HUB	0.3	0.2	0.5	Finding central documents
EXPLORER	0.4	0.3	0.3	Open-ended research
FACT_CHECKER	0.6	0.3	0.1	Verification tasks

from neurocausal_rag.search import create_optimizer

optimizer = create_optimizer(graph, embedding)
results = optimizer.search("Why did the bridge collapse?", mode="DETECTIVE")

API

Full REST API via FastAPI:

uvicorn neurocausal_rag.api.app:create_app --factory --host 0.0.0.0 --port 8000

Endpoints

Method	Endpoint	Description
`POST`	`/api/v1/search`	Search with causal reasoning
`POST`	`/api/v1/documents`	Add documents
`GET`	`/api/v1/documents`	List documents
`POST`	`/api/v1/documents/links`	Add causal links
`POST`	`/api/v1/agent/query`	Agentic RAG query
`POST`	`/api/v1/feedback`	Submit feedback
`POST`	`/api/v1/discovery`	Auto-discover causal links
`GET`	`/api/v1/graph/stats`	Graph statistics
`POST`	`/api/v1/graph/chain`	Get causal chain
`GET`	`/api/v1/health`	Health check

curl -X POST http://localhost:8000/api/v1/search \
  -H "Content-Type: application/json" \
  -d '{"query": "What causes global warming?", "top_k": 5, "mode": "DETECTIVE"}'

Benchmarks

Case Study: The Invisible Connection

Query: "How do greenhouse gases cause global warming?"

Metric	Classic RAG	NeuroCausal RAG
Search Time	37 ms	22 ms
Documents Found	Greenhouse effect, Gases	+ Cement Production
Causal Score	0.00	1.00
Multi-Hop	None	3-hop chain

Discovered chain:

Cement Production → CO2 Emissions → Greenhouse Gas → Global Warming

The word "cement" appears nowhere in the query — but the causal chain reveals the connection.

vs. CC-RAG (UIUC, June 2025)

Feature	CC-RAG (June 2025)	NeuroCausal RAG (April 2025)
Causal Graph	DAG structure	NetworkX + Neo4j
Multi-Hop	Theme-based chaining	N-hop + bridge documents
Bidirectional Search	Yes	Yes
Memory System	No	Persistent (SQLite)
Query Decomposition	No	Sub-query system
Contradiction Detection	No	Yes
Temporal Reasoning	No	Yes
Entity Linking	No	Yes (alias resolution)
Enterprise Ready	Academic	Production deployed
Published	June 2025	April 2025

Testing

pytest tests/ -v

# With coverage
pytest tests/ --cov=neurocausal_rag --cov-report=html

Test Distribution (v6.1)
├── Core (graph, node, edge): 35 tests
├── Search (retriever, multi_hop, optimizer, decomposer): 66 tests
├── Learning (discovery, entity, temporal, contradiction): 42 tests
├── Memory: 24 tests
├── Integration: 20 tests
├── API Routes: 58 tests
├── Config Validation: 70 tests
├── LLM Client: 34 tests
└── Imports & Exports: 33 tests
─────────────────────────────
Total: 382 tests, 0 failures

Project Structure

neurocausal_rag/
├── core/           # Graph engine, nodes, edges
├── embedding/      # Sentence-BERT multilingual
├── search/         # Retriever, multi-hop, optimizer, decomposer
├── learning/       # Causal discovery, feedback, pipeline
├── entity/         # Entity linking, NER
├── reasoning/      # Contradiction detection, temporal reasoning
├── memory/         # Persistent memory store
├── agents/         # LangGraph agentic RAG
├── api/            # FastAPI REST endpoints
├── llm/            # LLM client (OpenAI)
├── visualization/  # Graph visualization (PyVis)
└── ui/             # Streamlit components

Configuration

cp .env.example .env
# Set your API keys in .env

Roadmap

Built By

Ertugrul Akben — AI & Systems Strategist

License

MIT

Because knowing "what" is not enough — you need to know "why."

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
examples		examples
neurocausal_rag		neurocausal_rag
pages		pages
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
config.yaml		config.yaml
cover.png		cover.png
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Problem

The Solution

Architecture

Scoring Formula

Quick Start

Installation

Basic Usage

Multi-Hop Search

Docker

Search Modes

API

Endpoints

Benchmarks

Case Study: The Invisible Connection

vs. CC-RAG (UIUC, June 2025)

Testing

Project Structure

Configuration

Roadmap

Built By

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The Problem

The Solution

Architecture

Scoring Formula

Quick Start

Installation

Basic Usage

Multi-Hop Search

Docker

Search Modes

API

Endpoints

Benchmarks

Case Study: The Invisible Connection

vs. CC-RAG (UIUC, June 2025)

Testing

Project Structure

Configuration

Roadmap

Built By

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages