Demonstrates ArcadeDB's multi-model capabilities by implementing a Graph RAG (Retrieval-Augmented Generation) system that unifies three retrieval signals in a single database:
- Graph traversal — multi-hop entity bridging via knowledge graph relationships
- Vector similarity — semantic chunk retrieval using embeddings
- Full-text indexing — keyword-based content lookup
- Docker and Docker Compose
curlandjq- Java 21+ and Maven 3.x (for the Java demos)
docker compose up -d./setup.shThis creates the GraphRAG database, applies the schema, and inserts sample data.
./queries/queries.shcd java
mvn package -q
java -jar target/graph-rag.jarcd langchain4j
mvn package -q
# Embedding store: ingest + similarity search
java -jar target/graph-rag-langchain4j.jar
# Content retriever: semantic search + graph expansion
java -cp target/graph-rag-langchain4j.jar com.arcadedb.examples.GraphRAGContentRetriever| Type | Kind | Key properties |
|---|---|---|
Chunk |
Vertex | content, source, chunkIndex, embedding |
Entity |
Vertex | name |
Person |
Vertex (extends Entity) | name |
Concept |
Vertex (extends Entity) | name |
Organization |
Vertex (extends Entity) | name |
MENTIONS |
Edge | Chunk → Entity |
RELATES_TO |
Edge | Entity → Entity |
WORKS_AT |
Edge | Person → Organization |
AUTHORED |
Edge | Person → Chunk |
| # | Pattern | Language | Signal type |
|---|---|---|---|
| 1 | Hybrid Vector + Graph | SQL | Vector + Graph |
| 2 | Multi-Hop Entity Bridge | Cypher | Graph |
| 3 | Latest Chunk Per Document | Cypher | Graph |
| 4 | Composite Scoring | SQL | Vector + Graph |
| 5 | Agentic RAG Steps | Mixed | Multi-signal |
- 8 chunks from 4 internal documents with 4D embeddings
- 11 entities (4 persons, 4 concepts, 3 organizations)
- ~25 edges (MENTIONS, RELATES_TO, WORKS_AT, AUTHORED)
- Multi-hop design: querying "Vector Search" bridges to GraphRAG docs via shared entity mentions
The langchain4j/ directory contains two standalone examples using LangChain4j
with ArcadeDB via the Neo4j Bolt protocol:
- GraphRAGEmbeddingStore — ingests text chunks with real 384D embeddings (AllMiniLmL6V2), stores via Cypher over Bolt, and performs similarity search using LangChain4j's cosine similarity
- GraphRAGContentRetriever — re-embeds the sample Chunk data with 384D vectors, runs semantic search, then enriches results with graph context via Cypher traversal (entities mentioned by top matches)
No external API keys required — the embedding model runs in-process.
This use case targets ArcadeDB 26.3.1. Vector similarity queries use
vectorNeighbors('IndexName[property]', vector, k) with an LSM_VECTOR
index. The Bolt protocol (port 7687) enables Neo4j driver compatibility.