This page provides an introduction to ArcadeDB, explaining its purpose as a multi-model database management system, its key architectural principles, and how its major components work together. For detailed information on specific subsystems, see the relevant sub-pages: Architecture covers the modular design, Database Engine details the core storage and query processing, Server Infrastructure describes the HTTP API and clustering, Client Interfaces explains connection options, and Development and Operations covers building and deploying ArcadeDB.
ArcadeDB is a Multi-Model Database Management System that unifies multiple data paradigmsāgraph, document, key-value, time-series, vector embeddings, and geospatialāinto a single, high-performance engine. Created by Luca Garulli (founder of OrientDB) and written from scratch in Java 21+, ArcadeDB is designed for extreme performance through "Low Level Java" programming techniques that minimize garbage collection overhead and maximize mechanical sympathy.
The system is fully transactional with ACID guarantees, supporting MVCC isolation, Write-Ahead Logging (WAL), and distributed replication with leader-based quorum consensus.
Sources: README.md67-76 README.md74-86 pom.xml25-32
ArcadeDB stores all data in a unified record-based format, allowing vertices, documents, key-value pairs, time-series data, and vector embeddings to coexist in the same database and participate in the same queries. This eliminates the impedance mismatch typically encountered when integrating disparate database technologies.
| Data Model | Description | Primary Use Cases |
|---|---|---|
| Graph | Native vertices and edges with bidirectional links | Social networks, fraud detection, knowledge graphs |
| Document | Schemaless JSON documents with type definitions | Content management, catalogs, flexible schemas |
| Key/Value | Fast exact-match lookups by primary key | Caching, session storage, configuration |
| Time Series | Columnar storage with compression | IoT sensor data, metrics, monitoring |
| Vector | Similarity search with HNSW indexing | AI/ML embeddings, semantic search |
| Geospatial | Spatial indexing and proximity queries | Location services, routing, GIS |
Sources: README.md77-86
ArcadeDB natively understands five query languages, all executing against the same underlying engine:
| Language | Description | Entry Point |
|---|---|---|
| SQL | Extended OrientDB SQL with graph extensions | com.arcadedb.query.sql.parser (ANTLR4-generated) |
| Cypher | OpenCypher with native and legacy execution paths | com.arcadedb.query.opencypher package |
| Gremlin | Apache TinkerPop 3.7.x graph traversal | com.arcadedb.gremlin.ArcadeGraph |
| GraphQL | Schema-first GraphQL queries | com.arcadedb.graphql package |
| MongoDB QL | Subset of MongoDB query operators | com.arcadedb.mongodbw wire protocol translator |
Sources: README.md88-94 engine/pom.xml73-95 gremlin/pom.xml102-138
ArcadeDB is structured as a Maven multi-module project with 22 modules organized into logical layers:
Core modules (pom.xml126-148):
arcadedb-engine: Storage engine, schema, transactions, SQL/Cypher parsersarcadedb-network: RemoteDatabase HTTP client, binary protocolarcadedb-server: HttpServer (Netty/Undertow), ServerSecurity, ReplicatedDatabasearcadedb-console: Interactive/batch Console CLIWire protocol modules (shaded JARs for plugin isolation):
arcadedb-mongodbw: MongoDB wire protocol adapterarcadedb-postgresw: PostgreSQL wire protocol (JDBC-compatible)arcadedb-redisw: Redis RESP protocol subsetarcadedb-bolt: Neo4j Bolt protocol for Cypher queriesarcadedb-grpcw: gRPC service implementationUI and monitoring:
arcadedb-studio: React-based web UI with Cytoscape.js graph visualizationarcadedb-metrics: Prometheus-compatible metrics endpointSources: pom.xml126-148 server/pom.xml70-81 mongodbw/pom.xml31-33 postgresw/pom.xml32-34 gremlin/pom.xml32-34 package/pom.xml145-261
ArcadeDB supports three primary deployment architectures, each implemented by a different Database interface implementation:
Embedded deployment uses LocalDatabase (engine/src/main/java/com/arcadedb/database/LocalDatabase.java) for direct file access with zero network overhead. The application directly instantiates DatabaseFactory and manages the database lifecycle.
Remote deployment uses RemoteDatabase (network/src/main/java/com/arcadedb/remote/RemoteDatabase.java) as an HTTP/JSON client that communicates with HttpServer (server/src/main/java/com/arcadedb/server/http/HttpServer.java). The server wraps databases in ServerDatabase to prevent unsafe operations like drop() or close() from remote clients.
Replicated deployment uses ReplicatedDatabase (server/src/main/java/com/arcadedb/server/ReplicatedDatabase.java) which wraps LocalDatabase and coordinates with HAServer (server/src/main/java/com/arcadedb/server/ha/HAServer.java) for leader election and quorum-based replication.
Sources: network/src/test/java/com/arcadedb/remote/RemoteDatabaseTest.java41-51 Diagram 3 from system architecture
All data in ArcadeDBāregardless of model (graph vertex, document, time-series point)āis stored as a record identified by a Record ID (RID) in the format #bucket:position. Records are grouped into buckets (files on disk), which are the unit of parallelism for queries and transactions.
Key classes:
RID: Record identifier (engine/src/main/java/com/arcadedb/database/RID.java)MutableDocument, ImmutableVertex, etc.: Record implementationsBucket: Physical storage file containing recordsPageManager: Page cache and async I/O coordinatorArcadeDB uses a two-phase commit protocol with MVCC (Multi-Version Concurrency Control):
Phase 1 (commit1stPhase()) acquires locks, validates that no other transaction has modified the same pages (optimistic locking), and prepares a WAL entry.
Phase 2 (commit2ndPhase()) writes to the WAL, updates in-memory page cache, queues pages for asynchronous disk flush, and releases locks. This ensures durability while maintaining high throughput through async I/O.
Sources: Diagram 4 from system architecture, engine/pom.xml1-237
Each query language has a dedicated parser that produces an execution plan:
| Language | Parser Technology | Output Format | Execution |
|---|---|---|---|
| SQL | ANTLR4 grammar (engine/src/main/antlr4/com/arcadedb/query/sql/parser/SQLGrammar.g4) | SelectStatement AST | Step-based pipeline |
| Cypher (native) | ANTLR4 OpenCypher grammar | PhysicalPlan (cost-based optimizer) | Operator tree |
| Cypher (legacy) | OpenCypher parser | MatchStatement AST | Step-based (compatible) |
| Gremlin | TinkerPop bytecode | GraphTraversal | Strategy-based execution |
| GraphQL | GraphQL schema + SDL | Field resolvers | Direct DB access |
All query languages ultimately execute steps against the same TransactionContext, which provides MVCC-isolated reads from the page cache and index lookups. This unified execution layer allows cross-language features like SQL functions called from Cypher or Gremlin traversals accessing SQL-created indexes.
Sources: Diagram 2 from system architecture, engine/pom.xml73-95
ArcadeDB's behavior is controlled by GlobalConfiguration (engine/src/main/java/com/arcadedb/GlobalConfiguration.java), an enum with 150+ settings. Configuration sources are resolved in this order (highest priority first):
-Darcadedb.server.rootPassword=...)arcadedb_server_rootPassword)System.setProperty())config/server-configuration.json)GlobalConfiguration enum)Configuration profiles provide preset tunings for different scenarios:
default: Balanced settings for general usehigh-performance: Maximizes throughput (larger caches, more threads)low-ram: Minimizes memory footprint for constrained environmentslow-cpu: Reduces CPU usage (fewer threads, less parallelism)The ArcadeDBServer class (server/src/main/java/com/arcadedb/server/ArcadeDBServer.java) orchestrates server startup:
ServerSecurity from config/server-users.jsonl and config/server-groups.jsonHttpServer (Undertow) on port 2480 (default)HAServer for replication (if configured)plugins/ directory (MongoDB, Postgres, Redis, Bolt, gRPC)Sources: Diagram 7 from system architecture, server/pom.xml69-136
ArcadeDB releases include four pre-built distribution packages, assembled by package/pom.xml42-143:
| Variant | Included Modules | Size | Use Case |
|---|---|---|---|
| full | All modules (engine, server, console, studio, gremlin, graphql, all wire protocols, metrics) | ~200 MB | Development, evaluation, full-featured deployments |
| minimal | Excludes gremlin, redisw, mongodbw, graphql | ~150 MB | Production with SQL/Cypher only |
| headless | Excludes gremlin, redisw, mongodbw, graphql, studio | ~140 MB | Server-only deployments without web UI |
| base | Engine, server, network only (no console, studio, wire protocols, metrics) | ~100 MB | Minimal footprint, embedded or custom integrations |
The Maven Shade Plugin (pom.xml375-424) creates self-contained JARs for wire protocol modules, isolating dependencies to prevent classpath conflicts. For example, arcadedb-gremlin-shaded.jar bundles Apache TinkerPop dependencies without affecting other modules.
Custom distributions can be built using the Custom Package Builder script:
Sources: README.md152-168 package/pom.xml42-143 pom.xml375-424
| Layer | Technologies | Purpose |
|---|---|---|
| Core Language | Java 21+, jdk.incubator.vector (SIMD) | High-performance runtime with advanced JVM features |
| Build System | Maven 3.x, parent POM with 22 modules | Multi-module project management |
| Parsing | ANTLR4 (SQL, Cypher), JavaCC (legacy) | Query language lexing and parsing |
| Query Engines | Custom step-based execution, Apache TinkerPop 3.7.x, OpenCypher translator | Polyglot query support |
| Storage | Custom page-based engine with LZ4 compression, LSM-Tree indexes, Write-Ahead Log | Low-level file I/O with MVCC |
| Indexing | LSM-Tree (range queries), Hash (exact match), Lucene (full-text), JVector (similarity search) | Multi-index strategy |
| Networking | Undertow HTTP server, Netty (wire protocols), gRPC (Protocol Buffers) | Client-server communication |
| Web UI | React, Cytoscape.js, Webpack | Interactive graph visualization and query editor |
| Scripting | GraalVM Truffle (JavaScript, Python, Ruby) | Embedded polyglot scripting |
| Monitoring | Micrometer, Prometheus export, JMX | Metrics and observability |
| Testing | JUnit 5, Cucumber (OpenCypher TCK), Testcontainers | Unit, integration, and E2E tests |
| Packaging | Maven Assembly, Docker multi-platform builds | Distribution creation |
Sources: pom.xml48-104 engine/pom.xml36-50 server/pom.xml36-42 gremlin/pom.xml36-46
The fastest way to explore ArcadeDB is using the official Docker image:
Access the web UI at http://localhost:2480 (default credentials: root / playwithdata).
Sources: README.md131-148 network/src/test/java/com/arcadedb/remote/RemoteDatabaseTest.java42-56
Refresh this wiki
This wiki was recently refreshed. Please wait 6 days to refresh again.