#3678 feat(grpc): add GraphBatchLoad RPC for bulk graph loading by robfrank · Pull Request #3680 · ArcadeData/arcadedb

robfrank · 2026-03-18T14:43:32Z

Summary

Adds a new GraphBatchLoad client-streaming gRPC RPC that mirrors the HTTP POST /api/v1/batch/{database} endpoint (issue Batch HTTP endpoint #3675)
Introduces dedicated proto messages: GraphBatchOptions, GraphBatchRecord, GraphBatchChunk, GraphBatchResult
Supports streaming vertices and edges with temporary ID mapping and all GraphBatch tuning parameters
Includes 3 integration tests (vertex+edge, vertex-only, empty stream) — all 99 grpcw tests pass

Test plan

Proto compiles and generates Java sources (mvn generate-sources in grpc module)
grpcw module compiles cleanly (mvn clean compile)
graphBatchLoadVerticesAndEdges — streams 2 vertices + 1 edge, verifies counts and ID mapping
graphBatchLoadVerticesOnly — streams 2 vertices only, verifies 0 edges
graphBatchLoadEmptyStream — immediately completes, verifies empty result
Full grpcw test suite passes (99 tests, 0 failures)

🤖 Generated with Claude Code

gemini-code-assist · 2026-03-18T14:43:59Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the gRPC API by introducing a client-streaming endpoint for efficient bulk loading of graph data. This new capability allows users to stream large volumes of vertices and edges, including support for temporary ID resolution and configurable batching options, thereby improving performance and flexibility for graph data ingestion.

Highlights

New gRPC Endpoint: Introduced a new client-streaming gRPC RPC named GraphBatchLoad to facilitate bulk loading of graph data, mirroring the existing HTTP POST /api/v1/batch/{database} endpoint.
Protocol Buffer Definitions: Added dedicated Protocol Buffer messages (GraphBatchOptions, GraphBatchRecord, GraphBatchChunk, GraphBatchResult) to support the new batch loading functionality, defining the structure for data transfer and configuration.
Graph Batch Processing Logic: Implemented the server-side logic for GraphBatchLoad, handling streaming of vertices and edges, managing temporary ID mapping, and applying various graph batch tuning parameters.
Comprehensive Integration Tests: Included three new integration tests to validate the GraphBatchLoad RPC, covering scenarios such as loading both vertices and edges, loading only vertices, and handling an empty input stream.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

claude · 2026-03-18T14:46:02Z

Code Review

Good addition — the streaming design is sound and the integration tests cover the happy paths well. A few issues need addressing before merge:

🐛 Bug: Proto3 boolean defaults break `configureGraphBatchOptions`

Critical. In proto3, all boolean fields default to false. The fields parallel_flush, pre_allocate_edge_chunks, and bidirectional are documented as defaulting to true, but when a client sends a GraphBatchOptions message without setting these fields, they arrive as false, causing:

if (!opts.getParallelFlush())           // always true when field not set!
    builder.withParallelFlush(false);   // silently disables parallelism

A client that omits GraphBatchOptions is protected (early return), but any client that sends any option (e.g. just batch_size) will unexpectedly get parallelFlush=false, preAllocateEdgeChunks=false, and bidirectional=false.

Fix options:

Use optional bool parallel_flush = 4; (proto3 optional tracks presence) and check opts.hasParallelFlush() before applying.
Invert the semantics to negation fields: disable_parallel_flush, disable_pre_allocate, unidirectional.
Use a sentinel int32: 0=unset, 1=false, 2=true.

🐛 Bug: `resolveRef` crashes on empty `from_ref`/`to_ref`

private RID resolveRef(final String ref, ...) {
    if (ref.charAt(0) == '#') {  // StringIndexOutOfBoundsException if ref is ""

Proto3 string fields default to "". If a client sends an edge record without setting from_ref or to_ref, this throws StringIndexOutOfBoundsException, which leaks as a confusing INTERNAL error rather than a clear validation error.

if (ref == null || ref.isEmpty())
    throw new IllegalArgumentException("Edge record is missing from_ref or to_ref");

⚠️ Issue: `closeQuietly` on error may commit partial data

When onNext catches an exception, it calls resp.onError() then closeQuietly(batch). But GraphBatch.close() flushes buffered edges and connects incoming edges — partial data could be committed to the database even when returning an error to the client. Consider whether GraphBatch needs an abort()/rollback() path, or at minimum, document this as a known limitation.

⚠️ Issue: Unnecessary atomics in sequential callbacks

gRPC StreamObserver callbacks are guaranteed to be called sequentially, so AtomicBoolean inEdgePhase and AtomicReference<String> currentTypeRef don't need to be atomic. Plain boolean[] and String[] (mutable wrapper) or a small state object would have less overhead and be less misleading:

final boolean[] inEdgePhase = { false };
final String[] currentType = { null };

⚠️ Issue: `getDatabase` called with empty database name on non-first chunks

While getDatabase is only called on the first chunk (correct), the code checks batch == null to detect the first chunk. If a client accidentally sends no records in the first chunk (just options/credentials), then sends records in the second chunk, batch is already initialized — but what if the first chunk has no database field? The getDatabase("") call in chunk 1 will fail, which is fine, but this edge case might warrant a guard:

if (chunk.getDatabase().isEmpty())
    throw new IllegalArgumentException("First chunk must contain the database name");

📋 Design constraint not documented in proto

The "all vertices must come before edges" constraint is enforced in code but not documented in the proto comments. A client sending interleaved vertices and edges will get a confusing error mid-stream. Add this to the GraphBatchChunk or GraphBatchRecord comments:

// IMPORTANT: all VERTEX records must appear before any EDGE records across all chunks.
// Interleaving is not supported and will result in an error.

📋 Minor: `elapsed_ms` assertion too strict in tests

assertThat(result.getElapsedMs()).isGreaterThan(0);

This can flake for a very fast operation completing within the same millisecond. Prefer isGreaterThanOrEqualTo(0).

📋 Minor: Missing test coverage

No test for what happens when edges reference unknown temp IDs (error path)
No test for interleaved vertex/edge records in the same chunk (the constraint violation path)
No test for direct RID references in from_ref/to_ref (the #bucket:pos path)

These error paths are worth covering given the complexity of the state machine.

Positive notes

The proto message design is clean and well-structured
Flow control via call.disableAutoInboundFlowControl() + call.request(1) is the correct back-pressure pattern
flushVertexBatch batching before calling createVertices is a good performance optimization
The empty stream case is handled gracefully

gemini-code-assist

Code Review

This pull request introduces a new client-streaming gRPC RPC, GraphBatchLoad, for bulk graph loading, mirroring the HTTP POST /api/v1/batch/{database} endpoint. It includes new proto messages for options, records, chunks, and results, supporting streaming vertices and edges with temporary ID mapping and all GraphBatch tuning parameters. The changes also include integration tests to verify the functionality of the new RPC. The code introduces a new service method and associated data structures for handling graph batch loads, including vertex and edge processing, temporary ID mapping, and configuration options. The tests cover vertex+edge, vertex-only, and empty stream scenarios.

gemini-code-assist · 2026-03-18T14:49:21Z

+              counts[1]++;
+            } else {
+              if (inEdgePhase.get())
+                throw new IllegalArgumentException("Vertex record received after edges. All vertices must appear before edges");


This exception message could be improved to provide more context to the user. Consider including the database name or other relevant information to help the user identify the source of the error.

gemini-code-assist · 2026-03-18T14:49:21Z

+            }
+          }
+        } catch (final Exception e) {
+          resp.onError(Status.INTERNAL.withDescription("graphBatchLoad: " + e.getMessage()).asException());


Consider adding more context to the error message, such as the database name or the specific operation that failed, to aid in debugging.

gemini-code-assist · 2026-03-18T14:49:21Z

+          }
+        } catch (final Exception e) {
+          resp.onError(Status.INTERNAL.withDescription("graphBatchLoad: " + e.getMessage()).asException());
+          closeQuietly(batchRef.get());


Consider adding more context to the error message, such as the database name or the specific operation that failed, to aid in debugging.

gemini-code-assist · 2026-03-18T14:49:21Z

+    if (batch != null) {
+      try { batch.close(); } catch (final Exception ignored) { }
+    }


The closeQuietly method catches Exception but doesn't log it. At least log the exception at the DEBUG level to understand why the batch is failing to close.

try { batch.close(); } catch (final Exception ignored) { LogManager.instance().log(this, Level.DEBUG, "Error during GraphBatch close", ignored); }

claude · 2026-03-18T14:57:53Z

Code Review

Good addition — the feature mirrors the HTTP batch endpoint cleanly and the test coverage is solid. A few issues worth addressing before merging:

Bug: `resolveRef` passes `null` as database to `RID` constructor

// ArcadeDbGrpcService.java
return new RID(null, bucketId, position);

When null is passed, RID falls back to DatabaseContext.INSTANCE.getActiveDatabase() (thread-local). In a gRPC streaming handler there is no guarantee that the current thread has the database set in thread-local — especially in the test graphBatchLoadWithDirectRidReferences where only edges are streamed (no prior batch.createVertices() call that would set the context).

Fix: pass the database explicitly:

return new RID(dbRef.get(), bucketId, position);

Data consistency: `closeQuietly` commits partial data on error

When an exception is thrown mid-stream in onNext, the code calls resp.onError(...) then closeQuietly(batch). Since GraphBatch.close() flushes and commits whatever has been accumulated, partial data is persisted even on failure.

GraphBatch has no abort() / rollback path today, so a partial solution would be to at minimum document this behaviour prominently in the proto comment and method Javadoc so callers can plan accordingly. Longer term, adding an abort() to GraphBatch that rolls back without flushing would make the contract clean.

Spurious `call.request(1)` after sending error

} catch (final Exception e) {
    resp.onError(...);
    closeQuietly(batchRef.get());
    return;               // <-- return is in catch, but finally still runs
} finally {
    if (!cancelled.get())
        call.request(1);  // called even after resp.onError()
}

After resp.onError() the stream is terminated from the client's perspective. Requesting another message is harmless in practice, but it's semantically wrong and could log warnings in the gRPC runtime. Track the error state with a flag (alongside cancelled) and skip the request:

} finally {
    if (!cancelled.get() && !errorSent.get())
        call.request(1);
}

Dead variable `dbRef`

dbRef is set once in onNext but never read afterwards. It can be removed; the database is available via batchRef.get() internals (and the Database db local is enough for initialization).

`configureGraphBatchOptions` cannot explicitly enable `parallel_flush` / `bidirectional`

if (opts.hasParallelFlush() && !opts.getParallelFlush())
    builder.withParallelFlush(false);

The condition only fires when the field is explicitly set to false, which is correct for the common case because the builder already defaults to true. However, this means sending parallel_flush: true is a complete no-op. That is fine today but it would be worth a comment explaining the asymmetry, e.g.:

// Only need to disable: builder defaults to true; setting true explicitly is a no-op
if (opts.hasParallelFlush() && !opts.getParallelFlush())

Potential gRPC message-size breach for large batches

GraphBatchResult.id_mapping returns all temp-id → RID pairs. For a million-vertex import that is a ~50 MB string map, easily exceeding the default 4 MB gRPC message limit. Consider:

Omitting the map from the result by default and having the client query for RIDs separately if needed, or
Documenting the limit and advising callers not to use temp IDs for very large batches.

Minor

private GrpcValue sv(final String s) in the test class is used in only two tests and is just an alias for stringValue(). Remove it and call stringValue() directly to keep the test helper surface minimal.
Proto comment says credentials is "required on first chunk" but there is no enforcement. The database field is validated; it would be consistent to validate credentials is non-default as well (or update the comment to say "optional if channel-level auth is used").
graphBatchLoadWithDirectRidReferences sends only edges in the first chunk, which means batch == null on entry to onNext and the batch is initialized with inEdgePhase[0] = false. The edge records then immediately trigger the transition path. The test passes today, but it relies on correct ordering detection being deferred; a short inline comment in the test would clarify this intent.

Test coverage: good

The six tests cover the happy path, vertex-only, empty stream, unknown temp ID, ordering violation, and direct RID references. That is comprehensive. Only suggestion: add an assertion on result.getEdgesCreated() in graphBatchLoadUnknownTempIdReturnsError to confirm no edges leaked into the database (i.e., verify the partial-data concern is at least observable in tests).

codacy-production · 2026-03-18T15:18:20Z

Coverage summary from Codacy

See diff coverage on Codacy

Coverage variation	Diff coverage
✅ -8.72%	✅ 66.72%

Coverage variation details

	Coverable lines	Covered lines	Coverage
Common ancestor commit (`ca9db5a`)	110408	81751	74.04%
Head commit (`36109e0`)	141902 (+31494)	92700 (+10949)	65.33% (-8.72%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

	Coverable lines	Covered lines	Diff coverage
Pull request (#3680)	580	387	66.72%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

See your quality gate settings Change summary preferences

claude · 2026-03-18T15:27:21Z

Code Review

This PR bundles two distinct features: a new gRPC GraphBatchLoad RPC and several OpenCypher count push-down optimizations. They are unrelated and would benefit from being separate PRs, but I'll review both below.

gRPC GraphBatchLoad

Potential bug — mid-stream exception leaves transaction open

In ArcadeDbGrpcService.graphBatchLoad(), if an exception is thrown mid-stream (e.g., unknown temp ID, malformed RID), the catch block calls responseObserver.onError(...). However, it's not clear whether the database transaction started by GraphBatch is reliably rolled back in all error paths. If GraphBatch opens a transaction internally and the streaming context is aborted without a matching onCompleted, orphaned transactions could exhaust the transaction pool or leave the database in an inconsistent state. Please verify that GraphBatch is always cleaned up (e.g., via try-finally or try-with-resources).

Hardcoded batch size

The vertex flush threshold of 10_000 per type is hardcoded. Since this directly affects memory usage under load, it should be exposed as a named constant or a configurable option (e.g., GraphBatchOptions.batchSize). At minimum, extract it to a private static final int VERTEX_FLUSH_BATCH_SIZE = 10_000;.

"Vertices before edges" protocol constraint

The proto definition requires all vertex records to appear before edge records in the stream, and this is enforced server-side. This constraint is non-obvious and should be documented in the proto file itself (a comment on the GraphBatchLoad RPC or on the GraphBatchRecord message). Right now a client author has no way of knowing this unless they read the Java implementation.

GraphBatchResult missing vertex/edge counts

The HTTP POST /api/v1/batch endpoint returns detailed stats (vertices created, edges created, errors). The GraphBatchResult message currently returns only vertexCount and edgeCount with a string message. Consider adding an errorCount field or a repeated errors field so clients can distinguish partial failures.

Test coverage

The new tests (graphBatchLoadUnknownTempIdReturnsError, graphBatchLoadVertexAfterEdgeReturnsError) cover error paths, which is great. Missing:

A test for graphBatchLoadWithDirectRidReferences mixing both temp IDs and direct RIDs in the same stream.
A test verifying transaction isolation: if the stream fails mid-way, no vertices or edges should have been committed.

OpenCypher Count Push-Down Optimizations

PropertyAccessExpression — unguarded rid.asVertex()

if (variable instanceof RID rid) {
  final Object rawValue = rid.asVertex().get(propertyName);
  return convertFromStorage(rawValue);
}

rid.asVertex() can return null if the record has been deleted or if the RID points to a non-vertex record (e.g., an edge). This will throw a NullPointerException. There should be a null check:

final Vertex v = rid.asVertex();
if (v == null)
  return null;
final Object rawValue = v.get(propertyName);

Also: algorithm procedures may legitimately store edge RIDs. If an edge RID ends up in this branch, asVertex() returns null (or throws, depending on the implementation). It would be safer to use rid.getRecord() and check instanceof Vertex before casting, or handle the edge case explicitly.

CountStarJoinStep — potential long overflow in degree product

For star-join patterns, the count is computed as the product of per-arm degrees:

count *= armDegree;  // (conceptually)

For a high-degree central node (e.g., 10M+ edges per arm), the product can overflow long. Consider adding an overflow guard:

if (count > Long.MAX_VALUE / armDegree) return Long.MAX_VALUE; // or throw
count *= armDegree;

AlgoPageRank tolerance regression

The tolerance in AlgoPageRankTest was relaxed from 1e-6 to 1e-4. A 100× regression in convergence precision between the CSR fast-path and the OLTP path is a sign of a real algorithmic difference that should be investigated rather than papered over with a looser assertion. Please add a comment explaining why the two paths produce different results and whether this is acceptable for production use.

callStep.countOnlyOptimization — shared empty Result object

The countOnlyOptimization path returns N references to a single shared EmptyResult or similar object. If downstream steps mutate the result object (which some pipeline steps do), this will produce subtle corruption bugs. Either use immutable result objects, or document clearly that results from the count-only path must not be mutated.

tryOptimizeStarCountStar — missing tests for edge cases

The new optimizer path handles star-join patterns with OPTIONAL MATCH. Tests cover the happy path (starJoinQ4PatternUsesOptimizedStep, starJoinQ7OptionalMatchUsesOptimizedStep), but missing:

A query where the central node has 0 edges (the max(1, degree) for OPTIONAL MATCH arms matters here).
A star-join with 3+ arms.
A pattern that looks like a star-join but should NOT use the optimized path (regression guard).

CypherExecutionPlan — tryOptimizeStarCountStar before regular optimizer

Inserting a new fast-path before the existing optimizer dispatch is fragile. If the star-join pattern detection has a false positive (matches a query it shouldn't), it silently produces wrong results instead of falling through to the correct path. Consider adding a assert verifyCorrectness(plan) or at minimum a comment explaining the mutual exclusivity of tryOptimizeStarCountStar and the existing optimizations.

Minor / Style

RESULT_COUNT_HINT_VAR in CommandContext is a string constant used as a "magic variable" shared between procedures and CallStep. This coupling is fragile — a typo in any procedure silently disables the optimization with no warning. Consider using a typed wrapper or a dedicated interface (CountHintProvider) that procedures implement, instead of passing an untyped long through the context map.
Several test files updated only to change node.name AS name in RETURN clauses — these appear correct but it would be worth a comment in the test or a brief note explaining that this is required because the new RID lazy-loading path is now triggered.
AlgoPageRankTest tolerance relaxation should include a // TODO: comment or a linked issue if the CSR/OLTP convergence difference is a known limitation.

Summary

Area	Severity	Item
gRPC	Medium	Transaction cleanup on mid-stream error
gRPC	Low	Hardcoded batch size 10,000
gRPC	Low	Missing protocol constraint doc in proto
OpenCypher	Medium	`rid.asVertex()` NPE in `PropertyAccessExpression`
OpenCypher	Medium	`long` overflow in star-join degree product
OpenCypher	Medium	PageRank precision regression (1e-6 → 1e-4)
OpenCypher	Medium	Shared mutable result in count-only path
OpenCypher	Low	Missing edge-case tests for star-join optimizer
General	Low	`RESULT_COUNT_HINT_VAR` magic-string coupling

The gRPC feature and the OpenCypher optimizations are both well-structured and show clear design intent. Addressing the NPE in PropertyAccessExpression and verifying transaction cleanup in graphBatchLoad are the most important fixes before merge.

🤖 Generated with Claude Code

…ading Add a new GraphBatchLoad RPC to the gRPC module that exposes the same GraphBatch-based bulk loading as the HTTP POST /api/v1/batch endpoint. Supports streaming vertices and edges with temporary ID mapping, and all GraphBatch tuning parameters (batchSize, lightEdges, WAL, etc.). Closes #3678 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Use optional bool for proto fields that default to true (parallel_flush, pre_allocate_edge_chunks, bidirectional) to fix proto3 default semantics - Guard against empty from_ref/to_ref in resolveRef - Guard against empty database name on first chunk - Replace unnecessary AtomicBoolean/AtomicReference with plain arrays - Document vertex-before-edge constraint in proto comments - Add error-path tests (unknown temp ID, vertex-after-edge, direct RID refs) - Fix elapsed_ms assertion to avoid potential flake Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Pass database explicitly to RID constructor in resolveRef instead of relying on thread-local (safer for gRPC thread pool) - Skip call.request(1) after sending error to avoid spurious requests - Document partial-commit behavior of closeQuietly (by design, same as HTTP) - Document configureGraphBatchOptions asymmetry for true-default fields - Update credentials proto comment to reflect optional channel-level auth - Add message-size warning for large id_mapping responses - Remove sv() alias, use stringValue() consistently in tests - Add clarifying comment to direct RID reference test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude · 2026-03-18T16:06:14Z

test

claude · 2026-03-18T16:06:43Z

PLACEHOLDER_TO_EDIT

claude · 2026-03-18T16:07:19Z

full review here

claude · 2026-03-18T16:07:25Z

Code Review - Part 1 of 2

The prior reviews covered the main design issues well. This pass focuses on a few items not yet addressed.

Bug: Double-close risk on GraphBatch when transport error follows server error

When onNext throws (e.g. unknown temp ID), the server calls resp.onError() and then closeQuietly(batchRef.get()). The batch is now closed. However, if a transport error (client disconnect, network reset) arrives shortly after, gRPC will invoke the server-side onError callback on the request stream, which calls closeQuietly(batchRef.get()) on the already-closed batch.

GraphBatch.close() calls flush(), connectDeferredIncomingEdges(), and batchUpdateVertexHeadChunks(). If any of these are non-idempotent (edge buffer state not zeroed after close), a second call can double-commit deferred incoming edges or corrupt vertex head pointers.

Fix: null out the reference immediately after closing, so a second call is a no-op:

closeQuietly(batchRef.getAndSet(null));

Use this everywhere closeQuietly is currently called (onNext catch, onError, onCompleted catch).

claude · 2026-03-18T16:07:47Z

SPDX-License-Identifier: Apache-2.0

ArcadeDB
Copyright 2021-present Arcade Data Ltd

This product includes software developed at
Arcade Data Ltd (https://arcadedata.com/).

================================================================================

This product includes components originally developed for OrientDB, an Open
Source Multi-Model NoSQL Database.
Copyright 2010-2020 OrientDB LTD (https://orientdb.com)

ArcadeDB was created by Luca Garulli, the same founder of OrientDB, and written
from scratch after SAP's acquisition of OrientDB in 2020. ArcadeDB reuses
OrientDB's SQL engine (now heavily modified) and some utility classes.

OrientDB source code is available at: https://github.com/orientechnologies/orientdb

================================================================================

This product includes openCypher grammar files and Technology Compatibility Kit
(TCK) test suite, developed by the openCypher community.

Copyright (c) "Neo4j"
Neo4j Sweden AB [https://neo4j.com]

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

openCypher project website: https://opencypher.org/

Cypher® is a registered trademark of Neo4j Inc.

The included openCypher TCK tests contain the following attribution notice:

"This work was created by the collective efforts of the openCypher community.
Without limiting the terms of Section 6, any Derivative Work that is not
approved by the public consensus process of the openCypher Implementers Group
should not be described as 'Cypher' (and Cypher® is a registered trademark of
Neo4j Inc.) or as 'openCypher'. Extensions by implementers or prototypes or
proposals for change that have been documented or implemented should only be
described as 'implementation extensions to Cypher' or as 'proposed changes to
Cypher that are not yet approved by the openCypher community'."

The following files contain openCypher content:

engine/src/main/antlr4/com/arcadedb/query/opencypher/grammar/Cypher25Parser.g4
engine/src/main/antlr4/com/arcadedb/query/opencypher/grammar/Cypher25Lexer.g4
engine/src/test/resources/opencypher/tck/features/**/*.feature (220 test files)

This product includes software from the OpenCypher-Gremlin project.

The following file is a security-patched override of unmaintained upstream code:
gremlin/src/main/java/org/opencypher/gremlin/translation/groovy/StringTranslationUtils.java

OpenCypher-Gremlin project: https://github.com/opencypher/cypher-for-gremlin

================================================================================

This product includes software developed by The Apache Software Foundation
(https://www.apache.org/).

Apache Lucene
Copyright 2001-2024 The Apache Software Foundation

This product includes software developed by
The Apache Software Foundation (https://www.apache.org/).

Includes software from other Apache copyright owners, as follows:

ICU4J, (c) 1995-2014 International Business Machines Corporation and others
ANTLR4 grammar from C parser (BSD 3-clause license)
JFlex lexer generator (BSD license)

Apache ANTLR4
Copyright 2012-2021 The ANTLR Project

Apache TinkerPop (Gremlin)
Copyright 2015-2024 The Apache Software Foundation

Apache Commons Compress
Copyright 2002-2024 The Apache Software Foundation

Apache Commons Lang
Copyright 2001-2024 The Apache Software Foundation

Apache Commons Codec
Copyright 2002-2024 The Apache Software Foundation

Apache HttpComponents
Copyright 1999-2024 The Apache Software Foundation

================================================================================

This product includes software developed by GraalVM (Oracle).

GraalVM SDK, Truffle API, and JavaScript implementation
Copyright (c) 2012, 2024, Oracle and/or its affiliates

https://www.graalvm.org/

================================================================================

This product includes software developed by Google Inc.

Protocol Buffers
Copyright 2008 Google Inc.

Guava Libraries
Copyright (C) 2010-2024 The Guava Authors

Gson
Copyright 2008-2024 Google Inc.

gRPC
Copyright 2015-2024 The gRPC Authors

================================================================================

This product includes software developed by Red Hat, Inc.

Undertow
Copyright 2013 Red Hat, Inc., and individual contributors

https://undertow.io/

================================================================================

This product includes software developed by the Netty project.

Netty
Copyright 2014 The Netty Project

https://netty.io/

================================================================================

This product includes JUnit 5 test framework.

JUnit 5
Copyright 2015-2024 the original author or authors

https://junit.org/junit5/

================================================================================

This product includes Testcontainers framework.

Testcontainers
Copyright (c) 2015-2024 Richard North and other authors

https://testcontainers.com/

================================================================================

This product includes software developed by FasterXML, LLC.

Jackson JSON processor
Copyright (c) 2007-2024 Tatu Saloranta and other contributors

https://github.com/FasterXML/jackson

================================================================================

For the complete list of third-party components and their licenses, see the
ATTRIBUTIONS.md file in the root directory of this distribution.

claude · 2026-03-18T16:08:16Z

Code Review test

claude · 2026-03-18T16:08:34Z

SPDX-License-Identifier: Apache-2.0

ArcadeDB
Copyright 2021-present Arcade Data Ltd

This product includes software developed at
Arcade Data Ltd (https://arcadedata.com/).

================================================================================

This product includes components originally developed for OrientDB, an Open
Source Multi-Model NoSQL Database.
Copyright 2010-2020 OrientDB LTD (https://orientdb.com)

ArcadeDB was created by Luca Garulli, the same founder of OrientDB, and written
from scratch after SAP's acquisition of OrientDB in 2020. ArcadeDB reuses
OrientDB's SQL engine (now heavily modified) and some utility classes.

OrientDB source code is available at: https://github.com/orientechnologies/orientdb

================================================================================

This product includes openCypher grammar files and Technology Compatibility Kit
(TCK) test suite, developed by the openCypher community.

Copyright (c) "Neo4j"
Neo4j Sweden AB [https://neo4j.com]

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

openCypher project website: https://opencypher.org/

Cypher® is a registered trademark of Neo4j Inc.

The included openCypher TCK tests contain the following attribution notice:

"This work was created by the collective efforts of the openCypher community.
Without limiting the terms of Section 6, any Derivative Work that is not
approved by the public consensus process of the openCypher Implementers Group
should not be described as 'Cypher' (and Cypher® is a registered trademark of
Neo4j Inc.) or as 'openCypher'. Extensions by implementers or prototypes or
proposals for change that have been documented or implemented should only be
described as 'implementation extensions to Cypher' or as 'proposed changes to
Cypher that are not yet approved by the openCypher community'."

The following files contain openCypher content:

engine/src/main/antlr4/com/arcadedb/query/opencypher/grammar/Cypher25Parser.g4
engine/src/main/antlr4/com/arcadedb/query/opencypher/grammar/Cypher25Lexer.g4
engine/src/test/resources/opencypher/tck/features/**/*.feature (220 test files)

This product includes software from the OpenCypher-Gremlin project.

The following file is a security-patched override of unmaintained upstream code:
gremlin/src/main/java/org/opencypher/gremlin/translation/groovy/StringTranslationUtils.java

OpenCypher-Gremlin project: https://github.com/opencypher/cypher-for-gremlin

================================================================================

This product includes software developed by The Apache Software Foundation
(https://www.apache.org/).

Apache Lucene
Copyright 2001-2024 The Apache Software Foundation

This product includes software developed by
The Apache Software Foundation (https://www.apache.org/).

Includes software from other Apache copyright owners, as follows:

ICU4J, (c) 1995-2014 International Business Machines Corporation and others
ANTLR4 grammar from C parser (BSD 3-clause license)
JFlex lexer generator (BSD license)