ArcadeDB version
Observed on Docker images:
arcadedata/arcadedb:26.3.2
arcadedata/arcadedb:26.4.1-SNAPSHOT
arcadedata/arcadedb:26.4.2
Environment
- Host OS: Windows 10
- Architecture: x86_64
- Deployment: Docker
- ArcadeDB endpoint: HTTP
/api/v1/command/arcade
- Request mode matches ArcadeDB Studio:
language: opencypher
serializer: studio
- Differential comparison target: Neo4j Docker
neo4j:latest
Describe the bug
When MERGE creates a relationship pattern whose endpoint variable is still unbound and specified only by label, ArcadeDB may incorrectly reuse an already existing node with that label.
In Neo4j, a query like:
MERGE (p)-[:WORKS_AT]->(c:Company)
with c unbound does not mean "match any existing Company".
It means the whole pattern should be merged.
If no such pattern exists, Neo4j creates a fresh Company node for c.
ArcadeDB instead appears to bind c to an existing Company node, even though c was never matched explicitly.
To Reproduce
Setup:
CREATE (:Person {name:'Alice'}),
(:Company {name:'TechCorp', industry:'Technology'});
Query:
MATCH (p:Person {name:'Alice'})
MERGE (p)-[r:WORKS_AT {since: 2020}]->(c:Company)
WITH c
MATCH (company:Company)
RETURN c.name AS merged_company_name,
c.industry AS merged_industry,
count(company) AS total_companies;
Expected behavior
Neo4j creates a new label-only Company node for the unbound endpoint c.
So the merged endpoint has no properties, and there are now two Company nodes total.
Observed Neo4j result:
merged_company_name = null
merged_industry = null
total_companies = 2
Actual behavior
ArcadeDB reuses the already existing TechCorp node:
merged_company_name = TechCorp
merged_industry = Technology
total_companies = 1
So the unbound label-only endpoint is treated as if it matched an arbitrary existing Company, rather than being created as part of the merged pattern.
Control cases
If no Company exists yet, both engines create a fresh label-only endpoint:
CREATE (:Person {name:'Alice'});
MATCH (p:Person {name:'Alice'})
MERGE (p)-[r:WORKS_AT {since: 2020}]->(c:Company)
WITH c
MATCH (company:Company)
RETURN c.name AS merged_company_name,
count(company) AS total_companies;
Observed result on Neo4j and ArcadeDB:
merged_company_name = null
total_companies = 1
So ArcadeDB is capable of creating the endpoint when no existing candidate node is present.
If the endpoint is explicitly bound first, both engines correctly reuse it:
MATCH (p:Person {name:'Alice'}), (c:Company {name:'TechCorp'})
MERGE (p)-[r:WORKS_AT {since: 2020}]->(c)
WITH c
MATCH (company:Company)
RETURN c.name AS merged_company_name,
c.industry AS merged_industry,
count(company) AS total_companies;
Observed result on Neo4j and ArcadeDB:
merged_company_name = TechCorp
merged_industry = Technology
total_companies = 1
So the problem is not with reusing an already matched endpoint.
It is specifically with an endpoint variable that is still unbound inside the MERGE pattern.
With multiple existing Company nodes, Neo4j still creates a fresh one, while ArcadeDB still reuses an existing one:
CREATE (:Person {name:'Alice'}),
(:Company {name:'TechCorp', industry:'Technology'}),
(:Company {name:'DataInc', industry:'Analytics'});
MATCH (p:Person {name:'Alice'})
MERGE (p)-[r:WORKS_AT {since: 2020}]->(c:Company)
WITH c
MATCH (company:Company)
RETURN c.name AS merged_company_name,
c.industry AS merged_industry,
count(company) AS total_companies;
Observed Neo4j result:
merged_company_name = null
merged_industry = null
total_companies = 3
Observed ArcadeDB result:
merged_company_name = TechCorp
merged_industry = Technology
total_companies = 2
This makes the boundary much clearer:
- no existing endpoint: both engines create one
- explicitly bound endpoint: both engines reuse it
- unbound label-only endpoint with existing candidates: ArcadeDB reuses an existing node where Neo4j creates a fresh one
ArcadeDB version
Observed on Docker images:
arcadedata/arcadedb:26.3.2arcadedata/arcadedb:26.4.1-SNAPSHOTarcadedata/arcadedb:26.4.2Environment
/api/v1/command/arcadelanguage: opencypherserializer: studioneo4j:latestDescribe the bug
When
MERGEcreates a relationship pattern whose endpoint variable is still unbound and specified only by label, ArcadeDB may incorrectly reuse an already existing node with that label.In Neo4j, a query like:
with
cunbound does not mean "match any existingCompany".It means the whole pattern should be merged.
If no such pattern exists, Neo4j creates a fresh
Companynode forc.ArcadeDB instead appears to bind
cto an existingCompanynode, even thoughcwas never matched explicitly.To Reproduce
Setup:
Query:
Expected behavior
Neo4j creates a new label-only
Companynode for the unbound endpointc.So the merged endpoint has no properties, and there are now two
Companynodes total.Observed Neo4j result:
Actual behavior
ArcadeDB reuses the already existing
TechCorpnode:So the unbound label-only endpoint is treated as if it matched an arbitrary existing
Company, rather than being created as part of the merged pattern.Control cases
If no
Companyexists yet, both engines create a fresh label-only endpoint:Observed result on Neo4j and ArcadeDB:
So ArcadeDB is capable of creating the endpoint when no existing candidate node is present.
If the endpoint is explicitly bound first, both engines correctly reuse it:
Observed result on Neo4j and ArcadeDB:
So the problem is not with reusing an already matched endpoint.
It is specifically with an endpoint variable that is still unbound inside the
MERGEpattern.With multiple existing
Companynodes, Neo4j still creates a fresh one, while ArcadeDB still reuses an existing one:Observed Neo4j result:
Observed ArcadeDB result:
This makes the boundary much clearer: