fix: Cypher -> SET n:Label on same node across row fanout must be idempotent#4025
fix: Cypher -> SET n:Label on same node across row fanout must be idempotent#4025
Conversation
…mpotent When the same node appears on multiple rows (e.g. cartesian MATCH), repeated SET n:Label crashed: row 1 deleted the old vertex and created a new one with a fresh RID; row 2 still held the deleted vertex and failed with "Record not found" or "Duplicated key". Fix: add a labelReplacements map (Map<RID,Vertex>) to the ResultSet anonymous class in SetStep.syncPull(), alongside the existing writtenDocs map. On each call to applyLabels, look up the incoming vertex's RID. If a prior replacement exists, redirect the row's variable to the already-replaced vertex and let the existing type-equality check return early (idempotent). Otherwise proceed with the normal delete+create flow and record old RID → new vertex for future rows. Fixes #4017 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 8 |
🟢 Coverage 77.27% diff coverage · -8.63% coverage variation
Metric Results Coverage variation ✅ -8.63% coverage variation Diff coverage ✅ 77.27% diff coverage Coverage variation details
Coverable lines Covered lines Coverage Common ancestor commit (9a5b0e1) 120132 88786 73.91% Head commit (48d049a) 151423 (+31291) 98839 (+10053) 65.27% (-8.63%) Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch:
<coverage of head commit> - <coverage of common ancestor commit>Diff coverage details
Coverable lines Covered lines Diff coverage Pull request (#4025) 22 17 77.27% Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified:
<covered lines added or modified>/<coverable lines added or modified> * 100%
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Code Review - PR #4025: Cypher SET n:Label idempotency on row fanoutOverviewThis PR fixes a real crash bug (issue #4017) where The approach is sound and consistent with the existing CorrectnessLooks good:
Potential issue - interaction between If a SET clause combines a property assignment and a label assignment on the same variable in that order - e.g.
The PR description only covers pure label-fanout. If the combined case is also broken before this PR, that should be noted; if it was working before, it now regresses. A test or explicit exclusion comment would help clarify this. The safest fix would be to remove Test CoverageGood:
Minor - inaccurate javadoc comment in test: // The cartesian MATCH produces Alice twice and Charlie twice (age > peer)Alice only qualifies once (age 30 > Bob's 25). The cartesian product yields the pairs Missing test - edge preservation under fanout: Code QualityComment cleanup is good: The old comment explaining why label SET was not covered by Consistent with existing style: The new map follows the same anonymous-ResultSet-field pattern as Minor - private void applyLabels(final SetClause.SetItem item, final Result result,
final Map<RID, Vertex> labelReplacements) {The two-line signature is fine given the project's PerformanceThe Summary
Overall this is a solid, targeted fix. Recommend addressing the test comment and considering a note on the 🤖 Generated with Claude Code |
There was a problem hiding this comment.
Code Review
This pull request introduces a labelReplacements map to SetStep to track vertex replacements occurring during label updates, which is necessary because ArcadeDB replaces the vertex (changing its RID) when labels are modified. This ensures that subsequent rows in a fanout operation correctly reference the new vertex instance. The feedback highlights the need to pre-resolve all vertex aliases in a row to prevent property updates from failing on deleted records and recommends a recursive lookup in the replacement map to handle scenarios where a vertex is replaced multiple times.
| if (setClause == null || setClause.isEmpty()) | ||
| return; |
There was a problem hiding this comment.
The labelReplacements map should be used to pre-resolve all vertex aliases in the current row before processing any SetItem. This is critical because other SET operations (like property updates) rely on resolveLatestDoc, which currently only checks writtenDocs. If a node was replaced by a label change in a previous row, its RID has changed, and resolveLatestDoc will fail to find the latest state or attempt to modify a deleted record. By pre-resolving all aliases at the start of the row, we ensure that all subsequent operations (properties, maps, and labels) observe the correct, latest vertex instance.
if (setClause == null || setClause.isEmpty())
return;
if (!labelReplacements.isEmpty()) {
for (final String propName : result.getPropertyNames()) {
final Object propValue = result.getProperty(propName);
if (propValue instanceof Vertex v) {
Vertex replacement = labelReplacements.get(v.getIdentity());
if (replacement != null) {
Vertex next;
while ((next = labelReplacements.get(replacement.getIdentity())) != null)
replacement = next;
((ResultInternal) result).setProperty(propName, replacement);
}
}
}
}There was a problem hiding this comment.
Implemented. Added the pre-resolution block at the start of applySetOperations (before the transaction begin) that walks all result properties, looks up any Vertex value whose RID appears in labelReplacements, and chain-traverses to the current head. This ensures resolveLatestDoc in property/map SET operations always sees the live vertex rather than the deleted original, even when upstream steps re-deliver the old RID on a new row. Commit: 48d049a.
| final Vertex prior = labelReplacements.get(originalRid); | ||
| if (prior != null) { | ||
| propagateUpdateToSameNodeAliases(result, vertex, prior); | ||
| vertex = prior; | ||
| } |
There was a problem hiding this comment.
The lookup in labelReplacements must be recursive to handle cases where a node is replaced multiple times within the same query (e.g., multiple label changes across different rows). Without recursion, if a node was replaced from RID A to B, and then from B to C, a later row seeing RID A would only redirect to the deleted RID B, causing a crash when attempting to load its labels. Additionally, if the pre-resolution logic is added to applySetOperations, this block becomes redundant for the initial variable resolution but still serves as a safety check for multiple label updates in the same row.
| final Vertex prior = labelReplacements.get(originalRid); | |
| if (prior != null) { | |
| propagateUpdateToSameNodeAliases(result, vertex, prior); | |
| vertex = prior; | |
| } | |
| Vertex prior = labelReplacements.get(originalRid); | |
| if (prior != null) { | |
| Vertex next; | |
| while ((next = labelReplacements.get(prior.getIdentity())) != null) | |
| prior = next; | |
| propagateUpdateToSameNodeAliases(result, vertex, prior); | |
| vertex = prior; | |
| } |
There was a problem hiding this comment.
Implemented the chain traversal as suggested. The loop follows prior.getIdentity() through labelReplacements until it finds the head, so a vertex replaced A->B->C across rows is always resolved to C. Same pattern also applied inside the pre-resolution block added to applySetOperations. Commit: 48d049a.
… writtenDocs entry on label replace, add edge-preservation fanout test - Fix inaccurate javadoc in setLabelIdempotentOnSameNodeAcrossRowFanout: Alice qualifies for one pair (Alice>Bob), not twice - Pass writtenDocs into applyLabels so the stale MutableDocument entry for the old RID is removed after vertex replacement, preventing subsequent rows from reading the replaced document's pre-replacement state - Add setLabelPreservesEdgesAcrossRowFanout test: verifies that edges copied during the first label-replacement are still present after the second fanout hit redirects to the already-replaced vertex Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… in SetStep - At the start of each row in applySetOperations, walk labelReplacements to update any result property that still holds a deleted (old-RID) vertex so that property-SET operations via resolveLatestDoc observe the live vertex - In applyLabels, follow the replacement chain recursively so multi-step replacements (A->B->C) are resolved to the current head regardless of how many label changes have been applied in prior rows Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Code Review -
|
Code Review - PR #4025:
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4025 +/- ##
==========================================
- Coverage 64.91% 64.24% -0.67%
==========================================
Files 1597 1597
Lines 120132 120168 +36
Branches 25580 25587 +7
==========================================
- Hits 77981 77204 -777
- Misses 31377 32296 +919
+ Partials 10774 10668 -106 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Summary
Fixes #4017
SET n:Labelcrashed when the same node appeared on multiple rows (e.g. cartesianMATCH (n:Person) MATCH (m:Person) WHERE n.age > m.age SET n:Employee). Row 1 deleted the old vertex and created a new one with a fresh RID; row 2 still held the deleted vertex reference and failed with "Record not found" or "Duplicated key".labelReplacements: Map<RID, Vertex>field to theResultSetanonymous class inSetStep.syncPull(), alongside the existingwrittenDocsmap. On each call toapplyLabels, if the incoming vertex's RID was already replaced on a prior row, the current row's variable is redirected to the already-replaced vertex and the existing type-equality check returns early (idempotent). Otherwise the normal delete+create flow runs and the replacement is recorded for future rows.Test plan
setLabelIdempotentOnSameNodeAcrossRowFanout- exact reproducer from the issue (single label, cartesian MATCH)setMultipleLabelsIdempotentOnSameNodeAcrossRowFanout- stronger case with two labels (SET n:Employee:Checked)OpenCypher*suite: 5406 tests, 0 failures, 0 regressions🤖 Generated with Claude Code