Skip to content

[1/4] Multi-hop query: Mutation + multi-get query#226

Merged
em3s merged 7 commits intomainfrom
feat/214-mutation
Apr 1, 2026
Merged

[1/4] Multi-hop query: Mutation + multi-get query#226
em3s merged 7 commits intomainfrom
feat/214-mutation

Conversation

@zipdoki
Copy link
Copy Markdown
Contributor

@zipdoki zipdoki commented Mar 26, 2026

Part of #214

Summary

Introduces EdgeCache, a wide row structure that stores multiple edges per source in a single HBase row. Adds mutation support (CREATE/DELETE/UPDATE) and a seek API for range queries via ColumnRangeFilter.

Data Model

[EdgeCache - Wide Row]
Row: hash | directed source | tableCode | EDGE_CACHE | direction | cacheCode
Qualifier: cacheValues (indexField + order) | directed target
Value: version + properties

Unlike EdgeIndex (narrow row: one row per edge), EdgeCache stores multiple edges as qualifiers within a single row. Supports range queries via ColumnRangeFilter.

Test Plan

Test Coverage
EdgeCacheRecordMapperTest Cache record encode/decode round-trip (key, qualifier, value)
EdgeMutationBuilderTest Cache record generation in mutation pipeline (CREATE/DELETE/UPDATE)
V2BackedTableBindingTest EdgeCache Put/Delete HBase mutation conversion
EdgeCacheQueryE2ETest Seek API end-to-end: OUT/IN direction, cursor-based pagination, MultiEdge support

Run: ./gradlew test

Introduces a wide row structure that stores multiple edges per source in a single HBase row, enabling efficient multi-get queries via ColumnRangeFilter.

- EdgeCacheRecord with encoder/decoder
- Mutation pipeline support (CREATE/DELETE/UPDATE)
- Seek API with cursor-based pagination
@zipdoki zipdoki requested a review from em3s March 26, 2026 06:18
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Mar 26, 2026
offset?.let {
val decoded = CryptoUtils.decodeAndDecryptUrlSafe(it)
decoded + 0x00.toByte()
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@em3s
Used qualifier + 0x00 to express "start after" instead of ColumnRangeFilter's minColumnInclusive=false.

// current: byte-append trick
val offsetNext = decoded + 0x00.toByte()
ColumnRangeFilter(min, minColumnInclusive = true, max, maxColumnInclusive = false)

// alternative: use native exclusive flag
val offsetNext = decoded
ColumnRangeFilter(min, minColumnInclusive = false, max, maxColumnInclusive = false)

The byte-append keeps hbaseGetWideRow simpler (no inclusive params), but the native flag avoids byte-level assumptions. Open to either approach.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for exclusive flag.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@em3s
Done. Switched to native exclusive flag and removed the byte-append trick.

val nextOffset =
if (hasNext) {
results.lastOrNull()?.qualifier?.let {
CryptoUtils.encryptAndEncodeUrlSafe(it)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@em3s
Reusing v2's CryptoUtils for offset encoding, same as the existing scan path. This will need updating once #221 lands.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, use CryptoUtils and will address in a separate task.

@zipdoki zipdoki mentioned this pull request Mar 26, 2026
5 tasks
@zipdoki
Copy link
Copy Markdown
Contributor Author

zipdoki commented Mar 26, 2026

@em3s
Aware this is a large diff — sorry about that. Keeping it as one PR for now to make the overall design easier to review. Plan to split into two when merging:

  1. Core + Mutation — data model, codec, mutation pipeline, metadata
  2. Seek API — query endpoint, controller, E2E tests

Happy to split sooner if preferred.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
em3s
em3s previously requested changes Mar 27, 2026
val nextOffset =
if (hasNext) {
results.lastOrNull()?.qualifier?.let {
CryptoUtils.encryptAndEncodeUrlSafe(it)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, use CryptoUtils and will address in a separate task.


while (buffer.hasRemaining()) {
val propertyHashKey: Int = buffer.getValue()
val propertyValue: Any? = buffer.getValueOrNull()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decodeValue in EdgeIndexRecordMapper uses buffer.getValue() — why the different approach here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@em3s
Nullable properties end up as null in HBase (specialStateValueToNull() converts __UNSET__null before write).

e.g. schema has receivedFrom (nullable=true), mutation omits it → stored as null in EdgeIndex value.

V2 decoder handles this fine (ValueUtils.deserialize() returns null), but V3 EdgeIndexRecordMapper.decodeValue() uses getValue() which throws. Should use getValueOrNull() like EdgeCacheRecordMapper.

import org.junit.jupiter.api.Test

/**
* EdgeCache (Wide Row) HBase layout:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just EdgeCache (Wide Row) layout:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@em3s
Done.

@em3s
Copy link
Copy Markdown
Contributor

em3s commented Mar 27, 2026

@zipdoki

Large changes OK. This PR is not production-facing, so I suggest we merge first and iterate.
Just resolve the above reviews please.

zipdoki and others added 5 commits March 30, 2026 13:55
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Null properties are stored as "__UNSET__", not raw null.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace qualifier + 0x00 byte trick with ColumnRangeFilter's
minColumnInclusive=false for "start after" semantics.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@zipdoki zipdoki self-assigned this Apr 1, 2026
@zipdoki
Copy link
Copy Markdown
Contributor Author

zipdoki commented Apr 1, 2026

@em3s
Thanks for the review! I've addressed the requested changes in the latest commits. Please take another look when you have a chance.

@em3s em3s requested review from em3s and removed request for em3s April 1, 2026 07:02
@em3s em3s dismissed their stale review April 1, 2026 07:02

resolved.

Copy link
Copy Markdown
Contributor

@em3s em3s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zipdoki

LGTM 👍 — let's merge and iterate.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Apr 1, 2026
@em3s em3s merged commit 5dcdb78 into main Apr 1, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants