Skip to content

fix(#4453): prevent silent sample loss on concurrent single-row time series INSERTs#4455

Merged
robfrank merged 6 commits into
mainfrom
fix/4453-timeseries-concurrent-insert-lost-update
Jun 2, 2026
Merged

fix(#4453): prevent silent sample loss on concurrent single-row time series INSERTs#4455
robfrank merged 6 commits into
mainfrom
fix/4453-timeseries-concurrent-insert-lost-update

Conversation

@robfrank

@robfrank robfrank commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Closes #4453

Summary

The silent sample loss on concurrent single-row time series INSERTs is caused by a single bug: a compaction Phase 4 race in TimeSeriesShard.

Compaction Phase 4 race (TimeSeriesShard)

Phase 4a read the last partial page (phase4aPageCount) as a snapshot under the write lock. Phase 4b ran lock-free, allowing concurrent appendSamples calls to add more samples to that same page. Phase 4c only read pages after phase4aPageCount, so those Phase-4b samples were lost when clearDataPages() reset the logical count to 0.

Fix: Phase 4a no longer reads the last partial page (condition changed from phase4aPageCount > lastFullPage to > lastFullPage + 1, reading [lastFullPage+1, phase4aPageCount-1]). Phase 4c re-reads phase4aPageCount (with its fully up-to-date count, including Phase-4b appends) plus any new pages, all under the write lock.

Note on the dropped executor-dispatch change

An earlier revision also dispatched TimeSeriesEngine.appendSamples to the shard executor (to make the per-row append a top-level transaction). That change was dropped: it is unnecessary (the compaction fix alone fixes the loss, verified by the A/B below) and it moves the append out of the caller's transaction, which under HA reorders shard page versions on the Raft log. appendSamples now runs on the caller's thread and nests into the enclosing command transaction, so the page writes ship as a single in-order transaction.

A/B on TimeSeriesConcurrentInsertTest (48 threads x 5000, SHARDS 1):

  • main (no fix): fails, ~100 samples lost per run.
  • compaction Phase-4 fix only (no dispatch): passes, zero loss.

Test plan

  • TimeSeriesConcurrentInsertTest.concurrentSingleRowInsertsDoNotLoseSamples (was @Disabled, now enabled): 48 threads x 5000 single-row INSERTs with SHARDS=1, asserts zero sample loss. Passes only with the compaction fix.
  • New TimeSeriesGrpcHaConcurrentInsertIT: embedded 2-node Raft HA + gRPC, concurrent single-row INSERTs; asserts no ingest errors and that every node converges to the full sample count. Guards against reintroducing the out-of-band append dispatch.
  • Full engine time series suite green (66 test classes).

Out of scope

A separate, flaky WAL page-version-gap under compaction + HA (concurrent inserts overlapping a background compaction, where compaction page-0 writes ride a deferred SCHEMA_ENTRY while appends ship immediate TX_ENTRY) is tracked in #4458. It is not addressed here.

…series INSERTs

Two fixes:

1. Dispatch appendSamples to the shard executor thread (which has no enclosing
   transaction) so the nested begin/commit always operates as a true level-1
   transaction and immediately publishes page updates to the cache. On the SQL
   INSERT path, the enclosing command transaction caused a nested commit that
   deferred the page-cache update outside appendLock; a following serialized
   append read a stale DATA_SAMPLE_COUNT_OFFSET and wrote to the same slot.

2. Fix compaction Phase 4: Phase 4a no longer reads the last partial page
   (phase4aPageCount). Phase 4c re-reads it under the write lock with its
   up-to-date count, capturing samples written to it during the lock-free Phase 4b.
@codacy-production

codacy-production Bot commented Jun 1, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 complexity

Metric Results
Complexity 0

View in Codacy

🟢 Coverage 100.00% diff coverage · -7.74% coverage variation

Metric Results
Coverage variation -7.74% coverage variation
Diff coverage 100.00% diff coverage

View coverage diff in Codacy

Coverage variation details
Coverable lines Covered lines Coverage
Common ancestor commit (d5463d5) 124318 92528 74.43%
Head commit (4256cf8) 156029 (+31711) 104054 (+11526) 66.69% (-7.74%)

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details
Coverable lines Covered lines Diff coverage
Pull request (#4455) 3 3 100.00%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a critical concurrency issue (#4453) where concurrent single-row SQL INSERTs into a time series engine could silently lose samples due to nested transactions and race conditions during compaction. The fix dispatches shard writes asynchronously on a dedicated executor to ensure top-level transactions are used, and adjusts the compaction phases to safely defer reading the last partial page under a write lock. Additionally, a previously disabled integration test has been re-enabled. The review feedback suggests two valuable improvements: directly propagating RuntimeExceptions instead of wrapping them in a generic IOException when handling CompletionException in TimeSeriesEngine, and simplifying a redundant conditional check in TimeSeriesShard that results in dead code.

Comment on lines +133 to +137
} catch (final CompletionException e) {
if (e.getCause() instanceof IOException ioe)
throw ioe;
throw new IOException("TimeSeries append failed", e.getCause());
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When catching CompletionException, any RuntimeException thrown during the async execution (such as NullPointerException or IllegalArgumentException) will be wrapped inside the CompletionException. Currently, the code wraps all non-IOException causes into a generic IOException. It is better to propagate RuntimeExceptions directly to avoid obscuring the true nature of the error and forcing callers to handle them as checked exceptions.

    } catch (final CompletionException e) {
      if (e.getCause() instanceof IOException ioe)
        throw ioe;
      if (e.getCause() instanceof RuntimeException re)
        throw re;
      throw new IOException("TimeSeries append failed", e.getCause());
    }

Comment on lines 543 to 548
Object[] tailData = null;
if (finalPageCount > phase4aPageCount)
if (phase4aPageCount > lastFullPage && finalPageCount >= phase4aPageCount)
// Include phase4aPageCount itself: it may have received new samples during Phase 4b.
tailData = mutableBucket.readPagesRangeForCompaction(phase4aPageCount, finalPageCount);
else if (finalPageCount > phase4aPageCount)
tailData = mutableBucket.readPagesRangeForCompaction(phase4aPageCount + 1, finalPageCount);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The condition phase4aPageCount > lastFullPage is always true because phase4aPageCount is at least snapshotDataPageCount (which is lastFullPage + 1). Additionally, finalPageCount >= phase4aPageCount is always true because page count can only increase or stay the same between Phase 4a and Phase 4c. Therefore, the first if condition is always met, making the else if branch dead code. This can be simplified by removing the redundant checks and the unreachable else if block.

Suggested change
Object[] tailData = null;
if (finalPageCount > phase4aPageCount)
if (phase4aPageCount > lastFullPage && finalPageCount >= phase4aPageCount)
// Include phase4aPageCount itself: it may have received new samples during Phase 4b.
tailData = mutableBucket.readPagesRangeForCompaction(phase4aPageCount, finalPageCount);
else if (finalPageCount > phase4aPageCount)
tailData = mutableBucket.readPagesRangeForCompaction(phase4aPageCount + 1, finalPageCount);
final Object[] tailData = mutableBucket.readPagesRangeForCompaction(phase4aPageCount, finalPageCount);

@claude

claude Bot commented Jun 1, 2026

Copy link
Copy Markdown

Code Review posted via API - see inline

@claude

claude Bot commented Jun 1, 2026

Copy link
Copy Markdown

Code Review - PR 4455

Overview

This PR closes two independent data-loss paths in the time series engine:

  • Fix 1: Dispatches appendSamples via the shard executor to avoid nested-transaction stale reads.
  • Fix 2: Defers reading the last partial page in compaction Phase 4a to Phase 4c (under write lock), closing a race with concurrent appends during the lock-free Phase 4b.

The root-cause analysis is solid and well-documented. Overall this is a correctness-critical fix with a clear rationale.

Issues and Suggestions

  1. Documentation file committed to docs/ - should be removed: docs/4453-timeseries-concurrent-insert-lost-update.md is an investigation/design doc. This belongs in the GitHub issue 4453 or the PR description - not checked into the repo as a permanent artifact. The PR description already covers this material comprehensively. Please remove the file before merging.

  2. Fix 1 - appendBatch(n=1) fast path adds double-dispatch overhead: appendBatch has a fast path for single-element batches that delegates to the public appendSamples. After this fix, that fast path goes through a cross-thread dispatch plus a blocking .join(), adding scheduling overhead for every single-sample batch call. For multi-sample batches, the code calls shards[shardIdx].appendSamples() directly on the executor thread (the correct approach). Consider aligning the fast path the same way to avoid the redundant indirection.

  3. Fix 1 - Exception wrapping loses shard context: In the catch block the IOException message does not include the shard index or the original exception message, making production diagnosis harder. Consider including the shard index and original message.

  4. Fix 2 - Phase 4c else-if branch needs a clarifying comment: The else-if branch in Phase 4c handles the case where phase4aPageCount == lastFullPage (no partial page existed in Phase 4a, so it reads only new pages created during Phase 4b). The logic is correct but dense - a brief comment explaining when the else-if fires would help future readers distinguish the two branches.

  5. No test for Fix 2 (compaction Phase 4 race): Fix 1 is well covered by the re-enabled concurrent insert regression test. Fix 2 has no dedicated test. It is harder to trigger deterministically, but the absence of coverage is a future-regression risk. A comment in the test class noting the gap would help contributors know what is untested.

  6. @tag slow placement is correct: The class-level @tag(slow) on TimeSeriesConcurrentInsertTest is appropriate for 48 threads x 5000 SQL INSERTs and aligns with the project tagging guidelines.

Summary table:

  • docs/ investigation file should be removed: Medium severity, Blocking
  • appendBatch(n=1) fast path double-dispatch overhead: Low severity, Non-blocking
  • Exception message missing shard context: Low severity, Non-blocking
  • Phase 4c else-if branch lacks explanatory comment: Low severity, Non-blocking
  • No test for Fix 2 compaction race: Medium severity, Non-blocking

The core fixes are correct and the approach is consistent with the existing appendBatch executor pattern. The main blocker is removing the investigation document from the repo. Test coverage for Fix 1 is solid - well done on re-enabling the regression test.

@claude

claude Bot commented Jun 1, 2026

Copy link
Copy Markdown

Code Review

Overview

This PR closes two independent data-loss paths in the time series engine:

  1. Fix 1 - SQL INSERT path: appendSamples dispatched directly on the caller's thread could run inside a nested transaction, deferring the page-cache update until the outer commit (outside appendLock), causing slot collisions.
  2. Fix 2 - Compaction Phase 4 race: Phase 4a read the last partial page under the write lock, but Phase 4b ran lock-free, so concurrent appends to that page were lost when clearDataPages() ran in Phase 4c.

The root cause analysis is solid, the PR description is exceptionally clear, and the overall approach is correct. A few items worth discussing before merging:


Correctness

Fix 1 (TimeSeriesEngine.appendSamples)

The dispatch to shardExecutor correctly breaks the nested-transaction problem. Exception unwrapping handles IOException, RuntimeException, and the fallthrough case - no issue there.

Potential executor deadlock: shardExecutor is a fixed-size pool of shardCount threads. appendBatch already submits to this same pool and then calls .allOf(...).join() from the caller's thread, which is safe. However if appendSamples is ever called from within a shardExecutor thread (e.g. through some future appendBatchappendSamples n=1 fast-path path), blocking .join() on the same pool can deadlock when all threads are already waiting. Currently the fast-path appendBatch(n=1) calls appendSamples, which is fine as long as the caller is not a shard executor thread itself. Worth adding a guard or a brief comment acknowledging this constraint.

Fix 2 (TimeSeriesShard.compactInternal)

The boundary arithmetic is correct:

  • Phase 4a now reads [lastFullPage+1, phase4aPageCount-1] - correctly excludes the last partial page.
  • Phase 4c reads [phase4aPageCount, finalPageCount] under the write lock - re-reads the last partial page with its final, up-to-date sample count.
  • When phase4aPageCount == lastFullPage + 1 (only one partial page existed at Phase 4a snapshot time), phase4aData is null and Phase 4c picks it up in full. Correct.
  • The comment "phase4aPageCount is always >= lastFullPage + 1" depends on Phase 0's early-return guard being reliable - this invariant appears to hold from reading Phase 0's code.

Performance

Thread-switch overhead per single-row insert (Fix 1): Every SQL INSERT into a time series type now incurs a context switch to a shard executor thread and blocks until completion. For high-throughput single-row insert workloads (the exact scenario the regression test exercises: 48 threads × 5000 inserts), this adds measurable scheduling overhead compared to the original direct call. The existing appendBatch path already pays this cost and was designed for it; the single-sample path was not. Consider documenting this tradeoff in the method Javadoc, or noting a follow-up to investigate whether the nested-transaction issue could instead be closed at the transaction layer (e.g. getWrappedDatabaseInstance() pattern used elsewhere in the engine) to avoid the hop for callers that are not inside an enclosing transaction.


Test Coverage

Fix 1 is well-covered - concurrentSingleRowInsertsDoNotLoseSamples is a deterministic 48-thread × 5000-insert regression test that directly exercises the slot-collision path.

Fix 2 has no dedicated regression test. The compaction Phase 4 race (samples added to page phase4aPageCount during the lock-free Phase 4b are then lost) is a separate bug path that requires concurrent appends racing against a compaction cycle. It's harder to reproduce deterministically, but a test that:

  • inserts a batch large enough to trigger compaction
  • concurrently appends single rows during compaction
  • then verifies total sample count

...would close the gap and prevent regression of Fix 2 independently.


Documentation file

docs/4453-timeseries-concurrent-insert-lost-update.md has two ## Root Cause sections (one plain, one labeled (confirmed)) with partially overlapping content. The second supersedes the first - consider removing the first section or merging them to avoid confusion for future readers.


Style / minor notes

  • The three if (e.getCause() instanceof ...) branches in appendSamples are clear, but the comment // Propagate unchecked exceptions (e.g. NullPointerException from a bug) as-is explains what the code does rather than why. The why is the caller contract (IOException boundary) - a one-line note on that would be more durable.
  • The inline comments in compactInternal are verbose but the concurrency reasoning is genuinely non-obvious, so the length is justified here.
  • @Tag("slow") at the class level is correct per project conventions.

Summary

Both fixes are logically correct and well-motivated. The main items to address or acknowledge before merging:

  1. Medium - Document (or mitigate) the per-insert thread-switch cost introduced by Fix 1.
  2. Low - Add a regression test for Fix 2 (compaction Phase 4 race).
  3. Low - Clean up the duplicate Root Cause section in the docs file.
  4. Low - Add a comment in appendSamples acknowledging that callers must not be shard executor threads (deadlock guard).

…int, add shard index to error, remove tracking doc

- Document the per-insert thread hand-off cost and the constraint that
  appendSamples must not be called from a shard-executor thread (deadlock).
- Include the shard index in the append-failure IOException message.
- Note in the regression test that it also exercises the compaction Phase-4
  race because the maintenance scheduler compacts mid-run.
- Remove the investigation doc from the repo; the PR description covers it.
@claude

claude Bot commented Jun 1, 2026

Copy link
Copy Markdown

Code Review: fix(#4453) - prevent silent sample loss on concurrent single-row time series INSERTs

Overall: This is a well-reasoned, targeted fix for two genuine race conditions. The PR description is thorough, the analysis is accurate, and the changes are minimal. A few points worth discussing below.


Fix 1 - TimeSeriesEngine.appendSamples (dispatching via shardExecutor)

Correctness: Sound. Dispatching to shardExecutor ensures the begin()/commit() cycle inside TimeSeriesShard.appendSamples runs as a true top-level transaction with no enclosing outer transaction, so the page-cache update commits before appendLock.unlock(). This directly mirrors what appendBatch does for multi-sample writes.

Exception handling is mostly consistent with appendBatch, but there is one gap: if the async task throws an Error (e.g. OutOfMemoryError, AssertionError), neither instanceof IOException nor instanceof RuntimeException matches, so it gets wrapped in IOException and re-thrown as a checked exception. The existing appendBatch handler has the same gap. While this is an edge case (and consistent with existing behaviour), adding:

if (e.getCause() instanceof Error err)
    throw err;

before the fallback throw new IOException(...) would make the intent explicit and consistent with the "propagate as-is" comment above it.

Threading constraint is undocumented in enforcement: The Javadoc note ("this method must NOT be invoked from a shard-executor thread") is essential and well-placed, but there is no runtime guard. A future caller inside a task submitted to shardExecutor would block a pool thread on its own join(), consuming one of the shardCount threads and potentially causing all shardCount threads to block waiting on each other under high concurrency (the pool is Executors.newFixedThreadPool(shardCount) with an unbounded queue - not a hard deadlock but a starvation risk). Consider a lightweight assert:

assert !Thread.currentThread().getName().startsWith("ArcadeDB-TS-Shard-" + typeName)
    : "appendSamples must not be called from a shard-executor thread";

This costs nothing in production (asserts are off by default) but catches the mistake in test runs.

appendBatch fast path (n==1) is safe: It calls TimeSeriesEngine.appendSamples from the caller thread (not from within a shard task), so no recursive executor submission occurs. The multi-sample path already calls shards[shardIdx].appendSamples directly inside the executor task, bypassing TimeSeriesEngine.appendSamples, so there is no recursive dispatch there either. Both paths are fine.


Fix 2 - TimeSeriesShard.compactInternal Phase 4 (last-page race)

Correctness: Also sound. The key invariant is:

  • Phase 4a reads pages [lastFullPage+1 .. phase4aPageCount-1] (partial pages except the very last), under write lock.
  • Phase 4b runs lock-free; concurrent appendSamples calls may write more samples to page phase4aPageCount.
  • Phase 4c reads [phase4aPageCount .. finalPageCount] under write lock, capturing the fully up-to-date last page plus any pages added during Phase 4b.

The boundary condition when phase4aPageCount == lastFullPage + 1 (only one partial page exists) is handled correctly: the condition phase4aPageCount > lastFullPage + 1 evaluates to false, phase4aData is set to null, and Phase 4c reads the single page entirely on its own.

readPagesRangeForCompaction(phase4aPageCount, finalPageCount) when they are equal reads exactly one page (the loop pageNum <= toPage runs once), returning null if the page has no samples or a compaction array otherwise. The merge logic for toCompressFinal still null-checks tailData, so this is handled correctly.

The switch from Object[] tailData = null (conditional) to final Object[] tailData = ... (always assigned) is clean. The null case is preserved because readPagesRangeForCompaction returns null when all rows are empty, and the downstream merge already handles that.


Test Coverage

Removing @Disabled and enabling concurrentSingleRowInsertsDoNotLoseSamples is the right outcome. The test (48 threads x 5000 INSERTs, SHARDS=1, @Tag("slow")) directly exercises both fixed paths:

  1. Slot collision via nested transaction.
  2. Phase-4 race during a background compaction triggered mid-run.

The test is deterministic enough to be a reliable regression guard. Tagging it @Tag("slow") is correct per project conventions.


Minor nits

  • The Javadoc on appendSamples is now quite long (four paragraphs before the @param). The threading constraint paragraph is the most critical and should stay. The performance tradeoff paragraph could be condensed to one sentence since it repeats information already in the PR description.
  • The inline comment // Read only up to phase4aPageCount - 1; the last page is read in Phase 4c. is clear and appropriate.

Summary

Area Assessment
Correctness of Fix 1 Correct
Correctness of Fix 2 Correct
Exception handling Mostly consistent; Error subtypes wrapped as IOException (same as appendBatch)
Threading guard Documented but not runtime-enforced; consider a debug assert
Test coverage Excellent - both races covered
Style / conventions Consistent with existing code

Two actionable suggestions:

  1. Add an explicit Error re-throw branch in the CompletionException handler (or at minimum note that Error wrapping is intentional).
  2. Consider a assert !shard-executor-thread guard to catch future misuse early.

Neither is a blocker. The fix is correct and the test confirms it.

…st shard-thread reentry

- Re-throw Error (e.g. OOM/AssertionError) unwrapped instead of masking it as a
  checked IOException, in both appendSamples and appendBatch.
- Add a debug assert in appendSamples guarding against being called from a
  shard-executor thread, which would block its own fixed-size pool.
@claude

claude Bot commented Jun 1, 2026

Copy link
Copy Markdown

Code Review - PR #4455

Overview

This PR closes two independent data-loss paths in the time series engine. The root-cause analysis is clear and the fixes are well-targeted.

Fix 1 (TimeSeriesEngine.java): appendSamples was called directly on the caller's thread. When that thread already had an open transaction (e.g., an HTTP command handler), the inner db.begin()/commit() inside the shard created a nested transaction, which deferred page-cache updates to the outer commit - outside appendLock. The next serialized append then read a stale DATA_SAMPLE_COUNT_OFFSET and overwrote the same slot. Fix: dispatch through shardExecutor so the inner transaction always operates at level 1.

Fix 2 (TimeSeriesShard.java): Phase 4a read the last partial page under the write lock, then released it. During the lock-free Phase 4b, concurrent appends could add more samples to that same page. Phase 4c only read pages after phase4aPageCount, missing those Phase-4b samples before clearDataPages() reset the count. Fix: defer reading the last page to Phase 4c (under write lock) so all samples are captured.


Code Quality

Correct and consistent exception unwrapping. The pattern added to both appendSamples and appendBatch now correctly re-throws Error and RuntimeException without wrapping, which is the right behavior. Good consistency improvement.

Phase 4 boundary logic is sound. The condition change from phase4aPageCount > lastFullPage to phase4aPageCount > lastFullPage + 1 is necessary: with the new range ending at phase4aPageCount - 1, reading [lastFullPage+1, phase4aPageCount-1] would be empty (invalid) when phase4aPageCount == lastFullPage + 1, so the guard is correct.

tailData is now read unconditionally. The old code read tail pages only when finalPageCount > phase4aPageCount. The new code always calls readPagesRangeForCompaction(phase4aPageCount, finalPageCount). When the two values are equal, this reads exactly one page and the method returns null if it has 0 samples. This is correct but adds a small unconditional I/O call even for the common case where no new pages arrived during Phase 4b. A comment acknowledging this tradeoff would help future readers.


Potential Issues

Deadlock guard is assert-only. The check:

assert !Thread.currentThread().getName().startsWith("ArcadeDB-TS-Shard-" + typeName)
    : "appendSamples must not be called from a shard-executor thread ...";

Asserts are disabled in production by default. If this constraint is violated at runtime, the thread deadlocks silently with no diagnostic output. The Javadoc warning and architecture enforcement are probably sufficient here, but if this is a critical invariant, a checked guard (e.g., a ThreadLocal<Boolean> flag set in the executor task) would catch misuse in production. Not a blocker given the current call sites are well-controlled.

Thread-name-based detection is fragile. The assert relies on the naming convention "ArcadeDB-TS-Shard-" + typeName. If the convention changes or a type name is a prefix of another type's name, the guard could give false positives or miss the violation. Acceptable for a dev-time assertion.

shardExecutor is shared across writes and reads. appendSamples, appendBatch, and aggregateMulti (parallel sealed reads) all submit to the same pool of shardCount threads. Under sustained concurrent inserts, aggregation queries may queue behind write tasks. This is pre-existing behavior and the PR does not make it worse, but worth noting as a follow-up if latency-sensitive reads become a concern.


Performance

The PR correctly documents the thread-handoff cost in the Javadoc:

dispatching to the shard executor adds one thread hand-off plus a blocking join() per call

For bursty single-row inserts this is real overhead. The recommendation to use the batch API for high-throughput ingestion is appropriate. No issues here.


Test Coverage

The test is well-designed:

  • 48 threads x 5000 inserts = 240,000 rows, with unique (sid, ts) per row so any slot collision is detectable by count.
  • SHARDS=1 maximizes contention on the single shard, which is exactly the failure scenario.
  • The background compaction scheduler fires during the run, exercising Fix 2 concurrently with Fix 1 - a nice emergent property of the test design documented in the class Javadoc.
  • Tagged @Tag("slow") per project guidelines.
  • 5-minute timeout is generous and appropriate.

Minor gap: The assertion verifies total count only. In theory a pathological scenario could produce the right count with wrong data. Given the unique timestamp and unique sid per row, a correct count is a strong proxy for correctness. A secondary spot-check asserting no duplicate (sid, ts) pairs would make this bulletproof, but it is not required.


Summary

The analysis is thorough and well-documented, the fixes are minimal and targeted, and the regression test is solid. The two minor items (assert-only deadlock guard, unconditional Phase 4c read without comment) are worth noting but are not blockers.

LGTM. Ready to merge.

@codecov

codecov Bot commented Jun 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 65.27%. Comparing base (d5463d5) to head (4256cf8).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
...om/arcadedb/engine/timeseries/TimeSeriesShard.java 66.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4455      +/-   ##
==========================================
+ Coverage   65.23%   65.27%   +0.03%     
==========================================
  Files        1612     1612              
  Lines      124318   124323       +5     
  Branches    26910    26912       +2     
==========================================
+ Hits        81104    81146      +42     
+ Misses      31818    31780      -38     
- Partials    11396    11397       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…cessary append dispatch

The silent sample loss on concurrent single-row TimeSeries INSERTs is caused
solely by the compaction Phase-4 race: the last partial data page was read in
Phase 4a and then cleared, dropping samples written to it during the lock-free
Phase 4b. The fix (already in TimeSeriesShard) defers reading that page to
Phase 4c under the write lock so those samples are captured.

Revert the shard-executor dispatch in TimeSeriesEngine.appendSamples: it is
unnecessary. The per-shard appendLock plus the nested-commit page publish
already prevent the hypothesized stale-slot read, so a direct in-thread call
loses no samples. Reverting it removes a per-insert thread hand-off and keeps
the append inside the caller transaction.

Verified: the embedded TimeSeriesConcurrentInsertTest passes with the
compaction fix alone (and fails on main); all engine time-series tests pass.

Add TimeSeriesGrpcHaConcurrentInsertIT (embedded 2-node Raft HA + gRPC) as a
regression guard for concurrent single-row TS inserts over gRPC. The separate
flaky WAL page-version-gap under compaction + HA is tracked in #4458.
@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown

Code Review: fix(#4453) - prevent silent sample loss on concurrent single-row time series INSERTs

Overview

This PR fixes a compaction Phase-4 race in TimeSeriesShard that caused silent sample loss when concurrent single-row INSERTs overlapped with a background compaction. The regression test that previously reproduced the bug is now enabled. A new HA integration test is also added.

The core fix is surgical and correct. A few items worth addressing before merge:


Critical: Port conflict in the new HA test

TimeSeriesGrpcHaConcurrentInsertIT uses BASE_GRPC_PORT = 51071, which is identical to GrpcFollowerForwardingIT.BASE_GRPC_PORT = 51071. When both tests run in the same Surefire fork (or on the same machine with slow port release), they will collide.

// TimeSeriesGrpcHaConcurrentInsertIT.java
private static final int BASE_GRPC_PORT = 51071;  // ← conflicts with GrpcFollowerForwardingIT

// GrpcFollowerForwardingIT.java
private static final int BASE_GRPC_PORT = 51071;  // ← pre-existing

Suggestion: use a port that isn't already taken - e.g. 51081 (checking no other test uses it) or derive the port dynamically.


PR description no longer matches the diff

The description still leads with "Fix 1 - dispatch single-row INSERTs via CompletableFuture.runAsync(..., shardExecutor).join()". That executor-dispatch approach was dropped in the final commit (fix(#4453): fix TS sample loss via compaction Phase-4 only; drop unne…).

The actual diff only contains the Phase-4 compaction race fix. The stale description will mislead anyone who reads it later. Please update to describe only what landed.


Fix correctness analysis

The Phase-4 fix is sound:

Before:

  • Phase 4a read pages [lastFullPage+1, phase4aPageCount] (inclusive of the last partial page) under the write lock
  • Phase 4b ran lock-free - concurrent appends could add more samples to page phase4aPageCount
  • Phase 4c read only pages [phase4aPageCount+1, finalPageCount], so the extra Phase-4b samples on page phase4aPageCount were never captured before clearDataPages() erased them

After:

  • Phase 4a reads [lastFullPage+1, phase4aPageCount-1] (excludes the last page)
  • Phase 4b runs lock-free as before
  • Phase 4c reads [phase4aPageCount, finalPageCount] under the write lock, capturing everything including the Phase-4b writes to page phase4aPageCount

The condition change from phase4aPageCount > lastFullPage to phase4aPageCount > lastFullPage + 1 correctly excludes the last partial page from Phase 4a.

Edge case to verify: Phase 4c now calls readPagesRangeForCompaction(phase4aPageCount, finalPageCount) unconditionally (no if finalPageCount > phase4aPageCount guard). The method's Javadoc says toPage ≥ fromPage is a precondition. The inline comment justifies this ("pages only grow"), which is valid since clearDataPages() only runs inside the write lock. readPagesRangeForCompaction returns null for an empty result, and the downstream merge code handles null correctly - so even when finalPageCount == phase4aPageCount (no new pages since Phase 4a), this is safe.


HA test coverage gap

TimeSeriesGrpcHaConcurrentInsertIT verifies that no ingestion errors occur (no WAL version-gap panics). It does not verify data integrity on followers (e.g., that the follower's sample count matches the leader's). Consider adding a post-ingest query against the follower to confirm data was replicated correctly.

Example:

// After ingestion completes, query the follower's sample count
final Database followerDb = getServerDatabase(1 - leaderIndex, getDatabaseName());
final long followerCount = /* SELECT count(*) from sensor */;
assertThat(followerCount).as("follower count matches leader").isEqualTo(NUM_THREADS * POINTS_PER_THREAD);

Minor

  • TimeSeriesConcurrentInsertTest: removing @Disabled and updating the Javadoc is clean. The @Tag("slow") annotation is correct per CLAUDE.md guidance.
  • Comment volume: Phase 4a/4b/4c comment blocks are quite detailed. The concurrency context is genuinely non-obvious, so this is acceptable, but a few lines per phase (rather than the current 5-7 line blocks) would be more in line with project style.
  • appendSamples single-row path: still runs on the caller's thread (the executor dispatch was dropped). If the caller holds an enclosing transaction, it creates a nested TX. This is apparently tolerable now that the Phase-4 race is fixed, but a short Javadoc note explaining this threading constraint on appendSamples would help future maintainers avoid reintroducing Fix 1.

Summary

The compaction Phase-4 fix is correct and the test coverage is a clear improvement. The port conflict with GrpcFollowerForwardingIT should be fixed before merge, and the PR description should be updated to match what actually landed.

… check, document appendSamples threading

- TimeSeriesGrpcHaConcurrentInsertIT used BASE_GRPC_PORT 51071, identical to
  GrpcFollowerForwardingIT; move to 51081 to avoid collision.
- Verify data integrity: poll every node to converge to the full sample count
  after ingestion (followers apply Raft entries asynchronously, so poll rather
  than assert immediately). A WAL version gap would block convergence.
- Document on appendSamples that it runs on the caller thread and nests into
  the enclosing transaction, so the append must not be routed onto another
  thread (which would publish shard pages out of band and, under HA, reorder
  page versions on the Raft log).
@robfrank

robfrank commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed in 4256cf871:

  • Port conflict (critical): confirmed - GrpcFollowerForwardingIT also uses 51071. Moved TimeSeriesGrpcHaConcurrentInsertIT to 51081 (verified unused; gRPC test ports in this module are 50051, 51051, 51071).
  • PR description: updated to describe only the compaction Phase-4 fix that landed; documented why the executor-dispatch change was dropped.
  • HA follower integrity: added a per-node convergence check on the full sample count. Followers apply Raft entries asynchronously after commit, so it polls (60s) instead of asserting immediately, to avoid apply-lag flakiness; a version gap would block convergence.
  • appendSamples threading note: added a behavioral Javadoc note that it runs on the caller thread and nests into the enclosing transaction, and that routing it onto another thread reorders shard page versions on the Raft log under HA.

Not changing:

  • Comment volume in Phase 4a/b/c: leaving as-is. The concurrency invariants are non-obvious and the blocks document them precisely; trimming would lose the rationale.
  • Phase 4c edge case / fix correctness: agreed with the analysis, no change needed.

Re-ran TimeSeriesGrpcHaConcurrentInsertIT after the changes: passes, 0 WAL version-gap occurrences. The separate flaky compaction+HA page-0 ordering race is tracked in #4458.

@robfrank robfrank merged commit 8bd740d into main Jun 2, 2026
22 of 25 checks passed
@robfrank robfrank deleted the fix/4453-timeseries-concurrent-insert-lost-update branch June 2, 2026 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Time series: concurrent single-row SQL INSERTs silently lose samples (sealed-slot lost update)

1 participant