fix(#4453): prevent silent sample loss on concurrent single-row time series INSERTs#4455
Conversation
…series INSERTs Two fixes: 1. Dispatch appendSamples to the shard executor thread (which has no enclosing transaction) so the nested begin/commit always operates as a true level-1 transaction and immediately publishes page updates to the cache. On the SQL INSERT path, the enclosing command transaction caused a nested commit that deferred the page-cache update outside appendLock; a following serialized append read a stale DATA_SAMPLE_COUNT_OFFSET and wrote to the same slot. 2. Fix compaction Phase 4: Phase 4a no longer reads the last partial page (phase4aPageCount). Phase 4c re-reads it under the write lock with its up-to-date count, capturing samples written to it during the lock-free Phase 4b.
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 0 |
🟢 Coverage 100.00% diff coverage · -7.74% coverage variation
Metric Results Coverage variation ✅ -7.74% coverage variation Diff coverage ✅ 100.00% diff coverage Coverage variation details
Coverable lines Covered lines Coverage Common ancestor commit (d5463d5) 124318 92528 74.43% Head commit (4256cf8) 156029 (+31711) 104054 (+11526) 66.69% (-7.74%) Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch:
<coverage of head commit> - <coverage of common ancestor commit>Diff coverage details
Coverable lines Covered lines Diff coverage Pull request (#4455) 3 3 100.00% Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified:
<covered lines added or modified>/<coverable lines added or modified> * 100%
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Code Review
This pull request addresses a critical concurrency issue (#4453) where concurrent single-row SQL INSERTs into a time series engine could silently lose samples due to nested transactions and race conditions during compaction. The fix dispatches shard writes asynchronously on a dedicated executor to ensure top-level transactions are used, and adjusts the compaction phases to safely defer reading the last partial page under a write lock. Additionally, a previously disabled integration test has been re-enabled. The review feedback suggests two valuable improvements: directly propagating RuntimeExceptions instead of wrapping them in a generic IOException when handling CompletionException in TimeSeriesEngine, and simplifying a redundant conditional check in TimeSeriesShard that results in dead code.
| } catch (final CompletionException e) { | ||
| if (e.getCause() instanceof IOException ioe) | ||
| throw ioe; | ||
| throw new IOException("TimeSeries append failed", e.getCause()); | ||
| } |
There was a problem hiding this comment.
When catching CompletionException, any RuntimeException thrown during the async execution (such as NullPointerException or IllegalArgumentException) will be wrapped inside the CompletionException. Currently, the code wraps all non-IOException causes into a generic IOException. It is better to propagate RuntimeExceptions directly to avoid obscuring the true nature of the error and forcing callers to handle them as checked exceptions.
} catch (final CompletionException e) {
if (e.getCause() instanceof IOException ioe)
throw ioe;
if (e.getCause() instanceof RuntimeException re)
throw re;
throw new IOException("TimeSeries append failed", e.getCause());
}| Object[] tailData = null; | ||
| if (finalPageCount > phase4aPageCount) | ||
| if (phase4aPageCount > lastFullPage && finalPageCount >= phase4aPageCount) | ||
| // Include phase4aPageCount itself: it may have received new samples during Phase 4b. | ||
| tailData = mutableBucket.readPagesRangeForCompaction(phase4aPageCount, finalPageCount); | ||
| else if (finalPageCount > phase4aPageCount) | ||
| tailData = mutableBucket.readPagesRangeForCompaction(phase4aPageCount + 1, finalPageCount); |
There was a problem hiding this comment.
The condition phase4aPageCount > lastFullPage is always true because phase4aPageCount is at least snapshotDataPageCount (which is lastFullPage + 1). Additionally, finalPageCount >= phase4aPageCount is always true because page count can only increase or stay the same between Phase 4a and Phase 4c. Therefore, the first if condition is always met, making the else if branch dead code. This can be simplified by removing the redundant checks and the unreachable else if block.
| Object[] tailData = null; | |
| if (finalPageCount > phase4aPageCount) | |
| if (phase4aPageCount > lastFullPage && finalPageCount >= phase4aPageCount) | |
| // Include phase4aPageCount itself: it may have received new samples during Phase 4b. | |
| tailData = mutableBucket.readPagesRangeForCompaction(phase4aPageCount, finalPageCount); | |
| else if (finalPageCount > phase4aPageCount) | |
| tailData = mutableBucket.readPagesRangeForCompaction(phase4aPageCount + 1, finalPageCount); | |
| final Object[] tailData = mutableBucket.readPagesRangeForCompaction(phase4aPageCount, finalPageCount); |
…de Phase 4c branch
|
Code Review posted via API - see inline |
|
Code Review - PR 4455 Overview This PR closes two independent data-loss paths in the time series engine:
The root-cause analysis is solid and well-documented. Overall this is a correctness-critical fix with a clear rationale. Issues and Suggestions
Summary table:
The core fixes are correct and the approach is consistent with the existing appendBatch executor pattern. The main blocker is removing the investigation document from the repo. Test coverage for Fix 1 is solid - well done on re-enabling the regression test. |
Code ReviewOverviewThis PR closes two independent data-loss paths in the time series engine:
The root cause analysis is solid, the PR description is exceptionally clear, and the overall approach is correct. A few items worth discussing before merging: CorrectnessFix 1 (
|
…int, add shard index to error, remove tracking doc - Document the per-insert thread hand-off cost and the constraint that appendSamples must not be called from a shard-executor thread (deadlock). - Include the shard index in the append-failure IOException message. - Note in the regression test that it also exercises the compaction Phase-4 race because the maintenance scheduler compacts mid-run. - Remove the investigation doc from the repo; the PR description covers it.
Code Review: fix(#4453) - prevent silent sample loss on concurrent single-row time series INSERTsOverall: This is a well-reasoned, targeted fix for two genuine race conditions. The PR description is thorough, the analysis is accurate, and the changes are minimal. A few points worth discussing below. Fix 1 -
|
| Area | Assessment |
|---|---|
| Correctness of Fix 1 | Correct |
| Correctness of Fix 2 | Correct |
| Exception handling | Mostly consistent; Error subtypes wrapped as IOException (same as appendBatch) |
| Threading guard | Documented but not runtime-enforced; consider a debug assert |
| Test coverage | Excellent - both races covered |
| Style / conventions | Consistent with existing code |
Two actionable suggestions:
- Add an explicit
Errorre-throw branch in theCompletionExceptionhandler (or at minimum note thatErrorwrapping is intentional). - Consider a
assert !shard-executor-threadguard to catch future misuse early.
Neither is a blocker. The fix is correct and the test confirms it.
…st shard-thread reentry - Re-throw Error (e.g. OOM/AssertionError) unwrapped instead of masking it as a checked IOException, in both appendSamples and appendBatch. - Add a debug assert in appendSamples guarding against being called from a shard-executor thread, which would block its own fixed-size pool.
Code Review - PR #4455OverviewThis PR closes two independent data-loss paths in the time series engine. The root-cause analysis is clear and the fixes are well-targeted. Fix 1 (TimeSeriesEngine.java): Fix 2 (TimeSeriesShard.java): Phase 4a read the last partial page under the write lock, then released it. During the lock-free Phase 4b, concurrent appends could add more samples to that same page. Phase 4c only read pages after Code QualityCorrect and consistent exception unwrapping. The pattern added to both Phase 4 boundary logic is sound. The condition change from
Potential IssuesDeadlock guard is assert-only. The check: assert !Thread.currentThread().getName().startsWith("ArcadeDB-TS-Shard-" + typeName)
: "appendSamples must not be called from a shard-executor thread ...";Asserts are disabled in production by default. If this constraint is violated at runtime, the thread deadlocks silently with no diagnostic output. The Javadoc warning and architecture enforcement are probably sufficient here, but if this is a critical invariant, a checked guard (e.g., a Thread-name-based detection is fragile. The assert relies on the naming convention
PerformanceThe PR correctly documents the thread-handoff cost in the Javadoc:
For bursty single-row inserts this is real overhead. The recommendation to use the batch API for high-throughput ingestion is appropriate. No issues here. Test CoverageThe test is well-designed:
Minor gap: The assertion verifies total count only. In theory a pathological scenario could produce the right count with wrong data. Given the unique timestamp and unique SummaryThe analysis is thorough and well-documented, the fixes are minimal and targeted, and the regression test is solid. The two minor items (assert-only deadlock guard, unconditional Phase 4c read without comment) are worth noting but are not blockers. LGTM. Ready to merge. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4455 +/- ##
==========================================
+ Coverage 65.23% 65.27% +0.03%
==========================================
Files 1612 1612
Lines 124318 124323 +5
Branches 26910 26912 +2
==========================================
+ Hits 81104 81146 +42
+ Misses 31818 31780 -38
- Partials 11396 11397 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…cessary append dispatch The silent sample loss on concurrent single-row TimeSeries INSERTs is caused solely by the compaction Phase-4 race: the last partial data page was read in Phase 4a and then cleared, dropping samples written to it during the lock-free Phase 4b. The fix (already in TimeSeriesShard) defers reading that page to Phase 4c under the write lock so those samples are captured. Revert the shard-executor dispatch in TimeSeriesEngine.appendSamples: it is unnecessary. The per-shard appendLock plus the nested-commit page publish already prevent the hypothesized stale-slot read, so a direct in-thread call loses no samples. Reverting it removes a per-insert thread hand-off and keeps the append inside the caller transaction. Verified: the embedded TimeSeriesConcurrentInsertTest passes with the compaction fix alone (and fails on main); all engine time-series tests pass. Add TimeSeriesGrpcHaConcurrentInsertIT (embedded 2-node Raft HA + gRPC) as a regression guard for concurrent single-row TS inserts over gRPC. The separate flaky WAL page-version-gap under compaction + HA is tracked in #4458.
Code Review: fix(#4453) - prevent silent sample loss on concurrent single-row time series INSERTsOverviewThis PR fixes a compaction Phase-4 race in The core fix is surgical and correct. A few items worth addressing before merge: Critical: Port conflict in the new HA test
// TimeSeriesGrpcHaConcurrentInsertIT.java
private static final int BASE_GRPC_PORT = 51071; // ← conflicts with GrpcFollowerForwardingIT
// GrpcFollowerForwardingIT.java
private static final int BASE_GRPC_PORT = 51071; // ← pre-existingSuggestion: use a port that isn't already taken - e.g. PR description no longer matches the diffThe description still leads with "Fix 1 - dispatch single-row INSERTs via The actual diff only contains the Phase-4 compaction race fix. The stale description will mislead anyone who reads it later. Please update to describe only what landed. Fix correctness analysisThe Phase-4 fix is sound: Before:
After:
The condition change from Edge case to verify: Phase 4c now calls HA test coverage gap
Example: // After ingestion completes, query the follower's sample count
final Database followerDb = getServerDatabase(1 - leaderIndex, getDatabaseName());
final long followerCount = /* SELECT count(*) from sensor */;
assertThat(followerCount).as("follower count matches leader").isEqualTo(NUM_THREADS * POINTS_PER_THREAD);Minor
SummaryThe compaction Phase-4 fix is correct and the test coverage is a clear improvement. The port conflict with |
… check, document appendSamples threading - TimeSeriesGrpcHaConcurrentInsertIT used BASE_GRPC_PORT 51071, identical to GrpcFollowerForwardingIT; move to 51081 to avoid collision. - Verify data integrity: poll every node to converge to the full sample count after ingestion (followers apply Raft entries asynchronously, so poll rather than assert immediately). A WAL version gap would block convergence. - Document on appendSamples that it runs on the caller thread and nests into the enclosing transaction, so the append must not be routed onto another thread (which would publish shard pages out of band and, under HA, reorder page versions on the Raft log).
|
Addressed in
Not changing:
Re-ran |
Closes #4453
Summary
The silent sample loss on concurrent single-row time series INSERTs is caused by a single bug: a compaction Phase 4 race in
TimeSeriesShard.Compaction Phase 4 race (
TimeSeriesShard)Phase 4a read the last partial page (
phase4aPageCount) as a snapshot under the write lock. Phase 4b ran lock-free, allowing concurrentappendSamplescalls to add more samples to that same page. Phase 4c only read pages afterphase4aPageCount, so those Phase-4b samples were lost whenclearDataPages()reset the logical count to 0.Fix: Phase 4a no longer reads the last partial page (condition changed from
phase4aPageCount > lastFullPageto> lastFullPage + 1, reading[lastFullPage+1, phase4aPageCount-1]). Phase 4c re-readsphase4aPageCount(with its fully up-to-date count, including Phase-4b appends) plus any new pages, all under the write lock.Note on the dropped executor-dispatch change
An earlier revision also dispatched
TimeSeriesEngine.appendSamplesto the shard executor (to make the per-row append a top-level transaction). That change was dropped: it is unnecessary (the compaction fix alone fixes the loss, verified by the A/B below) and it moves the append out of the caller's transaction, which under HA reorders shard page versions on the Raft log.appendSamplesnow runs on the caller's thread and nests into the enclosing command transaction, so the page writes ship as a single in-order transaction.A/B on
TimeSeriesConcurrentInsertTest(48 threads x 5000, SHARDS 1):Test plan
TimeSeriesConcurrentInsertTest.concurrentSingleRowInsertsDoNotLoseSamples(was@Disabled, now enabled): 48 threads x 5000 single-row INSERTs with SHARDS=1, asserts zero sample loss. Passes only with the compaction fix.TimeSeriesGrpcHaConcurrentInsertIT: embedded 2-node Raft HA + gRPC, concurrent single-row INSERTs; asserts no ingest errors and that every node converges to the full sample count. Guards against reintroducing the out-of-band append dispatch.Out of scope
A separate, flaky WAL page-version-gap under compaction + HA (concurrent inserts overlapping a background compaction, where compaction page-0 writes ride a deferred
SCHEMA_ENTRYwhile appends ship immediateTX_ENTRY) is tracked in #4458. It is not addressed here.