Time series: concurrent single-row SQL INSERTs silently lose samples (sealed-slot lost update)

## Summary

Under many **concurrent single-row** `INSERT INTO <timeseriesType> SET ts=…, <tag>=…, <field>=…` statements (each in its own transaction, as happens over the HTTP/wire protocols), the time series engine **silently drops samples**. The client receives success on every insert (HTTP 200, no exception, no `ConcurrentModificationException`), yet the stored row count comes up short. Confirmed real data loss: a full table scan returns the same short count as `count(*)`, and a `GROUP BY sensor_id` shows specific series missing rows.

The batched InfluxDB line-protocol path (`POST /api/v1/ts/{db}/write`, which appends many samples in one engine call/transaction) is **not** affected. Single-threaded inserts are **not** affected.

## Impact

Silent, unacknowledged data loss on the time series write path under concurrency. Severity is high because there is no error surfaced to the client — an application doing concurrent per-row ingestion over SQL (HTTP or gRPC) loses samples without any signal.

## Environment

- Observed on `26.6.1-SNAPSHOT` (== the `arcadedata/arcadedb:latest` image at time of writing).
- Reproduced both embedded (in-process engine) and over HTTP against the Docker image.

## Reproduction

### A. Embedded (fast, deterministic at high concurrency)

A disabled regression test is included in PR #4450:
`engine/src/test/java/com/arcadedb/engine/timeseries/TimeSeriesConcurrentInsertTest.java`

It creates a `TIMESERIES TYPE … SHARDS 1`, runs **48 threads × 5000** single-row `INSERT`s (each in its own `database.transaction(...)`), then asserts `count(*) == 48*5000`. Result: e.g. **239911 / 240000** — 89 samples lost, with no exception and no failed transaction.

Enable it by removing the `@Disabled` annotation.

### B. Over HTTP (against the published image)

```
CREATE TIMESERIES TYPE sensor TIMESTAMP ts TAGS (sid STRING) FIELDS (v DOUBLE) SHARDS 1
```
Then 16 threads, each POSTing 5000 distinct single-row inserts to `POST /api/v1/command/{db}`:
```json
{"language":"sql","command":"INSERT INTO sensor SET ts=<unique>, sid='s<thread>', v=<i>"}
```
`SELECT count(*) FROM sensor` returns short by **10–22** out of 80000 across runs.

### Observations on the trigger
- **Concurrency level** (thread count) is the driver, not total row count. 5 threads often look clean; 16–48 threads reliably lose.
- `SHARDS 1` maximizes it (all appends contend on one shard).
- Single-threaded never loses.
- The unfiltered `SELECT count(*)` reflects the true (short) count; a full row scan agrees with it (so it is real loss, not a `count(*)` artifact).

## Root cause analysis

The write position for a sample is a read-modify-write of the per-data-page sample count. In `TimeSeriesBucket.appendSamples` (`engine/.../engine/timeseries/TimeSeriesBucket.java`):

```java
final int sampleCountInPage = dataPage.readShort(DATA_SAMPLE_COUNT_OFFSET) & 0xFFFF; // read slot
final int rowOffset = DATA_ROWS_OFFSET + sampleCountInPage * rowSize;
dataPage.writeLong(rowOffset, timestamps[i]);                                         // write row at slot
...
dataPage.writeShort(DATA_SAMPLE_COUNT_OFFSET, (short) (sampleCountInPage + 1));        // advance slot
```

`TimeSeriesShard.appendSamples` serializes appends per shard with a `appendLock` around a nested `db.begin()/append/db.commit()` on `getWrappedDatabaseInstance()` (and the comment there notes the lock exists specifically to avoid a commit-time `ConcurrentModificationException` on page 0). The append commit is **nested inside the enclosing request transaction** (`LocalDatabase.begin()` pushes a nested `TransactionContext` when a transaction is already active), and the durable publish is effectively deferred to the **outer** transaction commit (intentionally, so the bucket page writes ship to followers via the parent Raft WAL `TX_ENTRY` — the code references issue #4382). That outer commit happens **outside** `appendLock`.

A temporary collision detector added to `appendSamples` (recording `fileId:pageNumber:slot` per append) confirmed the mechanism: a following serialized append reads a **stale** `sampleCountInPage` — the prior append's `count → N+1` increment is not visible — so two appends compute the **same** `rowOffset` and write the same slot N; the later commit overwrites the earlier row. The page version advances normally, and `PageManager.updatePageVersion`'s version check does **not** raise a conflict for these commits, so the existing outer-transaction `retries` never kick in.

Sample collision-detector output (48-thread embedded run):
```
DIAG COLLISION slot=1:1:0 pageVer=239 prevThread=74 curThread=88
DIAG COLLISION slot=1:1:1 pageVer=240 prevThread=68 curThread=87
DIAG COLLISION slot=1:1:2 pageVer=241 prevThread=42 curThread=68
...
```
(slot reused across threads; page version high and advancing while the slot/count read is stale.)

### Why the batched path is safe
`TimeSeriesEngine.appendBatch` groups all of a shard's samples into a **single** transaction per shard, so there is never a second concurrent transaction reading a stale slot on the same page.

### Relevant code
- `engine/.../query/sql/executor/SaveElementStep.java` — `saveToTimeSeries` routes a SQL `INSERT` into `engine.appendSamples`.
- `engine/.../engine/timeseries/TimeSeriesEngine.java` — `appendSamples` (per-row) vs `appendBatch` (line protocol).
- `engine/.../engine/timeseries/TimeSeriesShard.java` — `appendSamples` (`appendLock` + nested begin/commit on the wrapped instance).
- `engine/.../engine/timeseries/TimeSeriesBucket.java` — `getOrCreateActiveDataPage` + `appendSamples` (the `DATA_SAMPLE_COUNT_OFFSET` slot RMW).
- `engine/.../engine/PageManager.java` — `updatePageVersion` (the version conflict check that is not catching this).

## Suggested direction (for discussion)

The core tension is that `appendLock` serializes the in-memory append but the durable publish is deferred to the enclosing transaction's commit (kept that way so writes ship via the parent Raft WAL). Any fix needs to make the slot read-modify-write atomic with respect to the actual publish without breaking HA replication — e.g. ensuring the per-shard append publishes/serializes the slot under the lock, or ensuring the concurrent same-page commits raise a conflict so the existing outer-transaction retry path heals them. This is a delicate change to the transaction/HA path and warrants careful design plus HA + performance verification.

## Notes

- A reliable embedded regression test is included (disabled) in PR #4450; enable once fixed.
- The time series load tests in the same PR catch this at scale on the SQL protocols.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Time series: concurrent single-row SQL INSERTs silently lose samples (sealed-slot lost update) #4453

Summary

Impact

Environment

Reproduction

A. Embedded (fast, deterministic at high concurrency)

B. Over HTTP (against the published image)

Observations on the trigger

Root cause analysis

Why the batched path is safe

Relevant code

Suggested direction (for discussion)

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Time series: concurrent single-row SQL INSERTs silently lose samples (sealed-slot lost update) #4453

Description

Summary

Impact

Environment

Reproduction

A. Embedded (fast, deterministic at high concurrency)

B. Over HTTP (against the published image)

Observations on the trigger

Root cause analysis

Why the batched path is safe

Relevant code

Suggested direction (for discussion)

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions