Skip to content

Invalid Position during Index Compaction #2434

@gramian

Description

@gramian

ArcadeDB Server v25.7.1 (build ec313a157411da9cbfc1fdcfd554c966e0befcc9/1755021836738/main)

Running on Linux 6.14.0-28-generic - OpenJDK 64-Bit Server VM 21.0.8

During a large ingest at some point LSMTreeIndex warnings about index compaction appeared in the log.

After a while the following errors start to appear alongside these warnings:

2025-08-20 08:42:29.465 INFO  [LSMTreeIndex] Compacting index 'metadata_2_3983062778932827(374)' (pages=11 pageSize=262144 threadId=38)...
2025-08-20 08:42:29.465 WARNI [LSMTreeIndex] - Compacting pages 0-9 (threadId=38)
2025-08-20 08:42:29.465 WARNI [LSMTreeIndex] - This turn compacting 10 pages using root page PageId(metadatalake/148/70) v.0 (threadId=38)
2025-08-20 08:42:29.466 WARNI [LSMTreeIndex] - Creating a new entry in index 'metadata_2_3983062778932827(374)' root page [?]->71 (entry in page=0 threadId=38)
2025-08-20 08:42:29.503 WARNI [LSMTreeIndex] - Creating a new entry in index 'metadata_2_3983062778932827(374)' root page [draws]->72 (entry in page=1 threadId=38)
2025-08-20 08:42:29.545 WARNI [LSMTreeIndex] - Creating a new entry in index 'metadata_2_3983062778932827(374)' root page [causality]->73 (entry in page=2 threadId=38)
2025-08-20 08:42:29.581 WARNI [LSMTreeIndex] - Creating a new entry in index 'metadata_2_3983062778932827(374)' root page [fundamentalism]->74 (entry in page=3 threadId=38)
2025-08-20 08:42:29.618 WARNI [LSMTreeIndex] - Creating a new entry in index 'metadata_2_3983062778932827(374)' root page [drill]->75 (entry in page=4 threadId=38)
2025-08-20 08:42:29.653 WARNI [LSMTreeIndex] - Creating a new entry in index 'metadata_2_3983062778932827(374)' root page [sttr]->76 (entry in page=5 threadId=38)
2025-08-20 08:42:29.681 WARNI [LSMTreeIndex] - Creating last entry in index 'metadata_2_3983062778932827(374)' root page [zwischen] (entriesInRootPage=7, threadId=38)
2025-08-20 08:42:29.681 WARNI [LSMTreeIndex] - compacted 10 pages, remaining 0 pages (totalKeys=54448 totalValues=140825 totalMergedKeys=746 totalMergedValues=1632907, threadId=38)
2025-08-20 08:42:29.686 WARNI [LSMTreeIndex] Index 'metadata_2_100106024536' compacted in 221ms (keys=54448 values=140825 mutablePages=2 immutablePages=77 iterations=54450 oldLevel0File=metadata_2_3983062778932827(374) newLevel0File=metadata_2_3987672206952732(468) newLevel1File=metadata_2_3939355416616505(148) threadId=38)
2025-08-20 08:42:29.686 INFO  [LSMTreeIndex] Compacting index 'metadata_3_99996499795(84)' (pages=10 pageSize=262144 threadId=38)...
2025-08-20 08:42:29.686 WARNI [LSMTreeIndex] - Creating sub-index 'metadata_3_3987672211338751(469)' with fileId=469 (threadId=38)...
2025-08-20 08:42:29.686 WARNI [LSMTreeIndex] - Compacting pages 0-8 (threadId=38)
2025-08-20 08:42:29.686 WARNI [LSMTreeIndex] - This turn compacting 9 pages using root page PageId(metadatalake/469/0) v.0 (threadId=38)Error on executing compaction of index 'metadata_3_99996499795'
java.lang.IllegalArgumentException: Invalid position -2086 (size=262136)
	at com.arcadedb.database.Binary.position(Binary.java:177)
	at com.arcadedb.database.Binary.putByteArray(Binary.java:304)
	at com.arcadedb.index.lsm.LSMTreeIndexCompacted.appendDuringCompaction(LSMTreeIndexCompacted.java:129)
	at com.arcadedb.index.lsm.LSMTreeIndexCompactor.compact(LSMTreeIndexCompactor.java:238)
	at com.arcadedb.index.lsm.LSMTreeIndex.compact(LSMTreeIndex.java:235)
	at com.arcadedb.database.async.DatabaseAsyncIndexCompaction.execute(DatabaseAsyncIndexCompaction.java:43)
	at com.arcadedb.database.async.DatabaseAsyncExecutorImpl$AsyncThread.run(DatabaseAsyncExecutorImpl.java:143)

Roughly every 30 seconds such a block is appended to the logs.
This behavior seems to coincide with ArcadeDB using close to 100% of available memory (total, not only heap), inside a container. This happens with 4GB and 8GB as resource limits.

  • Is (just) more memory needed?
  • Is this a bug in the index? (for example due to overlapping transactions of different threads)
  • Can this be dampened via configuration?
    • For example as DatabaseAsyncExecutorImpl is in the stack, maybe one of the asyncXXX settings, like asyncOperationsQueueImpl, asyncOperationsQueueSize, etc. could help?
  • Currently I am using 8 buckets for the type being inserted to, would it be better to use more (16) or less (1) bucket(s)?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions