-
-
Notifications
You must be signed in to change notification settings - Fork 94
Description
Description
When creating multiple indexes sequentially on large datasets, subsequent index creation commands fail with NeedRetryException because background LSMTree compaction from previous index creation is still running.
Tested on: 25.10.1-SNAPSHOT
Steps to Reproduce
- Import a large dataset (e.g., 33.8M records into a table)
- Create indexes sequentially:
CREATE INDEX ON Movie (movieId) UNIQUE; -- Works CREATE INDEX ON Rating (userId) NOTUNIQUE; -- Works, triggers compaction on 33.8M records CREATE INDEX ON Rating (movieId) NOTUNIQUE; -- FAILS: compaction still running CREATE INDEX ON Link (movieId) UNIQUE; -- FAILS: compaction still running CREATE INDEX ON Tag (movieId) NOTUNIQUE; -- FAILS: compaction still running
Expected Behavior
One of the following:
- Synchronous behavior:
CREATE INDEXblocks until any running compaction completes, then creates the new index - Queuing: Index creation requests are queued and executed after current compaction finishes
- Retry handling: The database automatically retries internally without throwing exception to client
Actual Behavior
✅ CREATE INDEX ON Movie (movieId) UNIQUE
✅ CREATE INDEX ON Rating (userId) NOTUNIQUE
→ Triggers LSMTree compaction on 33.8M records (takes 30-60 seconds)
❌ CREATE INDEX ON Rating (movieId) NOTUNIQUE
→ com.arcadedb.exception.NeedRetryException: Cannot create a new index while asynchronous tasks are running
❌ CREATE INDEX ON Link (movieId) UNIQUE
→ com.arcadedb.exception.NeedRetryException: Cannot create a new index while asynchronous tasks are running
❌ CREATE INDEX ON Tag (movieId) NOTUNIQUE
→ com.arcadedb.exception.NeedRetryException: Cannot create a new index while asynchronous tasks are running
The exception is thrown immediately even though the index creation could simply wait for compaction to finish.
Impact
- Developer Experience: ❌ Requires manual retry logic for a common operation
- API Consistency:
⚠️ Other database operations handle internal async tasks transparently - Batch Scripts: ❌ Cannot create multiple indexes in a single script without retry logic
- Documentation Gap:
⚠️ Behavior is not documented, developers encounter unexpected failures
Environment
- Version:
25.10.1-SNAPSHOT - Dataset: MovieLens ml-latest (33,832,163 ratings)
- Interface: Python bindings via arcadedb_embedded
- JVM Heap: 8GB
- Mode: Embedded
Logs
Error message:
Command failed: com.arcadedb.exception.NeedRetryException:
Cannot create a new index while asynchronous tasks are running
Compaction logs show LSMTree operations taking time:
LSMTreeIndex 'Rating[userId]' compacted 50 pages, remaining 0 pages
(totalKeys=289037 totalValues=2251732)
Subsequent attempts during compaction window:
2025-10-25 14:04:14.318 WARNI [LSMTreeIndex] Index 'Rating_14_175471788647837'
compacted in 174ms (keys=289166 values=2251404 mutablePages=2 immutablePages=51)
Reproduction Script
import arcadedb_embedded as arcadedb
import time
# After importing large dataset...
indexes = [
("Movie", "movieId", "UNIQUE"),
("Rating", "userId", "NOTUNIQUE"),
("Rating", "movieId", "NOTUNIQUE"), # This will fail
("Link", "movieId", "UNIQUE"), # This will fail
("Tag", "movieId", "NOTUNIQUE"), # This will fail
]
for table, column, uniqueness in indexes:
try:
db.command("sql", f"CREATE INDEX ON {table} ({column}) {uniqueness}")
print(f"✅ Created {table}({column})")
except Exception as e:
print(f"❌ Failed {table}({column}): {e}")Output:
✅ Created Movie(movieId)
✅ Created Rating(userId)
❌ Failed Rating(movieId): NeedRetryException: Cannot create a new index while asynchronous tasks are running
❌ Failed Link(movieId): NeedRetryException: Cannot create a new index while asynchronous tasks are running
❌ Failed Tag(movieId): NeedRetryException: Cannot create a new index while asynchronous tasks are running
Analysis
The issue occurs because:
- Creating an index on a large table triggers background LSMTree compaction
- Compaction can take 30-60+ seconds for tables with tens of millions of records
- During compaction, the database rejects new index creation requests
- No internal queuing or waiting mechanism exists
Suggested Solutions
Option 1: Synchronous Blocking (Simplest)
Make CREATE INDEX wait internally for any running compaction to complete before proceeding.
Pros:
- Simple, predictable behavior
- No API changes needed
- Works like other databases
Cons:
- Blocks the calling thread
Option 2: Internal Queuing
Queue index creation requests and execute them serially after compaction completes.
Pros:
- Non-blocking API
- Handles multiple concurrent requests
Cons:
- More complex implementation
- Need to communicate queue status to client
Option 3: Automatic Retry
Internally retry the index creation with exponential backoff instead of throwing exception.
Pros:
- Transparent to client
- Eventually succeeds
Cons:
- Hidden latency
- Harder to set appropriate timeout
Option 4: Document as Expected Behavior
If this is intentional, document that applications must implement retry logic.
Pros:
- No code changes
Cons:
- Poor developer experience
- Inconsistent with typical database behavior
Current Workaround
Implement retry logic with delays:
max_retries = 60 # 10 minutes max (60 * 10 seconds)
retry_delay = 10 # seconds between retries
for table, column, uniqueness in indexes:
created = False
for attempt in range(max_retries):
try:
db.command("sql", f"CREATE INDEX ON {table} ({column}) {uniqueness}")
print(f"✅ Created {table}({column})")
created = True
break
except Exception as e:
if "NeedRetryException" in str(e) and "asynchronous tasks" in str(e):
if attempt < max_retries - 1:
print(f"⏳ Compaction running, retry {attempt+1}/{max_retries}...")
time.sleep(retry_delay)
else:
print(f"❌ Timeout after {max_retries} attempts")
else:
# Different error - don't retry
print(f"❌ Error: {e}")
breakThis works but requires every application to implement the same retry pattern.
Related Issues
This is separate from but related to issue #2701 (duplicate timestamped indexes during compaction). Both stem from LSMTree compaction behavior on large datasets.