fix(#4333): persist migratedFileIds in schema.json to survive restart by robfrank · Pull Request #4342 · ArcadeData/arcadedb

robfrank · 2026-05-26T10:04:13Z

Summary

LocalSchema.migratedFileIds (the old->new fileId map written by LSM/vector index compaction) was a heap-only ConcurrentHashMap that was never included in schema.json. After a restart the map was always empty, so TransactionManager.applyChanges could not distinguish a safe compaction-migration skip from an unknown missing-file condition, silently dropping WAL pages at FINE level after every compaction+restart.

Changes:

LocalSchema.toJSON(): serialize migratedFileIds as a JSON object under the key "migratedFileIds" (only emitted when non-empty, so existing schema files without the key remain compatible)
LocalSchema loading: restore migratedFileIds from the JSON block on open, clearing the map first for idempotency
TransactionManager.applyChanges(): consult the migration map when a fileId is missing; log FINE for known compaction-migrations (safe skip, data already in the new file), WARNING for unknown missing files (deleted by schema change or incomplete migration map)

Test plan

MigratedFileIdsPersistenceTest#migratedFileIdsArePersisted - after compaction triggers setMigratedFileId, close and reopen the database; migration map must still be populated
MigratedFileIdsPersistenceTest#migratedFileIdsRoundTrip - round-trip the map through close+reopen and verify exact old->new entries
LSMTreeIndexCompactionTest - all existing compaction tests pass
LockFilesInOrderFileMigrationTest - existing file-migration regression test passes
ConcurrentCompactionTest - concurrent compaction tests pass

gemini-code-assist

Code Review

This pull request addresses issue #4333 by persisting the migratedFileIds map in schema.json to prevent WAL recovery from silently skipping pages of missing files after a restart. It serializes and deserializes the map in LocalSchema and updates TransactionManager.applyChanges() to check this map, logging a warning for genuinely unknown missing files. Feedback points out that setMigratedFileId needs to call saveConfiguration() to ensure the updated map is actually written to disk. Additionally, checking for null values when deserializing migratedFileIds is recommended to avoid potential JSONExceptions, and the warning log level for missing files should be reconsidered to avoid noise during normal Raft replay or crash recovery.

gemini-code-assist · 2026-05-26T10:05:35Z

+    if (!migratedFileIds.isEmpty()) {
+      final JSONObject migratedJSON = new JSONObject();
+      for (final Map.Entry<Integer, Integer> entry : migratedFileIds.entrySet())
+        migratedJSON.put(String.valueOf(entry.getKey()), entry.getValue());
+      root.put("migratedFileIds", migratedJSON);
+    }


While migratedFileIds is now correctly serialized and deserialized, setMigratedFileId(int, int) does not call saveConfiguration() or mark the schema as dirty.

If a compaction occurs and updates the migration map, but no other schema changes happen before the database is closed, the schema will not be considered dirty (dirtyGeneration.get() > savedGeneration will be false) and close() will not persist the updated map to schema.json.

To ensure the migration map is always persisted, setMigratedFileId should call saveConfiguration() (which will either save immediately or postpone it by marking the schema dirty if a transaction is active):

public void setMigratedFileId(final int oldFileId, final int newFileId) { LogManager.instance().log(this, Level.FINE, "Migrating file id %d to %d", null, oldFileId, newFileId); migratedFileIds.put(oldFileId, newFileId); saveConfiguration(); }

gemini-code-assist · 2026-05-26T10:05:35Z

+      if (root.has("migratedFileIds")) {
+        final JSONObject migratedJSON = root.getJSONObject("migratedFileIds");
+        for (final String key : migratedJSON.keySet())
+          migratedFileIds.put(Integer.parseInt(key), migratedJSON.getInt(key));
+      }


If migratedFileIds is present in the JSON but has a null value (e.g., "migratedFileIds": null), root.getJSONObject("migratedFileIds") will throw a JSONException. Using !root.isNull(...) is safer and prevents potential crashes during schema loading.

Suggested change

if (root.has("migratedFileIds")) {

final JSONObject migratedJSON = root.getJSONObject("migratedFileIds");

for (final String key : migratedJSON.keySet())

migratedFileIds.put(Integer.parseInt(key), migratedJSON.getInt(key));

}

if (root.has("migratedFileIds") && !root.isNull("migratedFileIds")) {

final JSONObject migratedJSON = root.getJSONObject("migratedFileIds");

for (final String key : migratedJSON.keySet())

migratedFileIds.put(Integer.parseInt(key), migratedJSON.getInt(key));

}

gemini-code-assist · 2026-05-26T10:05:35Z

+        else
+          LogManager.instance().log(this, Level.WARNING,
+              "Skipping page for missing file %d during transaction apply, file was deleted or migration map is incomplete", null,
+              txPage.fileId);


Logging a WARNING for any missing file not present in the migration map might be too noisy. During normal Raft replay or crash recovery, files deleted intentionally by schema changes (e.g., DROP TYPE or DROP INDEX) will also trigger this warning, potentially flooding the logs with false positives. Consider if INFO or keeping FINE is more appropriate, or if there is a way to track intentionally deleted files.

codacy-production · 2026-05-26T10:06:17Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 13 complexity

Metric Results

Complexity 13

View in Codacy

🟢 Coverage 64.71% diff coverage

Metric Results

Coverage variation Report missing for a68f663¹

Diff coverage ✅ 64.71% diff coverage

View coverage diff in Codacy

Coverage variation details

Coverable lines Covered lines Coverage

Common ancestor commit (a68f663) Report Missing Report Missing Report Missing

Head commit (2789ec0) 155481 103252 66.41%

Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: <coverage of head commit> - <coverage of common ancestor commit>

Diff coverage details

Coverable lines Covered lines Diff coverage

Pull request (#4342) 17 11 64.71%

Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: <covered lines added or modified>/<coverable lines added or modified> * 100%

¹ Codacy didn't receive coverage data for the commit, or there was an error processing the received data. Check your integration for errors and validate that your coverage setup is correct.

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

claude · 2026-05-26T10:09:44Z

test comment please ignore

claude · 2026-05-26T10:10:18Z

Code Review - fix(#4333): persist migratedFileIds in schema.json

The fix is correct in its core approach: serializing migratedFileIds to schema.json and restoring it on open so WAL recovery can distinguish safe compaction-migration skips from genuinely missing files. A few issues need attention before merge.

Bug: double `saveConfiguration()` per compaction

LSMTreeIndex.java and LSMVectorIndex.java both already call saveConfiguration() immediately after setMigratedFileId():

// LSMTreeIndex.java ~line 675
((LocalSchema) database.getSchema()).setMigratedFileId(fileId, newMutableIndex.getFileId());
database.getSchema().getEmbedded().saveConfiguration();  // still here

// LSMVectorIndex.java ~line 4618
((LocalSchema) getDatabase().getSchema()).setMigratedFileId(fileId, newMutableIndex.getFileId());
getDatabase().getSchema().getEmbedded().saveConfiguration();  // still here

With this PR, setMigratedFileId now internally calls saveConfiguration() too, resulting in two consecutive schema-file writes per compaction. Each write re-serializes the full schema JSON. This isn't a correctness bug but it is a needless performance hit on what is already an I/O-intensive path.

Fix: remove the explicit saveConfiguration() calls from LSMTreeIndex and LSMVectorIndex (they are now redundant), or remove it from setMigratedFileId and document that callers are responsible.

Missing assertion in `migratedFileIdsRoundTrip`

migratedFileIdsArePersisted correctly guards against compaction being a no-op:

final boolean compacted = bucketIndex.compact();
if (!compacted)
  error.set(new AssertionError("compact() returned false"));

migratedFileIdsRoundTrip omits this check - compact() return value is silently discarded. If compaction is skipped for any reason (not enough pages, concurrent activity), getMigratedFileId() returns null and the test fails with a misleading NullPointerException-style message rather than a clear "compaction did not run" message. Add the same guard used in the first test.

`docs/4333-persist-migrated-file-ids.md` should not be committed

This file is an AI review-cycle tracking document ("Review Cycles", "Cycle 1 - a47f16b", "gemini-code-assist reviewed", "State: timeout"). That kind of meta-tracking belongs in the PR description or issue, not in the repository. It will rot immediately and confuse future contributors. Please drop this file from the PR.

Design note: `migratedFileIds` grows without bound

Once an old->new file mapping is written to schema.json it is never removed - not even after the old file has been dropped and WAL segments predating the last checkpoint are irrelevant. For a long-lived database with frequent compactions this map will accumulate stale entries forever.

This is a pre-existing design question (the map already lived in memory indefinitely), but persisting it makes the problem permanent. A follow-up should add a cleanup pass - e.g., remove entries whose oldFileId no longer exists on disk after WAL recovery completes.

Minor: log message accuracy at INFO level

The INFO log reads:

"Skipping page for missing file %d during transaction apply, file was deleted or migration map is incomplete"

"migration map is incomplete" suggests a partial write/corruption, which is alarming. Given that the most common benign cause is a schema DROP, something like "file was deleted (schema change or DROP TYPE/INDEX)" reads more clearly for operators.

Summary

Severity	Item
Medium	Double `saveConfiguration()` call - redundant disk write per compaction
Medium	`migratedFileIdsRoundTrip` - compaction success not verified, test may give misleading failures
Medium	`docs/4333-persist-migrated-file-ids.md` - tracking document should not be in the repo
Low	Map grows without bound (pre-existing, but persistence makes it permanent)
Nit	INFO log wording slightly alarming

The core logic (serialize/deserialize migratedFileIds, consult map in applyChanges) is sound. The double-save and the docs file are the most important things to fix before merge.

claude · 2026-05-26T10:10:24Z

Code Review

Summary: This PR correctly identifies and fixes a real bug - LocalSchema.migratedFileIds was heap-only and lost on restart, causing WAL recovery to silently drop compaction-related pages. The fix is straightforward and the approach is sound. A few concerns worth addressing before merge.

Bug: WARNING log for intentional DROP TYPE / DROP INDEX

The original comment in TransactionManager.applyChanges() said:

Referencing a deleted file is expected during Raft replay and crash recovery - schema changes may delete files before their prior TX entries are replayed.

That case (DROP TYPE / DROP INDEX) is still completely valid and now logs at WARNING since the dropped file won't be in migratedFileIds. This will produce operator noise on any cluster that does Raft replay or crash-recovers after a DROP. Example: a user drops an index, the server crashes, and on restart the WAL replay logs a WARNING for every page of that index.

The intent of the new WARNING is good (flag truly unexpected missing files), but currently it fires on both unexpected files and intentional schema-drop files. Consider whether a separate mechanism - such as recording intentionally dropped file IDs in the schema JSON as a droppedFileIds set - is needed to distinguish the two cases, or at minimum add a comment acknowledging the false-positive risk.

Unbounded map growth in `schema.json`

migratedFileIds has no eviction. Every compaction appends a new oldId -> newId entry. With persistence, the map now grows in schema.json across the lifetime of the database. Long-lived databases with many compactions will see this bloat accumulate indefinitely.

Consider pruning entries whose old file ID no longer appears in any live WAL file, or at least document this as a known limitation so it can be tracked as a follow-up.

Test: second test ignores `compact()` return value

In migratedFileIdsRoundTrip, the compaction thread does not check whether compact() returned true. If compaction silently bails out (e.g. not enough pages), getMigratedFileId() will return null and assertThat(newFidBefore).isNotNull() fails with a misleading message. The first test handles this correctly - replicate the error.set(new AssertionError("compact() returned false")) guard.

Also, assertThat(done.await(30, TimeUnit.SECONDS)).isTrue() and assertThat(error.get()).isNull() in the second test lack descriptive messages, unlike the first test. Small but inconsistent.

Test code duplication

The two test methods are nearly identical (same schema setup, insert 1000 docs, run compaction in a thread, close/reopen, assert). Consider extracting shared setup into a helper method that takes the type name and property name as parameters. This would make both test bodies much shorter and the intent clearer.

Docs file

docs/4333-persist-migrated-file-ids.md is an issue-specific design document. The PR description already covers this content well. Committing issue-specific analysis docs to the repo tends to become dead weight as the code evolves. Worth considering whether this belongs in the PR description / commit message instead.

Minor: the PR body references a test named migratedFileIdsAreLoadedOnReopen but the actual test method is migratedFileIdsRoundTrip.

What works well

Backward-compatible serialization (key omitted when empty, load handles missing key with no-op)
migratedFileIds.clear() before repopulation in readConfiguration() ensures idempotency on re-read
The serialization uses string keys (required by JSON) with Integer.parseInt deserialization - correct
readConfiguration() is synchronized so the clear+populate sequence is safe under concurrent access
The getMigratedFileId call in TransactionManager is consistent with existing usage in TransactionContext
Test coverage proves the key scenario (persistence survives close+reopen)

claude · 2026-05-26T10:16:16Z

Code Review - fix(#4333): persist migratedFileIds in schema.json to survive restart

Overview

This PR fixes a real correctness bug: LocalSchema.migratedFileIds was heap-only, so after a restart the WAL recovery path in TransactionManager.applyChanges could not distinguish a safe compaction-skip from an unknown missing-file condition. The fix serializes the map to schema.json and restores it on open. Good find and clear root cause analysis.

Issues

1. Double saveConfiguration() after compaction (redundant write)

Both callers of setMigratedFileId already call saveConfiguration() immediately after:

LSMTreeIndex.java:677: database.getSchema().getEmbedded().saveConfiguration();
LSMVectorIndex.java:4619: getDatabase().getSchema().getEmbedded().saveConfiguration();

Now that setMigratedFileId itself calls saveConfiguration(), those two call sites each trigger two serializations of the schema in quick succession. This is idempotent and not a correctness problem, but it is wasteful since schema serialization is a synchronized operation. Consider removing the now-redundant explicit saveConfiguration() calls in the two index classes, or removing it from setMigratedFileId and documenting that callers are responsible for persisting.

2. migratedFileIds map grows without bounds

There is no eviction or pruning of entries from migratedFileIds. Every compaction permanently adds an entry that persists across restarts. For a long-lived database with frequent compactions, this map will grow indefinitely (unbounded memory and schema.json bloat). There should be a lifecycle event where stale entries can be removed - for example, when the old file is confirmed to no longer appear in any recoverable WAL segment, or when a subsequent compaction replaces oldId -> newId with newId -> newerNewId. At minimum, this should be noted as a known limitation with a follow-up issue.

3. docs/4333-persist-migrated-file-ids.md should not be committed

This file contains internal review-cycle notes ("gemini-code-assist reviewed (COMMENTED)", "State: timeout", links back to the PR being merged). It is meta-documentation about the review process, not documentation about the feature or schema format. It will become stale immediately after merge. PR descriptions and issue comments are the right place for this content.

Minor Points

4. Second test does not assert on compact() return value

In migratedFileIdsRoundTrip, the compaction thread calls bucketIndex.compact() without checking the return value. If compaction is skipped (returns false), the test will proceed and fail with a confusing "newFidBefore is null" message rather than a clear "compaction did not run" failure. The first test (migratedFileIdsArePersisted) handles this correctly - apply the same pattern to the second test.

5. The two tests are nearly identical

Both tests: create a schema, insert 1000 records, compact, close, reopen, and assert the migration entry survived. The only difference is the entity name. Consider merging into a single test or giving the second a distinct scenario (e.g., multiple sequential compactions, or a migration followed by a type drop).

What Works Well

The !root.isNull("migratedFileIds") null guard is correct and the backward-compatibility approach (omitting the key when empty) keeps existing schema files compatible.
Restoring with migratedFileIds.clear() before repopulating is correct for idempotent reloads.
Downgrading the applyChanges log from WARNING to INFO for the unknown-file case is reasonable - this path fires for every DROP TYPE/INDEX replay and WARNING would be noisy.
The setMigratedFileId -> saveConfiguration addition correctly addresses the race where a crash between compaction and the next schema change left the map unpersisted.

LocalSchema.migratedFileIds (old->new fileId map from LSM compaction) was a heap-only ConcurrentHashMap omitted from toJSON()/schema.json. After a restart the map was always empty, so TransactionManager.applyChanges could not distinguish a safe compaction-skip from a genuine missing-file data loss and silently dropped WAL pages at FINE level. Changes: - LocalSchema.toJSON(): serialize migratedFileIds as JSON object under "migratedFileIds" key (only emitted when non-empty) - LocalSchema loading: restore migratedFileIds from the JSON block on open - TransactionManager.applyChanges: consult the migration map when a fileId is missing; log FINE for known migrations, WARNING for unknown ones

…evel - setMigratedFileId() now calls saveConfiguration() so the migration map is persisted even when no other schema change follows compaction - loading: add !root.isNull("migratedFileIds") guard before getJSONObject() to avoid JSONException if the field is explicitly null - TransactionManager: downgrade unknown-missing-file log from WARNING to INFO to avoid noise during intentional schema-drop file deletions in Raft replay

codecov · 2026-05-26T15:07:57Z

Codecov Report

❌ Patch coverage is 58.82353% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.96%. Comparing base (128872a) to head (2789ec0).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
...n/java/com/arcadedb/engine/TransactionManager.java	0.00%	6 Missing ⚠️
...src/main/java/com/arcadedb/schema/LocalSchema.java	90.90%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4342      +/-   ##
==========================================
- Coverage   65.37%   64.96%   -0.42%     
==========================================
  Files        1610     1610              
  Lines      123789   123803      +14     
  Branches    26776    26781       +5     
==========================================
- Hits        80925    80426     -499     
- Misses      31429    32048     +619     
+ Partials    11435    11329     -106

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gemini-code-assist Bot reviewed May 26, 2026

View reviewed changes

robfrank added 3 commits May 26, 2026 14:22

docs: update tracking doc with PR and review history

2789ec0

robfrank force-pushed the fix/4333-persist-migrated-file-ids branch from 014a198 to 2789ec0 Compare May 26, 2026 12:22

robfrank merged commit d1ee393 into main May 26, 2026
27 of 40 checks passed

robfrank deleted the fix/4333-persist-migrated-file-ids branch May 26, 2026 16:15

robfrank added this to the 26.6.1 milestone May 27, 2026

robfrank self-assigned this May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(#4333): persist migratedFileIds in schema.json to survive restart#4342

fix(#4333): persist migratedFileIds in schema.json to survive restart#4342
robfrank merged 3 commits into
mainfrom
fix/4333-persist-migrated-file-ids

robfrank commented May 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

gemini-code-assist Bot May 26, 2026

Uh oh!

codacy-production Bot commented May 26, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 26, 2026

Uh oh!

claude Bot commented May 26, 2026

Uh oh!

claude Bot commented May 26, 2026

Uh oh!

claude Bot commented May 26, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

robfrank commented May 26, 2026

Summary

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

codacy-production Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

claude Bot commented May 26, 2026

Uh oh!

claude Bot commented May 26, 2026

Code Review - fix(#4333): persist migratedFileIds in schema.json

Bug: double saveConfiguration() per compaction

Missing assertion in migratedFileIdsRoundTrip

docs/4333-persist-migrated-file-ids.md should not be committed

Design note: migratedFileIds grows without bound

Minor: log message accuracy at INFO level

Summary

Uh oh!

claude Bot commented May 26, 2026

Code Review

Bug: WARNING log for intentional DROP TYPE / DROP INDEX

Unbounded map growth in schema.json

Test: second test ignores compact() return value

Test code duplication

Docs file

What works well

Uh oh!

claude Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review - fix(#4333): persist migratedFileIds in schema.json to survive restart

Overview

Issues

Minor Points

What Works Well

Uh oh!

codecov Bot commented May 26, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codacy-production Bot commented May 26, 2026 •

edited

Loading

Bug: double `saveConfiguration()` per compaction

Missing assertion in `migratedFileIdsRoundTrip`

`docs/4333-persist-migrated-file-ids.md` should not be committed

Design note: `migratedFileIds` grows without bound

Unbounded map growth in `schema.json`

Test: second test ignores `compact()` return value

claude Bot commented May 26, 2026 •

edited

Loading