Fix: ensure chunk-size warnings show original value and keep WAL off …#3134
Fix: ensure chunk-size warnings show original value and keep WAL off …#3134lvca merged 1 commit intoArcadeData:mainfrom
Conversation
…during chunked graph persistence
Summary of ChangesHello @tae898, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request focuses on enhancing the robustness and diagnostic capabilities of the graph persistence mechanism within Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🧪 CI InsightsHere's what we observed from your CI run for b3f756e. 🟢 All jobs passed!But CI Insights is watching 👀 |
There was a problem hiding this comment.
Code Review
The pull request introduces several important improvements to the LSMVectorIndex, primarily focusing on robust transaction management during graph persistence and enhanced logging. The changes ensure that WAL (Write-Ahead Log) is consistently disabled for chunked graph persistence, preventing potential data inconsistencies or performance overhead. The new logic for validating arcadedb.index.buildChunkSizeMB and providing informative warnings when misconfigured values are encountered is a welcome addition. Furthermore, the improved error logging with more contextual information will greatly assist in debugging persistence failures. The explicit commit in the else branch of the startedTransaction block is a critical fix for ensuring data durability when transactions are managed by the caller.
| getDatabase().getTransaction().setUseWAL(false); | ||
| } else { | ||
| getDatabase().getTransaction().setUseWAL(false); |
There was a problem hiding this comment.
This change correctly ensures that WAL is explicitly disabled for graph persistence, regardless of whether the transaction was initiated by the method or was already active. This is crucial for performance during bulk operations and for preventing potential deadlocks or inconsistencies if WAL was inadvertently left enabled.
| getDatabase().commit(); | ||
| LogManager.instance().log(this, Level.FINE, "Vector graph persisted and committed for index: %s", indexName); | ||
| } else { | ||
| getDatabase().commit(); |
There was a problem hiding this comment.
Adding getDatabase().commit() in the else branch is a critical fix. Previously, if a transaction was already active (managed by the caller), the graph persistence changes would not be committed, leading to data loss or inconsistency. This ensures that the changes are persisted correctly in all scenarios.
| long chunkSizeMB = getDatabase().getConfiguration().getValueAsLong(GlobalConfiguration.INDEX_BUILD_CHUNK_SIZE_MB); | ||
| if (chunkSizeMB <= 0) { | ||
| final long configuredChunkSize = chunkSizeMB; | ||
| chunkSizeMB = 50; | ||
| LogManager.instance().log(this, Level.WARNING, | ||
| "arcadedb.index.buildChunkSizeMB was %dMB during graph persistence; forcing fallback to 50MB", configuredChunkSize); | ||
| } |
There was a problem hiding this comment.
The addition of validation for chunkSizeMB and the fallback to a default value of 50MB, along with a warning log, is a good defensive programming practice. It prevents potential issues arising from misconfigured or invalid chunk sizes, such as excessively large single commits that could lead to out-of-memory errors or very slow transactions.
| LogManager.instance().log(this, Level.SEVERE, | ||
| "PERSIST: Failed to persist graph for %s (nodes=%d, storeVectorsInGraph=%b, txStatus=%s): %s - %s", | ||
| indexName, | ||
| totalNodes, | ||
| metadata.storeVectorsInGraph, | ||
| getDatabase().getTransaction().getStatus(), | ||
| e.getClass().getSimpleName(), | ||
| e.getMessage(), | ||
| e); |
There was a problem hiding this comment.
| if (LogManager.instance().isDebugEnabled()) | ||
| e.printStackTrace(); |
| long chunkSizeMB = db.getConfiguration().getValueAsLong(GlobalConfiguration.INDEX_BUILD_CHUNK_SIZE_MB); | ||
| if (chunkSizeMB <= 0) { | ||
| final long configuredChunkSize = chunkSizeMB; | ||
| chunkSizeMB = 50; | ||
| LogManager.instance().log(this, Level.WARNING, | ||
| "arcadedb.index.buildChunkSizeMB was %dMB; forcing fallback to 50MB for graph persistence chunking", configuredChunkSize); | ||
| } |
There was a problem hiding this comment.
| long chunkSizeMB = db.getConfiguration().getValueAsLong(GlobalConfiguration.INDEX_BUILD_CHUNK_SIZE_MB); | ||
| if (chunkSizeMB <= 0) { | ||
| final long configuredChunkSize = chunkSizeMB; | ||
| chunkSizeMB = 50; | ||
| LogManager.instance().log(this, Level.WARNING, | ||
| "arcadedb.index.buildChunkSizeMB was %dMB during graph persistence; forcing fallback to 50MB", configuredChunkSize); | ||
| } |
There was a problem hiding this comment.
|
this fix works. |
lvca
left a comment
There was a problem hiding this comment.
It looks great to me, thanks!
Summary
arcadedb.index.buildChunkSizeMBvalue before clamping to the 50MB fallback.Testing