-
Notifications
You must be signed in to change notification settings - Fork 136
[linstor] Update piraeus-server patches with critical fixes #1850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Update piraeus-server patches to address critical production issues: - Add fix-duplicate-tcp-ports.diff to prevent duplicate TCP ports after toggle-disk operations (upstream PR #476) - Update skip-adjust-when-device-inaccessible.diff with comprehensive fixes for resources stuck in StandAlone after reboot, Unknown state race condition, and encrypted LUKS resource deletion (upstream PR #477) ```release-note [linstor] Fix DRBD resources stuck in StandAlone state after reboot and encrypted resource deletion issues ``` Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
📝 WalkthroughWalkthroughThis PR updates Linstor's piraeus-server patch documentation and adds two targeted fixes: one addressing duplicate TCP port assignments after toggle-disk operations, and another handling inaccessible devices during resource operations by skipping unnecessary regeneration and adjustment steps. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @kvaps, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the stability and reliability of LINSTOR's DRBD resource management by updating existing patches and introducing a new one. It addresses critical production issues related to Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates piraeus-server patches to address several critical production issues. The changes include adding a new patch to fix duplicate TCP ports and overhauling another patch to resolve issues with DRBD resources being stuck, race conditions, and encrypted resource deletion failures. The commit messages within the patch files are detailed and clearly explain the rationale behind the changes. The logic appears sound and effectively addresses the described problems. I've included a couple of suggestions to enhance code readability and maintainability by using Java Streams for checking conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
packages/system/linstor/images/piraeus-server/patches/skip-adjust-when-device-inaccessible.diff (1)
34-54: Code duplication in child device accessibility checks.The child device accessibility check logic is duplicated between the
canRegenerateResFileblock (lines 34-54) and thecanAdjustblock (lines 66-88). Both iterate overdrbdRscData.getVlmLayerObjects().values()and checkchildVlm == null || !childVlm.exists() || childVlm.getDevicePath() == null.Consider extracting this into a helper method like
areChildDevicesAccessible(DrbdRscData<Resource> drbdRscData)to reduce duplication and improve maintainability.♻️ Suggested helper method extraction
private boolean areChildDevicesAccessible(DrbdRscData<Resource> drbdRscData) throws AccessDeniedException { AbsRscLayerObject<Resource> dataChild = drbdRscData.getChildBySuffix(RscLayerSuffixes.SUFFIX_DATA); if (dataChild != null) { for (DrbdVlmData<Resource> drbdVlmData : drbdRscData.getVlmLayerObjects().values()) { VlmProviderObject<Resource> childVlm = dataChild.getVlmProviderObject(drbdVlmData.getVlmNr()); if (childVlm == null || !childVlm.exists() || childVlm.getDevicePath() == null) { return false; } } } return true; }Also applies to: 66-113
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
packages/system/linstor/images/piraeus-server/patches/README.mdpackages/system/linstor/images/piraeus-server/patches/fix-duplicate-tcp-ports.diffpackages/system/linstor/images/piraeus-server/patches/skip-adjust-when-device-inaccessible.diff
🧰 Additional context used
📓 Path-based instructions (2)
packages/system/**
📄 CodeRabbit inference engine (AGENTS.md)
Use Helm Chart umbrella pattern with vendored upstream charts in
charts/directory
Files:
packages/system/linstor/images/piraeus-server/patches/skip-adjust-when-device-inaccessible.diffpackages/system/linstor/images/piraeus-server/patches/fix-duplicate-tcp-ports.diffpackages/system/linstor/images/piraeus-server/patches/README.md
**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
Git commits should follow
[component] Descriptionformat and include--signoffflag
Files:
packages/system/linstor/images/piraeus-server/patches/README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build
🔇 Additional comments (4)
packages/system/linstor/images/piraeus-server/patches/fix-duplicate-tcp-ports.diff (1)
1-87: Well-documented patch with comprehensive root cause analysis.The patch is well-structured with:
- Clear problem description and root cause analysis
- Detailed impact assessment
- Minimal, targeted fix (removing redundant
ensureStackDataExists()call)- Proper testing verification
- Correct sign-off and attribution
The fix correctly addresses the TCP port duplication issue by ensuring
resetStoragePools()only handles storage pool assignments while relying on callers to properly initialize layer data with correct TCP ports.packages/system/linstor/images/piraeus-server/patches/skip-adjust-when-device-inaccessible.diff (2)
163-172: Sound fix for race condition during device adjustment.The added
Files.exists(Paths.get(devicePath))check correctly handles the race condition where kernel device nodes may not exist immediately afterdrbdadm adjust. Deferring the check to the next DeviceManager cycle is a reasonable approach.
230-254: Critical fix for StandAlone resources after reboot.The
needsDiskAccessdetermination logic correctly identifies when disk access is truly required:
- New volume creation (
!drbdVlmData.exists())- Metadata operations (
checkMetaData()– a state flag indicating metadata needs checking/creation)- Resize operations (
Volume.Flags.RESIZE,Volume.Flags.DRBD_RESIZE)This allows network reconnect operations (StandAlone → Connected) to proceed without blocking on child device availability, which is the root cause of resources staying stuck in StandAlone after node reboot.
packages/system/linstor/images/piraeus-server/patches/README.md (1)
11-14: Documentation properly updated.The new patch entries follow the established format with:
- Bold patch filename
- Em-dash separator
- Concise description
- Upstream PR reference links
The descriptions accurately reflect the patch purposes documented in the diff files.
|
/retest |
|
Successfully created backport PR for |
|
Successfully created backport PR for |
What this PR does
This PR updates piraeus-server patches to address several critical production issues with DRBD resources and LUKS encryption:
Add fix-duplicate-tcp-ports.diff - Prevents duplicate TCP ports after toggle-disk operations (upstream PR revert-precommit #476)
Update skip-adjust-when-device-inaccessible.diff - Comprehensive fix for multiple issues:
These patches resolve scenarios where DRBD resources fail to automatically reconnect after node reboots and improve LUKS resource lifecycle management.
Upstream PRs:
Release note
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.