Skip to content

[Perf] [GDS] Performance improvements to GDS backend#2637

Merged
DongDongJu merged 14 commits intoLMCache:devfrom
oferki:gds_backend_perf
Mar 12, 2026
Merged

[Perf] [GDS] Performance improvements to GDS backend#2637
DongDongJu merged 14 commits intoLMCache:devfrom
oferki:gds_backend_perf

Conversation

@oferki
Copy link
Copy Markdown
Contributor

@oferki oferki commented Feb 25, 2026

What this PR does / why we need it:
Performance improvements to gds_backend in its critical paths:

  1. Replaced python open with os.open
  2. Added the O_NOATIME flag to not update file access time. Improves performance. Added error handling if this mode is not available.
  3. Removed assertions access to memory_obj.tensor - a property that performs validity checks, crop of raw data, access to views. Added a call to is_valid instead
  4. Moved the assertions under a special flag, that can be enabled for debug (unlikely that everyone runs python -O on their setup to avoid them themselves)

Special notes for your reviewers:
Please review error handling, in general there should not be a change to the control flow of the class.

… and O_NOATIME, remove accesses to memory_obj.tensor

Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @oferki, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the performance of the GDS backend by optimizing critical file I/O operations and streamlining assertion checks. It introduces direct operating system calls for metadata reading, leverages the O_NOATIME flag to reduce disk overhead, and refactors assertion logic to avoid unnecessary computations, all while providing a mechanism to enable detailed debugging when needed.

Highlights

  • File I/O Optimization: Replaced Python's high-level open() with os.open() and os.read() for metadata file access, bypassing buffering for small files to improve read performance.
  • O_NOATIME Flag Implementation: Integrated the O_NOATIME flag with os.open() to prevent updating file access times, further enhancing performance. Includes robust error handling to fall back to normal os.open() if O_NOATIME is unsupported.
  • Assertion Refinement: Modified assertion logic to use memory_obj.is_valid() instead of directly accessing memory_obj.tensor, avoiding expensive property checks and validity operations in critical paths.
  • Debug Assertion Control: Encapsulated performance-impacting assertions within a new _debug_asserts flag, allowing them to be enabled for debugging purposes without affecting production performance.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • lmcache/v1/storage_backend/gds_backend.py
    • Added _debug_asserts and _use_noatime instance flags to control assertion behavior and O_NOATIME usage.
    • Introduced a new _read_metadata_info method that uses os.open and os.read for faster metadata retrieval, incorporating O_NOATIME with a fallback mechanism.
    • Updated _read_metadata to call the new _read_metadata_info method.
    • Wrapped existing assert statements in _load_bytes_from_disk_with_allocation and _load_bytes_from_disk_with_memory methods under the _debug_asserts flag.
    • Replaced memory_obj.tensor is None check with not memory_obj.is_valid() in _load_bytes_from_disk_with_memory.
Activity
  • No specific activity (comments, reviews, progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several valuable performance optimizations to the GdsBackend's critical paths. The changes, including using os.open with O_NOATIME for faster metadata reads and moving debug assertions behind a feature flag, are well-reasoned. My review includes a suggestion to improve error handling for cross-platform compatibility and points out a minor code cleanup to remove unreachable code. Overall, these are solid enhancements that should improve performance.

Comment thread lmcache/v1/storage_backend/gds_backend.py
Comment thread lmcache/v1/storage_backend/gds_backend.py Outdated
Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>
@sammshen
Copy link
Copy Markdown
Contributor

@ekaynar could you take a look?

@sammshen sammshen requested a review from deng451e March 4, 2026 02:50
Copy link
Copy Markdown
Contributor

@sammshen sammshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ekaynar
Copy link
Copy Markdown

ekaynar commented Mar 4, 2026

LGTM

Copy link
Copy Markdown
Collaborator

@DongDongJu DongDongJu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DongDongJu DongDongJu enabled auto-merge (squash) March 4, 2026 18:13
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Mar 4, 2026
@DongDongJu DongDongJu merged commit 81c9472 into LMCache:dev Mar 12, 2026
27 of 28 checks passed
@oferki oferki deleted the gds_backend_perf branch March 12, 2026 05:04
realAaronWu pushed a commit to realAaronWu/LMCache that referenced this pull request Mar 20, 2026
* [Perf] [GDS] performance improvements to GDS backend: use OS file ops and O_NOATIME, remove accesses to memory_obj.tensor

Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>

* Fixes to Gemini comments

Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>

---------

Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>
Signed-off-by: Aaron Wu <aaron.wu@dell.com>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* [Perf] [GDS] performance improvements to GDS backend: use OS file ops and O_NOATIME, remove accesses to memory_obj.tensor

Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>

* Fixes to Gemini comments

Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>

---------

Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
* [Perf] [GDS] performance improvements to GDS backend: use OS file ops and O_NOATIME, remove accesses to memory_obj.tensor

Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>

* Fixes to Gemini comments

Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>

---------

Signed-off-by: Ofer Kiselov Nahman <ofer.kiselovnahman@weka.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants