Skip to content

feat(gds): enable parallel I/O thread pool for all cuFile filesystems#2802

Merged
deng451e merged 2 commits intoLMCache:devfrom
glimchb:gds-thread
Mar 28, 2026
Merged

feat(gds): enable parallel I/O thread pool for all cuFile filesystems#2802
deng451e merged 2 commits intoLMCache:devfrom
glimchb:gds-thread

Conversation

@glimchb
Copy link
Copy Markdown
Contributor

@glimchb glimchb commented Mar 17, 2026

Previously the thread pool for batched_get_blocking was only created on WekaFS, so cuFile on ext4, xfs, or other filesystems fell back to serial reads.

Move use_thread_pool out of the WekaFS branch so it is set whenever cuFile is active, regardless of filesystem type. Also rename the thread-name prefix from "weka-gds-io" to "gds-io" to reflect the broader scope.

What this PR does / why we need it:

Special notes for your reviewers:

If applicable:

  • this PR contains user facing changes - docs added
  • this PR contains unit tests

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the performance of I/O operations by extending the use of a parallel I/O thread pool to all filesystems that utilize cuFile. Previously, this optimization was restricted to WekaFS. The change ensures that cuFile operations benefit from parallel processing across a wider range of storage configurations, leading to more efficient data handling.

Highlights

  • Parallel I/O for all cuFile filesystems: Enabled the parallel I/O thread pool for all filesystems where cuFile is active, removing the previous limitation to WekaFS only.
  • Thread pool activation logic: Refactored the logic to activate the use_thread_pool flag whenever cuFile is in use, ensuring broader applicability.
  • Thread name prefix: Renamed the thread pool's name prefix from 'weka-gds-io' to 'gds-io' to reflect its expanded scope beyond WekaFS.
Changelog
  • lmcache/v1/storage_backend/gds_backend.py
    • Removed the initial explicit setting of self.use_thread_pool = False.
    • Modified the assignment of self.use_thread_pool to be dependent on self.use_cufile globally, rather than only within the WekaFS-specific block.
    • Updated the thread_name_prefix for the ThreadPoolExecutor from 'weka-gds-io' to 'gds-io'.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables the parallel I/O thread pool for all cuFile-enabled filesystems, not just WekaFS. This is achieved by tying the use_thread_pool flag to use_cufile. The thread name prefix is also made more generic. The changes are logical and correct. However, enabling this feature more broadly reveals a potential issue in the thread pool implementation (_batched_get_blocking_by_thread_pool_impl) where the memory format is not correctly handled during allocation. I've added a comment with details on this.

Comment thread lmcache/v1/storage_backend/gds_backend.py Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b7336239a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread lmcache/v1/storage_backend/gds_backend.py Outdated
Copy link
Copy Markdown
Contributor

@sammshen sammshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sammshen sammshen requested a review from deng451e March 20, 2026 01:55
Copy link
Copy Markdown
Collaborator

@DongDongJu DongDongJu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. please revise two things.

Comment thread lmcache/v1/storage_backend/gds_backend.py Outdated
Comment thread lmcache/v1/storage_backend/gds_backend.py Outdated
Copy link
Copy Markdown
Collaborator

@deng451e deng451e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Collaborator

@DongDongJu DongDongJu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work. LGTM

@DongDongJu DongDongJu enabled auto-merge (squash) March 27, 2026 20:57
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Mar 27, 2026
auto-merge was automatically disabled March 28, 2026 00:30

Head branch was pushed to by a user without write access

@github-actions github-actions Bot removed the full Run comprehensive tests on this PR label Mar 28, 2026
glimchb and others added 2 commits March 27, 2026 23:17
Previously the thread pool for batched_get_blocking was only created
on WekaFS, so cuFile on ext4, xfs, or other filesystems fell back to
serial reads.

Move `use_thread_pool` out of the WekaFS branch so it is set whenever
cuFile is active, regardless of filesystem type. Also rename the
thread-name prefix from "weka-gds-io" to "gds-io" to reflect the
broader scope.

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
The thread pool implementation of batched_get_blocking was not passing
the fmt parameter to allocate(), causing all allocations to use the
default memory format regardless of what was stored in the cache.

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
@deng451e deng451e enabled auto-merge (squash) March 28, 2026 05:52
@github-actions github-actions Bot added the full Run comprehensive tests on this PR label Mar 28, 2026
@deng451e deng451e merged commit f6a9919 into LMCache:dev Mar 28, 2026
34 checks passed
@glimchb glimchb deleted the gds-thread branch March 28, 2026 10:45
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
…LMCache#2802)

* feat(gds): enable parallel I/O thread pool for all cuFile filesystems

Previously the thread pool for batched_get_blocking was only created
on WekaFS, so cuFile on ext4, xfs, or other filesystems fell back to
serial reads.

Move `use_thread_pool` out of the WekaFS branch so it is set whenever
cuFile is active, regardless of filesystem type. Also rename the
thread-name prefix from "weka-gds-io" to "gds-io" to reflect the
broader scope.

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>

* fix(gds): pass memory format in batched_get_blocking thread pool path

The thread pool implementation of batched_get_blocking was not passing
the fmt parameter to allocate(), causing all allocations to use the
default memory format regardless of what was stored in the cache.

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>

---------

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
jooho-XCENA pushed a commit to xcena-dev/LMCache that referenced this pull request Apr 2, 2026
…LMCache#2802)

* feat(gds): enable parallel I/O thread pool for all cuFile filesystems

Previously the thread pool for batched_get_blocking was only created
on WekaFS, so cuFile on ext4, xfs, or other filesystems fell back to
serial reads.

Move `use_thread_pool` out of the WekaFS branch so it is set whenever
cuFile is active, regardless of filesystem type. Also rename the
thread-name prefix from "weka-gds-io" to "gds-io" to reflect the
broader scope.

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>

* fix(gds): pass memory format in batched_get_blocking thread pool path

The thread pool implementation of batched_get_blocking was not passing
the fmt parameter to allocate(), causing all allocations to use the
default memory format regardless of what was stored in the cache.

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>

---------

Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full Run comprehensive tests on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants