[Platform]: Add Intel Gaudi (HPU) Support#2822
Conversation
Signed-off-by: Tony Lin <tony.lin@intel.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request integrates support for Intel Gaudi (HPU) accelerators into the LMCache system. It enables the LMCache engine to detect, utilize, and manage memory on HPU devices, expanding the range of hardware platforms compatible with the system. The changes primarily involve adding HPU-specific device detection, information retrieval, and a new GPU connector to facilitate efficient KV cache operations on Gaudi. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for Intel Gaudi (HPU) by adding a new HPU connector and integrating it into the device detection and GPU information retrieval logic. The changes are well-structured, following existing patterns for XPU support. The new VLLMPagedMemHPUConnectorV2 class provides the necessary to_gpu and from_gpu functionalities using HPU-specific PyTorch operations. However, the get_shape method in the new connector is not implemented, which is a critical omission for a class inheriting from an abstract base class.
|
hi @ApostaC @sammshen @YaoJiayi @DongDongJu This is a rework of #1066. The previous implementation was overly complex. I've made several refactors on both vLLM & LMCache sides in past months to ensure today's HPU connector in LMCache is clean and straightforward — similar in structure to the GPU connector. HPU is an ASIC accelerator, and this PR demonstrates that LMCache's architecture-level design is not only compatible with GPUs, but also extensible to other ASIC processing units. Looking ahead, I hope this opens the door to a broader LMCache ecosystem supporting hardware from diverse vendors — if that aligns with the LMCache project's vision. I'd greatly appreciate it if the maintainers could take 5 minutes to review this. Thank you! 😊 |
|
Signed-off-by: Tony Lin <tony.lin@intel.com>
sammshen
left a comment
There was a problem hiding this comment.
LGTM! clean ups can come later
DongDongJu
left a comment
There was a problem hiding this comment.
much cleaner than before. Thanks for the work.
LGTM.
* feat: add Intel Gaudi (HPU) support Signed-off-by: Tony Lin <tony.lin@intel.com> * address review comments Signed-off-by: Tony Lin <tony.lin@intel.com> --------- Signed-off-by: Tony Lin <tony.lin@intel.com>
* feat: add Intel Gaudi (HPU) support Signed-off-by: Tony Lin <tony.lin@intel.com> * address review comments Signed-off-by: Tony Lin <tony.lin@intel.com> --------- Signed-off-by: Tony Lin <tony.lin@intel.com>
* feat: add Intel Gaudi (HPU) support Signed-off-by: Tony Lin <tony.lin@intel.com> * address review comments Signed-off-by: Tony Lin <tony.lin@intel.com> --------- Signed-off-by: Tony Lin <tony.lin@intel.com>
* feat: add Intel Gaudi (HPU) support Signed-off-by: Tony Lin <tony.lin@intel.com> * address review comments Signed-off-by: Tony Lin <tony.lin@intel.com> --------- Signed-off-by: Tony Lin <tony.lin@intel.com>
Add HPU connector to support Intel Gaudi Platform