Skip to content

Add shell script to deploy LLM service on AMD GPU#2321

Merged
cb-github-robot merged 4 commits intocloud-barista:mainfrom
leehyeoklee:improve-amd-gpu-llm-flow
Feb 23, 2026
Merged

Add shell script to deploy LLM service on AMD GPU#2321
cb-github-robot merged 4 commits intocloud-barista:mainfrom
leehyeoklee:improve-amd-gpu-llm-flow

Conversation

@leehyeoklee
Copy link
Copy Markdown
Contributor

🚀 Key Changes

This update extends LLM inference and serving capabilities to AMD GPUs, in addition to the existing NVIDIA support.

We have added shell scripts to automate ROCm driver installation, vLLM/Ollama environment setup, and model serving(vLLM). This allows users on cloud environments (Azure, AWS) with AMD GPUs to deploy LLM services without complex manual configuration.

✨ Implementation Details

1. ROCm Driver Installation (installRocmDriver.sh)

  • Installs the ROCm (Radeon Open Compute) and AMD GPU driver, which is essential for using AMD GPUs.

2. vLLM Environment Deployment (deployvLLMAmd.sh)

  • Configures the vLLM environment using the official rocm/vllm Docker image provided by AMD.
  • Automates Docker installation and HuggingFace cache directory setup

3. vLLM Model Serving (servevLLMAmd.sh)

  • Launches a specified HuggingFace model as a vLLM-powered, OpenAI-compatible API server.
  • Runs in a Docker container and includes features for stable operation, such as automatic shutdown of existing servers and health checks.

💻 Test Environment and Results

Azure (Radeon PRO V710)

  • Result: Works Perfectly
  • After installing the ROCm 7.0.1 driver, model serving and inference via vLLM/Ollama were confirmed to be running smoothly.

Ollama
image

vLLM
image

@seokho-son
Copy link
Copy Markdown
Member

@leehyeoklee Let's check if the suggested scripts can be (simply) merged with the existing scripts. :)

  1. ROCm Driver Installation (installRocmDriver.sh)
  2. vLLM Environment Deployment (deployvLLMAmd.sh)
  3. vLLM Model Serving (servevLLMAmd.sh)

https://github.com/cloud-barista/cb-tumblebug/tree/main/scripts/usecases/llm

@seokho-son
Copy link
Copy Markdown
Member

@leehyeoklee
Is this PR ready for additional review round?

@leehyeoklee leehyeoklee force-pushed the improve-amd-gpu-llm-flow branch from 66ef178 to 5742ef7 Compare February 23, 2026 07:08
@leehyeoklee
Copy link
Copy Markdown
Contributor Author

leehyeoklee commented Feb 23, 2026

@seokho-son

Yes, it's ready for another round of review.😊
I have unified the vLLM deployment and serving scripts to support both NVIDIA and AMD GPUs.

Additionally, I’ve created a single installGpuDriver.sh script that handles both GPU driver and CUDA/ROCm installations.

And I've confirmed that LLM models load and run correctly on both NVIDIA and AMD GPU VMs!

@leehyeoklee
Copy link
Copy Markdown
Contributor Author

For AMD vLLM deployment, I referred to this documentation: https://docs.vllm.ai/en/stable/getting_started/installation/gpu/

Note:
Since Python 3.12 is required to use the current Pre-built wheels, I configured the script to install it when proceeding with an AMD GPU.

@seokho-son
Copy link
Copy Markdown
Member

/approve

@github-actions github-actions bot added the approved This PR is approved and will be merged soon. label Feb 23, 2026
@cb-github-robot cb-github-robot merged commit 31cc77b into cloud-barista:main Feb 23, 2026
2 checks passed
@leehyeoklee leehyeoklee deleted the improve-amd-gpu-llm-flow branch February 23, 2026 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved This PR is approved and will be merged soon.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants