Skip to content

Support vGPU type instance driver installation#2336

Merged
cb-github-robot merged 2 commits intocloud-barista:mainfrom
seokho-son:main
Feb 28, 2026
Merged

Support vGPU type instance driver installation#2336
cb-github-robot merged 2 commits intocloud-barista:mainfrom
seokho-son:main

Conversation

@seokho-son
Copy link
Copy Markdown
Member

@seokho-son seokho-son commented Feb 28, 2026

NVIDIA driver installation script cannot support some instances with vGPU (for instance, g6f.2xlarge, g6f.4xlarge, gr6f.4xlarge)
n CSP Region SpecName Arch vCPU Mem(Gi) Cost($/h) Accelerator
1 AWS us-east-2 g6f.2xlarge x86_64 8 32 $0.475 NVIDIA L4 (C:undefined 5)
2 AWS us-east-2 g6f.4xlarge x86_64 16 64 $0.95 NVIDIA L4 (C:undefined 11)
3 AWS us-east-2 g6.2xlarge x86_64 8 32 $0.9776 NVIDIA L4 (C:1 23)
4 AWS us-east-2 gr6f.4xlarge x86_64 16 128 $1.066 NVIDIA L4 (C:undefined 11)
5 AWS us-east-2 g6.4xlarge x86_64 16 64 $1.3232 NVIDIA L4 (C:1 23)

This PR adds a parameter for vGPU instances.

Signed-off-by: Seokho Son <shsongist@gmail.com>
Copilot AI review requested due to automatic review settings February 28, 2026 19:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an explicit --vgpu flag to the GPU/CUDA driver installation scripts to force use of the proprietary NVIDIA kernel modules on fractional/vGPU-backed instances (e.g., AWS g6f), where open kernel modules are unsupported.

Changes:

  • Add --vgpu CLI flag to installGpuDriver.sh and installCudaDriver.sh.
  • When --vgpu is provided, force IS_VGPU=true and prefer proprietary driver candidates.
  • Extend vGPU detection with an additional heuristic based on PCI BAR size.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
scripts/usecases/llm/installGpuDriver.sh Adds --vgpu option and extends vGPU detection logic/heuristics to force proprietary driver selection.
scripts/usecases/llm/installCudaDriver.sh Adds --vgpu option and aligns vGPU detection/driver-selection behavior with the CUDA-focused installer.
Comments suppressed due to low confidence (1)

scripts/usecases/llm/installCudaDriver.sh:339

  • The comment says the --vgpu flag will “skip auto-detection entirely”, but Method 2 (/proc/driver/nvidia grep) still runs even when IS_VGPU is already true. Consider guarding Method 2 with if [ "$IS_VGPU" = false ] (like Methods 1/3/4) or adjust the comment to match actual behavior.
# If --vgpu flag was passed, skip auto-detection entirely
if [ "$FORCE_VGPU" = true ]; then
    IS_VGPU=true
    echo "  → --vgpu flag set: forcing proprietary driver."
fi
# Method 1: Check PCI descriptions for explicit NVIDIA GRID/vGPU identifiers
if [ "$IS_VGPU" = false ] && sudo lspci -nnk 2>/dev/null | grep -i nvidia | grep -Eqi "vGPU|GRID"; then
    IS_VGPU=true
fi
# Method 2: Check for NVIDIA GRID/vGPU signals in driver state (after driver is present)
if [ -d /proc/driver/nvidia/gpus ] && grep -q -ri "vGPU\|GRID" /proc/driver/nvidia/ 2>/dev/null; then
    IS_VGPU=true

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Seokho Son <shsongist@gmail.com>
@seokho-son
Copy link
Copy Markdown
Member Author

/approve

@github-actions github-actions bot added the approved This PR is approved and will be merged soon. label Feb 28, 2026
@cb-github-robot cb-github-robot merged commit 536f7a2 into cloud-barista:main Feb 28, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved This PR is approved and will be merged soon.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants