Support vGPU type instance driver installation#2336
Merged
cb-github-robot merged 2 commits intocloud-barista:mainfrom Feb 28, 2026
Merged
Support vGPU type instance driver installation#2336cb-github-robot merged 2 commits intocloud-barista:mainfrom
cb-github-robot merged 2 commits intocloud-barista:mainfrom
Conversation
Signed-off-by: Seokho Son <shsongist@gmail.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an explicit --vgpu flag to the GPU/CUDA driver installation scripts to force use of the proprietary NVIDIA kernel modules on fractional/vGPU-backed instances (e.g., AWS g6f), where open kernel modules are unsupported.
Changes:
- Add
--vgpuCLI flag toinstallGpuDriver.shandinstallCudaDriver.sh. - When
--vgpuis provided, forceIS_VGPU=trueand prefer proprietary driver candidates. - Extend vGPU detection with an additional heuristic based on PCI BAR size.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| scripts/usecases/llm/installGpuDriver.sh | Adds --vgpu option and extends vGPU detection logic/heuristics to force proprietary driver selection. |
| scripts/usecases/llm/installCudaDriver.sh | Adds --vgpu option and aligns vGPU detection/driver-selection behavior with the CUDA-focused installer. |
Comments suppressed due to low confidence (1)
scripts/usecases/llm/installCudaDriver.sh:339
- The comment says the
--vgpuflag will “skip auto-detection entirely”, but Method 2 (/proc/driver/nvidiagrep) still runs even whenIS_VGPUis already true. Consider guarding Method 2 withif [ "$IS_VGPU" = false ](like Methods 1/3/4) or adjust the comment to match actual behavior.
# If --vgpu flag was passed, skip auto-detection entirely
if [ "$FORCE_VGPU" = true ]; then
IS_VGPU=true
echo " → --vgpu flag set: forcing proprietary driver."
fi
# Method 1: Check PCI descriptions for explicit NVIDIA GRID/vGPU identifiers
if [ "$IS_VGPU" = false ] && sudo lspci -nnk 2>/dev/null | grep -i nvidia | grep -Eqi "vGPU|GRID"; then
IS_VGPU=true
fi
# Method 2: Check for NVIDIA GRID/vGPU signals in driver state (after driver is present)
if [ -d /proc/driver/nvidia/gpus ] && grep -q -ri "vGPU\|GRID" /proc/driver/nvidia/ 2>/dev/null; then
IS_VGPU=true
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Seokho Son <shsongist@gmail.com>
Member
Author
|
/approve |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
NVIDIA driver installation script cannot support some instances with vGPU (for instance, g6f.2xlarge, g6f.4xlarge, gr6f.4xlarge)
n CSP Region SpecName Arch vCPU Mem(Gi) Cost($/h) Accelerator
1 AWS us-east-2 g6f.2xlarge x86_64 8 32 $0.475 NVIDIA L4 (C:undefined 5)
2 AWS us-east-2 g6f.4xlarge x86_64 16 64 $0.95 NVIDIA L4 (C:undefined 11)
3 AWS us-east-2 g6.2xlarge x86_64 8 32 $0.9776 NVIDIA L4 (C:1 23)
4 AWS us-east-2 gr6f.4xlarge x86_64 16 128 $1.066 NVIDIA L4 (C:undefined 11)
5 AWS us-east-2 g6.4xlarge x86_64 16 64 $1.3232 NVIDIA L4 (C:1 23)
This PR adds a parameter for vGPU instances.