Skip to content

Enhance llm telemetry export_metrics and gpu driver for remote cmd#2334

Merged
cb-github-robot merged 5 commits intocloud-barista:mainfrom
seokho-son:main
Feb 28, 2026
Merged

Enhance llm telemetry export_metrics and gpu driver for remote cmd#2334
cb-github-robot merged 5 commits intocloud-barista:mainfrom
seokho-son:main

Conversation

@seokho-son
Copy link
Copy Markdown
Member

  • llm telemetry export_metrics: to support remote command.
  • gpu driver: extend nvidia gpu model coverage and csp instance types.

But, this requires more testing.. endlessly ?? ;)

Signed-off-by: Seokho Son <shsongist@gmail.com>
Signed-off-by: Seokho Son <shsongist@gmail.com>
Signed-off-by: Seokho Son <shsongist@gmail.com>
Copilot AI review requested due to automatic review settings February 28, 2026 15:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the LLM operational scripts to better support remote, non-interactive usage: the telemetry exporter can now run with defaults or CLI flags (not only via a config file), and the GPU driver installers add pre-cleanup plus a more robust multi-candidate driver installation flow.

Changes:

  • Add “remote-friendly” CLI parsing to export_metrics.sh (defaults / CLI flags / legacy config mode).
  • Add NVIDIA driver pre-cleanup and iterative fallback installation across multiple driver package candidates.
  • Add vGPU-vs-non-vGPU detection and ensure nvidia-modprobe is installed to create /dev/nvidia* nodes.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
scripts/usecases/llm/telemetry/export_metrics.sh Adds CLI/default execution modes to support remote execution without a config file.
scripts/usecases/llm/installGpuDriver.sh Adds NVIDIA package pre-cleanup, vGPU detection, candidate-based driver install loop, and nvidia-modprobe guard.
scripts/usecases/llm/installCudaDriver.sh Mirrors the NVIDIA driver install robustness improvements for the CUDA-driver-focused script.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Seokho Son <shsongist@gmail.com>
@seokho-son
Copy link
Copy Markdown
Member Author

/approve

@github-actions github-actions bot added the approved This PR is approved and will be merged soon. label Feb 28, 2026
@cb-github-robot cb-github-robot merged commit 26dea14 into cloud-barista:main Feb 28, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved This PR is approved and will be merged soon.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants