Skip to content

Update a3ultra to 570 and cuda 12-8#3859

Merged
samskillman merged 2 commits into
GoogleCloudPlatform:developfrom
samskillman:feat/upgrade-a3u-570
Apr 22, 2025
Merged

Update a3ultra to 570 and cuda 12-8#3859
samskillman merged 2 commits into
GoogleCloudPlatform:developfrom
samskillman:feat/upgrade-a3u-570

Conversation

@samskillman

Copy link
Copy Markdown
Collaborator

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@samskillman samskillman requested a review from cdunbar13 March 29, 2025 15:51
@samskillman samskillman requested a review from a team as a code owner March 29, 2025 15:51
@samskillman samskillman added the release-version-updates Added to release notes under the "Version Updates" heading. label Mar 29, 2025
ighosh98
ighosh98 previously approved these changes Mar 29, 2025
@samskillman

Copy link
Copy Markdown
Collaborator Author

/gcbrun

@tpdownes tpdownes left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Observed the following output on an A3 Ultra node:

tpdownes_google_com@a3u5590-a3ultranodeset-1:~$ sudo nvidia-smi
Tue Apr 22 18:42:32 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.15              Driver Version: 570.86.15      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H200                    On  |   00000000:8F:00.0 Off |                    0 |
| N/A   31C    P0             77W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA H200                    On  |   00000000:90:00.0 Off |                    0 |
| N/A   32C    P0             78W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA H200                    On  |   00000000:96:00.0 Off |                    0 |
| N/A   31C    P0             76W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA H200                    On  |   00000000:97:00.0 Off |                    0 |
| N/A   34C    P0             76W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA H200                    On  |   00000000:C4:00.0 Off |                    0 |
| N/A   31C    P0             78W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA H200                    On  |   00000000:C5:00.0 Off |                    0 |
| N/A   32C    P0             75W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA H200                    On  |   00000000:CB:00.0 Off |                    0 |
| N/A   30C    P0             77W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA H200                    On  |   00000000:CC:00.0 Off |                    0 |
| N/A   32C    P0             78W /  700W |       1MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

@tpdownes tpdownes assigned samskillman and unassigned tpdownes Apr 22, 2025
@samskillman samskillman merged commit 20d8a02 into GoogleCloudPlatform:develop Apr 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-version-updates Added to release notes under the "Version Updates" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants