Skip to content

zh-trans: update schedule GPUs#14943

Merged
k8s-ci-robot merged 1 commit intokubernetes:release-1.14from
mysunshine92:update-schedule-gpu-1
Jun 20, 2019
Merged

zh-trans: update schedule GPUs#14943
k8s-ci-robot merged 1 commit intokubernetes:release-1.14from
mysunshine92:update-schedule-gpu-1

Conversation

@mysunshine92
Copy link
Copy Markdown
Contributor

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Remember to delete this note before submitting your pull request.

For pull requests on 1.15 Features: set Milestone to 1.15 and Base Branch to dev-1.15

For pull requests on Chinese localization, set Base Branch to release-1.14

For pull requests on Korean Localization: set Base Branch to dev-1.14-ko.<latest team milestone>

If you need Help on editing and submitting pull requests, visit:
https://kubernetes.io/docs/contribute/start/#improve-existing-content.

If you need Help on choosing which branch to use, visit:
https://kubernetes.io/docs/contribute/start#choose-which-git-branch-to-use.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/zh Issues or PRs related to Chinese language sig/docs Categorizes an issue or PR as relevant to SIG Docs. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 17, 2019
@mysunshine92 mysunshine92 force-pushed the update-schedule-gpu-1 branch 2 times, most recently from 71279ba to b25c2bd Compare June 17, 2019 14:46
@mysunshine92
Copy link
Copy Markdown
Contributor Author

mysunshine92 commented Jun 17, 2019

cc @markthink

@mysunshine92
Copy link
Copy Markdown
Contributor Author

cc @tengqm

@chenrui333
Copy link
Copy Markdown
Member

will review later when I have time. 😄 👍


这个页面介绍了用户如何在不同的 Kubernetes 版本中使用 GPU,以及当前存在的一些限制。
这个页面介绍了用户如何在不同的 Kubernetes 版本中使用 GPU,以及当前存在的一些限制
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Kubernetes 支持对节点上的 AMD 和 NVIDA GPU 进行管理,目前处于**实验**状态。对 NVIDIA GPU 的支持在 v1.6 中加入,已经经历了多次不向后兼容的迭代。而对 AMD GPU 的支持则在 v1.9 中通过 [device plugin](#deploying-amd-gpu-device-plugin) 加入。

这个页面介绍了用户如何在不同的 Kubernetes 版本中使用 GPU,以及当前存在的一些限制。
这个页面介绍了用户如何在不同的 Kubernetes 版本中使用 GPU,以及当前存在的一些限制
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
这个页面介绍了用户如何在不同的 Kubernetes 版本中使用 GPU,以及当前存在的一些限制
这个页面介绍了用户如何在不同的 Kubernetes 版本中使用 GPU,以及当前存在的一些限制


如果你的集群已经启动并且上述要求满足的话,可以这样部署 AMD device plugin:

requirements are satisfied:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个没有翻译?

* 设备 ID (-device-id)
* VRAM 大小 (-vram)
* SIMD 数量(-simd-count)
* 计算单位(-cu-count)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* 计算单位(-cu-count)
* 计算单位数量 (-cu-count)

--->
## 集群内存在不同类型的 NVIDIA GPU

如果集群内部的不同节点上有不同类型的 NVIDIA GPU,那么你可以使用 [Node Labels 和 Node Selectors](/docs/tasks/configure-pod-container/assign-pods-nodes/) 来将pod调度到合适的节点上。
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
如果集群内部的不同节点上有不同类型的 NVIDIA GPU,那么你可以使用 [Node Labels 和 Node Selectors](/docs/tasks/configure-pod-container/assign-pods-nodes/) 来将pod调度到合适的节点上
如果集群内部的不同节点上有不同类型的 NVIDIA GPU,那么你可以使用 [节点标签和节点选择器](/docs/tasks/configure-pod-container/assign-pods-nodes/) 来将 pod 调度到合适的节点上


如果集群内部的不同节点上有不同类型的 NVIDIA GPU,那么你可以使用 [Node Labels 和 Node Selectors](/docs/tasks/configure-pod-container/assign-pods-nodes/) 来将pod调度到合适的节点上。

举一个例子:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
举一个例子
例如

<!--
For AMD GPUs, you can deploy [Node Labeller](https://github.com/RadeonOpenCompute/k8s-device-plugin/tree/master/cmd/k8s-node-labeller), which automatically labels your nodes with GPU properties. Currently supported properties:
--->
对于 AMD GPUs,您可以部署 [Node Labeller](https://github.com/RadeonOpenCompute/k8s-device-plugin/tree/master/cmd/k8s-node-labeller),它会自动给节点打上 GPU 属性标签。目前支持的属性:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
对于 AMD GPUs,您可以部署 [Node Labeller](https://github.com/RadeonOpenCompute/k8s-device-plugin/tree/master/cmd/k8s-node-labeller),它会自动给节点打上 GPU 属性标签。目前支持的属性:
对于 AMD GPUs,您可以部署 [节点标签器](https://github.com/RadeonOpenCompute/k8s-device-plugin/tree/master/cmd/k8s-node-labeller),它会自动给节点打上 GPU 属性标签。目前支持的属性:

kubectl create -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/k8s-1.9/nvidia-driver-installer/ubuntu/daemonset.yaml
kubectl create -f https://raw.githubusercontent.com/GoogleCloudPlatform/container-engine-accelerators/stable/nvidia-driver-installer/ubuntu/daemonset.yaml

# 安装 device plugin:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 安装 device plugin:
# 安装设备插件:

在你 1.12 版本的集群上,你能使用下面的命令来安装 NVIDIA 驱动以及 device plugin:

```
# 在 Container-Optimized OS 上安装 NVIDIA 驱动:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# 在 Container-Optimized OS 上安装 NVIDIA 驱动:
# 在容器优化的操作系统上安装 NVIDIA 驱动:

on [Container-Optimized OS](https://cloud.google.com/container-optimized-os/)
and has experimental code for Ubuntu from 1.9 onwards.
--->
#### GCE 中使用的 NVIDIA GPU device plugin
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### GCE 中使用的 NVIDIA GPU device plugin
#### GCE 中使用的 NVIDIA GPU 设备插件

#### GCE 中使用的 NVIDIA GPU device plugin

On your 1.9 cluster, you can use the following commands to install the NVIDIA drivers and device plugin:
[GCE 使用的 NVIDIA GPU device plugin](https://github.com/GoogleCloudPlatform/container-engine-accelerators/tree/master/cmd/nvidia_gpu) 并不要求使用 nvidia-docker,并且对于任何实现了 Kubernetes CRI 的容器运行时,都应该能够使用。这一实现已经在 [Container-Optimized OS](https://cloud.google.com/container-optimized-os/) 上进行了测试,并且在 1.9 版本之后会有对于 Ubuntu 的实验性代码。
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[GCE 使用的 NVIDIA GPU device plugin](https://github.com/GoogleCloudPlatform/container-engine-accelerators/tree/master/cmd/nvidia_gpu) 并不要求使用 nvidia-docker,并且对于任何实现了 Kubernetes CRI 的容器运行时,都应该能够使用。这一实现已经在 [Container-Optimized OS](https://cloud.google.com/container-optimized-os/) 上进行了测试,并且在 1.9 版本之后会有对于 Ubuntu 的实验性代码。
[GCE 使用的 NVIDIA GPU 设备插件](https://github.com/GoogleCloudPlatform/container-engine-accelerators/tree/master/cmd/nvidia_gpu) 并不要求使用 nvidia-docker,并且对于任何实现了 Kubernetes CRI 的容器运行时,都应该能够使用。这一实现已经在 [容器优化的操作系统](https://cloud.google.com/container-optimized-os/) 上进行了测试,并且在 1.9 版本之后会有对于 Ubuntu 的实验性代码。

- Docker 的[默认运行时](https://github.com/NVIDIA/k8s-device-plugin#preparing-your-gpu-nodes)必须设置为 nvidia-container-runtime,而不是 runc
- NVIDIA 驱动版本 ~= 361.93

如果你的集群已经启动并且满足上述要求的话,可以这样部署 NVIDIA device plugin:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
如果你的集群已经启动并且满足上述要求的话,可以这样部署 NVIDIA device plugin
如果你的集群已经启动并且满足上述要求的话,可以这样部署 NVIDIA 设备插件

The [official NVIDIA GPU device plugin](https://github.com/NVIDIA/k8s-device-plugin)
has the following requirements:
--->
### 部署 NVIDIA GPU device plugin
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### 部署 NVIDIA GPU device plugin
### 部署 NVIDIA GPU 设备插件

--->
### 部署 NVIDIA GPU device plugin

对于 NVIDIA GPUs,目前存在两种 device plugin 的实现:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
对于 NVIDIA GPUs,目前存在两种 device plugin 的实现
对于 NVIDIA GPUs,目前存在两种设备插件的实现


对于 NVIDIA GPUs,目前存在两种 device plugin 的实现:

#### 官方的 NVIDIA GPU device plugin
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### 官方的 NVIDIA GPU device plugin
#### 官方的 NVIDIA GPU 设备插件


#### 官方的 NVIDIA GPU device plugin

[官方的 NVIDIA GPU device plugin](https://github.com/NVIDIA/k8s-device-plugin) 有以下要求:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[官方的 NVIDIA GPU device plugin](https://github.com/NVIDIA/k8s-device-plugin) 有以下要求:
[官方的 NVIDIA GPU 设备插件](https://github.com/NVIDIA/k8s-device-plugin) 有以下要求:

<!--
Report issues with this device plugin to [RadeonOpenCompute/k8s-device-plugin](https://github.com/RadeonOpenCompute/k8s-device-plugin).
--->
请到 [RadeonOpenCompute/k8s-device-plugin](https://github.com/RadeonOpenCompute/k8s-device-plugin) 报告有关此 device plugin 的问题。
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
请到 [RadeonOpenCompute/k8s-device-plugin](https://github.com/RadeonOpenCompute/k8s-device-plugin) 报告有关此 device plugin 的问题
请到 [RadeonOpenCompute/k8s-device-plugin](https://github.com/RadeonOpenCompute/k8s-device-plugin) 报告有关此设备插件的问题

The [official AMD GPU device plugin](https://github.com/RadeonOpenCompute/k8s-device-plugin)
has the following requirements:
--->
### 部署 AMD GPU device plugin
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### 部署 AMD GPU device plugin
### 部署 AMD GPU 设备插件

When the above conditions are true, Kubernetes will expose `nvidia.com/gpu` or
`amd.com/gpu` as a schedulable resource.
--->
接着你需要在主机节点上安装对应厂商的 GPU 驱动并运行对应厂商的 device plugin ([AMD](#deploying-amd-gpu-device-plugin)、[NVIDIA](#deploying-nvidia-gpu-device-plugin))。
Copy link
Copy Markdown
Member

@chenrui333 chenrui333 Jun 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
接着你需要在主机节点上安装对应厂商的 GPU 驱动并运行对应厂商的 device plugin ([AMD](#deploying-amd-gpu-device-plugin)[NVIDIA](#deploying-nvidia-gpu-device-plugin))。
接着你需要在主机节点上安装对应厂商的 GPU 驱动并运行对应厂商的设备插件([AMD](#deploying-amd-gpu-device-plugin)[NVIDIA](#deploying-nvidia-gpu-device-plugin))。

Copy link
Copy Markdown
Member

@chenrui333 chenrui333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm except device plugin -> 设备插件

@mysunshine92 mysunshine92 force-pushed the update-schedule-gpu-1 branch from b25c2bd to 3cf51f7 Compare June 19, 2019 16:16
@mysunshine92
Copy link
Copy Markdown
Contributor Author

mysunshine92 commented Jun 19, 2019

lgtm except device plugin -> 设备插件

@chenrui333 谢谢你的评审,我已经全部修改了,麻烦给打下lgtm 和 approve 标签。

Copy link
Copy Markdown
Member

@chenrui333 chenrui333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 20, 2019
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chenrui333

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 20, 2019
@chenrui333
Copy link
Copy Markdown
Member

@mysunshine92 Thanks for your contributions, 🎉 🎉 🎉

@k8s-ci-robot k8s-ci-robot merged commit d525853 into kubernetes:release-1.14 Jun 20, 2019
SataQiu pushed a commit to SataQiu/website that referenced this pull request Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/zh Issues or PRs related to Chinese language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/docs Categorizes an issue or PR as relevant to SIG Docs. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants