Add CUDA Vital#58059
Conversation
💊 CI failures summary and remediationsAs of commit bb7409c (more details on the Dr. CI page and at hud.pytorch.org/pr/58059):
🚧 1 fixed upstream failure:These were probably caused by upstream breakages that were already fixed.
Please rebase on the
|
|
This pull request was exported from Phabricator. Differential Revision: D28357615 |
1 similar comment
|
This pull request was exported from Phabricator. Differential Revision: D28357615 |
Codecov Report
@@ Coverage Diff @@
## master #58059 +/- ##
==========================================
+ Coverage 76.44% 76.46% +0.02%
==========================================
Files 1990 1992 +2
Lines 199690 199960 +270
==========================================
+ Hits 152651 152900 +249
- Misses 47039 47060 +21 |
|
This pull request was exported from Phabricator. Differential Revision: D28357615 |
xuzhao9
left a comment
There was a problem hiding this comment.
I am not familiar with the Vitals project, so just adding a few questions here.
There was a problem hiding this comment.
Why we are only returning the "second" values here, but not "first"? In the test below, it looks like both the name and the value should be returned, say 'CUDA.used\t\t true'.
There was a problem hiding this comment.
I believe second calls toString() which has both the name and value in it and concatenates the two. So first is actually also inside second :)
There was a problem hiding this comment.
May I ask what is the field of "force" doing here?
There was a problem hiding this comment.
This is making sure the vital data is updated, even if the TORCHVITAL env variable is not set. This is to enable testing because the env variable isn't always set when the test is being done - and we no easy workaround.
|
This pull request was exported from Phabricator. Differential Revision: D28357615 |
xuzhao9
left a comment
There was a problem hiding this comment.
Looking good, having two comments inlined
There was a problem hiding this comment.
nit: do you need an extra API, or is it enough to have setVital with default force=false arg?
There was a problem hiding this comment.
here conditional also seems like unnecessary complication - if you had write(value, force) you could have avoided conditional
There was a problem hiding this comment.
Why are there 2 functions instead of write(const T& t, bool force)?
There was a problem hiding this comment.
Why toString function instead of redefining << so that you can directly call std::cout << this ?
There was a problem hiding this comment.
Probably because I have python semantics on my mind instead of c++ semantics :)
I implemented a << operator.
|
This pull request was exported from Phabricator. Differential Revision: D28357615 |
a081618 to
4679bdf
Compare
|
This pull request was exported from Phabricator. Differential Revision: D28357615 |
4679bdf to
27a06bf
Compare
|
This pull request was exported from Phabricator. Differential Revision: D28357615 |
Summary: Pull Request resolved: pytorch#58059 Add CUDA.used vital sign which is true only if CUDA was "used" which technically means the context was created. Also adds the following features: - Force vitals to be written even if vitals are disabled, to enable testing when the env variable is not set from the start of execution - Add a read_vitals call for python to read existing vital signs. Test Plan: buck test mode/dbg caffe2/test:torch -- --regex basic_vitals Reviewed By: xuzhao9 Differential Revision: D28357615 fbshipit-source-id: 78a8755801b11ccc7e33a9eac669ebb7a83ddb87
|
This pull request was exported from Phabricator. Differential Revision: D28357615 |
1e5e092 to
bb7409c
Compare
|
This pull request has been merged in 8b6487c. |
Summary: Pull Request resolved: pytorch#58059 Add CUDA.used vital sign which is true only if CUDA was "used" which technically means the context was created. Also adds the following features: - Force vitals to be written even if vitals are disabled, to enable testing when the env variable is not set from the start of execution - Add a read_vitals call for python to read existing vital signs. Test Plan: buck test mode/dbg caffe2/test:torch -- --regex basic_vitals Reviewed By: xuzhao9 Differential Revision: D28357615 fbshipit-source-id: 681bf9ef63cb1458df9f1c241d301a3ddf1e5252
Summary: Pull Request resolved: #58059 Add CUDA.used vital sign which is true only if CUDA was "used" which technically means the context was created. Also adds the following features: - Force vitals to be written even if vitals are disabled, to enable testing when the env variable is not set from the start of execution - Add a read_vitals call for python to read existing vital signs. Test Plan: buck test mode/dbg caffe2/test:torch -- --regex basic_vitals Reviewed By: xuzhao9 Differential Revision: D28357615 fbshipit-source-id: 681bf9ef63cb1458df9f1c241d301a3ddf1e5252
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> ghstack-source-id: b56228f Pull-Request: #178479
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> ghstack-source-id: 372adae Pull-Request: #178479
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> ghstack-source-id: 1c33462 Pull-Request: #178479
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> ghstack-source-id: 65c136a Pull-Request: #178479
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> ghstack-source-id: 160ae2f Pull-Request: #178479
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Pull Request resolved: #178479 Approved by: https://github.com/msaroufim, https://github.com/albanD
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Pull Request resolved: #178479 Approved by: https://github.com/msaroufim, https://github.com/albanD Co-authored-by: Xia-Weiwen <12522207+Xia-Weiwen@users.noreply.github.com>
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR pytorch#51047 (Jan 2021), with a Python API in pytorch#53238 and a CUDA vital in pytorch#58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Pull Request resolved: pytorch#178479 Approved by: https://github.com/msaroufim, https://github.com/albanD
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Pull Request resolved: #178479 Approved by: https://github.com/msaroufim, https://github.com/albanD
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> ghstack-source-id: 46b1a3d Pull-Request: #178479
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> ghstack-source-id: 46b1a3d Pull-Request: #178479
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). Update: `torch.set_vital` was used from TorchData (see meta-pytorch/data#1537 ) so will keep the dummy API around Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Pull Request resolved: #178479 Approved by: https://github.com/msaroufim, https://github.com/albanD, https://github.com/mlazos
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR pytorch#51047 (Jan 2021), with a Python API in pytorch#53238 and a CUDA vital in pytorch#58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Pull Request resolved: pytorch#178479 Approved by: https://github.com/msaroufim, https://github.com/albanD
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). Update: `torch.set_vital` was used from TorchData (see meta-pytorch/data#1537 ) so will keep the dummy API around Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Pull Request resolved: #178479 Approved by: https://github.com/msaroufim, https://github.com/albanD, https://github.com/mlazos
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR pytorch#51047 (Jan 2021), with a Python API in pytorch#53238 and a CUDA vital in pytorch#58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). A search on cs.github.com and general web search confirmed no external projects depend on torch.set_vital, torch.read_vitals, or torch.vitals_enabled — usage was limited to pytorch/pytorch itself. Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Pull Request resolved: pytorch#178479 Approved by: https://github.com/msaroufim, https://github.com/albanD
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR pytorch#51047 (Jan 2021), with a Python API in pytorch#53238 and a CUDA vital in pytorch#58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). Update: `torch.set_vital` was used from TorchData (see meta-pytorch/data#1537 ) so will keep the dummy API around Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Pull Request resolved: pytorch#178479 Approved by: https://github.com/msaroufim, https://github.com/albanD, https://github.com/mlazos
TorchVitals was an undocumented feature gated behind the TORCH_VITAL environment variable that allowed setting and reading key-value "vital signs" (e.g. whether CUDA or DataLoader was used). It was never documented and the test itself had a "FIXME: document or deprecate" comment. Originally added in PR #51047 (Jan 2021), with a Python API in #53238 and a CUDA vital in #58059. There have been no feature changes since June 2021 — all subsequent commits were mechanical cleanups (clang-tidy, thread-safe getenv, etc.). Update: `torch.set_vital` was used from TorchData (see meta-pytorch/data#1537 ) so will keep the dummy API around Authored with Claude. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Pull Request resolved: #178479 Approved by: https://github.com/msaroufim, https://github.com/albanD, https://github.com/mlazos
Summary: Pull Request resolved: pytorch#58059 Add CUDA.used vital sign which is true only if CUDA was "used" which technically means the context was created. Also adds the following features: - Force vitals to be written even if vitals are disabled, to enable testing when the env variable is not set from the start of execution - Add a read_vitals call for python to read existing vital signs. Test Plan: buck test mode/dbg caffe2/test:torch -- --regex basic_vitals Reviewed By: xuzhao9 Differential Revision: D28357615 fbshipit-source-id: 681bf9ef63cb1458df9f1c241d301a3ddf1e5252
Summary: Add CUDA.used vital sign which is true only if CUDA was "used" which technically means the context was created.
Test Plan: buck test mode/dbg caffe2/test:torch -- --regex basic_vitals
Differential Revision: D28357615