[BE][Easy] use pathlib.Path instead of dirname / ".." / pardir#129374
[BE][Easy] use pathlib.Path instead of dirname / ".." / pardir#129374XuehaiPan wants to merge 24 commits intogh/XuehaiPan/67/basefrom
pathlib.Path instead of dirname / ".." / pardir#129374Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/129374
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 8df444f with merge base 9694158 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
pathlib.Path in tools and torchgenpathlib.Path instead of dirname / ".." / pardir
ghstack-source-id: 6291baa Pull Request resolved: pytorch#129374
malfet
left a comment
There was a problem hiding this comment.
Before landing this change, can you please explain why:
- You sometime replace
parents[1]with.parent.parent - Why
parents[x - 1]notation? Path("/").parents[0]will throw an exception, whilePath("/").parentis always safe to call, should this be a considered in some of the cases?
|
@pytorchbot successfully started a revert job. Check the current status here. |
|
@XuehaiPan your PR has been successfully reverted. |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
FYI. Probably fine in this context but Pathlib can be really slow in performance critical situations compared to string handling. |
|
@pytorchmergebot revert -m "failing internal ROCM builds with error: ModuleNotFoundError: No module named hipify" -c ghfirst File "/re_cwd/buck-out/v2/gen/fbcode/a82b0712bf2bb755/caffe2/tools/amd_build/build_amd/build_amd#link-tree/build_amd.py", line 13, in |
|
@pytorchbot successfully started a revert job. Check the current status here. |
Reverting PR 129374 failedReason: Command Details for Dev Infra teamRaised by workflow job |
|
@pytorchmergebot revert -m "failing internal ROCM builds with error: ModuleNotFoundError: No module named hipify" -c ghfirst |
|
@pytorchbot successfully started a revert job. Check the current status here. |
|
@XuehaiPan your PR has been successfully reverted. |
tools/amd_build/build_amd.py
Outdated
| REPO_ROOT = Path(__file__).absolute().parents[2] | ||
| sys.path.append(str((REPO_ROOT / "torch" / "utils").resolve())) |
There was a problem hiding this comment.
I vaguely remember talking about it in the past: it will resolve to a different folder if __file__ is a symlink to some other place. I.e. imagine a folder structure like (which is used by build system that bazel/buck, but could be the case of many other builds):
- actual_files
- build_amd.py
- hipify.py
- deploy
- tools
- amd_build
- build_amd.py -> ../../../build_amd.py
- torch
- utils
- hipify.py -> ../../../hipify.py
In the previous version of the logic, it will resolve path correctly, but after the reorder it will point to non-existing /torch/utils/ folder
There was a problem hiding this comment.
Alright, today I have learned os.path.join(file, os.pardir) may have different behavior of os.path.dirname(file).
- If
fileis a regular file, both paths resolve to the same path. - If
fileis a symlink,resolvewill treat the "file" as a directory.
import os
from pathlib import Path
REPO_ROOT = Path(__file__).absolute().parents[2]
print(__file__)
print(os.path.realpath(os.path.join(os.path.dirname(__file__), os.pardir, os.pardir, 'torch', 'utils')))
print(os.path.realpath(os.path.join(__file__, os.pardir, os.pardir, os.pardir, 'torch', 'utils')))
print((REPO_ROOT / "torch" / "utils").resolve())The output is:
/Users/PanXuehai/Projects/pytorch/tools/amd_build/test.py
/Users/PanXuehai/Projects/pytorch/torch/utils
/Users/PanXuehai/Projects/torch/utils
/Users/PanXuehai/Projects/pytorch/torch/utils
| # NOTE: `tools/amd_build/build_amd.py` could be a symlink. | ||
| # The behavior of `symlink / '..'` is different from `symlink.parent`. | ||
| # Use `pardir` three times rather than using `path.parents[2]`. | ||
| REPO_ROOT = ( | ||
| Path(__file__).absolute() / os.path.pardir / os.path.pardir / os.path.pardir | ||
| ).resolve() | ||
| sys.path.append(str(REPO_ROOT / "torch" / "utils")) |
There was a problem hiding this comment.
I have updated the patch and added a note for it.
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Stack from ghstack (oldest at bottom):
path.resolve()->path.absolute()#129409pathlib.Pathinstead ofdirname/".."/pardir#129374Changes by apply order:
Replace all
".."andos.pardirusage withos.path.dirname(...).Replace nested
os.path.dirname(os.path.dirname(...))call withstr(Path(...).parent.parent).Reorder
.absolute()/and.resolve().parent: always resolve the path first..parent{...}.absolute()->.absolute().parent{...}Replace chained
.parent x Nwith.parents[${N - 1}]: the code is easier to read (see 5.).parent.parent.parent.parent->.parents[3]Replace.parents[${N - 1}]with.parents[${N} - 1]: the code is easier to read and does not introduce any runtime overhead..parents[3]->.parents[4 - 1]Replace.parents[2 - 1]with.parent.parent: because the code is shorter and easier to read.cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @rec @peterbell10