Skip to content

Fix refcount handling for dtype, layout and memory format#125271

Closed
albanD wants to merge 4 commits intopytorch:mainfrom
albanD:refcount_dtype
Closed

Fix refcount handling for dtype, layout and memory format#125271
albanD wants to merge 4 commits intopytorch:mainfrom
albanD:refcount_dtype

Conversation

@albanD
Copy link
Copy Markdown
Collaborator

@albanD albanD commented Apr 30, 2024

Finish fixing #124868
re-use our wrap() utils as much as possible and NewRef in other places.

@albanD albanD requested review from colesbury and malfet April 30, 2024 22:10
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Apr 30, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125271

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit efef1a8 with merge base e09f98c (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@albanD albanD marked this pull request as ready for review April 30, 2024 22:10
@albanD albanD added the topic: not user facing topic category label Apr 30, 2024
Copy link
Copy Markdown
Member

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@huydhn
Copy link
Copy Markdown
Contributor

huydhn commented May 1, 2024

@pytorchbot drci

@albanD albanD force-pushed the refcount_dtype branch from e25fde5 to 8a30612 Compare May 1, 2024 17:05
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented May 1, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@albanD albanD force-pushed the refcount_dtype branch from 8a30612 to efef1a8 Compare May 1, 2024 18:46
@albanD
Copy link
Copy Markdown
Collaborator Author

albanD commented May 1, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 1, 2024
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

huydhn added a commit to pytorch/test-infra that referenced this pull request May 2, 2024
#5160)

The context is in #5151. This
reland PR adds 2 more fixes:

* Do a left join from `workflow_job` to `push`, so that Dr.CI can always
find all the jobs from the PR even when the commit SHA is not found on
`push` in the case of forked PRs. The `head_sha_timestamp` field will be
empty then.
* When the `head_sha_timestamp` is empty, call `fetchCommitTimestamp` to
get the timestamp directly from GitHub. This is done once per commit.

Note that if the GitHub query fails and `head_sha_timestamp` is still
empty. Dr.CI won't apply similar flaky search to avoid FP, the search
query would expand to the current date otherwise.

### Testing

```
curl --request POST \
--url "http://localhost:3000/api/drci/drci?prNumber=PR_NUMBER" \
--header "Authorization: TOKEN" \
--data 'repo=pytorch'
```

1. pytorch/pytorch#125271, new forked PR, no
ciflow. `head_sha_timestamp` from Rockset is empty and
`fetchCommitTimestamp` is invoked. Dr.CI continues to work.

<details open><summary><b>NEW FAILURES</b> - The following jobs have
failed:</summary><p>

* [Lint / lintrunner-clang /
linux-job](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449212917)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585059/job/24449212917))
    `>>> Lint for torch/csrc/utils/tensor_memoryformats.cpp:`
* [pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 2, 5,
linux.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449643728)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449643728))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5,
linux.g5.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24450124622)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24450124622))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-focal-py3.11-clang10 / test (crossref, 2, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449335282)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449335282))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-focal-py3.11-clang10 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449334520)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449334520))

`test_tensor_creation_ops.py::TestTensorCreationCPU::test_constructor_dtypes_cpu`
* [pull / linux-focal-py3.11-clang10 / test (default, 2, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449334757)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449334757))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-focal-py3.11-clang10 / test (dynamo, 3, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449335837)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449335837))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-focal-py3.12-clang10 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449281229)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449281229))

`test_tensor_creation_ops.py::TestTensorCreationCPU::test_constructor_dtypes_cpu`
* [pull / linux-focal-py3.12-clang10 / test (default, 2, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449281368)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449281368))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-focal-py3.12-clang10 / test (dynamo, 3, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449282003)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449282003))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-focal-py3.8-clang10 / test (crossref, 2, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449309061)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449309061))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-focal-py3.8-clang10 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449308208)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449308208))

`test_tensor_creation_ops.py::TestTensorCreationCPU::test_constructor_dtypes_cpu`
* [pull / linux-focal-py3.8-clang10 / test (default, 2, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449308391)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449308391))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-focal-py3.8-clang10 / test (dynamo, 3, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449309632)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449309632))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-jammy-py3.10-clang15-asan / test (default, 2, 6,
linux.4xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449403443)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449403443))
    `test_autograd.py::TestAutograd::test_type_conversions`
* [pull / linux-jammy-py3.8-gcc11 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449357342)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449357342))

`test_tensor_creation_ops.py::TestTensorCreationCPU::test_constructor_dtypes_cpu`
* [pull / linux-jammy-py3.8-gcc11 / test (default, 2, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125271#24449357569)
([gh](https://github.com/pytorch/pytorch/actions/runs/8902585046/job/24449357569))
    `test_autograd.py::TestAutograd::test_type_conversions`
</p></details>

2. pytorch/pytorch#125225. Another forked PR
with `ciflow/trunk`. `head_sha_timestamp` is now available from Rockset
and `fetchCommitTimestamp` is not needed

<details open><summary><b>NEW FAILURES</b> - The following jobs have
failed:</summary><p>

* [pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 1, 5,
linux.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445851668)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445851668))
    `test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_16`
* [pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 2, 5,
linux.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445852045)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445852045))

`test_transformers.py::TestTransformersCUDA::test_script_encoder_subclass_cuda`
* [pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 3, 5,
linux.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445852311)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445852311))

`dynamo/test_autograd_function.py::AutogradFunctionTests::test_amp_custom_fwd_bwd`
* [pull / linux-focal-cuda12.1-py3.10-gcc9 / test (default, 4, 5,
linux.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445852638)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445852638))
`test_jit.py::TestScript::test_torchscript_multi_head_attn_fast_path`
* [pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 1, 5,
linux.g5.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24446408907)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24446408907))
    `test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_16`
* [pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 2, 5,
linux.g5.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24446409189)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24446409189))
`test_jit.py::TestScript::test_torchscript_multi_head_attn_fast_path`
* [pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 3, 5,
linux.g5.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24446409446)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24446409446))

`test_transformers.py::TestTransformersCUDA::test_script_encoder_subclass_cuda`
* [pull / linux-focal-cuda12.1-py3.10-gcc9-sm86 / test (default, 4, 5,
linux.g5.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24446409676)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24446409676))

`test_transformers.py::TestTransformersCUDA::test_transformerencoderlayer_subclass_cuda`
* [pull / linux-focal-py3.11-clang10 / test (crossref, 1, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445471589)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445471589))
    `test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_16`
* [pull / linux-focal-py3.11-clang10 / test (crossref, 2, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445471884)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445471884))

`test_transformers.py::TestTransformersCPU::test_script_encoder_subclass_cpu`
* [pull / linux-focal-py3.11-clang10 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445470929)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445470929))
    `test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_16`
* [pull / linux-focal-py3.11-clang10 / test (default, 2, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445471168)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445471168))

`test_transformers.py::TestTransformersCPU::test_script_encoder_subclass_cpu`
* [pull / linux-focal-py3.11-clang10 / test (default, 3, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445471397)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445471397))
`test_jit.py::TestScript::test_torchscript_multi_head_attn_fast_path`
* [pull / linux-focal-py3.11-clang10 / test (dynamo, 3, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445472530)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445472530))

`test_transformers.py::TestTransformersCPU::test_script_encoder_subclass_cpu`
* [pull / linux-focal-py3.12-clang10 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445428834)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445428834))
    `test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_16`
* [pull / linux-focal-py3.12-clang10 / test (default, 2, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445429085)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445429085))

`test_transformers.py::TestTransformersCPU::test_script_encoder_subclass_cpu`
* [pull / linux-focal-py3.12-clang10 / test (dynamo, 3, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445429974)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445429974))

`test_transformers.py::TestTransformersCPU::test_script_encoder_subclass_cpu`
* [pull / linux-focal-py3.8-clang10 / test (crossref, 1, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445479567)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445479567))
    `test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_16`
* [pull / linux-focal-py3.8-clang10 / test (crossref, 2, 2,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445479782)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445479782))

`test_transformers.py::TestTransformersCPU::test_script_encoder_subclass_cpu`
* [pull / linux-focal-py3.8-clang10 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445478904)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445478904))
    `test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_16`
* [pull / linux-focal-py3.8-clang10 / test (default, 2, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445479120)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445479120))

`test_transformers.py::TestTransformersCPU::test_script_encoder_subclass_cpu`
* [pull / linux-focal-py3.8-clang10 / test (dynamo, 3, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445480497)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445480497))

`test_transformers.py::TestTransformersCPU::test_script_encoder_subclass_cpu`
* [pull / linux-jammy-py3.10-clang15-asan / test (default, 1, 6,
linux.4xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445500236)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445500236))
    `test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_16`
* [pull / linux-jammy-py3.10-clang15-asan / test (default, 3, 6,
linux.4xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445500673)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445500673))

`test_transformers.py::TestTransformersCPU::test_transformerencoderlayer_subclass_model_cpu`
* [pull / linux-jammy-py3.10-clang15-asan / test (default, 4, 6,
linux.4xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445500892)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445500892))

`test_transformers.py::TestTransformersCPU::test_script_encoder_subclass_cpu`
* [pull / linux-jammy-py3.10-clang15-asan / test (default, 5, 6,
linux.4xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445501108)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445501108))
`test_jit.py::TestScript::test_torchscript_multi_head_attn_fast_path`
* [pull / linux-jammy-py3.8-gcc11 / test (default, 1, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445495672)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445495672))
    `test_fx.py::TestVisionTracing::test_torchvision_models_vit_b_16`
* [pull / linux-jammy-py3.8-gcc11 / test (default, 2, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445495930)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445495930))

`test_transformers.py::TestTransformersCPU::test_script_encoder_subclass_cpu`
* [pull / linux-jammy-py3.8-gcc11 / test (default, 3, 3,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445496144)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445496144))
`test_jit.py::TestScript::test_torchscript_multi_head_attn_fast_path`
* [pull / linux-jammy-py3.8-gcc11 / test (jit_legacy, 1, 1,
linux.2xlarge)](https://hud.pytorch.org/pr/pytorch/pytorch/125225#24445496582)
([gh](https://github.com/pytorch/pytorch/actions/runs/8893561548/job/24445496582))

`test_jit_legacy.py::TestScript::test_torchscript_multi_head_attn_fast_path`
</p></details>

3. pytorch/executorch#3353, non-ghstack,
non-forked PR.

`{"3353":{"FAILED":[],"FLAKY":[],"BROKEN_TRUNK":[],"UNSTABLE":[]}}`

4. pytorch/pytorch#125292, ghstack, non-forked
PR.

<details open><summary><b>NEW FAILURE</b> - The following job has
failed:</summary><p>

* [inductor / cuda12.1-py3.10-gcc9-sm86 / test
(dynamic_inductor_torchbench, 2, 2,
linux.g5.4xlarge.nvidia.gpu)](https://hud.pytorch.org/pr/pytorch/pytorch/125292#24455309482)
([gh](https://github.com/pytorch/pytorch/actions/runs/8904802497/job/24455309482))
    `resnet18`
</p></details>
pytorch-bot bot pushed a commit that referenced this pull request May 3, 2024
Finish fixing #124868
re-use our wrap() utils as much as possible and NewRef in other places.
Pull Request resolved: #125271
Approved by: https://github.com/colesbury
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-td-distributed ciflow/trunk Trigger trunk jobs on your pull request Merged topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants