Skip to content

Refactor TensorAccessor for headeronly.#166855

Closed
pearu wants to merge 33 commits intogh/pearu/150/basefrom
gh/pearu/150/head
Closed

Refactor TensorAccessor for headeronly.#166855
pearu wants to merge 33 commits intogh/pearu/150/basefrom
gh/pearu/150/head

Conversation

@pearu
Copy link
Copy Markdown
Collaborator

@pearu pearu commented Nov 3, 2025

This PR moves the implementations of Tensor accessor classes to headeronly with the following modifications:

  • Add ArrayRef and IndexBoundsCheck template parameters to refactor out the usages of IntArrayRef and TORCH_CHECK_INDEX from Tensor accessor implementations.
  • Eliminate usage of c10::irange as it is not headeronly-compatible.
  • Introduce torch::headeronly::{TensorAccessorBase,TensorAccessor, GenericPackedTensorAccessorBase, GenericPackedTensorAccessor} that are headeronly-equivalent to at::{TensorAccessorBase,TensorAccessor, GenericPackedTensorAccessorBase, GenericPackedTensorAccessor}. Both these sets of template classes use original implementations from torch::headeronly::detail that have new template parameters ArrayRefCls and IndexBoundsCheck to facilitate at and torch::headeronly implementations of ArrayRef and checking indices.

TODO:

Stack from ghstack (oldest at bottom):

cc @jbschlosser

[ghstack-poisoned]
@pearu pearu requested a review from janeyx99 as a code owner November 3, 2025 13:26
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Nov 3, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/166855

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 966eea7 with merge base d01a7b0 (image):

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pearu added a commit that referenced this pull request Nov 3, 2025
ghstack-source-id: 7ae42bc
Pull Request resolved: #166855
@pearu pearu added module: cpp Related to C++ API open source topic: not user facing topic category labels Nov 3, 2025
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 3, 2025
ghstack-source-id: cbe7a34
Pull Request resolved: #166855
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 4, 2025
ghstack-source-id: 679aabf
Pull Request resolved: #166855
pearu added a commit that referenced this pull request Nov 5, 2025
ghstack-source-id: 679aabf
Pull Request resolved: #166855
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 5, 2025
ghstack-source-id: 8bdb8b0
Pull Request resolved: #166855
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 5, 2025
ghstack-source-id: 7028577
Pull Request resolved: #166855
pearu added 2 commits November 5, 2025 23:38
[ghstack-poisoned]
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 5, 2025
ghstack-source-id: 651240f
Pull Request resolved: #166855
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 6, 2025
ghstack-source-id: a761208
Pull Request resolved: #166855
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 6, 2025
ghstack-source-id: 010c0ba
Pull Request resolved: #166855
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 6, 2025
ghstack-source-id: 9554e5d
Pull Request resolved: #166855
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 6, 2025
ghstack-source-id: 668d747
Pull Request resolved: #166855
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 10, 2025
ghstack-source-id: 25e7ad2
Pull Request resolved: #166855
Copy link
Copy Markdown
Contributor

@janeyx99 janeyx99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This looks p good to me, but let's rename the headeronly APIs so they don't share names to minimize confusion.

Comment thread aten/src/ATen/core/TensorAccessor.h
Comment thread torch/headeronly/core/TensorAccessor.h Outdated
Comment thread aten/src/ATen/core/TensorAccessor.h
Comment thread torch/headeronly/core/TensorAccessor.h Outdated
Comment thread torch/headeronly/core/TensorAccessor.h Outdated
Comment thread torch/headeronly/core/TensorAccessor.h Outdated
Comment thread torch/headeronly/core/TensorAccessor.h Outdated
Comment thread torch/headeronly/core/TensorAccessor.h Outdated
Comment thread torch/headeronly/core/TensorAccessor.h Outdated
Comment thread c10/util/ArrayRef.h
/// at::TensorAccessor, for instance) when `final` specifier is used.
template <typename T>
class ArrayRef final : public HeaderOnlyArrayRef<T> {
class ArrayRef : public HeaderOnlyArrayRef<T> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

Copy link
Copy Markdown
Collaborator Author

@pearu pearu Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is given in the note before the class definition:

/// NOTE: ArrayRef cannot be derived from. Normally, we would use
/// `final` specifier to force this constraint at compile time.
/// However, Intel compiler does not recognize ArrayRef as a class
/// template (which is required in the definition of
/// at::TensorAccessor, for instance) when `final` specifier is
/// used. So, we cannot define ArrayRef as final because of the Intel
/// compiler issue.

When ArrayRef is defined as final, the xpu CI fails with

2025-11-12T12:58:10.7402385Z [7169/8004] Building SYCL (Device) object torch_xpu_ops_gen_SparseSoftmaxKernels.cpp.o�[K
2025-11-12T12:58:10.7405002Z �[31mFAILED: �[0mcaffe2/aten_xpu/src/CMakeFiles/torch_xpu_ops.dir/ATen/native/sparse/xpu/sycl/torch_xpu_ops_gen_SparseSoftmaxKernels.cpp.o /var/lib/jenkins/workspace/build/caffe2/aten_xpu/src/CMakeFiles/torch_xpu_ops.dir/ATen/native/sparse/xpu/sycl/torch_xpu_ops_gen_SparseSoftmaxKernels.cpp.o 
2025-11-12T12:58:10.7409903Z cd /var/lib/jenkins/workspace/build/caffe2/aten_xpu/src/CMakeFiles/torch_xpu_ops.dir/ATen/native/sparse/xpu/sycl && /opt/conda/envs/py_3.10/lib/python3.10/site-packages/cmake/data/bin/cmake -E make_directory /var/lib/jenkins/workspace/build/caffe2/aten_xpu/src/CMakeFiles/torch_xpu_ops.dir/ATen/native/sparse/xpu/sycl/. && /opt/conda/envs/py_3.10/lib/python3.10/site-packages/cmake/data/bin/cmake -D verbose:BOOL=OFF -D generated_file:STRING=/var/lib/jenkins/workspace/build/caffe2/aten_xpu/src/CMakeFiles/torch_xpu_ops.dir/ATen/native/sparse/xpu/sycl/./torch_xpu_ops_gen_SparseSoftmaxKernels.cpp.o -P /var/lib/jenkins/workspace/build/caffe2/aten_xpu/src/CMakeFiles/torch_xpu_ops.dir/ATen/native/sparse/xpu/sycl/torch_xpu_ops_gen_SparseSoftmaxKernels.cpp.o.Release.cmake
2025-11-12T12:58:10.7413388Z In file included from <command-line>:
2025-11-12T12:58:10.7414602Z /tmp/icx-8814ad46d2/SparseSoftmaxKernels-header-221199.h:1621:248: error: ‘c10::ArrayRef’ is not a template
2025-11-12T12:58:10.7416885Z  1621 | template <> struct KernelInfo<::sycl::detail::RoundedRangeKernel<::sycl::item<1, true>, 1, ::at::native::xpu::MaxRowKernelFunctor<double, long *, ::torch::headeronly::detail::GenericPackedTensorAccessor<::torch::headeronly::detail::TensorAccessor<::c10::ArrayRef<long>, double, 1, torch::headeronly::DefaultPtrTraits, long>, ::at::IndexBoundsCheck<2, long>, double, 2, torch::headeronly::DefaultPtrTraits, long>, double *>>> {
2025-11-12T12:58:10.7418863Z       |                                                                                                                                                                                                                                                        ^~

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @swolchok does the final specifier matter to keep around?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like a compiler bug, but no I dont think that final is gating optimization opportunities in the absence of virtual methods

Copy link
Copy Markdown
Contributor

@janeyx99 janeyx99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add a kernel use case in libtorch_agnostic_extension in kernel.cpp that uses the HeaderOnlyPackedTensorAccessor in a more real-world way?

[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 12, 2025
ghstack-source-id: 2ec5972
Pull Request resolved: #166855
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 12, 2025
ghstack-source-id: 8c466f6
Pull Request resolved: #166855
@pearu
Copy link
Copy Markdown
Collaborator Author

pearu commented Nov 12, 2025

Can you also add a kernel use case in libtorch_agnostic_extension in kernel.cpp that uses the HeaderOnlyPackedTensorAccessor in a more real-world way?

Yes, I have added mv_tensor_accessor with CPU and CUDA kernels using headeronly tensor accessors.

@pearu pearu requested a review from janeyx99 November 12, 2025 20:21
Comment thread test/cpp_extensions/libtorch_agnostic_extension/libtorch_agnostic/csrc/kernel.h Outdated
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 13, 2025
ghstack-source-id: 2cd8e32
Pull Request resolved: #166855
@pearu pearu requested a review from janeyx99 November 13, 2025 11:05
[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 13, 2025
ghstack-source-id: 5bba9eb
Pull Request resolved: #166855
Copy link
Copy Markdown
Contributor

@janeyx99 janeyx99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! looking forward to tensor.packed_tensor_accessor apis 😛

one comment about the anon namespace

[ghstack-poisoned]
pearu added a commit that referenced this pull request Nov 15, 2025
ghstack-source-id: 3880b24
Pull Request resolved: #166855
@pearu
Copy link
Copy Markdown
Collaborator Author

pearu commented Nov 15, 2025

@pytorchbot merge

@pytorch-bot pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 15, 2025
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Khanaksahu pushed a commit to Khanaksahu/pytorch that referenced this pull request Nov 17, 2025
Silv3S pushed a commit to Silv3S/pytorch that referenced this pull request Nov 18, 2025
This PR moves the implementations of Tensor accessor classes to headeronly with the following modifications:
- Add ArrayRef and IndexBoundsCheck template parameters to refactor out the usages of `IntArrayRef` and `TORCH_CHECK_INDEX` from Tensor accessor implementations.
- Eliminate usage of `c10::irange` as it is not headeronly-compatible.
- Introduce `torch::headeronly::{TensorAccessorBase,TensorAccessor, GenericPackedTensorAccessorBase, GenericPackedTensorAccessor}` that are headeronly-equivalent to `at::{TensorAccessorBase,TensorAccessor, GenericPackedTensorAccessorBase, GenericPackedTensorAccessor}`. Both these sets of template classes use original implementations from `torch::headeronly::detail` that have new template parameters `ArrayRefCls` and `IndexBoundsCheck` to facilitate `at` and `torch::headeronly` implementations of ArrayRef and checking indices.

TODO:
- ~when pytorch#164991 lands, eliminate the placeholder class HeaderOnlyArrayRef~ UPDATE: done.

Pull Request resolved: pytorch#166855
Approved by: https://github.com/janeyx99
@github-actions github-actions Bot deleted the gh/pearu/150/head branch December 16, 2025 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged module: cpp Related to C++ API open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants