Port CPU torch.geqrf to ATen#56249
Port CPU torch.geqrf to ATen#56249IvanYashchuk wants to merge 6 commits intogh/ivanyashchuk/10/basefrom
Conversation
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit dd13e03 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 ghstack-source-id: cd264d7 Pull Request resolved: pytorch#56249
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 ghstack-source-id: c663026 Pull Request resolved: pytorch#56249
| static void apply_geqrf(const Tensor& self, const Tensor& tau, int64_t m, int64_t n) { | ||
| #ifndef USE_LAPACK | ||
| AT_ERROR("qr: LAPACK library not found in compilation"); | ||
| AT_ERROR("geqrf: LAPACK library not found in compilation"); |
There was a problem hiding this comment.
We have a different error string for LAPACK not being available in the other linalg ops, right?
There was a problem hiding this comment.
It's not consistent, some use the message as here and others use a nicer message. I'll change it here to a nicer message.
| def test_geqrf(self, device, dtype): | ||
|
|
||
| def run_test(shape): | ||
| # NumPy outputs the result of geqrf operation |
There was a problem hiding this comment.
This comment is a little confusing as written. It's that NumPy doesn't have a function named geqrf, but np.linalg.qr with mode='raw' computes the same operation, so this test compares against that function.
| batches = [(), (0, ), (2, ), (2, 1)] | ||
| ns = [5, 2, 0] | ||
| for batch, (m, n) in product(batches, product(ns, ns)): | ||
| # TODO: CUDA path doesn't work with batched or empty inputs |
There was a problem hiding this comment.
Assert this fails instead of just continuing so when the behavior changes the test demands an update, too
There was a problem hiding this comment.
The test is updated in #56251. So can we leave this one as is? In other cases, I agree that it's better to assert failures and not skip the test.
There was a problem hiding this comment.
I'll add the assert.
There was a problem hiding this comment.
haha, really negotiating against yourself there
|
|
||
| Rather, this directly calls the underlying LAPACK function `?geqrf` | ||
| Computes a QR decomposition of :attr:`input`. | ||
| Both `Q` and `R` matrices are stored in the same tensor `a`. |
There was a problem hiding this comment.
"in the same output tensor `a`."
| The elements of `R` are stored on and above the diagonal. | ||
| Elementary reflectors (or Householder vectors) implicitly defining matrix `Q` | ||
| are stored below the diagonal. | ||
| Result of this function can be used together with :func:`torch.linalg.householder_product` |
There was a problem hiding this comment.
"Result" -> "The results"
| are stored below the diagonal. | ||
| Result of this function can be used together with :func:`torch.linalg.householder_product` | ||
| to obtain the `Q` matrix or | ||
| with :func:`torch.ormqr` that uses implicit representation of matrix `Q` for efficient matrix-matrix multiplication. |
There was a problem hiding this comment.
"with :func:`torch.ormqr` that uses implicit representation of matrix `Q` for efficient matrix-matrix multiplication." -> "with :func:`torch.ormqr`, which uses an implicit representation of the `Q` matrix, for an efficient matrix-matrix multiplication."
| to obtain the `Q` matrix or | ||
| with :func:`torch.ormqr` that uses implicit representation of matrix `Q` for efficient matrix-matrix multiplication. | ||
|
|
||
| This function directly calls the underlying LAPACK function `geqrf` |
There was a problem hiding this comment.
This sentence seems redundant - maybe it can be removed?
There was a problem hiding this comment.
Yes, we can remove it.
| @@ -3368,24 +3368,35 @@ def merge_dicts(*dicts): | |||
| This is a low-level function for calling LAPACK directly. This function | |||
There was a problem hiding this comment.
"for calling LAPACK's geqrf directly." ?
| .. note:: | ||
| To obtain explicit Q and R matrices it is recommended to use :func:`torch.linalg.qr`. | ||
|
|
||
| .. note:: |
There was a problem hiding this comment.
I would combine this note with the previous to make one "see also" note.
There was a problem hiding this comment.
"See also :func:`torch.linalg.qr`, which computes explicit Q and R matrices, and :func:`torch.linalg.lstsq` with the ``driver="gels"`` option for a function that can solve matrix equations using a QR decomposition." ?
|
|
||
| def sample_inputs_geqrf(op_info, device, dtype, requires_grad=False): | ||
| """ | ||
| This function generates input for torch.geqrf |
There was a problem hiding this comment.
This comment doesn't hurt anything but I don't think it adds anything to the code, either
There was a problem hiding this comment.
I agree, we can remove it.
| out = [] | ||
| for batch, (m, n) in product(batches, product(ns, ns)): | ||
| # TODO: CUDA path doesn't work with batched or empty inputs | ||
| if 'cuda' in device and (batch != () or m == 0 or n == 0): |
There was a problem hiding this comment.
if torch.device(device).type == 'cuda'
| """ | ||
| batches = [(), (0, ), (2, ), (1, 1)] | ||
| ns = [5, 2, 0] | ||
| out = [] |
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
|
@mruberry thank you for your suggestions! I've updated this pull request. |
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 ghstack-source-id: 956f37f Pull Request resolved: pytorch#56249
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves #24705 [ghstack-poisoned]
This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 ghstack-source-id: 3bbcb9f Pull Request resolved: pytorch#56249
| } | ||
|
|
||
| static void geqrf_out_helper(const Tensor& input, const Tensor& QR, const Tensor& tau) { | ||
| TORCH_INTERNAL_ASSERT(input.dim() >= 2); |
There was a problem hiding this comment.
Do you want to make the TORCH_INTERNAL_ASSERT_DEBUG_ONLY? You've already checked all these conditions, and this function is not user facing.
|
@IvanYashchuk just ping me when you're happy with this PR, @IvanYashchuk, and I'll start merging this stack |
Summary: Pull Request resolved: pytorch#56249 This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27907357 Pulled By: mruberry fbshipit-source-id: 94e1806078977417e7903db76eab9d578305f585
Summary: Pull Request resolved: pytorch#56249 This PR ports `torch.geqrf` from TH to ATen. CUDA path will be implemented in a follow-up PR. With ATen port support for complex and batched inputs is added. There were no correctness tests, they are added in this PR and I added OpInfo for this operation. We can implement the QR decomposition as a composition of geqrf and orgqr (torch.linalg.householder_product). Also we can implement the least squares solver with geqrf + ormqr + trtrs. So it's useful to have this function renewed at least for the internal code. Resolves pytorch#24705 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27907357 Pulled By: mruberry fbshipit-source-id: 94e1806078977417e7903db76eab9d578305f585
Stack from ghstack:
This PR ports
torch.geqrffrom TH to ATen. CUDA path will beimplemented in a follow-up PR.
With ATen port support for complex and batched inputs is added.
There were no correctness tests, they are
added in this PR and I added OpInfo for this operation.
We can implement the QR decomposition as a composition of geqrf and
orgqr (torch.linalg.householder_product).
Also we can implement the least squares solver with geqrf + ormqr +
trtrs. So it's useful to have this function renewed at least for the
internal code.
Resolves #24705
Differential Revision: D27907357