Skip to content

Port linalg_eigh and linalg_eigvalsh to structured#79072

Closed
lezcano wants to merge 5 commits intogh/Lezcano/88/basefrom
gh/Lezcano/88/head
Closed

Port linalg_eigh and linalg_eigvalsh to structured#79072
lezcano wants to merge 5 commits intogh/Lezcano/88/basefrom
gh/Lezcano/88/head

Conversation

@lezcano
Copy link
Copy Markdown
Collaborator

@lezcano lezcano commented Jun 7, 2022

Stack from ghstack:

This follows the structure of linalg.svd.

This follows the structure of linalg.svd.

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Jun 7, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit 735a532 (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

lezcano added a commit that referenced this pull request Jun 7, 2022
This follows the structure of linalg.svd.

ghstack-source-id: 78f42dc
Pull Request resolved: #79072
@lezcano lezcano added module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul release notes: composability release notes category topic: not user facing topic category labels Jun 7, 2022
@lezcano lezcano added ciflow/all ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR labels Jun 8, 2022
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jun 8, 2022

The ciflow/all label was recently removed. It ran very expensive periodic CI jobs when most contributors did not need them. If you just want to check that you won't be reverted, use ciflow/trunk. If you really want the old ciflow/all behavior, add ciflow/trunk and ciflow/periodic.You can use any of the following

  • ciflow/trunk (.github/workflows/trunk.yml): all jobs we run per-commit on master
  • ciflow/periodic (.github/workflows/periodic.yml): all jobs we run periodically on master
  • ciflow/android (.github/workflows/run_android_tests.yml): android build and test
  • ciflow/nightly (.github/workflows/nightly.yml): all jobs we run nightly
  • ciflow/binaries: all binary build and upload jobs
  • ciflow/binaries_conda: binary build and upload job for conda
  • ciflow/binaries_libtorch: binary build and upload job for libtorch
  • ciflow/binaries_wheel: binary build and upload job for wheel

This follows the structure of linalg.svd.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Jun 11, 2022
This follows the structure of linalg.svd.

ghstack-source-id: a7d5033
Pull Request resolved: #79072
This follows the structure of linalg.svd.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Jun 13, 2022
This follows the structure of linalg.svd.

ghstack-source-id: 5bcc072
Pull Request resolved: #79072
Comment on lines +12409 to +12410
# Could not run 'aten::linalg_eigh' with arguments from the 'AutogradMeta' backend.
DecorateInfo(unittest.expectedFailure, 'TestFakeTensorNonErroring', 'test_fake'),
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No clue why I'm seeing this error in this test and in the test_meta.py. Any idea @albanD ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were moves to make Meta a backend bit
cc @bdhirsh do you know what this error is about?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it probably has to do with Meta being moved to the backend bit, but that error is pretty strange. Let me see if I can debug locally.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, found it....

Diagnosis: we have a meta function for linalg_eigh written directly in python, that gets registered to the Meta dispatch key:

@torch.library.impl(meta_lib, "linalg_eigh")
.

I think we should leave the python function around in case its useful, but you can remove the @torch.library.impl(meta_lib, "linalg_eigh") line that registers it to the dispatcher (since this PR will give us meta tensor support through structured kernels, instead).

Why did that break stuff? linalg_eigh is now a CompositeImplicitAutograd kernel, which basically means that we should be getting support for it for all backends (including meta) through its decomposition. By registering a meta kernel to it directly, that basically tells the dispatcher "hey, now that I have a direct kernel for the meta backend, then don't run the decomposition anymore when you're using the meta backend

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also curious what you both think about the naming - we now have a private _linalg_eigh op that implements the behavior of the 4 public ops.

Consolidating the implementation into one op seems great, although backends will now need to support a private operator in order to support better linalg coverage (this is the case for a couple other linalg ops too).

I know there have been some conversations around "decomposing into private / unstable API's" being kinda annoying for backends. What do you think about that w.r.t. to the private linalg ops? Are they just more likely to change over time, so making them private is safer?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this are very good points. I had a discussion about this with @mruberry the other day. What we thought is that, for now, the prim operations can simply be those from the public API. Going forward, we may want to have a lower level operation to be the prim with something similar to a LAPACK API, and then have svd, eigh and so on dispatch to these in Python.

In any case, following the current structure:

  • Makes implementing operations for backends fairly easily. If they have the LAPACK-like operation, they can always write two helper functions that call into it less than 10 LOC
  • Having these operations written as structured kernels makes prims easy to write, and will make moving to a LAPACK-style API later on rather easy.

Copy link
Copy Markdown
Collaborator

@IvanYashchuk IvanYashchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I don't know about TestFakeTensorNonErroring and how to fix it.

This follows the structure of linalg.svd.

[ghstack-poisoned]
Copy link
Copy Markdown
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You most likely don't want the ideep submodule change in this PR?

Comment on lines +12409 to +12410
# Could not run 'aten::linalg_eigh' with arguments from the 'AutogradMeta' backend.
DecorateInfo(unittest.expectedFailure, 'TestFakeTensorNonErroring', 'test_fake'),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were moves to make Meta a backend bit
cc @bdhirsh do you know what this error is about?

This follows the structure of linalg.svd.

[ghstack-poisoned]
lezcano added a commit that referenced this pull request Jun 14, 2022
This follows the structure of linalg.svd.

ghstack-source-id: 66a97bd
Pull Request resolved: #79072
@lezcano
Copy link
Copy Markdown
Collaborator Author

lezcano commented Jun 14, 2022

Sorry for the submodule point. Fixed. Should we just merge this one and figure out the MetaAutograd point later or should we wait to fix it?

Copy link
Copy Markdown
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine to merge this for now while we wait on resolution for the meta part.
Could you move the discussion to an issue though and pin Brian there?

@lezcano
Copy link
Copy Markdown
Collaborator Author

lezcano commented Jun 14, 2022

@pytorchbot merge -g

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot successfully started a merge job. Check the current status here

facebook-github-bot pushed a commit that referenced this pull request Jun 16, 2022
Summary:
This follows the structure of linalg.svd.

Pull Request resolved: #79072

Approved by: https://github.com/IvanYashchuk, https://github.com/albanD

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/549a597c004f7e0254cd4dd8f1777c57e0f00c00

Reviewed By: malfet

Differential Revision: D37156838

fbshipit-source-id: 84607565591fb7ab97238a09dbcc40a1d39cb102
lezcano added a commit that referenced this pull request Jun 17, 2022
lezcano added a commit that referenced this pull request Jun 17, 2022
Following #79072 (comment)

ghstack-source-id: 809acf7
Pull Request resolved: #79786
pytorchmergebot pushed a commit that referenced this pull request Jun 17, 2022
@facebook-github-bot facebook-github-bot deleted the gh/Lezcano/88/head branch June 18, 2022 14:17
facebook-github-bot pushed a commit that referenced this pull request Jun 20, 2022
Summary:
Following #79072 (comment)

Pull Request resolved: #79786

Approved by: https://github.com/ngimel, https://github.com/bdhirsh

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/648a6658ec7dbf17f815a7d7f4464f9e6f7cb08f

Reviewed By: malfet

Differential Revision: D37278842

fbshipit-source-id: 29197e92bdfd5f53e901367e4ca96645c6d6f9db
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 25, 2026
This follows the structure of linalg.svd.

Pull Request resolved: pytorch#79072

Approved by: https://github.com/IvanYashchuk, https://github.com/albanD
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR cla signed Merged module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul open source release notes: composability release notes category topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants