Fix sparse windows on CPU with MKL by mantaionut · Pull Request #102604 · pytorch/pytorch

mantaionut · 2023-05-31T09:30:35Z

Fix #97352.
This PR changes the way the linking to intel MKL is done and updating MKL on Windows to mkl-2021.4.0 .
There are for both conda and pip packages MKL version with which you can link dynamically. mkl-devel contains the static versions of the dlls and MKL contains the needed dlls for the runtime. MKL dlls and static libs starting with 2021.4.0 have the version in their names( for MKL 2023 we have mkl_core.2.dll and for 2021.4.0 we have mkl_core.1.dll) so its possible to have multiple versions installed and it will work properly.
For the wheel build, I added dependency for whell MKL and on conda a dependecy for the conda MKL and on libtorch I copied the MKL binaries in libtorch.
In order to test this PR I have to use custom builder pytorch/builder#1467

cc @alexsamardzic @nikitaved @pearu @cpuhrsch @amjames @bhosmer @jcaip @peterjc123 @mszhanyi @skyline75489 @nbcsm @vladimir-aubrecht @iremyux @Blackhex @cristianPanaite

pytorch-bot · 2023-05-31T09:30:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/102604

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 4156083 with merge base 6049998 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

periodic / linux-focal-rocm5.7-py3.8 / test (distributed, 1, 2, linux.rocm.gpu) (gh)
distributed/test_c10d_functional_native.py::C10DFunctionalNativeTest::test_inductor_all_to_all_single

This comment was automatically generated by Dr. CI and updates every 15 minutes.

IvanYashchuk

Changes to the aten/ directory look good. Exciting update!

lezcano · 2023-10-16T15:16:19Z

cc @amjames for visibility

amjames

Mainly asking for clarification on a couple of things.

aten/src/ATen/native/mkl/SparseCsrLinearAlgebra.cpp

amjames · 2023-10-16T19:17:49Z

setup.py

        "fsspec",
    ]
+    if IS_WINDOWS:
+        install_requires.append("mkl==2021.4.0")


Is an an exact pin needed here? I don't see why can't it be 2021.4.0 or later?

MKL has the version in its name. Since we still link with the static version of the dlls it will search for dlls with 1 in their name. So if we install MKL 2023.1.0 it will have those libraries but we 2 in their names and it will fail to load. So based on my tests the only thing compatible MKL versions are 2021.1.1-2021.4.0.

Thanks, I figured you had a good reason here.

cpuhrsch · 2023-10-23T17:39:32Z

@malfet - Does this work for you?

malfet · 2023-10-23T19:44:34Z

@cpuhrsch Change can not go as is as it depends on builder change that hasn't merged yet.

malfet

Please make suggested changes (and rebase your builder PR first, but looks ok to me)

malfet · 2023-10-23T19:43:25Z

.github/templates/windows_binary_build_workflow.yml.j2

      !{{ set_runner_specific_vars() }}
      !{{ common.checkout(deep_clone=False, directory="pytorch") }}
-      !{{ common.checkout(deep_clone=False, directory="builder", repository=common.builder_repo, branch=common.builder_branch) }}
+      !{{ common.checkout(deep_clone=False, directory="builder", repository="mantaionut/builder", branch="copy_mkl_windows") }}


This needs to be updated when builder PR is merged

malfet · 2023-12-08T05:36:11Z

setup.py

    ]
+    if IS_WINDOWS:
+        install_requires.append("mkl>=2021.1.1,<=2021.4.0")


Hmm, why build windows with MKL-2021.x, while Linux is build with MKL-2022?

Also, in order to place nice with poetry, one needs to add it unconditionally but add arch + platform constraint

Suggested change

]

if IS_WINDOWS:

install_requires.append("mkl>=2021.1.1,<=2021.4.0")

"mkl>=2021.1.1,<=2021.4.0; platform_system == \"Windows\" and platform_machine == \"x86_64\"",

]

I was thinking first to have this implementation since previously we were using MKL 2020. See if everything works well and then in the future will be easier to just change the version to 2022.

@mantaionut do you still want to update it to 2022? Also, are you sure that we are not double packing mkl into the wheel, but keeping dependency at the same time?

@malfet the binaries of mkl are not copied in the wheel, only in libtorch they are copied. I created a draft PR #118200 for updating to 2022. However I see for conda it might not be possible since numpy 1.19 -> mkl[version='>=2019.4,<2021.0a0|>=2021.4.0,<2022.0a0|>=2023.1.0,<2024.0a0'].
So based on this we could update to 2022 however on conda we should use 2023 instead. Let me know if you think this will work for you.

cpuhrsch · 2024-01-09T22:42:34Z

@mantaionut @malfet - Do we still want to move forward with this?

mantaionut · 2024-01-11T17:49:53Z

@mantaionut @malfet - Do we still want to move forward with this?

I would like to move forward. But it also depends on pytorch/builder#1467 for which i had to make additional changes after rebasing.

malfet · 2024-01-22T17:49:25Z

@mantaionut @malfet - Do we still want to move forward with this?

I would like to move forward. But it also depends on pytorch/builder#1467 for which i had to make additional changes after rebasing.

Merged builder PR, please remove temporary changes from this PR and merge it

Added support for intel Sparse on Windows

mantaionut · 2024-01-23T14:13:45Z

@pytorchbot merge

pytorchmergebot · 2024-01-23T14:16:00Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch-bot bot added the release notes: releng release notes category label May 31, 2023

pytorchbot added the open source label May 31, 2023

mantaionut force-pushed the Fix_sparse_windows branch 4 times, most recently from 06a2d02 to 987b456 Compare June 5, 2023 09:37

mantaionut added the ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR label Jun 5, 2023

mantaionut force-pushed the Fix_sparse_windows branch 2 times, most recently from 4eb772e to b387cd9 Compare June 6, 2023 05:09

mantaionut added the ciflow/binaries Trigger all binary build and upload jobs on the PR label Jun 6, 2023

mantaionut force-pushed the Fix_sparse_windows branch 2 times, most recently from 3769714 to 097ea78 Compare June 11, 2023 17:00

mantaionut force-pushed the Fix_sparse_windows branch 2 times, most recently from f22be71 to fc8b4da Compare June 26, 2023 06:55

mantaionut force-pushed the Fix_sparse_windows branch 4 times, most recently from 85b363a to aaf3d35 Compare June 30, 2023 06:08

mantaionut force-pushed the Fix_sparse_windows branch 11 times, most recently from b2a7af2 to b72a468 Compare August 6, 2023 09:32

mantaionut force-pushed the Fix_sparse_windows branch 4 times, most recently from 8e5d591 to 1139d36 Compare October 16, 2023 05:17

mantaionut marked this pull request as ready for review October 16, 2023 09:41

mantaionut requested review from a team, IvanYashchuk, lezcano and nikitaved as code owners October 16, 2023 09:41

mantaionut force-pushed the Fix_sparse_windows branch from 1139d36 to 814df50 Compare October 16, 2023 12:51

IvanYashchuk approved these changes Oct 16, 2023

View reviewed changes

IvanYashchuk requested review from malfet and removed request for lezcano and nikitaved October 16, 2023 14:13

IvanYashchuk added the module: sparse Related to torch.sparse label Oct 16, 2023

amjames reviewed Oct 16, 2023

View reviewed changes

malfet approved these changes Dec 8, 2023

View reviewed changes

Use intel Sparse on Windows

4156083

Added support for intel Sparse on Windows

atalman mentioned this pull request Mar 1, 2024

Add windows constraint to mkl package in wheel #121014

Closed

atalman mentioned this pull request May 7, 2024

Windows: 2.3.0 wheel can not be imported if installed for a single user only #125109

Closed

This was referenced Feb 13, 2025

Windows support of Intel oneMKL Sparse BLAS APIs and possible outdated comment #147124

Closed

Remove outdated comment in ATen/mkl/Sparse.h about lack of Windows support #147125

Closed

Conversation

mantaionut commented May 31, 2023 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented May 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/102604

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

IvanYashchuk left a comment

Choose a reason for hiding this comment

Uh oh!

lezcano commented Oct 16, 2023

Uh oh!

amjames left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpuhrsch commented Oct 23, 2023

Uh oh!

malfet commented Oct 23, 2023

Uh oh!

malfet left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpuhrsch commented Jan 9, 2024

Uh oh!

mantaionut commented Jan 11, 2024

Uh oh!

malfet commented Jan 22, 2024

Uh oh!

mantaionut commented Jan 23, 2024

Uh oh!

pytorchmergebot commented Jan 23, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

mantaionut commented May 31, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented May 31, 2023 •

edited

Loading