Skip to content

[Inductor] Pick ISA for inductor based on ATEN_CPU_CAPABILITY#123514

Closed
CaoE wants to merge 59 commits intogh/CaoE/31/basefrom
gh/CaoE/31/head
Closed

[Inductor] Pick ISA for inductor based on ATEN_CPU_CAPABILITY#123514
CaoE wants to merge 59 commits intogh/CaoE/31/basefrom
gh/CaoE/31/head

Conversation

@CaoE
Copy link
Collaborator

@CaoE CaoE commented Apr 7, 2024

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Apr 7, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/123514

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 38ec5c4 with merge base 67883e7 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@CaoE CaoE marked this pull request as draft April 7, 2024 03:26
@CaoE CaoE changed the title Set sisdlen according to _get_cpu_capability Set simdlen according to _get_cpu_capability Apr 7, 2024
@CaoE CaoE added ciflow/trunk Trigger trunk jobs on your pull request ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR labels Apr 7, 2024
[ghstack-poisoned]
[ghstack-poisoned]
CaoE added a commit that referenced this pull request Apr 7, 2024
ghstack-source-id: 75ae45b
Pull Request resolved: #123514
@CaoE CaoE requested a review from jgong5 April 7, 2024 12:09
Comment on lines +1235 to +1236
and not isinstance(_valid_vec_isa_list[0], VecNEON)
and not isinstance(_valid_vec_isa_list[0], VecZVECTOR)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute_cpu_capability also handles vsx and zvector. We should also consider them here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking VecNEON and VecZVECTOR into account.


CPUCapability get_cpu_capability() {
static CPUCapability capability = compute_cpu_capability();
CPUCapability capability = compute_cpu_capability();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing static here?

Copy link
Collaborator Author

@CaoE CaoE Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it back

Comment on lines 1242 to 1244
# If the simdlen is None, it indicates determin the vectorization length automatically
if config.cpp.simdlen is None:
assert _valid_vec_isa_list
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps we don't need this check any longer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the check

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
CaoE added a commit that referenced this pull request Apr 9, 2024
ghstack-source-id: 8b0e45e
Pull Request resolved: #123514
@CaoE CaoE changed the title Set simdlen according to _get_cpu_capability Set simdlen based on the environment ATEN_CPU_CAPABILITY Apr 9, 2024
@CaoE CaoE changed the title Set simdlen based on the environment ATEN_CPU_CAPABILITY Set simdlen based on ATEN_CPU_CAPABILITY Apr 9, 2024
@CaoE
Copy link
Collaborator Author

CaoE commented Jun 25, 2024

This PR depends on #124245

@github-actions
Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

[ghstack-poisoned]
Copy link
Collaborator

@jgong5 jgong5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a lot of UT failures. Can you check?

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
atol=atol,
rtol=rtol,
atol=1e-3,
rtol=1e-3,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the tolerance for all devices. Is it expected?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just changed this back to how it was before in this PR, no new changes were introduced.

@CaoE
Copy link
Collaborator Author

CaoE commented Sep 27, 2024

Hi @jgong5
Here is a new thing: Inductor recently added AMX ISA level which inherits AVX512.

class VecAMX(VecAVX512):
    _arch_flags = VecAVX512._arch_flags + " -mamx-tile -mamx-bf16 -mamx-int8"
    def __str__(self) -> str:
        return super().__str__() + " amx_tile"
    __hash__: Callable[[VecISA], Any] = VecISA.__hash__
...
supported_vec_isa_list = [VecAMX(), VecAVX512(), VecAVX2(), VecNEON()]

Eager does not have the option to set the AMX ISA level. When ATEN_CPU_CAPABILITY is set to avx512, both VecAMX and VecAVX512 of Inductor satisfy this environment variable, and VecAMX will be prioritized for selection.

@CaoE
Copy link
Collaborator Author

CaoE commented Sep 30, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@huydhn
Copy link
Contributor

huydhn commented Sep 30, 2024

@pytorchbot revert -m 'Sorry for reverting your change but its test_cpu_repro test is failing in trunk https://hud.pytorch.org/pytorch/pytorch/commit/6931c1644afdba53e63ce5671455e4e1b7265dd9' -c nosignal

I think a rebase is needed

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@CaoE your PR has been successfully reverted.

CaoE added 2 commits October 8, 2024 23:53
[ghstack-poisoned]
[ghstack-poisoned]
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please commit the suggested changes from pytorch's linter.

CaoE added 2 commits October 10, 2024 23:57
[ghstack-poisoned]
[ghstack-poisoned]
@CaoE
Copy link
Collaborator Author

CaoE commented Oct 13, 2024

@huydhn The PR is rebased. Could you please help import this this PR to see if it will break internal checks ?

@huydhn
Copy link
Contributor

huydhn commented Oct 14, 2024

Umm, unfortunately, I couldn't import this PR because it uses ghstack. Only the stack owner can do so (folks from Meta do that themselves). There is no work around that I know atm, so I think let's just merge this and let our oncall check it later then.

@CaoE
Copy link
Collaborator Author

CaoE commented Oct 17, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants