Skip to content

Support S32/U32 indices for BWD embedding and index ops & Neuron implicit downcast#8450

Closed
rpsilva-aws wants to merge 17 commits intopytorch:masterfrom
rpsilva-aws:rpsilva_downcast
Closed

Support S32/U32 indices for BWD embedding and index ops & Neuron implicit downcast#8450
rpsilva-aws wants to merge 17 commits intopytorch:masterfrom
rpsilva-aws:rpsilva_downcast

Conversation

@rpsilva-aws
Copy link
Copy Markdown
Collaborator

@rpsilva-aws rpsilva-aws commented Dec 4, 2024

In this PR, we extend tensor operations to allow S32 indices. This follows suits with other operations, in order to add flexibility and potentially performance benefits for accelerator backends. Reference for embedding dense bwd: https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/Embedding.cpp#L117

I re-structure a good portion of the tests to validate both types.

In addition, we also re-introduce the implicit downcasting for Neuron S64/U64 types, since the Neuron compiler does not support 64 bits.

@rpsilva-aws rpsilva-aws marked this pull request as ready for review December 4, 2024 18:41
@miladm miladm requested review from bhavya01 and tengyifei December 5, 2024 00:04
@miladm
Copy link
Copy Markdown
Collaborator

miladm commented Dec 5, 2024

is this PR ready for review @rpsilva-aws?

@rpsilva-aws
Copy link
Copy Markdown
Collaborator Author

@miladm Yes, but there seem to be some issues with the test infra - I need to validate with the TPU docker. I can ping once done/fixed.

@tengyifei
Copy link
Copy Markdown
Collaborator

Can just ping me again when the "Build and test" has passed

@rpsilva-aws
Copy link
Copy Markdown
Collaborator Author

Unfortunately, part of this PR has a dependency on pytorch/pytorch#142160.

Instead, creating 2 PRs:

@rpsilva-aws rpsilva-aws closed this Dec 6, 2024
@rpsilva-aws rpsilva-aws deleted the rpsilva_downcast branch December 9, 2024 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants