Skip to content

Vector optimization for adding AES encryption and decryption to the loongarch64 architecture#19364

Closed
zhuchen1911 wants to merge 2 commits intoopenssl:masterfrom
zhuchen1911:Loongarch64
Closed

Vector optimization for adding AES encryption and decryption to the loongarch64 architecture#19364
zhuchen1911 wants to merge 2 commits intoopenssl:masterfrom
zhuchen1911:Loongarch64

Conversation

@zhuchen1911
Copy link
Contributor

Checklist
  • documentation is added or updated
  • tests are added or updated

@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Oct 8, 2022
@paulidale paulidale added branch: master Applies to master branch approval: review pending This pull request needs review by a committer triaged: feature The issue/pr requests/adds a feature labels Oct 9, 2022
#ifndef OSSL_CRYPTO_LOONGARCH_ARCH_H
#define OSSL_CRYPTO_LOONGARCH_ARCH_H

//extern int have_lasx;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use regular C style comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your review. This comment has been removed. If there are other problems in the code, please point out and I will modify it according to your requirements.

Loongarch64 architecture defines 128 bit vector extension lsx and 256 bit
vector extension lasx. The cpucfg instruction can be used to obtain whether
the CPU has a corresponding extension. This part of code is added to prepare
for the subsequent addition of corresponding vector instruction optimization.

Signed-off-by: zhuchen <zhuchen@loongson.cn>
Comment on lines +161 to +166
#if defined(__loongarch__) || defined(__loongarch64)
#include "loongarch_arch.h"
# if defined(VPAES_ASM)
# define VPAES_CAPABLE (OPENSSL_loongarchcap_P & LOONGARCH_CFG2_LSX)
# endif
#endif
Copy link
Member

@t8m t8m Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the preprocessor directive indentations right. I.e., add one more space in front of everything, except the # define, and add two spaces in #include.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your review. preprocessor directive indentations has been corrected.

Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture
to AES. The test result on the 3A5000 improves performance by about 40%~50%.

Signed-off-by: zhuchen <zhuchen@loongson.cn>
Copy link
Member

@t8m t8m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paulidale still OK?

@paulidale
Copy link
Contributor

Still good.

@paulidale paulidale added approval: done This pull request has the required number of approvals and removed approval: review pending This pull request needs review by a committer labels Oct 11, 2022
@openssl-machine
Copy link
Collaborator

24 hours has passed since 'approval: done' was set, but this PR has failing CI tests. Once the tests pass it will get moved to 'approval: ready to merge' automatically, alternatively please review and set the label manually.

@paulidale
Copy link
Contributor

Merged, thanks.

@paulidale paulidale closed this Oct 12, 2022
openssl-machine pushed a commit that referenced this pull request Oct 12, 2022
Loongarch64 architecture defines 128 bit vector extension lsx and 256 bit
vector extension lasx. The cpucfg instruction can be used to obtain whether
the CPU has a corresponding extension. This part of code is added to prepare
for the subsequent addition of corresponding vector instruction optimization.

Signed-off-by: zhuchen <zhuchen@loongson.cn>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from #19364)
openssl-machine pushed a commit that referenced this pull request Oct 12, 2022
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture
to AES. The test result on the 3A5000 improves performance by about 40%~50%.

Signed-off-by: zhuchen <zhuchen@loongson.cn>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from #19364)
@cheungxi
Copy link

merge to openssl-3.0 branch ?

@paulidale
Copy link
Contributor

No, our stability policy prevent performance enhancements on stable branches.

It would be a possibility for 3.1 however.

t8m pushed a commit to t8m/openssl that referenced this pull request Nov 21, 2022
Loongarch64 architecture defines 128 bit vector extension lsx and 256 bit
vector extension lasx. The cpucfg instruction can be used to obtain whether
the CPU has a corresponding extension. This part of code is added to prepare
for the subsequent addition of corresponding vector instruction optimization.

Signed-off-by: zhuchen <zhuchen@loongson.cn>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from openssl#19364)

(cherry picked from commit 7f2d618)
t8m pushed a commit to t8m/openssl that referenced this pull request Nov 21, 2022
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture
to AES. The test result on the 3A5000 improves performance by about 40%~50%.

Signed-off-by: zhuchen <zhuchen@loongson.cn>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from openssl#19364)

(cherry picked from commit ef91754)
openssl-machine pushed a commit that referenced this pull request Nov 23, 2022
Loongarch64 architecture defines 128 bit vector extension lsx and 256 bit
vector extension lasx. The cpucfg instruction can be used to obtain whether
the CPU has a corresponding extension. This part of code is added to prepare
for the subsequent addition of corresponding vector instruction optimization.

Signed-off-by: zhuchen <zhuchen@loongson.cn>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from #19364)

(cherry picked from commit 7f2d618)
openssl-machine pushed a commit that referenced this pull request Nov 23, 2022
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture
to AES. The test result on the 3A5000 improves performance by about 40%~50%.

Signed-off-by: zhuchen <zhuchen@loongson.cn>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from #19364)

(cherry picked from commit ef91754)
beldmit pushed a commit to beldmit/openssl that referenced this pull request Dec 26, 2022
Loongarch64 architecture defines 128 bit vector extension lsx and 256 bit
vector extension lasx. The cpucfg instruction can be used to obtain whether
the CPU has a corresponding extension. This part of code is added to prepare
for the subsequent addition of corresponding vector instruction optimization.

Signed-off-by: zhuchen <zhuchen@loongson.cn>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from openssl#19364)
beldmit pushed a commit to beldmit/openssl that referenced this pull request Dec 26, 2022
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture
to AES. The test result on the 3A5000 improves performance by about 40%~50%.

Signed-off-by: zhuchen <zhuchen@loongson.cn>

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Paul Dale <pauli@openssl.org>
(Merged from openssl#19364)
@xen0n
Copy link

xen0n commented Jul 1, 2023

This breaks with upstream toolchains that does NOT yet have the LSX/LASX support (in fact support has just been merged about a week ago!), we have to probe for assembler support and disable the asm if support is absent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval: done This pull request has the required number of approvals branch: master Applies to master branch severity: fips change The pull request changes FIPS provider sources triaged: feature The issue/pr requests/adds a feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants