Vector optimization for adding AES encryption and decryption to the loongarch64 architecture#19364
Vector optimization for adding AES encryption and decryption to the loongarch64 architecture#19364zhuchen1911 wants to merge 2 commits intoopenssl:masterfrom zhuchen1911:Loongarch64
Conversation
crypto/loongarch_arch.h
Outdated
| #ifndef OSSL_CRYPTO_LOONGARCH_ARCH_H | ||
| #define OSSL_CRYPTO_LOONGARCH_ARCH_H | ||
|
|
||
| //extern int have_lasx; |
There was a problem hiding this comment.
Please use regular C style comments.
There was a problem hiding this comment.
Thank you very much for your review. This comment has been removed. If there are other problems in the code, please point out and I will modify it according to your requirements.
Loongarch64 architecture defines 128 bit vector extension lsx and 256 bit vector extension lasx. The cpucfg instruction can be used to obtain whether the CPU has a corresponding extension. This part of code is added to prepare for the subsequent addition of corresponding vector instruction optimization. Signed-off-by: zhuchen <zhuchen@loongson.cn>
include/crypto/aes_platform.h
Outdated
| #if defined(__loongarch__) || defined(__loongarch64) | ||
| #include "loongarch_arch.h" | ||
| # if defined(VPAES_ASM) | ||
| # define VPAES_CAPABLE (OPENSSL_loongarchcap_P & LOONGARCH_CFG2_LSX) | ||
| # endif | ||
| #endif |
There was a problem hiding this comment.
Please make the preprocessor directive indentations right. I.e., add one more space in front of everything, except the # define, and add two spaces in #include.
There was a problem hiding this comment.
Thank you very much for your review. preprocessor directive indentations has been corrected.
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture to AES. The test result on the 3A5000 improves performance by about 40%~50%. Signed-off-by: zhuchen <zhuchen@loongson.cn>
|
Still good. |
|
24 hours has passed since 'approval: done' was set, but this PR has failing CI tests. Once the tests pass it will get moved to 'approval: ready to merge' automatically, alternatively please review and set the label manually. |
|
Merged, thanks. |
Loongarch64 architecture defines 128 bit vector extension lsx and 256 bit vector extension lasx. The cpucfg instruction can be used to obtain whether the CPU has a corresponding extension. This part of code is added to prepare for the subsequent addition of corresponding vector instruction optimization. Signed-off-by: zhuchen <zhuchen@loongson.cn> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #19364)
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture to AES. The test result on the 3A5000 improves performance by about 40%~50%. Signed-off-by: zhuchen <zhuchen@loongson.cn> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #19364)
|
merge to |
|
No, our stability policy prevent performance enhancements on stable branches. It would be a possibility for 3.1 however. |
Loongarch64 architecture defines 128 bit vector extension lsx and 256 bit vector extension lasx. The cpucfg instruction can be used to obtain whether the CPU has a corresponding extension. This part of code is added to prepare for the subsequent addition of corresponding vector instruction optimization. Signed-off-by: zhuchen <zhuchen@loongson.cn> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#19364) (cherry picked from commit 7f2d618)
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture to AES. The test result on the 3A5000 improves performance by about 40%~50%. Signed-off-by: zhuchen <zhuchen@loongson.cn> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#19364) (cherry picked from commit ef91754)
Loongarch64 architecture defines 128 bit vector extension lsx and 256 bit vector extension lasx. The cpucfg instruction can be used to obtain whether the CPU has a corresponding extension. This part of code is added to prepare for the subsequent addition of corresponding vector instruction optimization. Signed-off-by: zhuchen <zhuchen@loongson.cn> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #19364) (cherry picked from commit 7f2d618)
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture to AES. The test result on the 3A5000 improves performance by about 40%~50%. Signed-off-by: zhuchen <zhuchen@loongson.cn> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #19364) (cherry picked from commit ef91754)
Loongarch64 architecture defines 128 bit vector extension lsx and 256 bit vector extension lasx. The cpucfg instruction can be used to obtain whether the CPU has a corresponding extension. This part of code is added to prepare for the subsequent addition of corresponding vector instruction optimization. Signed-off-by: zhuchen <zhuchen@loongson.cn> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#19364)
Add 128 bit lsx vector expansion optimization code of Loongarch64 architecture to AES. The test result on the 3A5000 improves performance by about 40%~50%. Signed-off-by: zhuchen <zhuchen@loongson.cn> Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#19364)
|
This breaks with upstream toolchains that does NOT yet have the LSX/LASX support (in fact support has just been merged about a week ago!), we have to probe for assembler support and disable the asm if support is absent. |
Checklist