Skip to content

Add optimized crc32 for Power 8+ processors#750

Open
mmatti-sw wants to merge 3 commits intomadler:masterfrom
mmatti-sw:crc32-power
Open

Add optimized crc32 for Power 8+ processors#750
mmatti-sw wants to merge 3 commits intomadler:masterfrom
mmatti-sw:crc32-power

Conversation

@mmatti-sw
Copy link
Copy Markdown

This is a pull request to include all Power8 optimisations rebased to v1.2.13.
The reference PR is #478

Manjunath S Matti and others added 3 commits November 16, 2022 04:40
Optimized functions for Power will make use of GNU indirect functions,
an extension to support different implementations of the same function,
which can be selected during runtime. This will be used to provide
optimized functions for different processor versions.

Since this is a GNU extension, we placed the definition of the Z_IFUNC
macro under `contrib/gcc`. This can be reused by other archs as well.

Author: Matheus Castanho <msc@linux.ibm.com>
Author: Rogerio Alves <rcardoso@linux.ibm.com>
Signed-off-by: Manjunath Matti <mmatti@linux.ibm.com>
This commit adds an optimized version for the crc32 function based
on crc32-vpmsum from https://github.com/antonblanchard/crc32-vpmsum/

This is the C implementation created by Rogerio Alves
<rogealve@br.ibm.com>

It makes use of vector instructions to speed up CRC32 algorithm.

Author: Rogerio Alves <rcardoso@linux.ibm.com>
Signed-off-by: Manjunath Matti <mmatti@linux.ibm.com>
Clang 7 changed the behavior of vec_xxpermdi in order to match GCC's
behavior.  After this change, code that used to work on Clang 6 stopped
to work on Clang >= 7.

Tested on Clang 6, 7, 8 and 9.

Reference: https://bugs.llvm.org/show_bug.cgi?id=38192

Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
@Neustradamus
Copy link
Copy Markdown

@madler: Can you look?

@ljavorsk
Copy link
Copy Markdown

I've tried to apply your patch on top of the patch from #410 and it has some rejected hunks (configure.rej Makefile.in.rej ; I can sent you the output if you want).

The previous patch (#478) was rebased on top of that patch, could you please preserve this order?

@nmoinvaz
Copy link
Copy Markdown
Contributor

The patches for Power are also maintained and incorporated in zlib-ng if anybody is interested.

@ljavorsk
Copy link
Copy Markdown

Okay, @iii-i has rebased his patch on top of yours and provided it to me.

I would like to agree on the order in which you'll have them applied, so I don't need to change it too often. Is that okay with you?

@mmatti-sw
Copy link
Copy Markdown
Author

I am ok with any order you or @iii-i would like to follow.

@iii-i
Copy link
Copy Markdown
Contributor

iii-i commented Nov 29, 2022

I'd prefer POWER patches to go first, since they provide a foundation for adding optimized CRC32 implementations.

@ljavorsk
Copy link
Copy Markdown

Okay, I agree with that. Thank you

@ljavorsk
Copy link
Copy Markdown

Hi, could you please rebase your patches on top of zlib-1.3 version?

@Neustradamus
Copy link
Copy Markdown

@ljavorsk
Copy link
Copy Markdown

Hi, sorry @mmatti-sw for the inconvenience. We've transitioned to zlib-ng from Fedora 40, and thus we don't plan to rebase the zlib anymore.

This means, that you can fully focus on the zlib-ng PRs from now on.

@fneddy fneddy mentioned this pull request Feb 25, 2025
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants