Skip to content

Provide an inline asm fallback for the ARMv8 intrinsics#1697

Merged
Dead2 merged 4 commits intozlib-ng:developfrom
ccawley2011:armv8-inline-asm
Feb 12, 2025
Merged

Provide an inline asm fallback for the ARMv8 intrinsics#1697
Dead2 merged 4 commits intozlib-ng:developfrom
ccawley2011:armv8-inline-asm

Conversation

@ccawley2011
Copy link
Copy Markdown
Contributor

@ccawley2011 ccawley2011 commented Feb 29, 2024

The configure checks have also been modified to better match the ARMv6 code.

Summary by CodeRabbit

  • New Features

    • Introduced support for ARMv8 architecture, including new CRC32 intrinsic functions.
    • Updated build configurations to reflect ARMv8 support across various files.
  • Bug Fixes

    • Renamed and updated function pointers and test cases to ensure correct CRC32 implementations are being validated.
  • Documentation

    • Updated README.md to reflect changes in CRC32 implementation and build options.
  • Chores

    • Renamed configuration flags and variables to align with ARMv8 support throughout build scripts and Makefiles.

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 29, 2024

Codecov Report

Attention: Patch coverage is 50.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 80.38%. Comparing base (8cebc9c) to head (369b43a).
Report is 14 commits behind head on develop.

Files with missing lines Patch % Lines
test/test_crc32.cc 0.00% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1697      +/-   ##
===========================================
+ Coverage    80.34%   80.38%   +0.04%     
===========================================
  Files          138      139       +1     
  Lines        11147    11176      +29     
  Branches      2860     2867       +7     
===========================================
+ Hits          8956     8984      +28     
- Misses        1217     1223       +6     
+ Partials       974      969       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mtl1979
Copy link
Copy Markdown
Collaborator

mtl1979 commented Mar 2, 2024

Renaming ACLE to just ARMV8 is misleading... ARMv8 is a lot more than just ACLE CRC instructions.

@nmoinvaz nmoinvaz added Build Env Architecture Architecture specific labels Mar 6, 2024
@nmoinvaz
Copy link
Copy Markdown
Member

nmoinvaz commented Mar 6, 2024

I'm not familar enough with ARM, why rename ACLE to ARMv8?

@ccawley2011
Copy link
Copy Markdown
Contributor Author

I'm not familar enough with ARM, why rename ACLE to ARMv8?

ACLE is the name of the header for non-NEON intrinsics rather than a specific CPU extension. It's used for both ARMv8 CRC32 instructions and ARMv6 SIMD instructions, however the build system for zlib-ng only uses ARM_ACLE to refer to the ARMv8 code. The ARMv6 code uses a separate option.

@mtl1979
Copy link
Copy Markdown
Collaborator

mtl1979 commented Mar 10, 2024

We can't just rename command-line options for configure or CMake build stage, as those have to be backwards compatible. Otherwise we break build for every project that use zlib-ng. Adding support for inline asm fallback should be separated as own pull request, as it is less controversial, even though we have no precedence for supporting mismatched toolchains, which is clearly the case here as GNU assembler would be newer than GNU C compiler (GCC).

@Dead2
Copy link
Copy Markdown
Member

Dead2 commented Apr 3, 2024

Renaming should be split into a separate PR from the added features/fixes.

@Un1q32
Copy link
Copy Markdown
Contributor

Un1q32 commented Apr 24, 2024

The improved acle/armv8 check fixes a false positive when building for 32 bit arm iOS, I ripped out the improved check and put it in https://github.com/Un1q32/zlib-ng/tree/acle, I would make a PR with just that but I'll wait to see if the author of this PR finishes the main PR

@Un1q32
Copy link
Copy Markdown
Contributor

Un1q32 commented May 17, 2024

No activity here for a while, gonna make a PR for the bit I ripped out

@Un1q32 Un1q32 mentioned this pull request May 17, 2024
@nmoinvaz
Copy link
Copy Markdown
Member

We can't just rename command-line options for configure or CMake build stage, as those have to be backwards compatible.

Backwards compatibility can be provided like so:

if(DEFINED WITH_ACLE)
  set(WITH_ARMV8 ${WITH_ACLE})
endif()

If the new names are more accurate, I'd prefer that.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 21, 2024

Walkthrough

This pull request introduces a comprehensive update to the zlib-ng project's ARM architecture support, focusing on transitioning from ARM C Language Extensions (ACLE) to ARMv8 CRC32 implementation. The changes span multiple files across the project, including workflow configurations, CMake scripts, source files, and build configurations. The primary goal is to replace ACLE-specific implementations with ARMv8-optimized versions, updating preprocessor directives, function names, and build flags to reflect this architectural shift.

Changes

File Change Summary
.github/workflows/cmake.yml, .github/workflows/configure.yml Renamed jobs and updated CMake/configure arguments from ACLE to ARMv8
CMakeLists.txt Replaced WITH_ACLE option with WITH_ARMV8, updated feature detection
README.md Updated CRC32-B implementation description and build options
arch/arm/* Renamed files, updated function names, and modified CRC32 implementation
cmake/detect-intrinsics.cmake Renamed macros and variables from ACLE to ARMv8
functable.c, test/* Updated function pointers and test cases to use crc32_armv8
win32/Makefile.* Modified build configurations to support ARMv8

Suggested reviewers

  • nmoinvaz
  • mtl1979
  • Dead2

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
arch/arm/crc32_armv8.c (1)

Line range hint 16-24: Ensure partial reads don’t cause out-of-bounds access.
Because of alignment checks and partial consumption of bytes, carefully verify that the pointer increments do not exceed buffer bounds for corner cases like len=2 or len=3.

🧹 Nitpick comments (6)
arch/arm/acle_intrins.h (2)

18-19: Ensure consistent naming for preprocessor directives.
Currently, the code uses both "ARM_CRC32" and "ARM_CRC32_INTRIN" to guard CRC-related logic. Consider ensuring naming consistency to reduce confusion, for example by adopting a single preprocessor directive (e.g., ARM_CRC32) throughout.


44-48: Consider code duplication between aarch64 and non-aarch64 code paths.
The functions __crc32b, __crc32h, __crc32w differ primarily by the inline assembly instruction names. Refactoring these into macros could reduce duplication and ease maintainability.

arch/arm/Makefile.in (1)

25-25: Confirm correctness of conditionally building ARMv8 objects.
Ensure your build system does not inadvertently build the ARMv8 object files on architectures that lack hardware CRC instructions. This can be enforced by retaining or adding relevant checks in the top-level configure or CMake.

arch/arm/arm_functions.h (1)

58-61: Check backward compatibility.
Removing references to the previous “crc32_acle” might break older build scripts or code expecting that symbol. Consider preserving the old name or providing a migration note for external downstream projects.

cmake/detect-intrinsics.cmake (1)

Line range hint 4-54: LGTM! The ARMv8 detection implementation is thorough.

The macro has been successfully renamed and enhanced to provide comprehensive ARMv8 support with proper fallbacks. The implementation:

  • Checks both CRC and SIMD support
  • Verifies compiler support for both -march and -Wa,-march flags
  • Separates inline assembly and intrinsics checks

Consider adding comments to document the purpose of each check block for better maintainability.

configure (1)

1124-1174: Approve with suggestions: ARMv8 feature detection is comprehensive

The implementation thoroughly checks for ARMv8 support through multiple compiler flags and includes both inline assembly and intrinsics checks. However, consider these improvements:

  1. Add comments explaining the fallback strategy between different march flags
  2. Add error handling or warning when no compiler flags work
 check_armv8_intrinsics() {
+    # Try compiler flags in order of preference:
+    # 1. -march=armv8-a+crc (preferred, most compilers)
+    # 2. -march=armv8-a+crc+simd (some compilers require explicit SIMD)
+    # 3. -Wa,-march=armv8-a+crc (assembler-level support)
+    # 4. -Wa,-march=armv8-a+crc+simd (assembler-level with SIMD)
     cat > $test.c << EOF
 int main() { return 0; }
 EOF
     if try $CC -c $CFLAGS -march=armv8-a+crc $test.c; then
         echo "Check whether -march=armv8-a+crc works ... Yes." | tee -a configure.log
         armv8flag="-march=armv8-a+crc"
     else
         echo "Check whether -march=armv8-a+crc works ... No." | tee -a configure.log
         if try $CC -c $CFLAGS -march=armv8-a+crc+simd $test.c; then
             echo "Check whether -march=armv8-a+crc+simd works ... Yes." | tee -a configure.log
             armv8flag="-march=armv8-a+crc+simd"
         else
             echo "Check whether -march=armv8-a+crc+simd works ... No." | tee -a configure.log
             if try $CC -c $CFLAGS -Wa,-march=armv8-a+crc $test.c; then
                 echo "Check whether -Wa,-march=armv8-a+crc works ... Yes." | tee -a configure.log
                 armv8flag="-Wa,-march=armv8-a+crc"
             else
                 echo "Check whether -Wa,-march=armv8-a+crc works ... No." | tee -a configure.log
                 if try $CC -c $CFLAGS -Wa,-march=armv8-a+crc+simd $test.c; then
                     echo "Check whether -Wa,-march=armv8-a+crc+simd works ... Yes." | tee -a configure.log
                     armv8flag="-Wa,-march=armv8-a+crc+simd"
                 else
                     echo "Check whether -Wa,-march=armv8-a+crc+simd works ... No." | tee -a configure.log
+                    echo "Warning: No ARMv8 compiler flags worked. ARMv8 optimizations will be disabled." | tee -a configure.log
                 fi
             fi
         fi
     fi
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 04d1b75 and 7cef706.

📒 Files selected for processing (17)
  • .github/workflows/cmake.yml (2 hunks)
  • .github/workflows/configure.yml (2 hunks)
  • CMakeLists.txt (4 hunks)
  • README.md (2 hunks)
  • arch/arm/Makefile.in (3 hunks)
  • arch/arm/acle_intrins.h (1 hunks)
  • arch/arm/arm_features.c (1 hunks)
  • arch/arm/arm_functions.h (2 hunks)
  • arch/arm/crc32_armv8.c (1 hunks)
  • cmake/detect-intrinsics.cmake (3 hunks)
  • configure (10 hunks)
  • functable.c (1 hunks)
  • test/CMakeLists.txt (1 hunks)
  • test/benchmarks/benchmark_crc32.cc (1 hunks)
  • test/test_crc32.cc (1 hunks)
  • win32/Makefile.a64 (1 hunks)
  • win32/Makefile.arm (2 hunks)
🔇 Additional comments (24)
arch/arm/acle_intrins.h (3)

38-42: Inline assembly usage for 64-bit operands on AArch64
The "crc32x" mnemonic is used correctly for 64-bit registers in AArch64. Just ensure that each time "crc32x" is utilized, the macro expansions and compilation flags are aligned with the targeted CPU to avoid illegal instruction errors on CPUs lacking the CRC extension.


62-63: Confirm correctness of the fallback for 64-bit calculations.
The fallback re-uses __crc32w twice to handle the two halves of a 64-bit operand. This approach is correct, but ensure that 32-bit alignment is guaranteed and that there's no unintended overhead if 64-bit instructions are available.


20-24: Validate inline assembly constraints for correctness.
It looks correct to reference general-purpose registers for input/output, but inline assembly constraints should carefully handle any potential register clobbers (e.g., the condition flags). Confirm that no additional clobber list (“cc”) is needed.

✅ Verification successful

No clobber list needed for CRC32 instructions

The CRC32 instructions used in the inline assembly (crc32b, crc32h, crc32w, crc32x) are data processing instructions that do not modify the condition flags (NZCV) according to the ARM Architecture Reference Manual. The current constraints using "=r" for output and "r" for inputs are sufficient, as:

  • These instructions only compute CRC and write to the destination register
  • They don't modify any processor state flags
  • No temporary registers are clobbered during execution

Looking at the other similar CRC32 implementations in the codebase (configure, cmake files), they follow the same constraint pattern consistently.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Checking for inline assembly usage from within the code:
rg -A 5 '__asm__ __volatile__' 

Length of output: 4146

arch/arm/crc32_armv8.c (2)

Line range hint 26-49: Check that leftover data is processed correctly.
When the code transitions among 8-byte, 4-byte, 2-byte, and 1-byte reads, ensure no data is missed or doubly processed. Given len is decremented in each step, it seems correct, but further tests might be warranted.


12-14: Validate usage of Z_REGISTER macros and pointer alignment checks.
While these macros and pointer alignment shifts often help performance, verify that your compiler or environment does indeed honor these optimizations.

✅ Verification successful

Let me analyze the findings and generate additional verification scripts to check the compiler optimizations and alignment aspects.


Based on the comprehensive analysis of the code, I can now provide the final response:

The Z_REGISTER and alignment optimizations are properly implemented and safe to use. The code shows a well-structured implementation with:

  • Z_REGISTER is properly defined in zbuild.h as either register or empty based on compiler support
  • Alignment checks are handled correctly with proper fallbacks for unaligned access
  • The implementation includes proper architecture-specific optimizations (ARM CRC32 intrinsics)
  • The code includes proper boundary checks and handles different data sizes appropriately

The implementation follows best practices for ARM architecture:

  • Uses ARM-specific CRC32 intrinsics when available
  • Handles alignment requirements properly with progressive width increases (8-bit → 16-bit → 32-bit → 64-bit)
  • Falls back gracefully when alignment requirements aren't met
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Checking references to Z_REGISTER macros:
rg -A 5 'Z_REGISTER'

Length of output: 3103


Script:

#!/bin/bash
# Check for alignment and optimization related macros/checks
rg -A 5 'UNALIGNED|ALIGNED|Z_TARGET_CRC'

# Check for architecture-specific configurations
cat arch/arm/crc32_armv8.c

Length of output: 20488

arch/arm/Makefile.in (2)

53-57: Verify parallel or incremental builds.
Check that any parallel building behavior is unaffected by the new ARMV8 object targets. Some build systems can get confused by changed target names if not properly declared as dependencies or prerequisites.


11-11: Rename Makefile variable usage consistently.
Now that ACLEFLAG is replaced with ARMV8FLAG, ensure that references to ACLEFLAG are removed entirely throughout the repository, including documentation.

arch/arm/arm_functions.h (1)

22-23: Validate the condition for ARM_CRC32 availability.
If you define ARM_CRC32 in one place and rely on __ARM_FEATURE_CRC32 in others, ensure they line up consistently in various compilers.

test/benchmarks/benchmark_crc32.cc (1)

65-66: LGTM! Consistent renaming from ACLE to ARMv8

The benchmark renaming aligns with the PR's objective to transition from ACLE to ARMv8 terminology while maintaining the same functionality.

win32/Makefile.a64 (1)

97-102: LGTM! Verify build configuration consistency

The changes to WFLAGS and OBJS align with the transition from ACLE to ARMv8 terminology and implementation.

✅ Verification successful

Build configuration is consistent across build systems

The verification confirms:

  • No remaining references to crc32_acle in any build files
  • ARM_CRC32 flag is consistently defined in CMakeLists.txt and Windows build files
  • crc32_armv8 is properly integrated across all build systems:
    • CMake build system (CMakeLists.txt)
    • Windows builds (Makefile.a64, Makefile.arm)
    • Unix/autotools build (arch/arm/Makefile.in)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify build configuration consistency across different build systems

# Check for any remaining references to crc32_acle in build files
rg "crc32_acle" -g "CMakeLists.txt" -g "Makefile*"

# Check for consistent usage of ARM_CRC32 flag
rg "ARM_CRC32" -g "CMakeLists.txt" -g "Makefile*"

# Verify that all build systems include the new crc32_armv8 object
rg "crc32_armv8" -g "CMakeLists.txt" -g "Makefile*"

Length of output: 1037

win32/Makefile.arm (2)

43-43: LGTM: Configuration flag follows existing pattern.

The addition of WITH_ARMV8 flag follows the same pattern as other architecture-specific flags like WITH_NEON and WITH_ARMV6.


101-103: LGTM: Build configuration properly handles ARMv8 support.

The conditional block correctly:

  1. Adds ARM_CRC32 flag when WITH_ARMV8 is defined
  2. Includes crc32_armv8.obj in the build
test/CMakeLists.txt (1)

194-194: LGTM: Test comment updated to reflect ARMv8 terminology.

The comment update from crc32_acle() to crc32_armv8() maintains consistency with the architectural changes.

functable.c (1)

186-190: Verify the impact of ARM_CRC32 macro replacement.

The changes correctly:

  1. Replace ARM_ACLE with ARM_CRC32 macro
  2. Use proper feature detection via cf.arm.has_crc32
  3. Assign crc32_armv8 function pointer

Let's verify the consistency of this change across the codebase:

✅ Verification successful

The ARM_CRC32 macro replacement is consistent across the codebase

The verification shows:

  1. ARM_ACLE references only remain in build configuration files (configure, CMakeLists.txt) and header guards
  2. ARM_CRC32 is properly defined in all relevant build files
  3. crc32_acle references only exist in ABI test files which are historical snapshots
  4. crc32_armv8 implementation and usage is consistent across the codebase
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the transition from ARM_ACLE to ARM_CRC32 and related function changes

# Check for any remaining ARM_ACLE references
echo "Checking for remaining ARM_ACLE references..."
rg "ARM_ACLE"

# Check for ARM_CRC32 usage consistency
echo "Checking ARM_CRC32 usage..."
rg "ARM_CRC32"

# Check for crc32_acle references
echo "Checking for remaining crc32_acle references..."
rg "crc32_acle"

# Check for crc32_armv8 implementation
echo "Checking crc32_armv8 implementation..."
rg "crc32_armv8"

Length of output: 4038

.github/workflows/configure.yml (1)

57-62: LGTM! Verify impact on existing build configurations.

The job renamings and configure argument updates from --without-acle to --without-armv8 are consistent with the PR objectives.

Run the following script to verify the impact on existing build configurations:

Also applies to: 85-90

✅ Verification successful

Build configurations are consistent across workflows

The verification shows that:

  • No remaining references to "ACLE" exist in the workflow files
  • ARMv8-related configurations are consistently used across both configure.yml and cmake.yml
  • The naming convention and flags are aligned:
    • Configure workflow uses --without-armv8
    • CMake workflow uses -DWITH_ARMV8=OFF
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any remaining references to ACLE in build configurations
# and verify the new ARMv8 configurations.

# Check for any remaining ACLE references in workflow files
echo "Checking for remaining ACLE references in workflow files..."
rg -i "acle" .github/workflows/

# Check for consistency in ARMv8 related configurations
echo "Checking ARMv8 related configurations..."
rg -i "armv8" .github/workflows/

Length of output: 1209

test/test_crc32.cc (1)

215-216: LGTM! Test case updated correctly.

The test case has been properly updated to use the new ARMv8 nomenclature while maintaining the same functionality and CPU feature check.

README.md (1)

23-23: LGTM! Documentation updates are accurate and consistent.

The changes to both the features section and build options table accurately reflect the transition from ACLE to ARMv8 while maintaining consistency with the documentation style.

Also applies to: 207-207

cmake/detect-intrinsics.cmake (1)

28-42: LGTM! The ARM architecture handling is robust.

The implementation correctly:

  • Handles both 32-bit and 64-bit ARM architectures
  • Uses appropriate assembly syntax for each architecture (%w0 for aarch64 vs %0 for arm32)
  • Properly checks for intrinsics support using standard headers
.github/workflows/cmake.yml (1)

170-175: LGTM! CI configuration updated correctly.

The workflow changes properly reflect the transition from ACLE to ARMv8:

  • Job names updated appropriately
  • Build flags changed from WITH_ACLE=OFF to WITH_ARMV8=OFF
  • CodeCov identifiers updated to maintain coverage tracking

Also applies to: 197-201

CMakeLists.txt (2)

106-106: LGTM! Build option updated appropriately.

The WITH_ARMV8 option correctly replaces the previous ACLE option while maintaining the same functionality.


1388-1388: LGTM! Feature information updated correctly.

The feature information properly reflects the new ARMv8 support in the build system.

configure (3)

95-95: LGTM: Variable initialization follows established pattern

The buildarmv8 flag is correctly initialized with other architecture flags, maintaining consistency with the codebase's configuration system.


1996-1996: LGTM: Build system integration is consistent

The ARMv8 flag is correctly integrated into both the configure log and architecture-specific Makefile, following the same pattern as other architecture-specific flags.

Also applies to: 2137-2137


1779-1792: Approve with verification: ARMv8 feature enablement looks good

The implementation correctly enables ARMv8 features based on available support. However, let's verify the existence of the referenced object files.

✅ Verification successful

ARMv8 CRC32 implementation is properly integrated

The verification confirms that:

  • The source file arch/arm/crc32_armv8.c exists and will generate the required object files
  • The implementation is properly integrated into the function table system via functable.c
  • The function is correctly declared in arm_functions.h and can be set as the native implementation
  • Architecture detection is handled in both cmake/detect-arch.c and arch/arm/arm_features.c
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that the ARMv8 object files exist in the source tree

# Check for the source file that generates crc32_armv8.o
echo "Checking for ARMv8 implementation file..."
fd "crc32_armv8.c" --type f

# Check for references to ARMv8 in other architecture files
echo "Checking for other ARMv8 references..."
rg "armv8" --type c

Length of output: 803

Comment on lines +55 to 58
#elif defined(ARM_NOCHECK_CRC32)
return 1;
#else
return 0;
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Dec 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Issue found: ARM_NOCHECK_CRC32 is used but not defined

The verification reveals that while ARM_NOCHECK_CRC32 is used in arm_features.c, it's not defined anywhere in the codebase. Other ARM-related NOCHECK macros (like ARM_NOCHECK_SIMD and ARM_NOCHECK_NEON) exist, but ARM_NOCHECK_CRC32 is missing its definition. This could lead to unexpected behavior.

  • arch/arm/arm_features.c: Uses ARM_NOCHECK_CRC32 in conditional compilation
  • The codebase uses ARM_CRC32 for feature detection, but the corresponding NOCHECK macro is not defined
🔗 Analysis chain

LGTM! Verify the macro definition consistency

The change from ARM_NOCHECK_ACLE to ARM_NOCHECK_CRC32 is consistent with the PR's objectives.


Let me gather more information about the macro definitions in the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify that ARM_NOCHECK_CRC32 is defined consistently across the codebase
# and that there are no lingering references to ARM_NOCHECK_ACLE

# Check for any remaining references to ARM_NOCHECK_ACLE
rg "ARM_NOCHECK_ACLE"

# Check where ARM_NOCHECK_CRC32 is defined/used
rg "ARM_NOCHECK_CRC32"

Length of output: 101


Script:

#!/bin/bash
# Check for any macro definitions or configurations related to ARM CRC32
rg -A 3 "#define.*ARM.*CRC32|#ifdef.*ARM.*CRC32|#ifndef.*ARM.*CRC32"

# Check build configuration files for these macros
fd "CMakeLists.txt|configure|Makefile|*.cmake" --exec rg "ARM.*CRC32|NOCHECK"

# Check header files for macro definitions
fd -e h --exec rg "ARM.*CRC32|NOCHECK"

Length of output: 1976

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks intentional.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nmoinvaz, understood. Let me know if there’s anything else you’d like me to check or clarify.

@nmoinvaz
Copy link
Copy Markdown
Member

This PR looks pretty complete. I like the naming changes as they are more accurate.

| WITH_PCLMULQDQ | | Build with PCLMULQDQ intrinsics | ON |
| WITH_VPCLMULQDQ | --without-vpclmulqdq | Build with VPCLMULQDQ intrinsics | ON |
| WITH_ACLE | --without-acle | Build with ACLE intrinsics | ON |
| WITH_ARMV8 | --without-armv8 | Build with ARMv8 intrinsics | ON |
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: One space too much in description field.

@Dead2
Copy link
Copy Markdown
Member

Dead2 commented Dec 23, 2024

We can't just rename command-line options for configure or CMake build stage, as those have to be backwards compatible.

Backwards compatibility can be provided like so:

if(DEFINED WITH_ACLE)
  set(WITH_ARMV8 ${WITH_ACLE})
endif()

If the new names are more accurate, I'd prefer that.

@ccawley2011 Could you add the requested backwards compatibility for CMake and Configure? No need to document the obsolete parameter, just enable the correct setting and print a warning that the old-name parameter has been deprecated and will be removed in the future.

@Dead2 Dead2 added the Rebase needed Please do a 'git rebase develop yourbranch' label Dec 23, 2024
@ccawley2011
Copy link
Copy Markdown
Contributor Author

@ccawley2011 Could you add the requested backwards compatibility for CMake and Configure? No need to document the obsolete parameter, just enable the correct setting and print a warning that the old-name parameter has been deprecated and will be removed in the future.

OK, done.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
CMakeLists.txt (1)

719-734: Inline ASM Fallback and Feature Check for ARMv8 Intrinsics
The new block checking for ARMv8 support is implemented cleanly. It calls check_armv8_compiler_flag() and conditionally adds ARM_CRC32 definitions and source files if inline assembly or intrinsics are available. One suggestion is to add a status message (or warning) in the fallback paths (both in the “else()” after the if(HAVE_ARMV8_INLINE_ASM OR HAVE_ARMV8_INTRIN) and in the outer else block) to facilitate troubleshooting when the feature is disabled due to missing compiler support. For example:

+  else()
+   message(STATUS "ARMv8 intrinsics not supported by the compiler; disabling WITH_ARMV8 fallback")
+   set(WITH_ARMV8 OFF)
+  endif()

This would make diagnosing build issues easier.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7cef706 and f0a93c5.

📒 Files selected for processing (17)
  • .github/workflows/cmake.yml (2 hunks)
  • .github/workflows/configure.yml (2 hunks)
  • CMakeLists.txt (4 hunks)
  • README.md (2 hunks)
  • arch/arm/Makefile.in (3 hunks)
  • arch/arm/acle_intrins.h (1 hunks)
  • arch/arm/arm_features.c (1 hunks)
  • arch/arm/arm_functions.h (2 hunks)
  • arch/arm/crc32_armv8.c (1 hunks)
  • cmake/detect-intrinsics.cmake (3 hunks)
  • configure (10 hunks)
  • functable.c (1 hunks)
  • test/CMakeLists.txt (1 hunks)
  • test/benchmarks/benchmark_crc32.cc (1 hunks)
  • test/test_crc32.cc (3 hunks)
  • win32/Makefile.a64 (1 hunks)
  • win32/Makefile.arm (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • README.md
  • functable.c
  • arch/arm/arm_features.c
  • test/CMakeLists.txt
  • test/benchmarks/benchmark_crc32.cc
  • .github/workflows/configure.yml
  • arch/arm/Makefile.in
  • win32/Makefile.a64
  • .github/workflows/cmake.yml
🧰 Additional context used
📓 Learnings (1)
test/test_crc32.cc (1)
Learnt from: KungFuJesus
PR: zlib-ng/zlib-ng#1833
File: test/test_crc32.cc:0-0
Timestamp: 2024-12-22T20:40:03.280Z
Learning: In the crc32_align test class (test/test_crc32.cc), the alignment offset parameter is always guaranteed to be non-negative because the test parameters are controlled and it never makes sense to have a negative offset.
⏰ Context from checks skipped due to timeout of 90000ms (55)
  • GitHub Check: Windows ClangCl Win64
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: Windows MSVC 2019 v140 Win64
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: Windows ClangCl Win64
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: Windows MSVC 2019 v140 Win64
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: Windows ClangCl Win64
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: Windows MSVC 2019 v140 Win64
  • GitHub Check: Ubuntu Clang Debug
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS Clang ASAN (ARM64)
  • GitHub Check: Windows ClangCl Win64
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: Windows MSVC 2019 v140 Win64
  • GitHub Check: Ubuntu Clang Debug
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS Clang ASAN (ARM64)
  • GitHub Check: Windows ClangCl Win64
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: Windows MSVC 2019 v140 Win64
  • GitHub Check: Ubuntu Clang Debug
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS Clang ASAN (ARM64)
  • GitHub Check: Windows ClangCl Win64
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: Windows MSVC 2019 v140 Win64
  • GitHub Check: Ubuntu Clang Debug
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS Clang ASAN (ARM64)
  • GitHub Check: Windows ClangCl Win64
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: Windows MSVC 2019 v140 Win64
  • GitHub Check: Ubuntu Clang Debug
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS Clang ASAN (ARM64)
  • GitHub Check: Windows ClangCl Win64
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: Windows MSVC 2019 v140 Win64
  • GitHub Check: Ubuntu Clang Debug
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS Clang ASAN (ARM64)
  • GitHub Check: Windows ClangCl Win64
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: Windows MSVC 2019 v140 Win64
  • GitHub Check: Ubuntu Clang Debug
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS Clang ASAN (ARM64)
  • GitHub Check: Windows ClangCl Win64
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: Windows MSVC 2019 v140 Win64
  • GitHub Check: Ubuntu Clang Debug
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
🔇 Additional comments (14)
arch/arm/crc32_armv8.c (2)

1-8: LGTM! The file header and preprocessor directive changes look good.

The transition from ARM_ACLE to ARM_CRC32 better reflects the actual functionality being used.


12-75: LGTM! The CRC32 implementation is well-optimized.

The function efficiently handles:

  • Different input sizes with specialized processing
  • Memory alignment for optimal performance
  • Proper use of ARMv8 CRC32 intrinsics
arch/arm/arm_functions.h (2)

22-24: LGTM! Function declaration is consistent with implementation.

The declaration properly reflects the renamed function and updated preprocessor directive.


59-62: LGTM! Native function mapping is correctly updated.

The native CRC32 function is properly mapped to the new crc32_armv8 implementation.

arch/arm/acle_intrins.h (3)

11-17: LGTM! Preprocessor directives and target definitions are well-structured.

The code properly handles different architectures and compiler configurations.


18-42: LGTM! AArch64 inline assembly implementations are correct.

The implementations:

  • Use proper crc32b, crc32h, crc32w, and crc32x instructions
  • Have correct register constraints
  • Include volatile keyword for proper optimization

43-64: LGTM! Non-AArch64 implementations handle CRC32 correctly.

The implementations:

  • Use appropriate assembly syntax for 32-bit ARM
  • Correctly implement 64-bit CRC32 using two 32-bit operations
win32/Makefile.arm (2)

43-43: LGTM! New flag definition is properly added.

The WITH_ARMV8 flag is correctly defined for conditional compilation.


101-104: LGTM! Conditional compilation section is correctly updated.

The section properly:

  • Checks for WITH_ARMV8
  • Sets appropriate flags
  • Includes the correct object file
test/test_crc32.cc (1)

200-250: LGTM! Test case renames align with ARMv8 transition.

The test case renames from ACLE to ARMv8 are consistent with the broader architectural transition while maintaining the same test coverage and functionality.

cmake/detect-intrinsics.cmake (1)

4-57: LGTM! Comprehensive ARMv8 support detection.

The implementation provides thorough checks for ARMv8 support through:

  1. Multiple compiler flag variants
  2. Both intrinsics and inline assembly support
  3. Proper handling of 32-bit and 64-bit ARM architectures
configure (2)

95-96: LGTM! Clean transition with backward compatibility.

The changes properly handle the ACLE to ARMv8 transition:

  1. Renames buildacle to buildarmv8
  2. Maintains backward compatibility by supporting --without-acle
  3. Adds clear comments indicating backward compatibility support

Also applies to: 201-203


1126-1192: LGTM! Robust ARMv8 detection implementation.

The implementation provides comprehensive ARMv8 support detection:

  1. Checks for various compiler flags
  2. Tests both intrinsics and inline assembly support
  3. Properly handles different ARM architectures
CMakeLists.txt (1)

106-112: Backward Compatibility for ARMv8 Option
The newly added option for WITH_ARMV8 is well integrated, and the block that checks if WITH_ACLE is defined to assign its value to WITH_ARMV8 correctly maintains backward compatibility. Ensure that any legacy builds relying on WITH_ACLE behave as expected.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
CMakeLists.txt (2)

105-112: Backward Compatibility for ARMv8 Option:
The changes introduce a new option WITH_ARMV8 (line 106) for enabling ARMv8 CRC32 intrinsics. The backward compatibility block (lines 109–112) resets WITH_ARMV8 to the value of the legacy WITH_ACLE if it is defined. This ensures that users of the old option aren’t immediately broken.
Consider adding an explicit deprecation notice or comment indicating that WITH_ACLE is now deprecated, so that users can gradually migrate to WITH_ARMV8.


719-728: Inline ASM Fallback for ARMv8 Intrinsics:
This block (lines 719–728) calls check_armv8_compiler_flag() and, if either inline assembly or intrinsic support is detected, defines -DARM_CRC32 and sets up the associated source file (crc32_armv8.c) with the proper compile flags. In the fallback branch (lines 731–735), WITH_ARMV8 is disabled when the required support isn’t found.
The implementation is clear and functionally sound. For improved diagnostics, consider adding a verbose message or comment when falling back (i.e. when WITH_ARMV8 is set to OFF) to aid future debugging.

Also applies to: 731-735

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f0a93c5 and 369b43a.

📒 Files selected for processing (17)
  • .github/workflows/cmake.yml (2 hunks)
  • .github/workflows/configure.yml (2 hunks)
  • CMakeLists.txt (4 hunks)
  • README.md (2 hunks)
  • arch/arm/Makefile.in (3 hunks)
  • arch/arm/acle_intrins.h (1 hunks)
  • arch/arm/arm_features.c (1 hunks)
  • arch/arm/arm_functions.h (2 hunks)
  • arch/arm/crc32_armv8.c (1 hunks)
  • cmake/detect-intrinsics.cmake (3 hunks)
  • configure (9 hunks)
  • functable.c (1 hunks)
  • test/CMakeLists.txt (1 hunks)
  • test/benchmarks/benchmark_crc32.cc (1 hunks)
  • test/test_crc32.cc (3 hunks)
  • win32/Makefile.a64 (1 hunks)
  • win32/Makefile.arm (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
  • README.md
  • test/CMakeLists.txt
  • win32/Makefile.arm
  • test/benchmarks/benchmark_crc32.cc
  • win32/Makefile.a64
  • functable.c
  • arch/arm/Makefile.in
  • .github/workflows/configure.yml
  • test/test_crc32.cc
  • arch/arm/arm_features.c
  • arch/arm/crc32_armv8.c
  • .github/workflows/cmake.yml
⏰ Context from checks skipped due to timeout of 90000ms (61)
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC (ARM64)
  • GitHub Check: macOS GCC
  • GitHub Check: macOS GCC Symbol Prefix & Compat
  • GitHub Check: macOS GCC Symbol Prefix (ARM64)
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC (ARM64)
  • GitHub Check: macOS GCC
  • GitHub Check: macOS GCC Symbol Prefix & Compat
  • GitHub Check: macOS GCC Symbol Prefix (ARM64)
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC (ARM64)
  • GitHub Check: macOS GCC
  • GitHub Check: macOS GCC Symbol Prefix & Compat
  • GitHub Check: macOS GCC Symbol Prefix (ARM64)
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC (ARM64)
  • GitHub Check: macOS GCC
  • GitHub Check: macOS GCC Symbol Prefix & Compat
  • GitHub Check: macOS GCC Symbol Prefix (ARM64)
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC (ARM64)
  • GitHub Check: macOS GCC
  • GitHub Check: macOS GCC Symbol Prefix & Compat
  • GitHub Check: macOS GCC Symbol Prefix (ARM64)
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC (ARM64)
  • GitHub Check: macOS GCC
  • GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
  • GitHub Check: macOS GCC Symbol Prefix & Compat
  • GitHub Check: macOS GCC Symbol Prefix (ARM64)
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC
  • GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
  • GitHub Check: macOS GCC Symbol Prefix & Compat
  • GitHub Check: macOS GCC Symbol Prefix (ARM64)
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC
  • GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
  • GitHub Check: macOS GCC Symbol Prefix & Compat
  • GitHub Check: macOS GCC Symbol Prefix (ARM64)
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC
  • GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
  • GitHub Check: macOS GCC Symbol Prefix & Compat
  • GitHub Check: macOS GCC Symbol Prefix (ARM64)
  • GitHub Check: Windows ClangCl Win32
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC
  • GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
  • GitHub Check: macOS GCC Symbol Prefix & Compat
  • GitHub Check: macOS GCC Symbol Prefix (ARM64)
🔇 Additional comments (11)
arch/arm/arm_functions.h (1)

22-24: LGTM! The function declaration changes look good.

The renaming from crc32_acle to crc32_armv8 and the updated conditional compilation directive from ARM_ACLE to ARM_CRC32 are consistent with the PR's goal to transition from ACLE to ARMv8.

arch/arm/acle_intrins.h (4)

11-17: LGTM! The conditional compilation setup is correct.

The code properly sets up the target flags for CRC support based on the architecture.


18-42: LGTM! The AArch64 inline assembly implementations are correct.

The code provides efficient inline assembly implementations for CRC32 operations on AArch64:

  • __crc32b for byte operations
  • __crc32h for halfword operations
  • __crc32w for word operations
  • __crc32d for doubleword operations

44-64: LGTM! The non-AArch64 inline assembly implementations are correct.

The code provides equivalent inline assembly implementations for 32-bit ARM architectures.


62-64: LGTM! The non-AArch64 64-bit operation is efficiently implemented.

The __crc32d implementation for non-AArch64 cleverly uses two __crc32w calls to handle 64-bit input.

cmake/detect-intrinsics.cmake (3)

4-57: LGTM! The ARMv8 compiler flag detection is comprehensive.

The macro correctly checks for various ARMv8 compiler flags in order of preference:

  1. -march=armv8-a+crc
  2. -march=armv8-a+crc+simd
  3. -Wa,-march=armv8-a+crc
  4. -Wa,-march=armv8-a+crc+simd

156-175: LGTM! The ARMv8 inline assembly check is thorough.

The code properly verifies inline assembly support for both AArch64 and non-AArch64 architectures.


177-191: LGTM! The ARMv8 intrinsics check is well-implemented.

The code verifies compiler support for ARMv8 intrinsics using a simple test case.

configure (2)

95-95: LGTM! The backward compatibility handling is well-implemented.

The code maintains backward compatibility by:

  1. Adding buildarmv8 variable
  2. Updating help text to document the new --without-armv8 option
  3. Supporting the old --without-acle option for backward compatibility

Also applies to: 169-169, 201-203


1126-1192: LGTM! The ARMv8 intrinsics detection is thorough.

The code properly checks for:

  1. Compiler support for various ARMv8 flags
  2. ARMv8 inline assembly support
  3. ARMv8 intrinsics support
CMakeLists.txt (1)

1402-1405: Feature Information Update for ARMv8:
The addition of the feature info entry for WITH_ARMV8 (line 1402) clearly communicates that ARMv8 CRC32 intrinsics support is enabled. This aligns well with the new option and the fallback implementation.
Ensure that the documentation and any user-visible build messages mention this new flag and its behavior (including fallback conditions) for clarity.

@Dead2 Dead2 removed the Rebase needed Please do a 'git rebase develop yourbranch' label Feb 11, 2025
Copy link
Copy Markdown
Member

@Dead2 Dead2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Dead2 Dead2 merged commit aecfe16 into zlib-ng:develop Feb 12, 2025
@ccawley2011 ccawley2011 deleted the armv8-inline-asm branch February 12, 2025 17:05
@Dead2 Dead2 mentioned this pull request Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants