Provide an inline asm fallback for the ARMv8 intrinsics#1697
Provide an inline asm fallback for the ARMv8 intrinsics#1697Dead2 merged 4 commits intozlib-ng:developfrom
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #1697 +/- ##
===========================================
+ Coverage 80.34% 80.38% +0.04%
===========================================
Files 138 139 +1
Lines 11147 11176 +29
Branches 2860 2867 +7
===========================================
+ Hits 8956 8984 +28
- Misses 1217 1223 +6
+ Partials 974 969 -5 ☔ View full report in Codecov by Sentry. |
|
Renaming ACLE to just ARMV8 is misleading... ARMv8 is a lot more than just ACLE CRC instructions. |
|
I'm not familar enough with ARM, why rename ACLE to ARMv8? |
ACLE is the name of the header for non-NEON intrinsics rather than a specific CPU extension. It's used for both ARMv8 CRC32 instructions and ARMv6 SIMD instructions, however the build system for zlib-ng only uses ARM_ACLE to refer to the ARMv8 code. The ARMv6 code uses a separate option. |
|
We can't just rename command-line options for |
|
Renaming should be split into a separate PR from the added features/fixes. |
|
The improved acle/armv8 check fixes a false positive when building for 32 bit arm iOS, I ripped out the improved check and put it in https://github.com/Un1q32/zlib-ng/tree/acle, I would make a PR with just that but I'll wait to see if the author of this PR finishes the main PR |
|
No activity here for a while, gonna make a PR for the bit I ripped out |
Backwards compatibility can be provided like so: if(DEFINED WITH_ACLE)
set(WITH_ARMV8 ${WITH_ACLE})
endif()If the new names are more accurate, I'd prefer that. |
cc24587 to
7cef706
Compare
WalkthroughThis pull request introduces a comprehensive update to the zlib-ng project's ARM architecture support, focusing on transitioning from ARM C Language Extensions (ACLE) to ARMv8 CRC32 implementation. The changes span multiple files across the project, including workflow configurations, CMake scripts, source files, and build configurations. The primary goal is to replace ACLE-specific implementations with ARMv8-optimized versions, updating preprocessor directives, function names, and build flags to reflect this architectural shift. Changes
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
arch/arm/crc32_armv8.c (1)
Line range hint
16-24: Ensure partial reads don’t cause out-of-bounds access.
Because of alignment checks and partial consumption of bytes, carefully verify that the pointer increments do not exceed buffer bounds for corner cases like len=2 or len=3.
🧹 Nitpick comments (6)
arch/arm/acle_intrins.h (2)
18-19: Ensure consistent naming for preprocessor directives.
Currently, the code uses both "ARM_CRC32" and "ARM_CRC32_INTRIN" to guard CRC-related logic. Consider ensuring naming consistency to reduce confusion, for example by adopting a single preprocessor directive (e.g., ARM_CRC32) throughout.
44-48: Consider code duplication between aarch64 and non-aarch64 code paths.
The functions __crc32b, __crc32h, __crc32w differ primarily by the inline assembly instruction names. Refactoring these into macros could reduce duplication and ease maintainability.arch/arm/Makefile.in (1)
25-25: Confirm correctness of conditionally building ARMv8 objects.
Ensure your build system does not inadvertently build the ARMv8 object files on architectures that lack hardware CRC instructions. This can be enforced by retaining or adding relevant checks in the top-level configure or CMake.arch/arm/arm_functions.h (1)
58-61: Check backward compatibility.
Removing references to the previous “crc32_acle” might break older build scripts or code expecting that symbol. Consider preserving the old name or providing a migration note for external downstream projects.cmake/detect-intrinsics.cmake (1)
Line range hint
4-54: LGTM! The ARMv8 detection implementation is thorough.The macro has been successfully renamed and enhanced to provide comprehensive ARMv8 support with proper fallbacks. The implementation:
- Checks both CRC and SIMD support
- Verifies compiler support for both
-marchand-Wa,-marchflags- Separates inline assembly and intrinsics checks
Consider adding comments to document the purpose of each check block for better maintainability.
configure (1)
1124-1174: Approve with suggestions: ARMv8 feature detection is comprehensiveThe implementation thoroughly checks for ARMv8 support through multiple compiler flags and includes both inline assembly and intrinsics checks. However, consider these improvements:
- Add comments explaining the fallback strategy between different march flags
- Add error handling or warning when no compiler flags work
check_armv8_intrinsics() { + # Try compiler flags in order of preference: + # 1. -march=armv8-a+crc (preferred, most compilers) + # 2. -march=armv8-a+crc+simd (some compilers require explicit SIMD) + # 3. -Wa,-march=armv8-a+crc (assembler-level support) + # 4. -Wa,-march=armv8-a+crc+simd (assembler-level with SIMD) cat > $test.c << EOF int main() { return 0; } EOF if try $CC -c $CFLAGS -march=armv8-a+crc $test.c; then echo "Check whether -march=armv8-a+crc works ... Yes." | tee -a configure.log armv8flag="-march=armv8-a+crc" else echo "Check whether -march=armv8-a+crc works ... No." | tee -a configure.log if try $CC -c $CFLAGS -march=armv8-a+crc+simd $test.c; then echo "Check whether -march=armv8-a+crc+simd works ... Yes." | tee -a configure.log armv8flag="-march=armv8-a+crc+simd" else echo "Check whether -march=armv8-a+crc+simd works ... No." | tee -a configure.log if try $CC -c $CFLAGS -Wa,-march=armv8-a+crc $test.c; then echo "Check whether -Wa,-march=armv8-a+crc works ... Yes." | tee -a configure.log armv8flag="-Wa,-march=armv8-a+crc" else echo "Check whether -Wa,-march=armv8-a+crc works ... No." | tee -a configure.log if try $CC -c $CFLAGS -Wa,-march=armv8-a+crc+simd $test.c; then echo "Check whether -Wa,-march=armv8-a+crc+simd works ... Yes." | tee -a configure.log armv8flag="-Wa,-march=armv8-a+crc+simd" else echo "Check whether -Wa,-march=armv8-a+crc+simd works ... No." | tee -a configure.log + echo "Warning: No ARMv8 compiler flags worked. ARMv8 optimizations will be disabled." | tee -a configure.log fi fi fi fi
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
.github/workflows/cmake.yml(2 hunks).github/workflows/configure.yml(2 hunks)CMakeLists.txt(4 hunks)README.md(2 hunks)arch/arm/Makefile.in(3 hunks)arch/arm/acle_intrins.h(1 hunks)arch/arm/arm_features.c(1 hunks)arch/arm/arm_functions.h(2 hunks)arch/arm/crc32_armv8.c(1 hunks)cmake/detect-intrinsics.cmake(3 hunks)configure(10 hunks)functable.c(1 hunks)test/CMakeLists.txt(1 hunks)test/benchmarks/benchmark_crc32.cc(1 hunks)test/test_crc32.cc(1 hunks)win32/Makefile.a64(1 hunks)win32/Makefile.arm(2 hunks)
🔇 Additional comments (24)
arch/arm/acle_intrins.h (3)
38-42: Inline assembly usage for 64-bit operands on AArch64
The "crc32x" mnemonic is used correctly for 64-bit registers in AArch64. Just ensure that each time "crc32x" is utilized, the macro expansions and compilation flags are aligned with the targeted CPU to avoid illegal instruction errors on CPUs lacking the CRC extension.
62-63: Confirm correctness of the fallback for 64-bit calculations.
The fallback re-uses __crc32w twice to handle the two halves of a 64-bit operand. This approach is correct, but ensure that 32-bit alignment is guaranteed and that there's no unintended overhead if 64-bit instructions are available.
20-24: Validate inline assembly constraints for correctness.
It looks correct to reference general-purpose registers for input/output, but inline assembly constraints should carefully handle any potential register clobbers (e.g., the condition flags). Confirm that no additional clobber list (“cc”) is needed.
✅ Verification successful
No clobber list needed for CRC32 instructions
The CRC32 instructions used in the inline assembly (crc32b, crc32h, crc32w, crc32x) are data processing instructions that do not modify the condition flags (NZCV) according to the ARM Architecture Reference Manual. The current constraints using "=r" for output and "r" for inputs are sufficient, as:
- These instructions only compute CRC and write to the destination register
- They don't modify any processor state flags
- No temporary registers are clobbered during execution
Looking at the other similar CRC32 implementations in the codebase (configure, cmake files), they follow the same constraint pattern consistently.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Checking for inline assembly usage from within the code:
rg -A 5 '__asm__ __volatile__'
Length of output: 4146
arch/arm/crc32_armv8.c (2)
Line range hint 26-49: Check that leftover data is processed correctly.
When the code transitions among 8-byte, 4-byte, 2-byte, and 1-byte reads, ensure no data is missed or doubly processed. Given len is decremented in each step, it seems correct, but further tests might be warranted.
12-14: Validate usage of Z_REGISTER macros and pointer alignment checks.
While these macros and pointer alignment shifts often help performance, verify that your compiler or environment does indeed honor these optimizations.
✅ Verification successful
Let me analyze the findings and generate additional verification scripts to check the compiler optimizations and alignment aspects.
Based on the comprehensive analysis of the code, I can now provide the final response:
The Z_REGISTER and alignment optimizations are properly implemented and safe to use. The code shows a well-structured implementation with:
- Z_REGISTER is properly defined in zbuild.h as either
registeror empty based on compiler support - Alignment checks are handled correctly with proper fallbacks for unaligned access
- The implementation includes proper architecture-specific optimizations (ARM CRC32 intrinsics)
- The code includes proper boundary checks and handles different data sizes appropriately
The implementation follows best practices for ARM architecture:
- Uses ARM-specific CRC32 intrinsics when available
- Handles alignment requirements properly with progressive width increases (8-bit → 16-bit → 32-bit → 64-bit)
- Falls back gracefully when alignment requirements aren't met
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Checking references to Z_REGISTER macros:
rg -A 5 'Z_REGISTER'
Length of output: 3103
Script:
#!/bin/bash
# Check for alignment and optimization related macros/checks
rg -A 5 'UNALIGNED|ALIGNED|Z_TARGET_CRC'
# Check for architecture-specific configurations
cat arch/arm/crc32_armv8.c
Length of output: 20488
arch/arm/Makefile.in (2)
53-57: Verify parallel or incremental builds.
Check that any parallel building behavior is unaffected by the new ARMV8 object targets. Some build systems can get confused by changed target names if not properly declared as dependencies or prerequisites.
11-11: Rename Makefile variable usage consistently.
Now that ACLEFLAG is replaced with ARMV8FLAG, ensure that references to ACLEFLAG are removed entirely throughout the repository, including documentation.
arch/arm/arm_functions.h (1)
22-23: Validate the condition for ARM_CRC32 availability.
If you define ARM_CRC32 in one place and rely on __ARM_FEATURE_CRC32 in others, ensure they line up consistently in various compilers.
test/benchmarks/benchmark_crc32.cc (1)
65-66: LGTM! Consistent renaming from ACLE to ARMv8
The benchmark renaming aligns with the PR's objective to transition from ACLE to ARMv8 terminology while maintaining the same functionality.
win32/Makefile.a64 (1)
97-102: LGTM! Verify build configuration consistency
The changes to WFLAGS and OBJS align with the transition from ACLE to ARMv8 terminology and implementation.
✅ Verification successful
Build configuration is consistent across build systems
The verification confirms:
- No remaining references to
crc32_aclein any build files ARM_CRC32flag is consistently defined in CMakeLists.txt and Windows build filescrc32_armv8is properly integrated across all build systems:- CMake build system (CMakeLists.txt)
- Windows builds (Makefile.a64, Makefile.arm)
- Unix/autotools build (arch/arm/Makefile.in)
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify build configuration consistency across different build systems
# Check for any remaining references to crc32_acle in build files
rg "crc32_acle" -g "CMakeLists.txt" -g "Makefile*"
# Check for consistent usage of ARM_CRC32 flag
rg "ARM_CRC32" -g "CMakeLists.txt" -g "Makefile*"
# Verify that all build systems include the new crc32_armv8 object
rg "crc32_armv8" -g "CMakeLists.txt" -g "Makefile*"
Length of output: 1037
win32/Makefile.arm (2)
43-43: LGTM: Configuration flag follows existing pattern.
The addition of WITH_ARMV8 flag follows the same pattern as other architecture-specific flags like WITH_NEON and WITH_ARMV6.
101-103: LGTM: Build configuration properly handles ARMv8 support.
The conditional block correctly:
- Adds ARM_CRC32 flag when WITH_ARMV8 is defined
- Includes crc32_armv8.obj in the build
test/CMakeLists.txt (1)
194-194: LGTM: Test comment updated to reflect ARMv8 terminology.
The comment update from crc32_acle() to crc32_armv8() maintains consistency with the architectural changes.
functable.c (1)
186-190: Verify the impact of ARM_CRC32 macro replacement.
The changes correctly:
- Replace ARM_ACLE with ARM_CRC32 macro
- Use proper feature detection via cf.arm.has_crc32
- Assign crc32_armv8 function pointer
Let's verify the consistency of this change across the codebase:
✅ Verification successful
The ARM_CRC32 macro replacement is consistent across the codebase
The verification shows:
- ARM_ACLE references only remain in build configuration files (configure, CMakeLists.txt) and header guards
- ARM_CRC32 is properly defined in all relevant build files
- crc32_acle references only exist in ABI test files which are historical snapshots
- crc32_armv8 implementation and usage is consistent across the codebase
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify the transition from ARM_ACLE to ARM_CRC32 and related function changes
# Check for any remaining ARM_ACLE references
echo "Checking for remaining ARM_ACLE references..."
rg "ARM_ACLE"
# Check for ARM_CRC32 usage consistency
echo "Checking ARM_CRC32 usage..."
rg "ARM_CRC32"
# Check for crc32_acle references
echo "Checking for remaining crc32_acle references..."
rg "crc32_acle"
# Check for crc32_armv8 implementation
echo "Checking crc32_armv8 implementation..."
rg "crc32_armv8"
Length of output: 4038
.github/workflows/configure.yml (1)
57-62: LGTM! Verify impact on existing build configurations.
The job renamings and configure argument updates from --without-acle to --without-armv8 are consistent with the PR objectives.
Run the following script to verify the impact on existing build configurations:
Also applies to: 85-90
✅ Verification successful
Build configurations are consistent across workflows
The verification shows that:
- No remaining references to "ACLE" exist in the workflow files
- ARMv8-related configurations are consistently used across both
configure.ymlandcmake.yml - The naming convention and flags are aligned:
- Configure workflow uses
--without-armv8 - CMake workflow uses
-DWITH_ARMV8=OFF
- Configure workflow uses
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any remaining references to ACLE in build configurations
# and verify the new ARMv8 configurations.
# Check for any remaining ACLE references in workflow files
echo "Checking for remaining ACLE references in workflow files..."
rg -i "acle" .github/workflows/
# Check for consistency in ARMv8 related configurations
echo "Checking ARMv8 related configurations..."
rg -i "armv8" .github/workflows/
Length of output: 1209
test/test_crc32.cc (1)
215-216: LGTM! Test case updated correctly.
The test case has been properly updated to use the new ARMv8 nomenclature while maintaining the same functionality and CPU feature check.
README.md (1)
23-23: LGTM! Documentation updates are accurate and consistent.
The changes to both the features section and build options table accurately reflect the transition from ACLE to ARMv8 while maintaining consistency with the documentation style.
Also applies to: 207-207
cmake/detect-intrinsics.cmake (1)
28-42: LGTM! The ARM architecture handling is robust.
The implementation correctly:
- Handles both 32-bit and 64-bit ARM architectures
- Uses appropriate assembly syntax for each architecture (
%w0for aarch64 vs%0for arm32) - Properly checks for intrinsics support using standard headers
.github/workflows/cmake.yml (1)
170-175: LGTM! CI configuration updated correctly.
The workflow changes properly reflect the transition from ACLE to ARMv8:
- Job names updated appropriately
- Build flags changed from
WITH_ACLE=OFFtoWITH_ARMV8=OFF - CodeCov identifiers updated to maintain coverage tracking
Also applies to: 197-201
CMakeLists.txt (2)
106-106: LGTM! Build option updated appropriately.
The WITH_ARMV8 option correctly replaces the previous ACLE option while maintaining the same functionality.
1388-1388: LGTM! Feature information updated correctly.
The feature information properly reflects the new ARMv8 support in the build system.
configure (3)
95-95: LGTM: Variable initialization follows established pattern
The buildarmv8 flag is correctly initialized with other architecture flags, maintaining consistency with the codebase's configuration system.
1996-1996: LGTM: Build system integration is consistent
The ARMv8 flag is correctly integrated into both the configure log and architecture-specific Makefile, following the same pattern as other architecture-specific flags.
Also applies to: 2137-2137
1779-1792: Approve with verification: ARMv8 feature enablement looks good
The implementation correctly enables ARMv8 features based on available support. However, let's verify the existence of the referenced object files.
✅ Verification successful
ARMv8 CRC32 implementation is properly integrated
The verification confirms that:
- The source file
arch/arm/crc32_armv8.cexists and will generate the required object files - The implementation is properly integrated into the function table system via
functable.c - The function is correctly declared in
arm_functions.hand can be set as the native implementation - Architecture detection is handled in both
cmake/detect-arch.candarch/arm/arm_features.c
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that the ARMv8 object files exist in the source tree
# Check for the source file that generates crc32_armv8.o
echo "Checking for ARMv8 implementation file..."
fd "crc32_armv8.c" --type f
# Check for references to ARMv8 in other architecture files
echo "Checking for other ARMv8 references..."
rg "armv8" --type c
Length of output: 803
| #elif defined(ARM_NOCHECK_CRC32) | ||
| return 1; | ||
| #else | ||
| return 0; |
There was a problem hiding this comment.
💡 Codebase verification
Issue found: ARM_NOCHECK_CRC32 is used but not defined
The verification reveals that while ARM_NOCHECK_CRC32 is used in arm_features.c, it's not defined anywhere in the codebase. Other ARM-related NOCHECK macros (like ARM_NOCHECK_SIMD and ARM_NOCHECK_NEON) exist, but ARM_NOCHECK_CRC32 is missing its definition. This could lead to unexpected behavior.
arch/arm/arm_features.c: UsesARM_NOCHECK_CRC32in conditional compilation- The codebase uses
ARM_CRC32for feature detection, but the corresponding NOCHECK macro is not defined
🔗 Analysis chain
LGTM! Verify the macro definition consistency
The change from ARM_NOCHECK_ACLE to ARM_NOCHECK_CRC32 is consistent with the PR's objectives.
Let me gather more information about the macro definitions in the codebase.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Verify that ARM_NOCHECK_CRC32 is defined consistently across the codebase
# and that there are no lingering references to ARM_NOCHECK_ACLE
# Check for any remaining references to ARM_NOCHECK_ACLE
rg "ARM_NOCHECK_ACLE"
# Check where ARM_NOCHECK_CRC32 is defined/used
rg "ARM_NOCHECK_CRC32"
Length of output: 101
Script:
#!/bin/bash
# Check for any macro definitions or configurations related to ARM CRC32
rg -A 3 "#define.*ARM.*CRC32|#ifdef.*ARM.*CRC32|#ifndef.*ARM.*CRC32"
# Check build configuration files for these macros
fd "CMakeLists.txt|configure|Makefile|*.cmake" --exec rg "ARM.*CRC32|NOCHECK"
# Check header files for macro definitions
fd -e h --exec rg "ARM.*CRC32|NOCHECK"
Length of output: 1976
There was a problem hiding this comment.
@nmoinvaz, understood. Let me know if there’s anything else you’d like me to check or clarify.
|
This PR looks pretty complete. I like the naming changes as they are more accurate. |
| | WITH_PCLMULQDQ | | Build with PCLMULQDQ intrinsics | ON | | ||
| | WITH_VPCLMULQDQ | --without-vpclmulqdq | Build with VPCLMULQDQ intrinsics | ON | | ||
| | WITH_ACLE | --without-acle | Build with ACLE intrinsics | ON | | ||
| | WITH_ARMV8 | --without-armv8 | Build with ARMv8 intrinsics | ON | |
There was a problem hiding this comment.
nitpick: One space too much in description field.
@ccawley2011 Could you add the requested backwards compatibility for CMake and Configure? No need to document the obsolete parameter, just enable the correct setting and print a warning that the old-name parameter has been deprecated and will be removed in the future. |
7cef706 to
f0a93c5
Compare
OK, done. |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
CMakeLists.txt (1)
719-734: Inline ASM Fallback and Feature Check for ARMv8 Intrinsics
The new block checking for ARMv8 support is implemented cleanly. It calls check_armv8_compiler_flag() and conditionally adds ARM_CRC32 definitions and source files if inline assembly or intrinsics are available. One suggestion is to add a status message (or warning) in the fallback paths (both in the “else()” after the if(HAVE_ARMV8_INLINE_ASM OR HAVE_ARMV8_INTRIN) and in the outer else block) to facilitate troubleshooting when the feature is disabled due to missing compiler support. For example:+ else()
+ message(STATUS "ARMv8 intrinsics not supported by the compiler; disabling WITH_ARMV8 fallback")
+ set(WITH_ARMV8 OFF)
+ endif()This would make diagnosing build issues easier.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
.github/workflows/cmake.yml(2 hunks).github/workflows/configure.yml(2 hunks)CMakeLists.txt(4 hunks)README.md(2 hunks)arch/arm/Makefile.in(3 hunks)arch/arm/acle_intrins.h(1 hunks)arch/arm/arm_features.c(1 hunks)arch/arm/arm_functions.h(2 hunks)arch/arm/crc32_armv8.c(1 hunks)cmake/detect-intrinsics.cmake(3 hunks)configure(10 hunks)functable.c(1 hunks)test/CMakeLists.txt(1 hunks)test/benchmarks/benchmark_crc32.cc(1 hunks)test/test_crc32.cc(3 hunks)win32/Makefile.a64(1 hunks)win32/Makefile.arm(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
- README.md
- functable.c
- arch/arm/arm_features.c
- test/CMakeLists.txt
- test/benchmarks/benchmark_crc32.cc
- .github/workflows/configure.yml
- arch/arm/Makefile.in
- win32/Makefile.a64
- .github/workflows/cmake.yml
🧰 Additional context used
📓 Learnings (1)
test/test_crc32.cc (1)
Learnt from: KungFuJesus
PR: zlib-ng/zlib-ng#1833
File: test/test_crc32.cc:0-0
Timestamp: 2024-12-22T20:40:03.280Z
Learning: In the crc32_align test class (test/test_crc32.cc), the alignment offset parameter is always guaranteed to be non-negative because the test parameters are controlled and it never makes sense to have a negative offset.
⏰ Context from checks skipped due to timeout of 90000ms (55)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Ubuntu Clang Debug
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Ubuntu Clang Debug
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Ubuntu Clang Debug
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Ubuntu Clang Debug
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Ubuntu Clang Debug
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Ubuntu Clang Debug
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Ubuntu Clang Debug
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS Clang ASAN (ARM64)
- GitHub Check: Windows ClangCl Win64
- GitHub Check: Windows ClangCl Win32
- GitHub Check: Windows MSVC 2019 v140 Win64
- GitHub Check: Ubuntu Clang Debug
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
🔇 Additional comments (14)
arch/arm/crc32_armv8.c (2)
1-8: LGTM! The file header and preprocessor directive changes look good.The transition from
ARM_ACLEtoARM_CRC32better reflects the actual functionality being used.
12-75: LGTM! The CRC32 implementation is well-optimized.The function efficiently handles:
- Different input sizes with specialized processing
- Memory alignment for optimal performance
- Proper use of ARMv8 CRC32 intrinsics
arch/arm/arm_functions.h (2)
22-24: LGTM! Function declaration is consistent with implementation.The declaration properly reflects the renamed function and updated preprocessor directive.
59-62: LGTM! Native function mapping is correctly updated.The native CRC32 function is properly mapped to the new
crc32_armv8implementation.arch/arm/acle_intrins.h (3)
11-17: LGTM! Preprocessor directives and target definitions are well-structured.The code properly handles different architectures and compiler configurations.
18-42: LGTM! AArch64 inline assembly implementations are correct.The implementations:
- Use proper
crc32b,crc32h,crc32w, andcrc32xinstructions- Have correct register constraints
- Include volatile keyword for proper optimization
43-64: LGTM! Non-AArch64 implementations handle CRC32 correctly.The implementations:
- Use appropriate assembly syntax for 32-bit ARM
- Correctly implement 64-bit CRC32 using two 32-bit operations
win32/Makefile.arm (2)
43-43: LGTM! New flag definition is properly added.The
WITH_ARMV8flag is correctly defined for conditional compilation.
101-104: LGTM! Conditional compilation section is correctly updated.The section properly:
- Checks for
WITH_ARMV8- Sets appropriate flags
- Includes the correct object file
test/test_crc32.cc (1)
200-250: LGTM! Test case renames align with ARMv8 transition.The test case renames from ACLE to ARMv8 are consistent with the broader architectural transition while maintaining the same test coverage and functionality.
cmake/detect-intrinsics.cmake (1)
4-57: LGTM! Comprehensive ARMv8 support detection.The implementation provides thorough checks for ARMv8 support through:
- Multiple compiler flag variants
- Both intrinsics and inline assembly support
- Proper handling of 32-bit and 64-bit ARM architectures
configure (2)
95-96: LGTM! Clean transition with backward compatibility.The changes properly handle the ACLE to ARMv8 transition:
- Renames buildacle to buildarmv8
- Maintains backward compatibility by supporting --without-acle
- Adds clear comments indicating backward compatibility support
Also applies to: 201-203
1126-1192: LGTM! Robust ARMv8 detection implementation.The implementation provides comprehensive ARMv8 support detection:
- Checks for various compiler flags
- Tests both intrinsics and inline assembly support
- Properly handles different ARM architectures
CMakeLists.txt (1)
106-112: Backward Compatibility for ARMv8 Option
The newly added option for WITH_ARMV8 is well integrated, and the block that checks if WITH_ACLE is defined to assign its value to WITH_ARMV8 correctly maintains backward compatibility. Ensure that any legacy builds relying on WITH_ACLE behave as expected.
f0a93c5 to
369b43a
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
CMakeLists.txt (2)
105-112: Backward Compatibility for ARMv8 Option:
The changes introduce a new optionWITH_ARMV8(line 106) for enabling ARMv8 CRC32 intrinsics. The backward compatibility block (lines 109–112) resetsWITH_ARMV8to the value of the legacyWITH_ACLEif it is defined. This ensures that users of the old option aren’t immediately broken.
Consider adding an explicit deprecation notice or comment indicating thatWITH_ACLEis now deprecated, so that users can gradually migrate toWITH_ARMV8.
719-728: Inline ASM Fallback for ARMv8 Intrinsics:
This block (lines 719–728) callscheck_armv8_compiler_flag()and, if either inline assembly or intrinsic support is detected, defines-DARM_CRC32and sets up the associated source file (crc32_armv8.c) with the proper compile flags. In the fallback branch (lines 731–735),WITH_ARMV8is disabled when the required support isn’t found.
The implementation is clear and functionally sound. For improved diagnostics, consider adding a verbose message or comment when falling back (i.e. whenWITH_ARMV8is set to OFF) to aid future debugging.Also applies to: 731-735
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
.github/workflows/cmake.yml(2 hunks).github/workflows/configure.yml(2 hunks)CMakeLists.txt(4 hunks)README.md(2 hunks)arch/arm/Makefile.in(3 hunks)arch/arm/acle_intrins.h(1 hunks)arch/arm/arm_features.c(1 hunks)arch/arm/arm_functions.h(2 hunks)arch/arm/crc32_armv8.c(1 hunks)cmake/detect-intrinsics.cmake(3 hunks)configure(9 hunks)functable.c(1 hunks)test/CMakeLists.txt(1 hunks)test/benchmarks/benchmark_crc32.cc(1 hunks)test/test_crc32.cc(3 hunks)win32/Makefile.a64(1 hunks)win32/Makefile.arm(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (12)
- README.md
- test/CMakeLists.txt
- win32/Makefile.arm
- test/benchmarks/benchmark_crc32.cc
- win32/Makefile.a64
- functable.c
- arch/arm/Makefile.in
- .github/workflows/configure.yml
- test/test_crc32.cc
- arch/arm/arm_features.c
- arch/arm/crc32_armv8.c
- .github/workflows/cmake.yml
⏰ Context from checks skipped due to timeout of 90000ms (61)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC (ARM64)
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: macOS GCC Symbol Prefix (ARM64)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC (ARM64)
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: macOS GCC Symbol Prefix (ARM64)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC (ARM64)
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: macOS GCC Symbol Prefix (ARM64)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC (ARM64)
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: macOS GCC Symbol Prefix (ARM64)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC (ARM64)
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: macOS GCC Symbol Prefix (ARM64)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC (ARM64)
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: macOS GCC Symbol Prefix (ARM64)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: macOS GCC Symbol Prefix (ARM64)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: macOS GCC Symbol Prefix (ARM64)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: macOS GCC Symbol Prefix (ARM64)
- GitHub Check: Windows ClangCl Win32
- GitHub Check: EL9 Clang S390X DFLTCC MSAN
- GitHub Check: macOS GCC
- GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
- GitHub Check: macOS GCC Symbol Prefix & Compat
- GitHub Check: macOS GCC Symbol Prefix (ARM64)
🔇 Additional comments (11)
arch/arm/arm_functions.h (1)
22-24: LGTM! The function declaration changes look good.The renaming from
crc32_acletocrc32_armv8and the updated conditional compilation directive fromARM_ACLEtoARM_CRC32are consistent with the PR's goal to transition from ACLE to ARMv8.arch/arm/acle_intrins.h (4)
11-17: LGTM! The conditional compilation setup is correct.The code properly sets up the target flags for CRC support based on the architecture.
18-42: LGTM! The AArch64 inline assembly implementations are correct.The code provides efficient inline assembly implementations for CRC32 operations on AArch64:
__crc32bfor byte operations__crc32hfor halfword operations__crc32wfor word operations__crc32dfor doubleword operations
44-64: LGTM! The non-AArch64 inline assembly implementations are correct.The code provides equivalent inline assembly implementations for 32-bit ARM architectures.
62-64: LGTM! The non-AArch64 64-bit operation is efficiently implemented.The
__crc32dimplementation for non-AArch64 cleverly uses two__crc32wcalls to handle 64-bit input.cmake/detect-intrinsics.cmake (3)
4-57: LGTM! The ARMv8 compiler flag detection is comprehensive.The macro correctly checks for various ARMv8 compiler flags in order of preference:
-march=armv8-a+crc-march=armv8-a+crc+simd-Wa,-march=armv8-a+crc-Wa,-march=armv8-a+crc+simd
156-175: LGTM! The ARMv8 inline assembly check is thorough.The code properly verifies inline assembly support for both AArch64 and non-AArch64 architectures.
177-191: LGTM! The ARMv8 intrinsics check is well-implemented.The code verifies compiler support for ARMv8 intrinsics using a simple test case.
configure (2)
95-95: LGTM! The backward compatibility handling is well-implemented.The code maintains backward compatibility by:
- Adding
buildarmv8variable- Updating help text to document the new
--without-armv8option- Supporting the old
--without-acleoption for backward compatibilityAlso applies to: 169-169, 201-203
1126-1192: LGTM! The ARMv8 intrinsics detection is thorough.The code properly checks for:
- Compiler support for various ARMv8 flags
- ARMv8 inline assembly support
- ARMv8 intrinsics support
CMakeLists.txt (1)
1402-1405: Feature Information Update for ARMv8:
The addition of the feature info entry forWITH_ARMV8(line 1402) clearly communicates that ARMv8 CRC32 intrinsics support is enabled. This aligns well with the new option and the fallback implementation.
Ensure that the documentation and any user-visible build messages mention this new flag and its behavior (including fallback conditions) for clarity.
The configure checks have also been modified to better match the ARMv6 code.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores