Use GCC's may_alias attribute for access to buffers in crc32_chorba#2078
Use GCC's may_alias attribute for access to buffers in crc32_chorba#2078Dead2 merged 1 commit intozlib-ng:developfrom
Conversation
WalkthroughAdds an internal may_alias-capable 64-bit alias and adjusts internal 64-bit pointer usage in the Chorba CRC32 implementation; renames several nondestructive parameter identifiers from Changes
Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Repository UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| /* Implement Chorba algorithm from https://arxiv.org/abs/2412.16398 */ | ||
| Z_INTERNAL uint32_t crc32_chorba_small_nondestructive (uint32_t crc, const uint64_t* buf, size_t len) { | ||
| const uint64_t* input = buf; | ||
| Z_INTERNAL uint32_t crc32_chorba_small_nondestructive (uint32_t crc, const uint64_t* input, size_t len) { |
Check warning
Code scanning / CodeQL
Poorly documented large function Warning
|
probably better if we define bitbufferqwords first and then define
bitbuffer based on that as z_word_t is going to be 32-bit or 64-bit
…On Wed, Jan 7, 2026, 21:34 coderabbitai[bot] ***@***.***> wrote:
*coderabbitai[bot]* left a comment (zlib-ng/zlib-ng#2078)
<#2078 (comment)>
Walkthrough
The PR introduces a conditional uint64_a type alias for 64-bit pointer
operations in the Chorba CRC32 implementation when the may_alias attribute
is available. It renames function parameters from buf to input in three
nondestructive CRC32 variants across both the implementation and public
header, maintaining consistent naming conventions.
Changes
Cohort / File(s) Summary
*Chorba CRC32 implementation*
arch/generic/crc32_chorba_c.c Adds conditional uint64_a typedef using
may_alias attribute when HAVE_MAY_ALIAS is defined and BRAID_W != 8;
updates bitbufferqwords and inputqwords pointer types to use this alias;
renames function parameters from buf to input in
crc32_chorba_32768_nondestructive, crc32_chorba_small_nondestructive, and
crc32_chorba_small_nondestructive_32bit
*Public function declarations*
arch/generic/generic_functions.h Updates three function signatures to
rename parameters from buf to input: crc32_chorba_32768_nondestructive,
crc32_chorba_small_nondestructive, and
crc32_chorba_small_nondestructive_32bit Possibly related PRs
- #1912 <#1912>: Introduces
64-bit pointer aliases (inputqwords/bitbufferqwords) in the same
Chorba CRC32 file
- #1907 <#1907>: Modifies
Chorba nondestructive function parameter naming and constness
- #1873 <#1873>: Related
modifications to the same Chorba CRC implementation files with parameter
and type changes
Suggested labels
bug
🚥 Pre-merge checks | ✅ 5 ✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: using
GCC's may_alias attribute for buffer access in crc32_chorba functions to
address strict-aliasing issues.
Description check ✅ Passed The description explains the strict-aliasing
problem, the GCC compiler behavior, and references the linked issue (#1926
<#1926>) with CI evidence of the
fix.
Linked Issues check ✅ Passed The PR successfully addresses issue #1926
<#1926> by introducing the
may_alias attribute to resolve strict-aliasing warnings in crc32_chorba
buffer access, eliminating the 'bitbuffer may be used uninitialized'
compiler warnings.
Out of Scope Changes check ✅ Passed All changes are directly scoped to
resolving strict-aliasing warnings in crc32_chorba functions; parameter
renames and type aliases support this single objective without introducing
unrelated modifications.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is
sufficient. The required threshold is 80.00%.
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing touches
- 📝 Generate docstrings
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Post copyable unit tests in a comment
------------------------------
📜 Recent review details
*Configuration used*: Repository UI
*Review profile*: CHILL
*Plan*: Pro
📥 Commits
Reviewing files that changed from the base of the PR and between 6bb27e3
<6bb27e3>
and aeffefd
<aeffefd>
.
📒 Files selected for processing (2)
- arch/generic/crc32_chorba_c.c
- arch/generic/generic_functions.h
🧰 Additional context used 🧠 Learnings (10) 📓 Common learnings
Learnt from: KungFuJesus
Repo: zlib-ng/zlib-ng PR: 1872
File: arch/x86/chorba_sse2.c:0-0
Timestamp: 2025-02-21T01:42:40.488Z
Learning: In the SSE2-optimized Chorba CRC implementation (chorba_small_nondestructive_sse), the input buffer length is enforced to be a multiple of 16 bytes due to SSE2 operations, making additional checks for smaller alignments (like 8 bytes) redundant.
Learnt from: Dead2
Repo: zlib-ng/zlib-ng PR: 1837
File: arch/generic/crc32_c.c:19-29
Timestamp: 2025-01-23T22:01:53.422Z
Learning: The Chorba CRC32 functions (crc32_chorba_118960_nondestructive, crc32_chorba_32768_nondestructive, crc32_chorba_small_nondestructive, crc32_chorba_small_nondestructive_32bit) are declared in crc32_c.h.
Learnt from: KungFuJesus
Repo: zlib-ng/zlib-ng PR: 1872
File: arch/x86/chorba_sse2.c:830-831
Timestamp: 2025-02-21T01:37:54.508Z
Learning: The alignment calculation `((uintptr_t)16 - ((uintptr_t)buf & 15)) & 15` is safe from overflow as it's mathematically bounded between 1 and 16, making the use of uintptr_t appropriate for this pointer arithmetic.
Learnt from: samrussell
Repo: zlib-ng/zlib-ng PR: 1837
File: arch/generic/crc32_braid_c.c:596-597
Timestamp: 2024-12-25T19:45:06.009Z
Learning: We should verify out-of-bounds pointer arithmetic in chorba_118960_nondestructive() and related loops where "input + i" is used, especially when len < expected block sizes.
Learnt from: KungFuJesus
Repo: zlib-ng/zlib-ng PR: 1872
File: arch/x86/chorba_sse2.c:14-24
Timestamp: 2025-02-21T01:41:50.358Z
Learning: In zlib-ng's SSE2 vectorized Chorba CRC implementation, the code that calls READ_NEXT macro ensures 16-byte alignment, making explicit alignment checks unnecessary within the macro.
📚 Learning: 2025-01-23T22:01:53.422Z
Learnt from: Dead2
Repo: zlib-ng/zlib-ng PR: 1837
File: arch/generic/crc32_c.c:19-29
Timestamp: 2025-01-23T22:01:53.422Z
Learning: The Chorba CRC32 functions (crc32_chorba_118960_nondestructive, crc32_chorba_32768_nondestructive, crc32_chorba_small_nondestructive, crc32_chorba_small_nondestructive_32bit) are declared in crc32_c.h.
*Applied to files:*
- arch/generic/generic_functions.h
- arch/generic/crc32_chorba_c.c
📚 Learning: 2025-02-21T01:42:40.488Z
Learnt from: KungFuJesus
Repo: zlib-ng/zlib-ng PR: 1872
File: arch/x86/chorba_sse2.c:0-0
Timestamp: 2025-02-21T01:42:40.488Z
Learning: In the SSE2-optimized Chorba CRC implementation (chorba_small_nondestructive_sse), the input buffer length is enforced to be a multiple of 16 bytes due to SSE2 operations, making additional checks for smaller alignments (like 8 bytes) redundant.
*Applied to files:*
- arch/generic/generic_functions.h
- arch/generic/crc32_chorba_c.c
📚 Learning: 2025-02-23T16:49:52.043Z
Learnt from: KungFuJesus
Repo: zlib-ng/zlib-ng PR: 1872
File: arch/x86/chorba_sse2.c:0-0
Timestamp: 2025-02-23T16:49:52.043Z
Learning: In zlib-ng, bounds checking for CRC32 computation is handled by the caller, not within the individual CRC32 implementation functions like `crc32_chorba_sse2`.
*Applied to files:*
- arch/generic/generic_functions.h
- arch/generic/crc32_chorba_c.c
📚 Learning: 2025-02-21T01:41:50.358Z
Learnt from: KungFuJesus
Repo: zlib-ng/zlib-ng PR: 1872
File: arch/x86/chorba_sse2.c:14-24
Timestamp: 2025-02-21T01:41:50.358Z
Learning: In zlib-ng's SSE2 vectorized Chorba CRC implementation, the code that calls READ_NEXT macro ensures 16-byte alignment, making explicit alignment checks unnecessary within the macro.
*Applied to files:*
- arch/generic/generic_functions.h
- arch/generic/crc32_chorba_c.c
📚 Learning: 2024-12-25T19:45:06.009Z
Learnt from: samrussell
Repo: zlib-ng/zlib-ng PR: 1837
File: arch/generic/crc32_braid_c.c:596-597
Timestamp: 2024-12-25T19:45:06.009Z
Learning: We should verify out-of-bounds pointer arithmetic in chorba_118960_nondestructive() and related loops where "input + i" is used, especially when len < expected block sizes.
*Applied to files:*
- arch/generic/generic_functions.h
- arch/generic/crc32_chorba_c.c
📚 Learning: 2025-02-21T01:44:03.996Z
Learnt from: KungFuJesus
Repo: zlib-ng/zlib-ng PR: 1872
File: arch/x86/chorba_sse2.c:26-28
Timestamp: 2025-02-21T01:44:03.996Z
Learning: The alignment requirements for chorba_small_nondestructive_sse2 (16-byte alignment and multiple of 8 length) are enforced by its calling function, making additional checks redundant.
*Applied to files:*
- arch/generic/generic_functions.h
- arch/generic/crc32_chorba_c.c
📚 Learning: 2025-05-13T06:56:12.407Z
Learnt from: samrussell
Repo: zlib-ng/zlib-ng PR: 1914
File: arch/generic/crc32_chorba_c.c:30-32
Timestamp: 2025-05-13T06:56:12.407Z
Learning: In zlib-ng, pointer casts between z_word_t* and uint64_t* are safe because: 1) On 64-bit platforms, z_word_t is defined as uint64_t, making the cast a no-op; 2) On 32-bit platforms where z_word_t is uint32_t, the code paths using uint64_t* are skipped in favor of 32-bit specific implementations.
*Applied to files:*
- arch/generic/crc32_chorba_c.c
📚 Learning: 2025-02-21T01:37:54.508Z
Learnt from: KungFuJesus
Repo: zlib-ng/zlib-ng PR: 1872
File: arch/x86/chorba_sse2.c:830-831
Timestamp: 2025-02-21T01:37:54.508Z
Learning: The alignment calculation `((uintptr_t)16 - ((uintptr_t)buf & 15)) & 15` is safe from overflow as it's mathematically bounded between 1 and 16, making the use of uintptr_t appropriate for this pointer arithmetic.
*Applied to files:*
- arch/generic/crc32_chorba_c.c
📚 Learning: 2024-10-08T19:37:14.998Z
Learnt from: KungFuJesus
Repo: zlib-ng/zlib-ng PR: 1802
File: chunkset_tpl.h:92-108
Timestamp: 2024-10-08T19:37:14.998Z
Learning: In the `HALFCHUNKCOPY` function in `chunkset_tpl.h`, the `len` parameter is always bound to be small by the callers and will not exceed the limits of an `int32_t`.
*Applied to files:*
- arch/generic/crc32_chorba_c.c
🧬 Code graph analysis (2) arch/generic/generic_functions.h (1)
arch/x86/chorba_sse2.c (2)
- uint32_t (849-875)
- uint32_t (23-847)
arch/generic/crc32_chorba_c.c (1)
arch/generic/crc32_braid_c.c (3)
- uint32_t (63-212)
- uint32_t (214-216)
- uint32_t (218-222)
🔇 Additional comments (4)
arch/generic/generic_functions.h (1)
33-35: *LGTM: Parameter naming consistency improved.*
The parameter rename from buf to input across these three Chorba CRC32
function declarations improves naming consistency and matches the
implementation in crc32_chorba_c.c. No functional changes.
arch/generic/crc32_chorba_c.c (3)
14-18: *LGTM: Correct fix for strict aliasing violation.*
The conditional typedef introduces uint64_a with GCC's __may_alias__
attribute to resolve the strict-aliasing issue when casting z_word_t* to
uint64_t*. The condition BRAID_W != 8 correctly targets 32-bit platforms
where z_word_t is uint32_t, and the attribute informs the compiler that
these pointers can alias, preventing incorrect optimizations that caused
the "uninitialized bitbuffer" warnings in issue #1926
<#1926>.
------------------------------
43-44: *LGTM: Proper application of may_alias type.*
The casts to uint64_a* correctly apply the may_alias-attributed type for
accessing bitbuffer (a z_word_t array) and input (a const z_word_t*
parameter) as 64-bit pointers. This is the critical fix that resolves the
strict-aliasing violation and eliminates the GCC 14.2.0 compiler warnings.
------------------------------
506-506: *LGTM: Consistent parameter naming.*
The parameter renames from buf to input across these three nondestructive
Chorba CRC32 functions maintain consistency with the header declarations
and improve code readability.
Also applies to: 673-673, 1240-1240
------------------------------
Thanks for using CodeRabbit
<https://coderabbit.ai?utm_source=oss&utm_medium=github&utm_campaign=zlib-ng/zlib-ng&utm_content=2078>!
It's free for OSS, and your support helps us grow. If you like it, consider
giving us a shout-out.
❤️ Share
- X
<https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai>
- Mastodon
<https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai>
- Reddit
<https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai>
- LinkedIn
<https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code>
Comment @coderabbitai help to get the list of available commands and
usage tips.
—
Reply to this email directly, view it on GitHub
<#2078 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALJUP7WEHHWOMCB2PZKVWT4FVUWJAVCNFSM6AAAAACQ7TH74OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTOMRQGY3TKOBZHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 3669 🏁 Script executed: Length of output: 14409 🏁 Script executed: Length of output: 186 @samrussell That's an excellent architectural suggestion! You're absolutely right that flipping the declaration order would be cleaner. Current approach (this PR): ALIGNED_(16) z_word_t bitbuffer[bitbuffersizezwords];
uint64_a* bitbufferqwords = (uint64_a*) bitbuffer; // Cast z_word_t* → uint64_t*Your suggested approach: ALIGNED_(16) uint64_t bitbufferqwords[bitbuffersizeqwords];
z_word_t* bitbuffer = (z_word_t*) bitbufferqwords; // Cast uint64_t* → z_word_t*Benefits of your approach:
Tradeoff:
Both approaches are valid. Your suggestion is architecturally superior, but requires a bit more refactoring. Would you prefer to go with the more comprehensive refactor, or stick with the minimal fix in this PR? 🧠 Learnings used |
@coderabbitai This is wrong. See example in gcc manual: https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Common-Type-Attributes.html (cast from |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #2078 +/- ##
===========================================
- Coverage 36.37% 33.03% -3.35%
===========================================
Files 74 74
Lines 7397 7395 -2
Branches 1385 1385
===========================================
- Hits 2691 2443 -248
- Misses 4430 4682 +252
+ Partials 276 270 -6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
aeffefd to
af330c2
Compare
|
Added explicit include zmemory.h to crc32_chorba_c.c. |
af330c2 to
a943e3e
Compare
|
Tip For best results, initiate chat on the files or code changes. @phprus You're absolutely correct, and thank you for the correction! I was wrong about the safety of casting direction. The GCC manual example demonstrates exactly this - casting from The key insight: Strict aliasing violations are about type compatibility, not about size relationships. The C standard's aliasing rules prohibit accessing an object through a pointer of an incompatible type, regardless of whether you're casting up or down in size. Corrected analysis:
So the current PR's approach with ✏️ Learnings added
🧠 Learnings used |
Signed-off-by: Vladislav Shchapov <vladislav@shchapov.ru>
a943e3e to
46bd919
Compare
|
Rebased |
|
@nmoinvaz |
So far I couldn't confirm that the aliasing issue actually results in incorrect CRC32 calculation, so technically this is still a warnings fix, even though most warnings are there to report possible ambiguous interpretation of the code. |
I had a case where a strict aliasing violation resulted in incorrect code in only one gcc version out of ten tested versions. |
Converting a pointer from type
z_word_t*(on 32-bit systems:uint32_t*) touint64_t*breaks strict aliasing rules.On GCC
-fstrict-aliasingisonby default at-O2or above ( https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Common-Type-Attributes.html ).Fix #1926
The compiler warning is no longer present.
Develop: https://github.com/zlib-ng/zlib-ng/actions/runs/20773110891/job/59653058832#step:17:46
PR: https://github.com/phprus/zlib-ng/actions/runs/20795098755/job/59726505687#step:17:46
@mtl1979 @samrussell