Skip to content

Fix unaligned access in ACLE based crc32#1833

Merged
Dead2 merged 1 commit intozlib-ng:developfrom
KungFuJesus:fix_alignsan_acle
Dec 23, 2024
Merged

Fix unaligned access in ACLE based crc32#1833
Dead2 merged 1 commit intozlib-ng:developfrom
KungFuJesus:fix_alignsan_acle

Conversation

@KungFuJesus
Copy link
Copy Markdown
Collaborator

@KungFuJesus KungFuJesus commented Dec 21, 2024

This fixes a rightful complaint from the alignment sanitizer that we alias memory in an unaligned fashion. A nice added bonus is that this improves performance a tiny by on the larger buffers, perhaps due to loops that idiomatically decrement a count and increment a single buffer pointer rather than the maze of conditional pointer reassignments.

Summary by CodeRabbit

  • Bug Fixes

    • Improved buffer handling for CRC computation, ensuring accurate processing of data.
    • Enhanced testing for alignment issues in CRC32 calculations.
  • New Features

    • Streamlined the CRC calculation process by utilizing local variables for buffer data, enhancing performance.
    • Introduced a new test class to verify CRC32 functionality with varying alignment offsets.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 21, 2024

Walkthrough

The pull request modifies the crc32_acle.c file in the ARM architecture directory, specifically changing how data is accessed for CRC computation. The previous implementation, which used registered pointers for buffer referencing, has been updated to utilize local variables of types uint16_t, uint32_t, and uint64_t. This change eliminates pointer dereferencing, employing direct value assignment from the buffer. The core CRC computation logic remains unchanged, ensuring correct processing of buffer chunks and handling of any remaining bytes. Additionally, the test_crc32.cc file has been enhanced with new tests for alignment issues.

Changes

File Change Summary
arch/arm/crc32_acle.c Modified crc32_acle function:
- Replaced registered pointers with local variables
- Eliminated pointer dereferencing
- Maintained original chunk processing and buffer alignment checks
test/test_crc32.cc Enhanced testing framework:
- Added static array align_offsets
- Introduced crc32_align test class for alignment tests
- Updated INSTANTIATE_TEST_SUITE_P for new tests
- Added TEST_CRC32_ALIGN macro for streamlined test creation

Possibly related PRs

Suggested labels

optimization

Suggested reviewers

  • Dead2

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 19b558c and 90722f4.

📒 Files selected for processing (2)
  • arch/arm/crc32_acle.c (2 hunks)
  • test/test_crc32.cc (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • arch/arm/crc32_acle.c
🧰 Additional context used
📓 Learnings (1)
test/test_crc32.cc (2)
Learnt from: phprus
PR: zlib-ng/zlib-ng#1833
File: test/test_crc32.cc:204-216
Timestamp: 2024-12-22T18:12:46.391Z
Learning: Use uma única alocação de buffer para testar vários deslocamentos (offsets) em testes de alinhamento, garantindo que todas as posições sejam efetivamente cobertas.
Learnt from: KungFuJesus
PR: zlib-ng/zlib-ng#1833
File: test/test_crc32.cc:0-0
Timestamp: 2024-12-22T20:40:03.280Z
Learning: In the crc32_align test class (test/test_crc32.cc), the alignment offset parameter is always guaranteed to be non-negative because the test parameters are controlled and it never makes sense to have a negative offset.
🔇 Additional comments (3)
test/test_crc32.cc (3)

235-237: LGTM! Comprehensive alignment test coverage

The array covers all possible misalignment cases for up to 8-byte aligned accesses, which is appropriate for testing ARM ACLE CRC32 implementation.


239-250: Address static analysis warning about unused variable

The static analysis tool reports that align_offsets is never read. However, this is a false positive as the array is used indirectly through the test parameter generator.

Let's verify the array usage:


200-214: ⚠️ Potential issue

Improve memory safety and error handling in alignment test class

The current implementation has several potential issues:

  1. Memory leak if CRC32 call throws an exception
  2. No validation of the offset parameter
  3. No protection against integer overflow in buffer size calculation

Consider this safer implementation:

 class crc32_align : public ::testing::TestWithParam<int> {
 public:
     void hash(int param, crc32_func crc32) {
+        ASSERT_GE(param, 0) << "Alignment offset must be non-negative";
+        
+        size_t size;
+        if (__builtin_add_overflow(128, param, &size)) {
+            FAIL() << "Buffer size would overflow";
+            return;
+        }
+        
         uint8_t *buf = (uint8_t*)zng_alloc(sizeof(uint8_t) * (128 + param));
-        if (buf != NULL) {
-            (void)crc32(0, buf + param, 128);
-        } else {
-            FAIL();
-        }
+        ASSERT_NE(buf, nullptr) << "Memory allocation failed";
+        
+        try {
+            (void)crc32(0, buf + param, 128);
+        } catch (...) {
+            zng_free(buf);
+            throw;
+        }
         zng_free(buf);
     }
 };
⛔ Skipped due to learnings
Learnt from: KungFuJesus
PR: zlib-ng/zlib-ng#1833
File: test/test_crc32.cc:0-0
Timestamp: 2024-12-22T20:40:03.280Z
Learning: In the crc32_align test class (test/test_crc32.cc), the alignment offset parameter is always guaranteed to be non-negative because the test parameters are controlled and it never makes sense to have a negative offset.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
arch/arm/crc32_acle.c (2)

49-52: Performance optimization through simplified pointer handling

The main loop's changes offer potential performance benefits:

  1. Single pointer increment and length decrement instead of multiple conditional reassignments
  2. Safe memory access through memcpy for 8-byte chunks
  3. Natural alignment handling for optimal CPU performance

This aligns with the PR's goal of performance improvement for larger buffers.

Consider adding a size threshold for small buffers to bypass the alignment handling overhead when processing very small inputs.


Line range hint 14-66: Excellent architectural improvements for CRC32 ACLE implementation

The changes successfully address both objectives:

  1. Alignment Safety: Eliminates unaligned memory access through consistent use of memcpy and proper variable declarations
  2. Performance: Simplifies pointer handling and maintains efficient processing of larger buffers while ensuring proper alignment

The implementation maintains a good balance between safety and performance.

Consider adding performance benchmarks to the test suite to validate the performance improvements, especially for larger buffers.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 04d1b75 and 0eeb75c.

📒 Files selected for processing (1)
  • arch/arm/crc32_acle.c (2 hunks)
🔇 Additional comments (3)
arch/arm/crc32_acle.c (3)

14-16: LGTM: Good alignment-safe variable declarations

The introduction of properly-typed local variables (buf2, buf4, buf8) ensures natural alignment for different access sizes, which is crucial for the subsequent memcpy operations.


56-66: LGTM: Robust handling of remaining bytes

The implementation properly handles remaining bytes with:

  • Consistent use of memcpy for safe memory access
  • Correct progression from larger (4-byte) to smaller (2-byte) chunks
  • Proper buffer pointer advancement

33-43: Improved alignment handling with safe memory access

The use of memcpy for different sized chunks (2 and 4 bytes) properly handles potential unaligned access while maintaining the original alignment progression. This change effectively addresses the alignment sanitizer concerns.

Let's verify the alignment assumptions:

✅ Verification successful

Alignment handling is correctly implemented across architectures

The verification confirms that the alignment handling in arch/arm/crc32_acle.c is consistent with best practices and aligns with other architecture implementations:

  • The ARM implementation correctly uses memcpy for all data sizes (2, 4, and 8 bytes) with proper alignment checks
  • Similar alignment-aware implementations are found in other architectures:
    • S390 uses explicit alignment masks and vector attributes
    • x86 implementations handle alignment through SIMD-specific approaches
    • The generic implementation uses safe memory access patterns
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for other potential unaligned access patterns in the codebase
rg -l "(__packed__|__attribute__\(\(packed\)\))" | while read file; do
  echo "=== $file ==="
  rg "struct.*(__packed__|__attribute__\(\(packed\)\))" "$file" -A 5
done

Length of output: 62


Script:

#!/bin/bash
# Let's check for alignment-related patterns and memory access in the CRC implementation
rg -B 2 -A 2 "(memcpy|aligned|unaligned|__packed__|__attribute__)" arch/arm/crc32_acle.c

# Also check for similar CRC implementations in other architectures
fd "crc32.*\.[ch]$" | while read -r file; do
  echo "=== $file ==="
  rg -B 2 -A 2 "(memcpy|aligned|unaligned|__packed__|__attribute__)" "$file"
done

Length of output: 6751

@ccawley2011
Copy link
Copy Markdown
Contributor

The code is supposed to ensure that the pointers are aligned by performing smaller loads until alignment is reached, so if that isn't happening it's likely that there's something else wrong.

Using memcpy will likely hurt performance on operating systems where unaligned access is forbidden, since it'll need to load each word one byte at a time before combining them.

@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

The code is supposed to ensure that the pointers are aligned by performing smaller loads until alignment is reached, so if that isn't happening it's likely that there's something else wrong.

Using memcpy will likely hurt performance on operating systems where unaligned access is forbidden, since it'll need to load each word one byte at a time before combining them.

It doesn't though, as the compiler optimizes memcpys of fixed lengths to call builtins that translate the calls to direct unaligned loads on architectures that support it (which ARM does). This is more to ensure that we're calling well defined behavior within C. The sanitizer detected unaligned access at runtime, so the ladder of ifs had a lapse in proper alignment checks. Actually I think I see where the alignment checks are failing because the & trick for powers of 2 requires a subtraction of one. This is being done for the first one but not subsequent ones.

@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

Now the access should actually be aligned to the data type being read (for the slight benefit of speed on any particularly ARM chip).

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 21, 2024

Codecov Report

Attention: Patch coverage is 82.75862% with 5 lines in your changes missing coverage. Please review.

Project coverage is 82.39%. Comparing base (87d8e95) to head (90722f4).
Report is 1 commits behind head on develop.

Files with missing lines Patch % Lines
test/test_crc32.cc 44.44% 1 Missing and 4 partials ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1833      +/-   ##
===========================================
- Coverage    82.40%   82.39%   -0.02%     
===========================================
  Files          136      136              
  Lines        11021    11035      +14     
  Branches      2799     2805       +6     
===========================================
+ Hits          9082     9092      +10     
  Misses        1239     1239              
- Partials       700      704       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ccawley2011
Copy link
Copy Markdown
Contributor

It doesn't though, as the compiler optimizes memcpys of fixed lengths to call builtins that translate the calls to direct unaligned loads on architectures that support it (which ARM does).

The behaviour on ARMv6 and newer is dependent on how the operating system configures alignment exceptions - if they're enabled (or ARMv4/v5 behaviour is enabled on ARMv6), applications and libraries will need to be compiled without unaligned access, and the behaviour observed in PR #1548 will occur instead. RISC OS is an example of a platform that both supports ARMv8 with CRC extensions and has alignment exceptions enabled by default.

@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

KungFuJesus commented Dec 21, 2024

Hmm, so it sounds like the may_alias attribute is needed for those cases. I can write this so that it just dereferences a pointer but it's hard to imagine a case where the compiler wouldn't treat that the same way as us calling a fixed sized memcpy as it sees we're assigning the pointer from something that only has 1 byte alignment. FWIW, with the full test corpora this passes without memcpy and simply just deref'ing the pointers as the respective types, so you shouldn't be seeing unaligned access regardless. The memcpys are more there as a guard rail (though it sounds like one you can't rely on if you're on RISC OS, lest you want to compile your binaries with forced memcpy's there).

@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

@ccawley2011 It's now memcpy free. We'll have to rely on the alignment sanitizer to police it here.

@nmoinvaz nmoinvaz added bug Architecture Architecture specific labels Dec 22, 2024
@Dead2
Copy link
Copy Markdown
Member

Dead2 commented Dec 22, 2024

Aarch64 rpi5

Before PR

crc32/acle/1              2.50 ns         2.50 ns    279907009
crc32/acle/8              4.17 ns         4.17 ns    167944184
crc32/acle/12             4.17 ns         4.17 ns    167945907
crc32/acle/16             4.58 ns         4.58 ns    152674019
crc32/acle/32             6.67 ns         6.67 ns    104957581
crc32/acle/64             8.34 ns         8.34 ns     83971953
crc32/acle/512            35.1 ns         35.1 ns     19920130
crc32/acle/4096            226 ns          226 ns      3092795
crc32/acle/32768          1720 ns         1720 ns       406891
crc32/acle/262144        14683 ns        14683 ns        47675
crc32/acle/4194304      240527 ns       240526 ns         2905

After PR

crc32/acle/1              2.50 ns         2.50 ns    279910077
crc32/acle/8              4.17 ns         4.17 ns    167945750
crc32/acle/12             4.17 ns         4.17 ns    167946288
crc32/acle/16             4.58 ns         4.58 ns    152674642
crc32/acle/32             6.67 ns         6.67 ns    104964779
crc32/acle/64             8.34 ns         8.34 ns     83969360
crc32/acle/512            35.7 ns         35.7 ns     19554344
crc32/acle/4096            226 ns          226 ns      3092795
crc32/acle/32768          1720 ns         1720 ns       406898
crc32/acle/262144        14666 ns        14665 ns        47730
crc32/acle/4194304      240605 ns       240604 ns         2906

Performance here seems to be pretty much identical.

Copy link
Copy Markdown
Member

@Dead2 Dead2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@phprus
Copy link
Copy Markdown
Contributor

phprus commented Dec 22, 2024

Unaligned access is not fixed

Ubuntu GCC AARCH64 No NEON UBSAN (https://github.com/zlib-ng/zlib-ng/actions/runs/12447135765/job/34750296246?pr=1833):

/home/runner/work/zlib-ng/zlib-ng/arch/arm/crc32_acle.c:40:18: runtime error: load of misaligned address 0x00550046cc52 for type 'uint32_t', which requires 4 byte alignment

@Dead2
Copy link
Copy Markdown
Member

Dead2 commented Dec 22, 2024

Unaligned access is not fixed

Ubuntu GCC AARCH64 No NEON UBSAN (zlib-ng/zlib-ng/actions/runs/12447135765/job/34750296246?pr=1833):

/home/runner/work/zlib-ng/zlib-ng/arch/arm/crc32_acle.c:40:18: runtime error: load of misaligned address 0x00550046cc52 for type 'uint32_t', which requires 4 byte alignment

Thanks for noticing, we have had many months of failing CIs due to github not supporting s390x themselves, so we have gotten too used to CI always failing.

@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

Ugh, I ran through the whole data test corpus without issue but alignment is just kind of a chance thing if we can't generate unaligned input. Maybe I'll need to test it that way, forcing unaligned input.

Comment thread arch/arm/crc32_acle.c Outdated
Comment thread arch/arm/crc32_acle.c Outdated
@phprus
Copy link
Copy Markdown
Contributor

phprus commented Dec 22, 2024

Test for unaligned access (pseudocode):

char buf[32];
size_t len = sizeof(buf);
for (size_t i = 0; i < len; i++)
{
    buf[i] = 'a' + (i % 26);
}

for (size_t offset = 0; offset < sizeof(uint64_t); offset++)
{
    for (size_t current_len = 1; current_len < (current_len - offset); current_len++)
    {
        c = crc(0, buf + offset, current_len);
        ignore(c);
    }
}

@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

Test for unaligned access (pseudocode):

char buf[32];
size_t len = sizeof(buf);
for (size_t i = 0; i < len; i++)
{
    buf[i] = 'a' + (i % 26);
}

for (size_t offset = 0; offset < sizeof(uint64_t); offset++)
{
    for (size_t current_len = 1; current_len < (current_len - offset); current_len++)
    {
        c = crc(0, buf + offset, current_len);
        ignore(c);
    }
}

Yep that's more or less how I tested it, I just did it in the benchmark code where we start with a 64 byte aligned address and walked values 1-7 for that.

@phprus
Copy link
Copy Markdown
Contributor

phprus commented Dec 22, 2024

Yep that's more or less how I tested it, I just did it in the benchmark code where we start with a 64 byte aligned address and walked values 1-7 for that.

Maybe add this test to zlib-ng?

@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

Rebased to go all green.

@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

Yep that's more or less how I tested it, I just did it in the benchmark code where we start with a 64 byte aligned address and walked values 1-7 for that.

Maybe add this test to zlib-ng?

I mean my test consisted of running that in conjunction with ubsan. If not compiled with ubsan it's not much of a test but I suppose I could write it as at least one runner will run it with that. I've never written gtests before, though.

@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

Ok, I added a gtest specifically for this.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/test_crc32.cc (1)

Line range hint 204-247: Consider adding more comprehensive alignment test cases.

While the current implementation tests basic alignment scenarios, consider enhancing the test coverage:

  1. Test with different buffer sizes
  2. Test with different CRC initial values
  3. Test edge cases around alignment boundaries

Example enhancement:

struct align_test_param {
    int offset;
    size_t size;
    uint32_t initial_crc;
};

static const align_test_param align_test_cases[] = {
    {1, 128, 0},
    {2, 64, 0xffffffff},
    {4, 256, 0x12345678},
    // Add more test cases
};

class crc32_align : public ::testing::TestWithParam<align_test_param> {
    // Update implementation accordingly
};
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d042ddc and 32b2652.

📒 Files selected for processing (2)
  • arch/arm/crc32_acle.c (2 hunks)
  • test/test_crc32.cc (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • arch/arm/crc32_acle.c
🔇 Additional comments (4)
test/test_crc32.cc (4)

29-31: LGTM: Alignment test values cover essential cases.

The array includes offsets 1-7, which effectively tests various alignment boundaries for different data types (16-bit, 32-bit, 64-bit).


219-219: LGTM: Test suite registration is correct.

The alignment test cases are properly registered with the test framework using the alignment offsets.


230-237: LGTM: Test macro follows the same pattern as TEST_CRC32.

The new alignment test macro correctly mirrors the structure of the existing TEST_CRC32 macro, maintaining consistency in the codebase.


247-247: LGTM: ACLE alignment test is properly guarded.

The alignment test for ACLE is correctly placed within the ARM_ACLE conditional compilation block and uses the appropriate CPU feature flag.

Comment thread test/test_crc32.cc
Comment on lines +204 to +214
/* Specifically to test where we had dodgy alignment in the acle CRC32
* function. All others are either byte level access or use intrinsics
* that work with unaligned access */
class crc32_align : public ::testing::TestWithParam<int> {
public:
void hash(int param, crc32_func crc32) {
uint8_t *buf = (uint8_t*)zng_alloc(sizeof(uint8_t) * (128 + param));
if (buf != NULL) {
(void)crc32(0, buf + param, 128);
}
zng_free(buf);
}
};
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Dec 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider adding buffer initialization and size validation.

While the test class correctly allocates memory for alignment testing, there are a few improvements to consider:

  1. The allocated buffer is uninitialized, which might lead to undefined behavior
  2. There's no validation that param is non-negative
  3. Memory allocation failure isn't properly handled

Consider this improved implementation:

 class crc32_align : public ::testing::TestWithParam<int> {
 public:
     void hash(int param, crc32_func crc32) {
+        ASSERT_GE(param, 0) << "Alignment offset must be non-negative";
         uint8_t *buf = (uint8_t*)zng_alloc(sizeof(uint8_t) * (128 + param));
-        if (buf != NULL) {
-            (void)crc32(0, buf + param, 128);
-        }
+        ASSERT_NE(buf, nullptr) << "Memory allocation failed";
+        // Initialize buffer to avoid undefined behavior
+        memset(buf, 0xAA, 128 + param);
+        (void)crc32(0, buf + param, 128);
         zng_free(buf);
     }
 };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/* Specifically to test where we had dodgy alignment in the acle CRC32
* function. All others are either byte level access or use intrinsics
* that work with unaligned access */
class crc32_align : public ::testing::TestWithParam<int> {
public:
void hash(int param, crc32_func crc32) {
uint8_t *buf = (uint8_t*)zng_alloc(sizeof(uint8_t) * (128 + param));
if (buf != NULL) {
(void)crc32(0, buf + param, 128);
}
zng_free(buf);
}
};
/* Specifically to test where we had dodgy alignment in the acle CRC32
* function. All others are either byte level access or use intrinsics
* that work with unaligned access */
class crc32_align : public ::testing::TestWithParam<int> {
public:
void hash(int param, crc32_func crc32) {
ASSERT_GE(param, 0) << "Alignment offset must be non-negative";
uint8_t *buf = (uint8_t*)zng_alloc(sizeof(uint8_t) * (128 + param));
ASSERT_NE(buf, nullptr) << "Memory allocation failed";
// Initialize buffer to avoid undefined behavior
memset(buf, 0xAA, 128 + param);
(void)crc32(0, buf + param, 128);
zng_free(buf);
}
};

Copy link
Copy Markdown
Contributor

@phprus phprus Dec 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use single buffer for all offsets to ensure that all addresses are checked.

zng_alloc guarantees alignment. So everything will work. I made a mistake.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phprus Entendido! A abordagem mais simples seria alocar um buffer grande o suficiente para o maior deslocamento antes de todos os testes, usar esse mesmo buffer para cada offset e liberar no final. Segue um exemplo de como você pode implementar isso usando o fixture de teste:


#include <gtest/gtest.h>

class crc32_align : public ::testing::TestWithParam {
protected:
static inline uint8_t *sBuffer = nullptr;
// Ajuste este valor de acordo com o valor máximo de offset que deseja testar.
// Supondo que o maior offset seja 7, precisamos 128 + 7 = 135 bytes.
static const int kBufferSize = 128 + 7;

static void SetUpTestSuite() {
    sBuffer = (uint8_t*)zng_alloc(sizeof(uint8_t) * kBufferSize);
    ASSERT_NE(sBuffer, nullptr) << "Memory allocation failed";
    // Inicializa o buffer para evitar comportamentos indefinidos
    memset(sBuffer, 0xAA, kBufferSize);
}

static void TearDownTestSuite() {
    zng_free(sBuffer);
    sBuffer = nullptr;
}

void hash(int param, crc32_func crc32) {
    // Ainda assim podemos garantir que param >= 0
    ASSERT_GE(param, 0) << "Alignment offset must be non-negative";
    // Usa o mesmo buffer alocado, variando apenas o início.
    (void)crc32(0, sBuffer + param, 128);
}

};

INSTANTIATE_TEST_SUITE_P(crc32_alignment, crc32_align, testing::ValuesIn(align_offsets));

#define TEST_CRC32_ALIGN(name, func, support_flag)
TEST_P(crc32_align, name) {
if (!(support_flag)) {
GTEST_SKIP();
return;
}
hash(GetParam(), func);
}


Dessa forma, usamos um único buffer para todos os offsets, garantindo que cada endereço seja testado. Se você tiver mais de um conjunto de offsets maior que 7, basta ajustar kBufferSize apropriadamente. Se surgir alguma dúvida adicional, é só avisar!


✏️ Learnings added
Learnt from: phprus
PR: zlib-ng/zlib-ng#1833
File: test/test_crc32.cc:204-216
Timestamp: 2024-12-22T18:12:46.241Z
Learning: Use uma única alocação de buffer para testar vários deslocamentos (offsets) em testes de alinhamento, garantindo que todas as posições sejam efetivamente cobertas.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually care what the memory contents are, we are simply testing varying alignments against the alignment sanitizer, here. Additionally, if an allocation somehow fails we have bigger problems and it will crash, causing a failure.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment thread test/test_crc32.cc Fixed
@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

Ok, I think this is now good to go.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 38cd582 and 19b558c.

📒 Files selected for processing (2)
  • arch/arm/crc32_acle.c (2 hunks)
  • test/test_crc32.cc (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • arch/arm/crc32_acle.c
🧰 Additional context used
📓 Learnings (1)
test/test_crc32.cc (1)
Learnt from: phprus
PR: zlib-ng/zlib-ng#1833
File: test/test_crc32.cc:204-216
Timestamp: 2024-12-22T18:12:46.391Z
Learning: Use uma única alocação de buffer para testar vários deslocamentos (offsets) em testes de alinhamento, garantindo que todas as posições sejam efetivamente cobertas.
🔇 Additional comments (2)
test/test_crc32.cc (2)

233-235: LGTM! Comprehensive alignment test coverage

The array covers all possible misalignment scenarios (1-7 bytes) which is perfect for testing unaligned access cases.


237-248: LGTM! Well-structured test implementation

The test macro and instantiation follow good testing practices and maintain consistency with existing code patterns.

Let's verify the usage of align_offsets array to address the static analysis warning:

Comment thread test/test_crc32.cc
This fixes a rightful complaint from the alignment sanitizer that we
alias memory in an unaligned fashion. A nice added bonus is that this
improves performance a tiny bit on the larger buffers, perhaps due to
loops that idiomatically decrement a count and increment a single buffer
pointer rather than the maze of conditional pointer reassignments.

While here, let's write a unit test just for this. Since this is the only
variant that accesses memory in a potentially unaligned fashion that doesn't
explicitly go byte by byte or use intrinsics that don't require alignment,
we'll enable it only for this function for now. Adding more tests later if
need be should be possible. For everything else not crc, we're relying on
ubsan to hopefully catch things by chance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Architecture Architecture specific bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants