Fix "RLE" compression with big endian architectures by KungFuJesus · Pull Request #1832 · zlib-ng/zlib-ng

KungFuJesus · 2024-12-21T15:12:38Z

This was missed in #1831. The RLE methods compare a string of bytes directly with itself to directly derive a simple run length encoding. They use similar but not identical methods to compare256. This needs a similar endianness check at compile time to know which compare bit count to use (leading or trailing).

Summary by CodeRabbit

New Features
- Introduced a new header file for enhanced functionality in comparing 256-byte arrays.
- Added multiple functions for optimized byte and integer comparisons, supporting different data alignments and sizes.

This was missed in zlib-ng#1831. The RLE methods compare a string of bytes directly with itself to directly derive a simple run length encoding. They use similar but not identical methods to compare256. This needs a similar endianness check at compile time to know which compare bit count to use (leading or trailing).

coderabbitai · 2024-12-21T15:12:45Z

Walkthrough

The pull request introduces a new header file compare256_rle.h that provides optimized functions for comparing 256-byte arrays using run-length encoding (RLE). The implementation includes multiple comparison functions targeting different integer sizes and alignment scenarios. These functions are designed to efficiently compare byte arrays by leveraging built-in compiler functions and handling various data alignment conditions.

Changes

File	Change Summary
`compare256_rle.h`	- Added new header file with RLE comparison functions - Implemented `compare256_rle_c` for byte-by-byte comparison - Added unaligned comparison functions for 16, 32, and 64-bit integers - Included `zendian.h` header

Sequence Diagram

sequenceDiagram
    participant Caller
    participant compare256_rle_func
    participant Input Arrays

    Caller->>compare256_rle_func: Call with src0, src1
    compare256_rle_func->>Input Arrays: Compare bytes
    compare256_rle_func-->>Caller: Return matching byte length

Possibly related PRs

Make big endians first class citizens again #1831: Contains similar unaligned comparison functions for 16, 32, and 64-bit integers

Suggested labels

optimization, Architecture

Suggested reviewers

nmoinvaz
Dead2

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

compare256_rle.h (1)

126-130: Validate the 64-bit endianness handling.
Similar to the 32-bit version, this code properly uses __builtin_ctzll and __builtin_clzll for little- and big-endian systems. As a best practice, verify that the code compiles and runs on both big-endian and little-endian systems, possibly through continuous integration or an emulator-based test.

Would you like help setting up an emulator-based or cross-compilation build pipeline to test big-endian architectures automatically?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 04d1b75 and 56780d0.

📒 Files selected for processing (1)

compare256_rle.h (3 hunks)

🔇 Additional comments (2)

compare256_rle.h (2)

92-96: Double-check built-in usage for big-endian.
The usage of __builtin_ctz vs __builtin_clz for big-endian is correct and ensures that we detect the first differing byte on both architectures. However, confirm that additional alternative branches (e.g., #elif) are discarded properly for other architectures that might not be strictly little- or big-endian.

8-8: Confirm necessity of 'zendian.h' include.
This header seems essential for detecting and handling byte endianness, but please verify if any part of this file depends on macros or definitions from 'zendian.h'. If it's only needed for the LITTLE_ENDIAN or BIG_ENDIAN macros, confirm that they're defined if the compiler doesn't, or provide a fallback.

Run the following shell script to check references of endian macros across the repository and confirm usage:

✅ Verification successful

Based on the search results, I can generate the final response as we have sufficient information to verify the necessity of the 'zendian.h' include.

'zendian.h' include is necessary - used for endianness-dependent code
The file actively uses endianness macros from zendian.h for conditional compilation. Specifically:

Uses BYTE_ORDER == LITTLE_ENDIAN checks for different byte counting implementations
Contains endianness-specific code paths using __builtin_ctz/__builtin_clz for both 32-bit and 64-bit operations
The header provides essential endianness detection and fallback definitions across different platforms

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify usage of endian macros across the repository, ensuring 'zendian.h' is required.

# Searching for occurrences of LITTLE_ENDIAN, BIG_ENDIAN, or zendian.h usage.
rg -A 3 -i 'LITTLE_ENDIAN|BIG_ENDIAN|zendian\.h'

Length of output: 11908

codecov · 2024-12-21T15:41:34Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.39%. Comparing base (04d1b75) to head (56780d0).
Report is 1 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1832      +/-   ##
===========================================
+ Coverage    82.10%   82.39%   +0.29%     
===========================================
  Files          136      136              
  Lines        10969    11020      +51     
  Branches      2716     2799      +83     
===========================================
+ Hits          9006     9080      +74     
+ Misses        1266     1241      -25     
- Partials       697      699       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

phprus · 2024-12-21T15:48:50Z

CI errors (Ubuntu GCC AARCH64 No NEON UBSAN):

76: /home/runner/work/zlib-ng/zlib-ng/arch/arm/crc32_acle.c:52:13: runtime error: load of misaligned address 0x005502d11024 for type 'const uint64_t', which requires 8 byte alignment
76: 0x005502d11024: note: pointer points here
76:   65 0a 6f 74 68 65 72 20  68 61 6e 64 2c 20 73 75  63 68 20 62 65 61 75 74  79 20 77 6f 75 6c 64 20
76:               ^ 
76:     #0 0x55000be4c4 in crc32_acle /home/runner/work/zlib-ng/zlib-ng/arch/arm/crc32_acle.c:52
76:     #1 0x55000d8aec in crc32_fold_copy_c /home/runner/work/zlib-ng/zlib-ng/arch/generic/crc32_fold_c.c:16
76:     #2 0x5500069218 in zng_read_buf /home/runner/work/zlib-ng/zlib-ng/deflate.c:1201
76:     #3 0x55000825b4 in deflate_stored /home/runner/work/zlib-ng/zlib-ng/deflate_stored.c:160
76:     #4 0x5500060ecc in zng_deflate /home/runner/work/zlib-ng/zlib-ng/deflate.c:1040
76:     #5 0x5500054764 in gz_comp /home/runner/work/zlib-ng/zlib-ng/gzwrite.c:125
76:     #6 0x5500055ab8 in gz_write /home/runner/work/zlib-ng/zlib-ng/gzwrite.c:207
76:     #7 0x5500055f10 in zng_gzwrite /home/runner/work/zlib-ng/zlib-ng/gzwrite.c:254
76:     #8 0x5500048e20 in gz_compress /home/runner/work/zlib-ng/zlib-ng/test/minigzip.c:144
76:     #9 0x5500047ac0 in main /home/runner/work/zlib-ng/zlib-ng/test/minigzip.c:330
76:     #10 0x55028473f8  (/lib/libc.so.6+0x273f8)
76:     #11 0x55028474c8 in __libc_start_main (/lib/libc.so.6+0x274c8)
76:     #12 0x550004896c in _start (/home/runner/work/zlib-ng/zlib-ng/minigzip+0x4896c)
76: 
76: qemu: uncaught target signal 6 (Aborted) - core dumped
74: /home/runner/work/zlib-ng/zlib-ng/arch/arm/crc32_acle.c:52:13: runtime error: load of misaligned address 0x005502d39a1c for type 'const uint64_t', which requires 8 byte alignment
74: 0x005502d39a1c: note: pointer points here
74:   61 74 2e 20 4c 69 64 64  79 20 77 61 73 20 61 73  6b 69 6e 67 0a 71 75 65  73 74 69 6f 6e 73 20 61
74:               ^ 
73: -- Decompress qemu-aarch64;-L;/usr/aarch64-linux-gnu/;/home/runner/work/zlib-ng/zlib-ng/minigzip;-d;-c
73: --   Input: /home/runner/work/zlib-ng/zlib-ng/test/Testing/Temporary/minigzip-book1-R-0ThCpV.gz
73: --   Output: /home/runner/work/zlib-ng/zlib-ng/test/Testing/Temporary/minigzip-book1-R-0ThCpV
74:     #0 0x55000be4c4 in crc32_acle /home/runner/work/zlib-ng/zlib-ng/arch/arm/crc32_acle.c:52
74:     #1 0x55000d8aec in crc32_fold_copy_c /home/runner/work/zlib-ng/zlib-ng/arch/generic/crc32_fold_c.c:16
74:     #2 0x5500085a48 in inf_chksum_cpy /home/runner/work/zlib-ng/zlib-ng/inflate.c:31
74:     #3 0x5500085a48 in updatewindow /home/runner/work/zlib-ng/zlib-ng/inflate.c:343
74:     #4 0x550008abe4 in zng_inflate /home/runner/work/zlib-ng/zlib-ng/inflate.c:1183
74:     #5 0x550004e404 in gz_decomp /home/runner/work/zlib-ng/zlib-ng/gzread.c:178
74:     #6 0x550004fd74 in gz_fetch /home/runner/work/zlib-ng/zlib-ng/gzread.c:231
74:     #7 0x5500050ab4 in gz_read /home/runner/work/zlib-ng/zlib-ng/gzread.c:310
74:     #8 0x550005101c in zng_gzread /home/runner/work/zlib-ng/zlib-ng/gzread.c:366
74:     #9 0x5500048c50 in gz_uncompress /home/runner/work/zlib-ng/zlib-ng/test/minigzip.c:161
74:     #10 0x55000479cc in main /home/runner/work/zlib-ng/zlib-ng/test/minigzip.c:326
74:     #11 0x55028473f8  (/lib/libc.so.6+0x273f8)
74:     #12 0x55028474c8 in __libc_start_main (/lib/libc.so.6+0x274c8)
74:     #13 0x550004896c in _start (/home/runner/work/zlib-ng/zlib-ng/minigzip+0x4896c)
74: 
74: qemu: uncaught target signal 6 (Aborted) - core dumped

KungFuJesus · 2024-12-21T15:51:35Z

Yes, those are not new, they were introduced in a PR that re-enabled the alignment sanitizer. A subsequent PR is likely to follow to fix that one, but it's unrelated to this.

ccawley2011 · 2024-12-21T17:04:03Z

compare256_rle.h

        diff = sv ^ mv;
        if (diff) {
+#if BYTE_ORDER == LITTLE_ENDIAN
            uint32_t match_byte = __builtin_ctz(diff) / 8;


Would it be worth using macros for this so that the endianness checks (and the builtin availability checks) can be performed in one place?

As of right now this is only ever used in two files in 2 functions. Maybe it'd be worth a second PR, though.

coderabbitai bot reviewed Dec 21, 2024

View reviewed changes

Dead2 approved these changes Dec 21, 2024

View reviewed changes

ccawley2011 reviewed Dec 21, 2024

View reviewed changes

This was referenced Dec 21, 2024

Enable unaligned memory access on big endian PowerPC #1690

Closed

Set OPTIMAL_CMP for 32-bit PowerPC #1834

Merged

nmoinvaz approved these changes Dec 21, 2024

View reviewed changes

Dead2 merged commit 90913e8 into zlib-ng:develop Dec 22, 2024

This was referenced Dec 22, 2024

Continued cleanup of old UNALIGNED_OK checks #1830

Merged

Fix unaligned access in ACLE based crc32 #1833

Merged

Remove no longer needed endianness check for compare256_bench #1836

Closed

Dead2 mentioned this pull request Dec 31, 2024

2.2.3 Release #1838

Merged

This was referenced Jan 27, 2025

Chorba #1837

Merged

Add uncompress benchmark #1860

Merged

Fix an unfortunate bug with Visual Studio 2015 #1862

Merged

coderabbitai bot mentioned this pull request Feb 5, 2025

Fix shift overflow in inflate and send_code. #1859

Merged

coderabbitai bot mentioned this pull request Oct 10, 2025

Don't build C-fallback functions that never get used on x86_64 #1984

Merged

coderabbitai bot mentioned this pull request Jan 17, 2026

Replace conditional byte swapping with portable host/LE conversion #2105

Merged

coderabbitai bot mentioned this pull request Jan 26, 2026

Refactor compare256 with portable ctz functions #2128

Merged

This was referenced Mar 11, 2026

Zlib-ng-compat vs Zlib #2216

Open

Add back zng_clz for big-endian and use macro for compare256 #2229

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix "RLE" compression with big endian architectures#1832

Fix "RLE" compression with big endian architectures#1832
Dead2 merged 1 commit intozlib-ng:developfrom
KungFuJesus:fix_big_endian_rle

KungFuJesus commented Dec 21, 2024 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 21, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

codecov bot commented Dec 21, 2024 •

edited

Loading

Uh oh!

phprus commented Dec 21, 2024

Uh oh!

KungFuJesus commented Dec 21, 2024

Uh oh!

ccawley2011 Dec 21, 2024

Uh oh!

KungFuJesus Dec 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

KungFuJesus commented Dec 21, 2024 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

phprus commented Dec 21, 2024

Uh oh!

KungFuJesus commented Dec 21, 2024

Uh oh!

ccawley2011 Dec 21, 2024

Choose a reason for hiding this comment

Uh oh!

KungFuJesus Dec 21, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

KungFuJesus commented Dec 21, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 21, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Dec 21, 2024 •

edited

Loading