Skip to content

Fix "RLE" compression with big endian architectures#1832

Merged
Dead2 merged 1 commit intozlib-ng:developfrom
KungFuJesus:fix_big_endian_rle
Dec 22, 2024
Merged

Fix "RLE" compression with big endian architectures#1832
Dead2 merged 1 commit intozlib-ng:developfrom
KungFuJesus:fix_big_endian_rle

Conversation

@KungFuJesus
Copy link
Copy Markdown
Collaborator

@KungFuJesus KungFuJesus commented Dec 21, 2024

This was missed in #1831. The RLE methods compare a string of bytes directly with itself to directly derive a simple run length encoding. They use similar but not identical methods to compare256. This needs a similar endianness check at compile time to know which compare bit count to use (leading or trailing).

Summary by CodeRabbit

  • New Features
    • Introduced a new header file for enhanced functionality in comparing 256-byte arrays.
    • Added multiple functions for optimized byte and integer comparisons, supporting different data alignments and sizes.

This was missed in zlib-ng#1831. The RLE methods compare a string of bytes
directly with itself to directly derive a simple run length encoding.
They use similar but not identical methods to compare256. This needs
a similar endianness check at compile time to know which compare bit
count to use (leading or trailing).
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Dec 21, 2024

Walkthrough

The pull request introduces a new header file compare256_rle.h that provides optimized functions for comparing 256-byte arrays using run-length encoding (RLE). The implementation includes multiple comparison functions targeting different integer sizes and alignment scenarios. These functions are designed to efficiently compare byte arrays by leveraging built-in compiler functions and handling various data alignment conditions.

Changes

File Change Summary
compare256_rle.h - Added new header file with RLE comparison functions
- Implemented compare256_rle_c for byte-by-byte comparison
- Added unaligned comparison functions for 16, 32, and 64-bit integers
- Included zendian.h header

Sequence Diagram

sequenceDiagram
    participant Caller
    participant compare256_rle_func
    participant Input Arrays

    Caller->>compare256_rle_func: Call with src0, src1
    compare256_rle_func->>Input Arrays: Compare bytes
    compare256_rle_func-->>Caller: Return matching byte length
Loading

Possibly related PRs

Suggested labels

optimization, Architecture

Suggested reviewers

  • nmoinvaz
  • Dead2

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
compare256_rle.h (1)

126-130: Validate the 64-bit endianness handling.
Similar to the 32-bit version, this code properly uses __builtin_ctzll and __builtin_clzll for little- and big-endian systems. As a best practice, verify that the code compiles and runs on both big-endian and little-endian systems, possibly through continuous integration or an emulator-based test.

Would you like help setting up an emulator-based or cross-compilation build pipeline to test big-endian architectures automatically?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 04d1b75 and 56780d0.

📒 Files selected for processing (1)
  • compare256_rle.h (3 hunks)
🔇 Additional comments (2)
compare256_rle.h (2)

92-96: Double-check built-in usage for big-endian.
The usage of __builtin_ctz vs __builtin_clz for big-endian is correct and ensures that we detect the first differing byte on both architectures. However, confirm that additional alternative branches (e.g., #elif) are discarded properly for other architectures that might not be strictly little- or big-endian.


8-8: Confirm necessity of 'zendian.h' include.
This header seems essential for detecting and handling byte endianness, but please verify if any part of this file depends on macros or definitions from 'zendian.h'. If it's only needed for the LITTLE_ENDIAN or BIG_ENDIAN macros, confirm that they're defined if the compiler doesn't, or provide a fallback.

Run the following shell script to check references of endian macros across the repository and confirm usage:

✅ Verification successful

Based on the search results, I can generate the final response as we have sufficient information to verify the necessity of the 'zendian.h' include.

'zendian.h' include is necessary - used for endianness-dependent code
The file actively uses endianness macros from zendian.h for conditional compilation. Specifically:

  • Uses BYTE_ORDER == LITTLE_ENDIAN checks for different byte counting implementations
  • Contains endianness-specific code paths using __builtin_ctz/__builtin_clz for both 32-bit and 64-bit operations
  • The header provides essential endianness detection and fallback definitions across different platforms
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify usage of endian macros across the repository, ensuring 'zendian.h' is required.

# Searching for occurrences of LITTLE_ENDIAN, BIG_ENDIAN, or zendian.h usage.
rg -A 3 -i 'LITTLE_ENDIAN|BIG_ENDIAN|zendian\.h'

Length of output: 11908

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.39%. Comparing base (04d1b75) to head (56780d0).
Report is 1 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1832      +/-   ##
===========================================
+ Coverage    82.10%   82.39%   +0.29%     
===========================================
  Files          136      136              
  Lines        10969    11020      +51     
  Branches      2716     2799      +83     
===========================================
+ Hits          9006     9080      +74     
+ Misses        1266     1241      -25     
- Partials       697      699       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@phprus
Copy link
Copy Markdown
Contributor

phprus commented Dec 21, 2024

CI errors (Ubuntu GCC AARCH64 No NEON UBSAN):

76: /home/runner/work/zlib-ng/zlib-ng/arch/arm/crc32_acle.c:52:13: runtime error: load of misaligned address 0x005502d11024 for type 'const uint64_t', which requires 8 byte alignment
76: 0x005502d11024: note: pointer points here
76:   65 0a 6f 74 68 65 72 20  68 61 6e 64 2c 20 73 75  63 68 20 62 65 61 75 74  79 20 77 6f 75 6c 64 20
76:               ^ 
76:     #0 0x55000be4c4 in crc32_acle /home/runner/work/zlib-ng/zlib-ng/arch/arm/crc32_acle.c:52
76:     #1 0x55000d8aec in crc32_fold_copy_c /home/runner/work/zlib-ng/zlib-ng/arch/generic/crc32_fold_c.c:16
76:     #2 0x5500069218 in zng_read_buf /home/runner/work/zlib-ng/zlib-ng/deflate.c:1201
76:     #3 0x55000825b4 in deflate_stored /home/runner/work/zlib-ng/zlib-ng/deflate_stored.c:160
76:     #4 0x5500060ecc in zng_deflate /home/runner/work/zlib-ng/zlib-ng/deflate.c:1040
76:     #5 0x5500054764 in gz_comp /home/runner/work/zlib-ng/zlib-ng/gzwrite.c:125
76:     #6 0x5500055ab8 in gz_write /home/runner/work/zlib-ng/zlib-ng/gzwrite.c:207
76:     #7 0x5500055f10 in zng_gzwrite /home/runner/work/zlib-ng/zlib-ng/gzwrite.c:254
76:     #8 0x5500048e20 in gz_compress /home/runner/work/zlib-ng/zlib-ng/test/minigzip.c:144
76:     #9 0x5500047ac0 in main /home/runner/work/zlib-ng/zlib-ng/test/minigzip.c:330
76:     #10 0x55028473f8  (/lib/libc.so.6+0x273f8)
76:     #11 0x55028474c8 in __libc_start_main (/lib/libc.so.6+0x274c8)
76:     #12 0x550004896c in _start (/home/runner/work/zlib-ng/zlib-ng/minigzip+0x4896c)
76: 
76: qemu: uncaught target signal 6 (Aborted) - core dumped
74: /home/runner/work/zlib-ng/zlib-ng/arch/arm/crc32_acle.c:52:13: runtime error: load of misaligned address 0x005502d39a1c for type 'const uint64_t', which requires 8 byte alignment
74: 0x005502d39a1c: note: pointer points here
74:   61 74 2e 20 4c 69 64 64  79 20 77 61 73 20 61 73  6b 69 6e 67 0a 71 75 65  73 74 69 6f 6e 73 20 61
74:               ^ 
73: -- Decompress qemu-aarch64;-L;/usr/aarch64-linux-gnu/;/home/runner/work/zlib-ng/zlib-ng/minigzip;-d;-c
73: --   Input: /home/runner/work/zlib-ng/zlib-ng/test/Testing/Temporary/minigzip-book1-R-0ThCpV.gz
73: --   Output: /home/runner/work/zlib-ng/zlib-ng/test/Testing/Temporary/minigzip-book1-R-0ThCpV
74:     #0 0x55000be4c4 in crc32_acle /home/runner/work/zlib-ng/zlib-ng/arch/arm/crc32_acle.c:52
74:     #1 0x55000d8aec in crc32_fold_copy_c /home/runner/work/zlib-ng/zlib-ng/arch/generic/crc32_fold_c.c:16
74:     #2 0x5500085a48 in inf_chksum_cpy /home/runner/work/zlib-ng/zlib-ng/inflate.c:31
74:     #3 0x5500085a48 in updatewindow /home/runner/work/zlib-ng/zlib-ng/inflate.c:343
74:     #4 0x550008abe4 in zng_inflate /home/runner/work/zlib-ng/zlib-ng/inflate.c:1183
74:     #5 0x550004e404 in gz_decomp /home/runner/work/zlib-ng/zlib-ng/gzread.c:178
74:     #6 0x550004fd74 in gz_fetch /home/runner/work/zlib-ng/zlib-ng/gzread.c:231
74:     #7 0x5500050ab4 in gz_read /home/runner/work/zlib-ng/zlib-ng/gzread.c:310
74:     #8 0x550005101c in zng_gzread /home/runner/work/zlib-ng/zlib-ng/gzread.c:366
74:     #9 0x5500048c50 in gz_uncompress /home/runner/work/zlib-ng/zlib-ng/test/minigzip.c:161
74:     #10 0x55000479cc in main /home/runner/work/zlib-ng/zlib-ng/test/minigzip.c:326
74:     #11 0x55028473f8  (/lib/libc.so.6+0x273f8)
74:     #12 0x55028474c8 in __libc_start_main (/lib/libc.so.6+0x274c8)
74:     #13 0x550004896c in _start (/home/runner/work/zlib-ng/zlib-ng/minigzip+0x4896c)
74: 
74: qemu: uncaught target signal 6 (Aborted) - core dumped

@KungFuJesus
Copy link
Copy Markdown
Collaborator Author

Yes, those are not new, they were introduced in a PR that re-enabled the alignment sanitizer. A subsequent PR is likely to follow to fix that one, but it's unrelated to this.

diff = sv ^ mv;
if (diff) {
#if BYTE_ORDER == LITTLE_ENDIAN
uint32_t match_byte = __builtin_ctz(diff) / 8;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth using macros for this so that the endianness checks (and the builtin availability checks) can be performed in one place?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of right now this is only ever used in two files in 2 functions. Maybe it'd be worth a second PR, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants