Skip to content

Inline the CHUNKSIZE function#1974

Merged
Dead2 merged 1 commit intozlib-ng:developfrom
ccawley2011:chunksize-inline
Oct 7, 2025
Merged

Inline the CHUNKSIZE function#1974
Dead2 merged 1 commit intozlib-ng:developfrom
ccawley2011:chunksize-inline

Conversation

@ccawley2011
Copy link
Copy Markdown
Contributor

@ccawley2011 ccawley2011 commented Oct 2, 2025

Summary by CodeRabbit

  • Refactor

    • Simplified chunk-size handling across all platforms by consolidating logic and removing architecture-specific paths. Reduces initialization overhead and potential inconsistencies without changing user-facing behavior or APIs.
  • Performance

    • Streamlined checks may yield minor improvements in consistency and micro-performance during decompression, with no functional impact.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Oct 2, 2025

Walkthrough

This PR removes the chunksize function and related indirections across architectures, eliminates the functable entry and inflate_state field for chunksize, and replaces usages with a static inline CHUNKSIZE() returning sizeof(chunk_t). Call sites in inflate/init paths are adjusted accordingly.

Changes

Cohort / File(s) Summary
Arch function headers: remove chunksize declarations and macro aliasing
arch/arm/arm_functions.h, arch/generic/generic_functions.h, arch/loongarch/loongarch_functions.h, arch/power/power_functions.h, arch/riscv/riscv_functions.h, arch/x86/x86_functions.h
Delete public chunksize_* declarations (NEON, C, LSX/LASX, POWER8, RVV, SSE2/AVX2/AVX512) and remove DISABLE_RUNTIME_CPU_DETECTION aliasing of native_chunksize.
Functable API: remove chunksize
functable.c, functable.h
Remove functable member (*chunksize)(void), the stub, and all assignments/arch-specific wiring.
Inflate state/init: drop state->chunksize
inflate.h, inflate.c, infback.c
Remove inflate_state.chunksize field and initialization via FUNCTABLE_CALL(chunksize)() in inflateInit2 and inflateBackInit.
Inffast template: use CHUNKSIZE()
inffast_tpl.h
Replace dist >= state->chunksize with dist >= CHUNKSIZE().
Chunkset template: define inline CHUNKSIZE()
chunkset_tpl.h
Change CHUNKSIZE() to static inline size_t CHUNKSIZE(void) returning sizeof(chunk_t).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Inflate as inflate/inflateBack
  participant Inffast as inffast_tpl
  participant Chunkset as CHUNKSIZE()

  Caller->>Inflate: initialize/execute
  note over Inflate: state->chunksize removed
  Inflate->>Inffast: decode and copy
  Inffast->>Chunkset: CHUNKSIZE()
  Chunkset-->>Inffast: sizeof(chunk_t)
  alt dist >= CHUNKSIZE()
    Inffast->>Inffast: use CHUNK-copy path
  else dist < CHUNKSIZE()
    Inffast->>Inffast: use short copy path
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

optimization, cleanup

Suggested reviewers

  • Dead2
  • KungFuJesus

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly conveys the primary change of the PR by indicating that the CHUNKSIZE function has been inlined, matching the conversion in chunkset_tpl.h and the removal of related API stubs. It is concise, precise, and gives a teammate scanning history a clear understanding of the main refactoring being applied.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d7db7a and ae40786.

📒 Files selected for processing (13)
  • arch/arm/arm_functions.h (0 hunks)
  • arch/generic/generic_functions.h (0 hunks)
  • arch/loongarch/loongarch_functions.h (0 hunks)
  • arch/power/power_functions.h (0 hunks)
  • arch/riscv/riscv_functions.h (0 hunks)
  • arch/x86/x86_functions.h (0 hunks)
  • chunkset_tpl.h (1 hunks)
  • functable.c (0 hunks)
  • functable.h (0 hunks)
  • infback.c (0 hunks)
  • inffast_tpl.h (1 hunks)
  • inflate.c (0 hunks)
  • inflate.h (0 hunks)
💤 Files with no reviewable changes (11)
  • inflate.h
  • arch/arm/arm_functions.h
  • arch/riscv/riscv_functions.h
  • arch/generic/generic_functions.h
  • functable.c
  • functable.h
  • arch/x86/x86_functions.h
  • arch/power/power_functions.h
  • infback.c
  • inflate.c
  • arch/loongarch/loongarch_functions.h
🧰 Additional context used
🧠 Learnings (6)
📚 Learning: 2024-10-04T03:17:24.773Z
Learnt from: KungFuJesus
PR: zlib-ng/zlib-ng#1778
File: chunkset_tpl.h:164-165
Timestamp: 2024-10-04T03:17:24.773Z
Learning: In `chunkset_tpl.h`, using `goto` in the `CHUNKMEMSET` function aids the compiler in inlining the function, so it should be retained.

Applied to files:

  • chunkset_tpl.h
  • inffast_tpl.h
📚 Learning: 2024-10-17T03:29:32.723Z
Learnt from: KungFuJesus
PR: zlib-ng/zlib-ng#1802
File: chunkset_tpl.h:79-84
Timestamp: 2024-10-17T03:29:32.723Z
Learning: In `chunkset_tpl.h`, within the `GET_CHUNK_MAG` function, `chunk_rem` is always less than or equal to the size of `chunk_t`, ensuring that casting from `uint64_t` to `uint32_t` does not lead to data loss.

Applied to files:

  • chunkset_tpl.h
📚 Learning: 2024-10-07T22:00:02.180Z
Learnt from: KungFuJesus
PR: zlib-ng/zlib-ng#1802
File: chunkset_tpl.h:92-108
Timestamp: 2024-10-07T22:00:02.180Z
Learning: In the `HALFCHUNKCOPY` function in `chunkset_tpl.h`, the `len` parameter is always bound to be small by the callers and will not exceed the limits of an `int32_t`.

Applied to files:

  • chunkset_tpl.h
📚 Learning: 2024-10-29T02:22:52.846Z
Learnt from: KungFuJesus
PR: zlib-ng/zlib-ng#1805
File: inffast_tpl.h:257-262
Timestamp: 2024-10-29T02:22:52.846Z
Learning: In `inffast_tpl.h`, when AVX512 is enabled, the branch involving `chunkcopy_safe` is intentionally eliminated to optimize performance.

Applied to files:

  • inffast_tpl.h
📚 Learning: 2024-10-07T21:23:13.401Z
Learnt from: KungFuJesus
PR: zlib-ng/zlib-ng#1802
File: chunkset_tpl.h:135-135
Timestamp: 2024-10-07T21:23:13.401Z
Learning: In the `CHUNKMEMSET` function within `chunkset_tpl.h`, extra bounds checks are avoided to maintain performance in critical code sections. Branching is minimized to prevent negative impacts on speculative execution. The variable `len` is enforced with `safelen` early on.

Applied to files:

  • inffast_tpl.h
📚 Learning: 2025-06-10T07:38:03.297Z
Learnt from: mtl1979
PR: zlib-ng/zlib-ng#1921
File: arch/riscv/chunkset_rvv.c:103-104
Timestamp: 2025-06-10T07:38:03.297Z
Learning: In RISC-V chunkset_rvv.c CHUNKCOPY function, when dist < sizeof(chunk_t), the vl variable intentionally becomes 0, causing the while loop to not execute. This is correct behavior because copying full chunks is not safe when the distance is smaller than chunk size, and the function appropriately falls back to memcpy for handling remaining bytes.

Applied to files:

  • inffast_tpl.h
🧬 Code graph analysis (2)
chunkset_tpl.h (1)
functable.c (1)
  • chunksize_stub (354-357)
inffast_tpl.h (1)
chunkset_tpl.h (1)
  • CHUNKSIZE (9-11)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (87)
  • GitHub Check: macOS GCC UBSAN (ARM64)
  • GitHub Check: Windows MSVC 2022 v140 Win32
  • GitHub Check: Windows MSVC 2022 v142 Win64
  • GitHub Check: Windows MSVC 2022 v142 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win64 Native Instructions (AVX)
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
  • GitHub Check: macOS GCC UBSAN (ARM64)
  • GitHub Check: Windows MSVC 2022 v140 Win32
  • GitHub Check: Windows MSVC 2022 v142 Win64
  • GitHub Check: Windows MSVC 2022 v142 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win64 Native Instructions (AVX)
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
  • GitHub Check: macOS GCC UBSAN (ARM64)
  • GitHub Check: Windows MSVC 2022 v140 Win32
  • GitHub Check: Windows MSVC 2022 v142 Win64
  • GitHub Check: Windows MSVC 2022 v142 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win64 Native Instructions (AVX)
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
  • GitHub Check: macOS GCC UBSAN (ARM64)
  • GitHub Check: Windows MSVC 2022 v140 Win32
  • GitHub Check: Windows MSVC 2022 v142 Win64
  • GitHub Check: Windows MSVC 2022 v142 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win64 Native Instructions (AVX)
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
  • GitHub Check: macOS GCC UBSAN (ARM64)
  • GitHub Check: Windows MSVC 2022 v140 Win32
  • GitHub Check: Windows MSVC 2022 v142 Win64
  • GitHub Check: Windows MSVC 2022 v142 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win64 Native Instructions (AVX)
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: macOS GCC Symbol Prefix & Compat (ARM64)
  • GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
  • GitHub Check: macOS GCC UBSAN (ARM64)
  • GitHub Check: Windows MSVC 2022 v140 Win32
  • GitHub Check: Windows MSVC 2022 v142 Win64
  • GitHub Check: Windows MSVC 2022 v142 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win64 Native Instructions (AVX)
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
  • GitHub Check: macOS GCC UBSAN (ARM64)
  • GitHub Check: Windows MSVC 2022 v140 Win32
  • GitHub Check: Windows MSVC 2022 v142 Win64
  • GitHub Check: Windows MSVC 2022 v141 Win64
  • GitHub Check: Windows MSVC 2022 v142 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win64 Native Instructions (AVX)
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
  • GitHub Check: macOS GCC UBSAN (ARM64)
  • GitHub Check: Windows MSVC 2022 v140 Win32
  • GitHub Check: Windows MSVC 2022 v142 Win64
  • GitHub Check: Windows MSVC 2022 v141 Win64
  • GitHub Check: Windows MSVC 2022 v142 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win64 Native Instructions (AVX)
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
  • GitHub Check: Lint
  • GitHub Check: macOS GCC UBSAN (ARM64)
  • GitHub Check: Windows MSVC 2022 v140 Win32
  • GitHub Check: Windows MSVC 2022 v142 Win64
  • GitHub Check: Windows MSVC 2022 v141 Win64
  • GitHub Check: Windows MSVC 2022 v142 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win64 Native Instructions (AVX)
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
  • GitHub Check: Lint
  • GitHub Check: macOS GCC UBSAN (ARM64)
  • GitHub Check: Windows MSVC 2022 v140 Win32
  • GitHub Check: Windows MSVC 2022 v142 Win64
  • GitHub Check: Windows MSVC 2022 v141 Win64
  • GitHub Check: Windows MSVC 2022 v142 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win32
  • GitHub Check: Windows MSVC 2022 v143 Win64 Native Instructions (AVX)
  • GitHub Check: EL9 Clang S390X DFLTCC MSAN
  • GitHub Check: Ubuntu GCC AARCH64 No NEON UBSAN
🔇 Additional comments (2)
chunkset_tpl.h (1)

9-11: LGTM! Improved type correctness and performance.

Converting CHUNKSIZE to a static inline function with size_t return type is more type-correct (sizeof returns size_t) and eliminates runtime indirection through the functable. The change is safe since sizeof(chunk_t) is always small and well within the range of any integer type used in comparisons.

inffast_tpl.h (1)

275-275: LGTM! Eliminates runtime indirection.

Replacing state->chunksize with CHUNKSIZE() removes a runtime memory load and replaces it with a compile-time constant, improving performance while maintaining identical logic. The condition correctly determines when to use CHUNKCOPY (for large distances) vs CHUNKMEMSET (for pattern repetition).


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Member

@nmoinvaz nmoinvaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov
Copy link
Copy Markdown

codecov bot commented Oct 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.53%. Comparing base (7d7db7a) to head (ae40786).
⚠️ Report is 1 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1974      +/-   ##
===========================================
+ Coverage    80.86%   81.53%   +0.67%     
===========================================
  Files          161      161              
  Lines        13741    13726      -15     
  Branches      3117     3113       -4     
===========================================
+ Hits         11111    11192      +81     
+ Misses        1603     1475     -128     
- Partials      1027     1059      +32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Dead2 Dead2 merged commit 2801a6f into zlib-ng:develop Oct 7, 2025
156 checks passed
@ccawley2011 ccawley2011 deleted the chunksize-inline branch October 7, 2025 13:11
@Dead2 Dead2 mentioned this pull request Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants