Skip to content

Minor optimization of insert_string#1951

Merged
Dead2 merged 1 commit intodevelopfrom
insert_string_opt
Aug 20, 2025
Merged

Minor optimization of insert_string#1951
Dead2 merged 1 commit intodevelopfrom
insert_string_opt

Conversation

@Dead2
Copy link
Copy Markdown
Member

@Dead2 Dead2 commented Aug 16, 2025

Remove the s-> indirection penalty from the loop in insert_string for a very minor speedup in compression.
Also some minor code cleanup.

Summary by CodeRabbit

  • Refactor
    • Internal performance tweaks to string processing (reduced unnecessary initialization and streamlined local variable usage).
    • Micro-optimizations to reduce memory accesses and improve runtime efficiency during intensive operations.
    • No changes to public APIs or observable behavior; backward compatible.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Aug 16, 2025

Walkthrough

Removed the initialization of a hash variable in a macro in insert_string.c and introduced local pointer aliases for head, prev, and w_mask in insert_string_tpl.h, replacing direct s-> array accesses with cached locals. No public APIs or semantics were changed.

Changes

Cohort / File(s) Summary of Changes
Hash calc var declaration
insert_string.c
Changed HASH_CALC_VAR_INIT from uint32_t h = 0 to uint32_t h (removed initialization).
Hash-chain insertion aliasing
insert_string_tpl.h
In QUICK_INSERT_STRING reordered a local declaration; in INSERT_STRING added local aliases headp, prevp, w_mask and replaced s->head, s->prev, s->w_mask accesses with these aliases; logic unchanged.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 190df18 and e079160.

📒 Files selected for processing (2)
  • insert_string.c (1 hunks)
  • insert_string_tpl.h (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • insert_string_tpl.h
  • insert_string.c
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch insert_string_opt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
insert_string_tpl.h (2)

59-76: Optional: Mirror the local pointer/mask caching in QUICK_INSERT_STRING for consistency

This function still dereferences s->head, s->prev, and s->w_mask directly. While it’s not in a loop, it’s still on a hot path and aligning with the pattern you used in INSERT_STRING may tighten generated code slightly and improves consistency.

Apply this refactor within the shown range:

 Z_INTERNAL Pos QUICK_INSERT_STRING(deflate_state *const s, uint32_t str) {
-    uint8_t *strstart = s->window + str + HASH_CALC_OFFSET;
+    uint8_t *strstart = s->window + str + HASH_CALC_OFFSET;
+    Pos *headp = s->head;
+    Pos *prevp = s->prev;
+    unsigned int w_mask = s->w_mask;
     uint32_t val, hm;
     Pos head;

     HASH_CALC_VAR_INIT;
     HASH_CALC_READ;
     HASH_CALC(HASH_CALC_VAR, val);
     HASH_CALC_VAR &= HASH_CALC_MASK;
     hm = HASH_CALC_VAR;

-    head = s->head[hm];
+    head = headp[hm];
     if (LIKELY(head != str)) {
-        s->prev[str & s->w_mask] = head;
-        s->head[hm] = (Pos)str;
+        prevp[str & w_mask] = head;
+        headp[hm] = (Pos)str;
     }
     return head;
 }

89-92: Nit: Make w_mask const to express intent

w_mask is read-only. Declaring it const documents intent and can sometimes help the compiler.

-    unsigned int w_mask = s->w_mask;
+    const unsigned int w_mask = s->w_mask;
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3a52db3 and 9d3ca79.

📒 Files selected for processing (2)
  • insert_string.c (1 hunks)
  • insert_string_tpl.h (3 hunks)
🔇 Additional comments (2)
insert_string.c (1)

15-15: LGTM: Dropping the zero-init avoids a redundant store

HASH_CALC_VAR_INIT no longer initializes h to 0. In this TU, h is always assigned by HASH_CALC(HASH_CALC_VAR, val) before use, so this is safe and removes a needless instruction on hot paths.

insert_string_tpl.h (1)

102-106: LGTM: Local aliasing for head/prev/mask removes s-> redirection in the loop

The switch to headp, prevp, and w_mask inside the loop is correct and functionally identical. This should slightly reduce pointer chasing and improve instruction cache friendliness in this hot path.

@codecov
Copy link
Copy Markdown

codecov bot commented Aug 16, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@3a52db3). Learn more about missing BASE report.
⚠️ Report is 3 commits behind head on develop.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #1951   +/-   ##
==========================================
  Coverage           ?   81.93%           
==========================================
  Files              ?      162           
  Lines              ?    13926           
  Branches           ?     3122           
==========================================
  Hits               ?    11410           
  Misses             ?     1518           
  Partials           ?      998           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Dead2 Dead2 force-pushed the insert_string_opt branch from 9d3ca79 to 190df18 Compare August 16, 2025 22:18
@Dead2
Copy link
Copy Markdown
Member Author

Dead2 commented Aug 16, 2025

Develop 14 Aug 2025

 Level   Comp   Comptime min/avg/max/stddev  Decomptime min/avg/max/stddev  Compressed size
 1     54.185%  0.0783/0.0786/0.0787/0.0001    0.0302/0.0310/0.0310/0.0002        8,526,745
 2     43.871%  0.1284/0.1291/0.1293/0.0002    0.0304/0.0305/0.0305/0.0000        6,903,702
 3     42.388%  0.1561/0.1562/0.1564/0.0001    0.0292/0.0293/0.0293/0.0000        6,670,239
 4     41.647%  0.1795/0.1799/0.1801/0.0002    0.0285/0.0286/0.0286/0.0000        6,553,746
 5     41.216%  0.1945/0.1949/0.1953/0.0002    0.0283/0.0284/0.0285/0.0000        6,485,936
 6     41.038%  0.2402/0.2407/0.2411/0.0002    0.0280/0.0281/0.0281/0.0000        6,457,827
 7     40.778%  0.3374/0.3378/0.3381/0.0002    0.0283/0.0283/0.0284/0.0000        6,416,941
 8     40.704%  0.4448/0.4453/0.4457/0.0002    0.0283/0.0284/0.0284/0.0000        6,405,249
 9     40.409%  0.5249/0.5256/0.5260/0.0002    0.0276/0.0276/0.0276/0.0000        6,358,951

 avg1  42.915%                       0.2542                         0.0289
 tot                                68.6448                         7.8051       60,779,336

   text    data     bss     dec     hex filename
 159302    1344       8  160654   2738e libz-ng.so.2

compress_bench/compress_bench/1                 3759 ns         3759 ns       745247
compress_bench/compress_bench/8                 4012 ns         4012 ns       698459
compress_bench/compress_bench/16                4204 ns         4204 ns       668137
compress_bench/compress_bench/32                4527 ns         4527 ns       619999
compress_bench/compress_bench/64                4863 ns         4863 ns       575942
compress_bench/compress_bench/512               4898 ns         4898 ns       572839
compress_bench/compress_bench/4096              5445 ns         5445 ns       512955
compress_bench/compress_bench/32768            10006 ns        10006 ns       280370

PR opt insert_string

 Level   Comp   Comptime min/avg/max/stddev  Decomptime min/avg/max/stddev  Compressed size
 1     54.185%  0.0783/0.0785/0.0786/0.0001    0.0302/0.0310/0.0310/0.0002        8,526,745
 2     43.871%  0.1288/0.1291/0.1294/0.0002    0.0304/0.0305/0.0305/0.0000        6,903,702
 3     42.388%  0.1555/0.1559/0.1561/0.0002    0.0292/0.0293/0.0293/0.0000        6,670,239
 4     41.647%  0.1786/0.1792/0.1793/0.0002    0.0285/0.0286/0.0286/0.0000        6,553,746
 5     41.216%  0.1937/0.1941/0.1944/0.0002    0.0284/0.0284/0.0285/0.0000        6,485,936
 6     41.038%  0.2396/0.2401/0.2405/0.0002    0.0281/0.0281/0.0282/0.0000        6,457,827
 7     40.778%  0.3363/0.3368/0.3371/0.0002    0.0283/0.0283/0.0284/0.0000        6,416,941
 8     40.704%  0.4440/0.4446/0.4449/0.0002    0.0283/0.0284/0.0284/0.0000        6,405,249
 9     40.409%  0.5246/0.5253/0.5256/0.0002    0.0276/0.0276/0.0276/0.0000        6,358,951

 avg1  42.915%                       0.2537                         0.0289
 tot                                68.5058                         7.8050       60,779,336

   text    data     bss     dec     hex filename
 159310    1344       8  160662   27396 libz-ng.so.2
 
compress_bench/compress_bench/1                 3736 ns         3736 ns       749449
compress_bench/compress_bench/8                 3987 ns         3987 ns       701651
compress_bench/compress_bench/16                4157 ns         4158 ns       672641
compress_bench/compress_bench/32                4485 ns         4485 ns       623899
compress_bench/compress_bench/64                4806 ns         4806 ns       582874
compress_bench/compress_bench/512               4849 ns         4849 ns       577258
compress_bench/compress_bench/4096              5354 ns         5354 ns       521563
compress_bench/compress_bench/32768             9852 ns         9852 ns       283820

Deflatebench shows about 0.2% speedup.
compress_bench shows about 0.6% to 1.65% speedup.

Copy link
Copy Markdown
Member

@nmoinvaz nmoinvaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, might be a good idea to mention a comment about indirection like you did in your most recent PR.

@Dead2 Dead2 force-pushed the insert_string_opt branch from 190df18 to e079160 Compare August 18, 2025 20:54
@Dead2 Dead2 merged commit 65eec04 into develop Aug 20, 2025
285 of 292 checks passed
@Dead2 Dead2 deleted the insert_string_opt branch August 23, 2025 17:22
@Dead2 Dead2 mentioned this pull request Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants