Unroll some of the adler checksum for avx2#1949
Conversation
|
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. 📜 Recent review detailsConfiguration used: CodeRabbit UI 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (100)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
On a meager U class Haswell chip: BeforeAfterOn HEDT class hardware (Cascake Lake X): BeforeAfter |
Similar to what's done for vmx, avx512, and sse4, let's unroll some of this checksum since it's a commutative checksum. We take advantage of ILP and do more intermediate sums before rolling them back together for the finalization of the checksum.
d6bb724 to
352fcec
Compare
|
Nice work! |
|
Tested on i7-11700K, compiled without AVX512* to enforce AVX2 code usage. Deflatebench benchmark differences were negligible (using minideflate), and within measurement errors. DevelopPRNo idea how the PR one results in a smaller compiled code size.. compress_bench seems not to have changed much at all. |
Similar to what's done for vmx, avx512, and sse4, let's unroll some of this checksum since it's a commutative checksum. We take advantage of ILP and do more intermediate sums before rolling them back together for the finalization of the checksum.
Summary by CodeRabbit