Skip to content

Conversation

@jqnatividad
Copy link
Collaborator

Blake3 is a valid spdx::algorithm that's parallelizable unlike sha256 - which is inherently single-threaded by design.

This makes it much, much faster than sha256.
https://crates.io/crates/blake3

see DOI-DO/dcat-us#225

much much faster than single threaded sha256
including latest qsv-tuned csv fork with more inlines
to be initially used with `describegpt`, and possibly for stats_cache
@jqnatividad jqnatividad requested a review from Copilot October 25, 2025 11:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new BLAKE3 file hashing helper function to leverage BLAKE3's parallelizable hashing capabilities, providing significantly faster performance compared to SHA256 for file hashing operations. The implementation uses memory mapping and Rayon-based parallel processing for optimal performance.

Key Changes

  • Added hash_blake3_file() function that uses BLAKE3's optimized memory-mapped and parallel hashing
  • Added blake3 dependency with rayon and mmap features enabled
  • Included comprehensive test coverage with correctness tests and benchmark tests for various file sizes

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.

File Description
src/util.rs Added hash_blake3_file() function with extensive tests for correctness and performance benchmarking
Cargo.toml Added blake3 dependency with rayon and mmap features

@jqnatividad jqnatividad merged commit 3dbe53c into master Oct 25, 2025
16 of 17 checks passed
@jqnatividad jqnatividad deleted the blake3-hashing-helper branch October 25, 2025 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants