feat: expose skip_sha256 parameter in Python upload API#705
Merged
Conversation
fb9dd90 to
f9aeea5
Compare
…rings Move the bool-to-Sha256Policy conversion to the Python boundary (hf_xet) and accept Sha256Policy directly in data_client and clean_file. This eliminates duplicated conversion logic, adds validation in the Rust layer, and makes clean_file consistent with clean_bytes. - Add Clone/Copy derives and from_skip/from_hex helpers on Sha256Policy - Change upload_bytes_async to accept Sha256Policy instead of bool - Change upload_async to accept Vec<Sha256Policy> instead of Option<Vec<String>> + bool - Change clean_file to accept Sha256Policy instead of impl AsRef<str> - Update all callers: hf_xet, git_xet, xet_pkg, migration_tool, test_utils
e8f063c to
3524451
Compare
Mirror the sha256s support from upload_files() to upload_bytes(), allowing callers to pass pre-computed SHA-256 hashes for byte uploads. Also aligns upload_bytes_async to accept Vec<Sha256Policy> (per-file) like upload_async.
Collaborator
|
Thank you! Will unlock huggingface/huggingface_hub#3900 and huggingface/huggingface_hub#3876 |
seanses
approved these changes
Mar 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
skip_sha256andsha256sparameters toupload_bytes()Python binding for per-file SHA-256 policies:skip_sha256: bool = False- Skip SHA-256 computation entirely (setsSha256Policy::Skip)sha256s: Optional[List[str]] = None- Provide pre-computed SHA-256 hashes (companion to existing parameter onupload_files())Changes
Python binding changes:
skip_sha256+sha256sparams toupload_bytes()/upload_files()Internal refactoring:
Clone/Copyderives +from_skip()/from_hex()helpers toSha256Policyupload_bytes_async,upload_async,clean_fileto useVec<Sha256Policy>git_xet,xet_pkg, migration tool, testsMotivation
huggingface_hubalready knows whether SHA-256 is required. This change enables skipping expensive computation when unnecessary, or passing pre-computed hashes for bulk operations.Companion to #678.