-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Labels
Description
We have a lot of internal logic for hashing inputs and it might be nice to expose some of this to users (e.g. https://stackoverflow.com/questions/72177022/how-to-get-hash-of-string-column-in-polars-or-pyarrow)
The HashBatch method in key_hash.h (not quite merged but close) is likely to be the most performant. However, it does make some sacrifices on uniqueness of hashes in the spirit of performance (so we should make sure to document these).
Reporter: Weston Pace / @westonpace
Related issues:
- [C++][Compute] Add scalar_hash function (duplicates)
Note: This issue was originally created as ARROW-16513. Please see the migration documentation for further details.