The ORA_HASH function in Oracle SQL is a powerful function used to compute a consistent "hash value" (a number) for any given expression.
Its main purpose is to help you divide data into random, evenly distributed groups (or "buckets"). This is extremely useful for generating a random sample of your data (e.g., "get a random 10% of all users") or for dividing large data sets for parallel processing.
What is the ORA_HASH Function in Oracle?
The ORA_HASH(expr, [max_bucket], [seed_value]) function takes an expression (like a user ID, a name, or any combination of columns) and generates a large number.
The key feature is the max_bucket argument. By setting this, you tell the function to return a number between 0 and max_bucket. This effectively and (mostly) evenly distributes all your rows into a set number of buckets.
expr: The value you want to hash (e.g.,user_id).max_bucket: The highest bucket number you want. If you setmax_bucketto9, the function will return a number between0and9, creating 10 buckets.seed_value(Optional): A number you can change to get a different set of hash results for the same data.
ORA_HASH Function Syntax
The syntax for ORA_HASH is as follows:
ORA_HASH(expr [, max_bucket] [, seed_value])
Let's break that down:
expr: The expression or column to hash.[max_bucket](Optional): An integer from 0 to 4294967295 that defines the highest bucket number to return. This is the most important argument for sampling.[seed_value](Optional): An integer to "seed" the hash. Changing this changes the results.
Oracle ORA_HASH Function Examples
Here are two practical examples of how to use ORA_HASH.
Example 1: Creating a 10% Random Sample using ORA_HASH
This is the most common use case. If you want a random 10% sample of your data, you can create 10 buckets (numbered 0-9) and then select all the rows from just one of those buckets (e.g., bucket 0).
Query: (This query hashes the customer_id and puts each row into one of 10 buckets (0-9). It then selects only the rows that landed in bucket 0.)
SELECT
customer_id,
cust_last_name
FROM
customers
WHERE
ORA_HASH(customer_id, 9) = 0; -- 9 is the max bucket, giving 10 total buckets
Result: (The query will return approximately 10% of all customers, randomly distributed.)
Example 2: Distributing Data into 4 Groups using ORA_HASH
This example shows how to divide data into a set number of groups. We'll use ORA_HASH to assign a "Group Number" from 0 to 3 to each employee.
Query:
-- We set max_bucket to 3, which creates 4 buckets: 0, 1, 2, and 3.
SELECT
last_name,
ORA_HASH(employee_id, 3) AS "Group_Number"
FROM
employees
WHERE ROWNUM <= 8; -- Just show the first 8 for this example
Result: (Each employee is randomly but consistently assigned to a group between 0 and 3.)
LAST_NAME Group_Number
------------------------- ------------
King 1
Kochhar 3
De Haan 0
Hunold 1
Ernst 0
Lorentz 0
Pataballa 1
Abel 2