public class KeySelector.GenericWordSelector extends Object implements KeySelector.WordSelector
This implementation uses a variable stride algorithm to extract k word positions from key data. It's suitable for general values of m (filter size) and k (number of hash functions) within reasonable bounds.
Algorithm overview:
- Uses stride of (m-5) bits between word positions
- Processes key data in multi-byte chunks as needed
- Handles cases where words span across byte boundaries
- Employs lookup tables for efficient bit manipulation
The stride calculation (m-5) ensures optimal distribution of word positions across the key space while avoiding overlap with the 5-bit bit selector stride.
Constraints:
- Maximum supported values: m=23, k=11 for 32-byte keys
- Formula constraint: ((5k + (k-1)(m-5)) / 8) + 2 ≤ keySizeInBytes
- Larger values may cause ArrayIndexOutOfBoundsException
Thread Safety: This implementation is thread-safe as it operates only on method parameters and local variables.
| Constructor and Description |
|---|
GenericWordSelector()
Default constructor.
|
| Modifier and Type | Method and Description |
|---|---|
void |
getWordSelectors(byte[] b,
int offset,
int length,
int[] wordOffset)
Extracts k word offsets from key data using variable stride algorithm.
|
public void getWordSelectors(byte[] b,
int offset,
int length,
int[] wordOffset)
This method processes the key data to calculate word positions for Bloom filter operations. Each word offset represents the starting bit position for one of the k hash functions.
Algorithm details:
- Calculates stride as (m-5) bits between word positions
- Processes key data in chunks spanning multiple bytes when needed
- Uses bit manipulation to extract word-aligned bit groups
- Handles edge cases where words cross byte boundaries
getWordSelectors in interface KeySelector.WordSelectorb - key data as byte arrayoffset - starting position within key arraylength - number of bytes to processwordOffset - output array of length k to store calculated word offsets