Make bloom filter more effective. #193

dbaarda · 2020-04-30T04:53:06Z

Use the upper bits of the hash as the bloom filter index instead of using the lower bits used for the hashtable index.

This means we don't use the same index for the bloom filter and hashtable which ensures we don't always start probing the hashtable after bloom-filter hits at known non-empty entries. This is similar to using a "different hash" for each "k" of a bloom filter. The hashtable probes have a chance of immediately hitting an empty entry, significantly reducing the number of weaksum compares.

Also add tmask and bshift attributes to hashtable_t so we don't need to recalculate them for every probe.

Mix the hash's upper 16bits with the lower 16bits as a different hash for the bloom filter. This means bloomfilter hits don't necessarily align with occupied hashtable entries, possibly avoiding multiple hashtable probes to detect a hashtable miss.

Add and use tmask attribute for getting the hashtable index from a hash. Add and use bshift attribute for getting the bloomfilter bit index from a hash. This means we use the upper bits of the hash instead of bit-mixing and masking.

For a minimum of 1 and an empty file, the mask is empty, resulting in a bshift with as many bits as unsigned hash. In C shifting by all the bits is undefined.

dbaarda added 3 commits April 30, 2020 10:05

Change minimum hashtable size to 2.

e8328e9

For a minimum of 1 and an empty file, the mask is empty, resulting in a bshift with as many bits as unsigned hash. In C shifting by all the bits is undefined.

dbaarda merged commit 9cf9ca3 into librsync:master Apr 30, 2020

dbaarda deleted the opt/hashtable3 branch May 14, 2020 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make bloom filter more effective. #193

Make bloom filter more effective. #193

Uh oh!

dbaarda commented Apr 30, 2020 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Make bloom filter more effective. #193

Make bloom filter more effective. #193

Uh oh!

Conversation

dbaarda commented Apr 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dbaarda commented Apr 30, 2020 •

edited

Loading