Skip to content

pateldivyesh1323/bloom_filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bloom Filter Dictionary Lookup

A simple implementation of a Bloom Filter data structure in C for fast dictionary word lookups.

What is a Bloom Filter?

A Bloom Filter is a space-efficient probabilistic data structure that tells you if an element is "definitely not in a set" or "possibly in a set". It can have false positives but never false negatives.

Features

  • Fast word existence checking
  • Memory efficient (uses ~1.4 MB for 1M+ words)
  • 1% false positive rate
  • Uses djb2 and sdbm hash functions

Files

  • bloom_filter.c - Main implementation
  • bloom_filter.h - Header file with definitions
  • words.txt - Dictionary file (~21MB)

How to Build

gcc bloom_filter.c -lm -o bloom_filter

How to Use

Run the program:

./bloom_filter

Then type words to check if they exist in the dictionary:

Enter a word to search in dictionary: hello
(True Positive) Word hello exist.
Definition: used as a greeting or to begin a phone conversation
Time taken: 0.001234 seconds

Enter a word to search in dictionary: asdfgh
(True Negative) Word asdfgh does not exist.
Time taken: 0.000567 seconds

Results Explained

  • True Positive: Word exists in both bloom filter and dictionary
  • False Positive: Bloom filter says yes, but word not actually in dictionary (rare)
  • True Negative: Word definitely doesn't exist

Technical Details

  • Expected elements: 1,000,000
  • False positive rate: 1% (0.01)
  • Hash functions: 7 (calculated automatically)
  • Memory usage: ~1.4 MB

How it Works

  1. Dictionary words are loaded and hashed into a bit array
  2. When you search a word, it's hashed the same way
  3. If any required bits are 0, the word definitely doesn't exist
  4. If all required bits are 1, the word probably exists (checked against actual dictionary)

This makes lookups extremely fast while using minimal memory!

About

Implementation of Bloom filter in C

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages