Bloom filters are
space-efficient probablistic data structures used to test whether an element
is a member of a set.
They're surprisingly simple: take an array of m
bits, and for up to n different elements, either test or set
k bits using positions chosen using hash functions. If all
bits are set, the element probably already exists, with a false positive
rate of p; if any of the bits are not set, the element
certainly does not exist.
Bloom filters find a wide range of uses, including tracking which
articles you've read,
speeding up Bitcoin clients,
detecting malicious web sites,
and improving the performance of caches.
This page will help you choose an optimal size for your filter, or explore
how the different parameters interact.
n = 1,000,000
p = 0 (1 in 99,938,173,200,325,296,128)
m = 95,850,584 (11.43MiB)
k = 66