Bloom Filters – Much, much more than a space efficient hashmap

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

fastfilter_cpp

2 244 5.5 C++

Fast Approximate Membership Filters (C++)

There are many alternatives to Bloom filters, but some variants of Bloom filters are still competitive. I'm one of the authors of some benchmarks for filters: https://github.com/FastFilter/fastfilter_cpp (this is based on the cuckoo filter benchmark) and https://github.com/FastFilter/fastfilter_java
For static sets (where you construct the filter once and then use it for lookup), blocked Bloom filters are the fastest, for lookup. They do need a bit more space (maybe 10% more than Bloom filters). Also very fast are binary fuse filters (which are new), and xor filters. Cuckoo filters, ribbon filters, and Bloom filters are a bit slower.
For dynamic sets (where you can add and remove entries later), the fastest (again for lookup) are probably "Succinct counting blocked Bloom filter" (no paper yet for this): they are a combination of blocked Bloom filters and counting Bloom filters, so lookup is identical to the blocked Bloom filter. Then cuckoo filters, and counting Bloom filters.

fastfilter_java

1 235 5.9 Java

Fast Approximate Membership Filters (Java)

There are many alternatives to Bloom filters, but some variants of Bloom filters are still competitive. I'm one of the authors of some benchmarks for filters: https://github.com/FastFilter/fastfilter_cpp (this is based on the cuckoo filter benchmark) and https://github.com/FastFilter/fastfilter_java
For static sets (where you construct the filter once and then use it for lookup), blocked Bloom filters are the fastest, for lookup. They do need a bit more space (maybe 10% more than Bloom filters). Also very fast are binary fuse filters (which are new), and xor filters. Cuckoo filters, ribbon filters, and Bloom filters are a bit slower.
For dynamic sets (where you can add and remove entries later), the fastest (again for lookup) are probably "Succinct counting blocked Bloom filter" (no paper yet for this): they are a combination of blocked Bloom filters and counting Bloom filters, so lookup is identical to the blocked Bloom filter. Then cuckoo filters, and counting Bloom filters.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
xor_singleheader

2 329 6.1 C

Header-only binary fuse and xor filter library

Yes. Some variants of ribbon filters are very fast, and other variants (which are a bit slower thought) need very little space: only a few percent more than the theoretical lower bound.
For static sets, ribbon filters and binary fuse filters (e.g. here: https://github.com/FastFilter/xor_singleheader) are very competitive. Both are based on recent (2019 and newer) theoretical work from Stefan Walzer, e.g. this one https://arxiv.org/pdf/1907.04749.pdf

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Using Redpanda with OpenTelemetry and Grafana for real-time event monitoring

1 project | dev.to | 4 May 2024
Visualizing C++ data structures during debugging

1 project | news.ycombinator.com | 3 May 2024
Add support for Qualcomm Oryon processor

1 project | news.ycombinator.com | 3 May 2024
LocalAI: Self-hosted OpenAI alternative reaches 2.14.0

1 project | news.ycombinator.com | 3 May 2024
How to switch themes in Flutter using BLoC

1 project | dev.to | 3 May 2024