bloom_cpp VS BuRR

Compare bloom_cpp vs BuRR and see what are their differences.

CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
bloom_cpp BuRR
1 2
3 42
- -
5.3 5.3
over 1 year ago 6 days ago
C++ C++
MIT License Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

bloom_cpp

Posts with mentions or reviews of bloom_cpp. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-03-17.

BuRR

Posts with mentions or reviews of BuRR. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-05-22.
  • Show HN: B-field, a probabilistic key-value data structure (`rust-bfield`)
    5 projects | news.ycombinator.com | 22 May 2024
    Very interesting and I'll have to read more to understand how it fully works, but _initially_ the space requirements doesn't seem too impressive? Am I missing something here? Maybe the solution here is more flexible?

    One alternative approach for many of these problems is to start with a perfect minimal hash function which hashes your key into a unique number [0, N) and then have a packed array of size N where each element is of a fixed byte size. To look up the value you first use the hash function to get an index, and then you look up in the packed array. There's also no error rate here: This is exact.

    PTHash (https://arxiv.org/abs/2104.10402) needs roughly ~4 bits per element to create a perfect minimal hash function.

    > Store 1 billion web URLs and assign each of them one of a small number of categories values (n=8) in 2.22Gb (params include ν=8, κ=1, =0.1%; 19 bits per element)

    Assuming that "n=8" here means "8 bits" we need 1GB (8 bits * billion) to represent all of the values, and then 500 MB for the hash function (4 bits * billion).

    I also don't quite understand what "2.22Gb" here refers to. 19 bits * billion = 2.357 SI-giga bytes = 19 SI-giga bits = 2.212 gibi bytes.

    > Store 1 billion DNA or RNA k-mers ("ACGTA...") and associate them with any of the ~500k bacterial IDs current described by NCBI in 6.93Gb (ν=62, κ=4, =0.1%; 59 bits per element)

    "~500k bacterial ID" can be represented with 19 bits. 1 billion of these take ~2.3GB, and then we have the additional 500MB for the perfect hash function.

    Another data structure which is even more fine-tuned for this problem space is Bumped Ribbon Retrieval (https://arxiv.org/abs/2109.01892) where they have <1% overhead over just storing the plain bit values.

  • Ask HN: What are some 'cool' but obscure data structures you know about?
    54 projects | news.ycombinator.com | 21 Jul 2022
    If you enjoyed XOR filters, you might also like ribbon filters, something that I had the pleasure of working on last year. They share the basic idea of using a system of linear equations, but instead of considering 3 random positions per key, the positions to probe are narrowly concentrated along a ribbon with a typical width of 64. This makes them far more cache-efficient to construct and query.

    By purposefully overloading the data structure by a few per cent and bumping those items that cannot be inserted as a result of this overloading to the next layer (making this a recursive data structure), we can achieve almost arbitrarily small space overheads: <1% is no problem for the fast configurations, and <0.1% overhead with around 50% extra runtime cost. This compares to around 10% for XOR filters and ≥ 44% for Bloom filters.

    In fact, I'm going to present them at a conference on Monday - the paper is already out: https://drops.dagstuhl.de/opus/volltexte/2022/16538/pdf/LIPI... and the implementation is at https://github.com/lorenzhs/BuRR/. I hope this isn't too much self-promotion for HN, but I'm super hyped about this :)

What are some alternatives?

When comparing bloom_cpp and BuRR you can also consider the following projects:

munt-official - Munt is a witness-secured decentralized network for payments, digital assets, finance and more

AspNetCoreDiagnosticScenarios - This repository has examples of broken patterns in ASP.NET Core applications

Th3inspector - Th3Inspector 🕵️ Best Tool For Information Gathering 🔎

RVS_Generic_Swift_Toolbox - A Collection Of Various Swift Tools, Like Extensions and Utilities

iRingo - 解锁完整的 Apple功能和集成服务

clojure - The Clojure programming language

CodeRabbit: AI Code Reviews for Developers
Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
coderabbit.ai
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured

Did you know that C++ is
the 7th most popular programming language
based on number of references?