BuRR

Bumped Ribbon Retrieval and Approximate Membership Query (by lorenzhs)

BuRR Alternatives

Similar projects and alternatives to BuRR

  1. CPython

    1,451 BuRR VS CPython

    The Python programming language

  2. Nutrient

    Nutrient - The #1 PDF SDK Library. Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.

    Nutrient logo
  3. TablaM

    154 BuRR VS TablaM

    The practical relational programing language for data-oriented applications

  4. clojure

    109 BuRR VS clojure

    The Clojure programming language

  5. entt

    79 BuRR VS entt

    Gaming meets modern C++ - a fast and reliable entity component system (ECS) and much more

  6. AspNetCoreDiagnosticScenarios

    This repository has examples of broken patterns in ASP.NET Core applications

  7. pyroscope

    Discontinued Continuous Profiling Platform. Debug performance issues down to a single line of code [Moved to: https://github.com/grafana/pyroscope] (by pyroscope-io)

  8. Caffeine

    51 BuRR VS Caffeine

    A high performance caching library for Java

  9. CodeRabbit

    CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.

    CodeRabbit logo
  10. ann-benchmarks

    Benchmarks of approximate nearest neighbor libraries in Python

  11. RoaringBitmap

    A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache Pinot, Tablesaw, and many others

  12. multiversion-concurrency-control

    Discontinued Implementation of multiversion concurrency control, Raft, Left Right concurrency Hashmaps and a multi consumer multi producer Ringbuffer, concurrent and parallel load-balanced loops, parallel actors implementation in Main.java, Actor2.java and a parallel interpreter

  13. FusionCache

    FusionCache is an easy to use, fast and robust hybrid cache with advanced resiliency features.

  14. RVS_Generic_Swift_Toolbox

    A Collection Of Various Swift Tools, Like Extensions and Utilities

  15. minisketch

    Minisketch: an optimized library for BCH-based set reconciliation

  16. bloom_cpp

    Bloom Filter implemention in C++

  17. t-digest

    9 BuRR VS t-digest

    A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means

  18. PSI

    3 BuRR VS PSI

    Private Set Intersection Cardinality protocol based on ECDH and Bloom Filters (by OpenMined)

  19. asami

    6 BuRR VS asami

    A graph store for Clojure and ClojureScript

  20. sdsl-lite

    5 BuRR VS sdsl-lite

    Succinct Data Structure Library 2.0

  21. rust-bfield

    B-field implementation in Rust

  22. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better BuRR alternative or higher similarity.

BuRR discussion

Log in or Post with

BuRR reviews and mentions

Posts with mentions or reviews of BuRR. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-05-22.
  • Show HN: B-field, a probabilistic key-value data structure (`rust-bfield`)
    5 projects | news.ycombinator.com | 22 May 2024
    Very interesting and I'll have to read more to understand how it fully works, but _initially_ the space requirements doesn't seem too impressive? Am I missing something here? Maybe the solution here is more flexible?

    One alternative approach for many of these problems is to start with a perfect minimal hash function which hashes your key into a unique number [0, N) and then have a packed array of size N where each element is of a fixed byte size. To look up the value you first use the hash function to get an index, and then you look up in the packed array. There's also no error rate here: This is exact.

    PTHash (https://arxiv.org/abs/2104.10402) needs roughly ~4 bits per element to create a perfect minimal hash function.

    > Store 1 billion web URLs and assign each of them one of a small number of categories values (n=8) in 2.22Gb (params include ν=8, κ=1, =0.1%; 19 bits per element)

    Assuming that "n=8" here means "8 bits" we need 1GB (8 bits * billion) to represent all of the values, and then 500 MB for the hash function (4 bits * billion).

    I also don't quite understand what "2.22Gb" here refers to. 19 bits * billion = 2.357 SI-giga bytes = 19 SI-giga bits = 2.212 gibi bytes.

    > Store 1 billion DNA or RNA k-mers ("ACGTA...") and associate them with any of the ~500k bacterial IDs current described by NCBI in 6.93Gb (ν=62, κ=4, =0.1%; 59 bits per element)

    "~500k bacterial ID" can be represented with 19 bits. 1 billion of these take ~2.3GB, and then we have the additional 500MB for the perfect hash function.

    Another data structure which is even more fine-tuned for this problem space is Bumped Ribbon Retrieval (https://arxiv.org/abs/2109.01892) where they have <1% overhead over just storing the plain bit values.

  • Ask HN: What are some 'cool' but obscure data structures you know about?
    54 projects | news.ycombinator.com | 21 Jul 2022
    If you enjoyed XOR filters, you might also like ribbon filters, something that I had the pleasure of working on last year. They share the basic idea of using a system of linear equations, but instead of considering 3 random positions per key, the positions to probe are narrowly concentrated along a ribbon with a typical width of 64. This makes them far more cache-efficient to construct and query.

    By purposefully overloading the data structure by a few per cent and bumping those items that cannot be inserted as a result of this overloading to the next layer (making this a recursive data structure), we can achieve almost arbitrarily small space overheads: <1% is no problem for the fast configurations, and <0.1% overhead with around 50% extra runtime cost. This compares to around 10% for XOR filters and ≥ 44% for Bloom filters.

    In fact, I'm going to present them at a conference on Monday - the paper is already out: https://drops.dagstuhl.de/opus/volltexte/2022/16538/pdf/LIPI... and the implementation is at https://github.com/lorenzhs/BuRR/. I hope this isn't too much self-promotion for HN, but I'm super hyped about this :)

Stats

Basic BuRR repo stats
2
40
4.7
2 months ago

Sponsored
Nutrient - The #1 PDF SDK Library
Bad PDFs = bad UX. Slow load times, broken annotations, clunky UX frustrates users. Nutrient’s PDF SDKs gives seamless document experiences, fast rendering, annotations, real-time collaboration, 100+ features. Used by 10K+ devs, serving ~half a billion users worldwide. Explore the SDK for free.
nutrient.io

Did you know that C++ is
the 7th most popular programming language
based on number of references?