critbit
RoaringBitmap
critbit | RoaringBitmap | |
---|---|---|
3 | 24 | |
330 | 3,390 | |
- | 0.9% | |
0.0 | 8.5 | |
over 2 years ago | 14 days ago | |
C | Java | |
- | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
critbit
-
Ask HN: What are some 'cool' but obscure data structures you know about?
> Good use-case: routing. Say you have a list of 1 million IPs that are [deny listed].
Apparently, bloom filters make for lousy IP membership checks, read: https://blog.cloudflare.com/when-bloom-filters-dont-bloom/
CritBit Trie [0] and possibly Allotment Routing Table (ART) are better suited for IPs.
[0] https://github.com/agl/critbit
[1] https://web.archive.org/web/20210720162224/https://www.harig...
-
Rethink-app: DNS over HTTPS, firewall, and connection tracker for Android
developer here
I'd imagine the app should work over IPv6-only networks thanks to 464xlat. I may be wrong, because I've never tested it on a IPv6-only network.
The reason for IPv6 is two fold:
1. Firewall today simply stores classless IP address rules as strings in a sqlite table fronted by a lfu cache backed by a typical hash-map. With IPv6, I'd imagine, this won't scale. So, we need a more economical in-memory data-structure (like a crit-bit trie [0] or art tree).
2. Apparently LwIP has problems with HappyEyeballs (I personally never saw it, but got a couple of reports from users about it that it was an unrecoverable error once the connectivity was lost, and the firewall had to be restarted). We're in the process of replacing LwIP with gvisor/netstack now [2], just to get IPv6 support back on track.
[0] https://github.com/agl/critbit
[1] http://www.hariguchi.org/art/art.pdf
[2] https://github.com/celzero/firestack/issues/3
- Critbit Trees in C(WEB)
RoaringBitmap
-
Iterating over Bit Sets Quickly
I was recently reading about Roaring https://roaringbitmap.org/ which is a highly optimized compressed bitset implementation. I reccomend reading about it if you are interested in this sort of thing. The talk at https://roaringbitmap.org/talks/ is especially good.
- Roaring Bitmaps
- Roaring bitmaps are compressed bitmaps, can be 100x faster
-
What feature would you like to remove in C++26?
However, I would love compressed (not just packed) bitsets too, which is something different to me. I would make it another class with a similar interface, based on something like roaring. It doesn't need to be in the standard, but it would be nice if the API was a such that one could easily swap implementations.
-
Jaccard Index
As an aside if you find yourself having to compute them on the fly, know that the Roaring Bitmaps libraries is the way to go [1]. The bitmaps are compressed, and can be streamed directly into SIMD computations (batching XORs and popcnts 256 bits wide!). The Jaccard index is just intersection_len / union_len [2] away
[1] https://roaringbitmap.org/
[2] https://roaringbitmap.readthedocs.io/en/latest/#roaringbitma...
-
Looking for fast, space-efficient key-lookup
Use a two stage approach, with a bloom/cuckoo filter stored as a https://roaringbitmap.org/ in memory. Then a secondary key/value store on disk (bolt or anything else).
-
BitSet Vs BigInteger
As an aside, if you're dealing with large bit sets, you might also want to evaluate Roaring Bitmaps.
-
Negative Incentives in Academic Research
Sidetracking a bit the conversation. What a coincidence that the author (Lemire) is also represented on Today's #1 "Ask HN: What are some cool but obscure data structures you know about?" as he is the main contributor of RoaringBitmap https://github.com/RoaringBitmap/RoaringBitmap and one of the main authors of the data structure.
- Ask HN: What are some 'cool' but obscure data structures you know about?
- Roaring bitmaps: A better compressed bitset
What are some alternatives?
flatbuffers - An implementation of the flatbuffers protocol in Haskell.
HyperMinHash-java - Union, intersection, and set cardinality in loglog space
tables - Deprecated because of
lucene - Apache Lucene open-source search software
rethink-app - DNS over HTTPS / DNS over Tor / DNSCrypt client, WireGuard proxifier, firewall, and connection tracker for Android.
CQEngine - Ultra-fast SQL-like queries on Java collections
semantic-source - Parsing, analyzing, and comparing source code across many languages
Primes - Prime Number Projects in C#/C++/Python
nextstep-plist - Parser and printer for NextStep style plist files
Feign - Feign makes writing java http clients easier
data-treify - Reify a recursive data structure into an explicit graph.
maven-compiler-plugin - Apache Maven Compiler Plugin