asami
RoaringBitmap
Our great sponsors
asami | RoaringBitmap | |
---|---|---|
6 | 24 | |
626 | 3,388 | |
0.6% | 1.7% | |
0.0 | 8.5 | |
about 2 years ago | 7 days ago | |
Clojure | Java | |
Eclipse Public License 1.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
asami
- Ask HN: What are some 'cool' but obscure data structures you know about?
-
Ask HN: Why are relational DBs are the standard instead of graph-based DBs?
Unlike some other commenters, I agree that graph models are usually a better fit for most data than relational models. There's been some interesting work in recent years developing this idea: in the Clojure world there's Datomic, XTDB, and a host of competitors, all of which build on work from Semantic Web/SPARQL/triplestores and logic programming. Some are even intended to be used as primary datastores: they support some amount of schema and constraints, have well-defined consistency and ACID guarantees, etc. This makes them unlike graph databases like Neo4J and others, which fill an architectural role more like Elasticsearch as a read-optimization tool. Here's an interesting talk making a case for triple-based databases.
- Introduction to the Asami Graph Database
-
How to query Datomic, Datascript, Asami, or other graph databases
Despite the documentation that exists, I've heard many people who have been confused about how to query Datomic, Datascript, Asami, or other graph databases. So I've made an attempt at explaining it https://github.com/threatgrid/asami/wiki/Introduction
- Introduction (To Graph Databases)
-
Asami
The first Graph implementation for Asami was a simple in-memory data structure, described in my ClojureD talk. The code for this appears in asami.index. This file started much smaller (as referenced above), but has since expanded with the needs extended functionality, such as transactions, and transitive closure operations.
RoaringBitmap
-
Iterating over Bit Sets Quickly
I was recently reading about Roaring https://roaringbitmap.org/ which is a highly optimized compressed bitset implementation. I reccomend reading about it if you are interested in this sort of thing. The talk at https://roaringbitmap.org/talks/ is especially good.
- Roaring Bitmaps
- Roaring bitmaps are compressed bitmaps, can be 100x faster
-
What feature would you like to remove in C++26?
However, I would love compressed (not just packed) bitsets too, which is something different to me. I would make it another class with a similar interface, based on something like roaring. It doesn't need to be in the standard, but it would be nice if the API was a such that one could easily swap implementations.
-
Jaccard Index
As an aside if you find yourself having to compute them on the fly, know that the Roaring Bitmaps libraries is the way to go [1]. The bitmaps are compressed, and can be streamed directly into SIMD computations (batching XORs and popcnts 256 bits wide!). The Jaccard index is just intersection_len / union_len [2] away
[1] https://roaringbitmap.org/
[2] https://roaringbitmap.readthedocs.io/en/latest/#roaringbitma...
-
Looking for fast, space-efficient key-lookup
Use a two stage approach, with a bloom/cuckoo filter stored as a https://roaringbitmap.org/ in memory. Then a secondary key/value store on disk (bolt or anything else).
-
BitSet Vs BigInteger
As an aside, if you're dealing with large bit sets, you might also want to evaluate Roaring Bitmaps.
-
Negative Incentives in Academic Research
Sidetracking a bit the conversation. What a coincidence that the author (Lemire) is also represented on Today's #1 "Ask HN: What are some cool but obscure data structures you know about?" as he is the main contributor of RoaringBitmap https://github.com/RoaringBitmap/RoaringBitmap and one of the main authors of the data structure.
- Ask HN: What are some 'cool' but obscure data structures you know about?
- Roaring bitmaps: A better compressed bitset
What are some alternatives?
datascript - Immutable database and Datalog query engine for Clojure, ClojureScript and JS
HyperMinHash-java - Union, intersection, and set cardinality in loglog space
crux - General purpose bitemporal database for SQL, Datalog & graph queries. Backed by @juxt [Moved to: https://github.com/xtdb/xtdb]
lucene - Apache Lucene open-source search software
datahike - A durable Datalog implementation adaptable for distribution.
CQEngine - Ultra-fast SQL-like queries on Java collections
datalevin - A simple, fast and versatile Datalog database
Primes - Prime Number Projects in C#/C++/Python
Apache AGE - Graph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL. [Moved to: https://github.com/apache/age]
Feign - Feign makes writing java http clients easier
naga - Datalog based rules engine
maven-compiler-plugin - Apache Maven Compiler Plugin