RoaringBitmap
Deeplearning4j
RoaringBitmap | Deeplearning4j | |
---|---|---|
24 | 13 | |
3,388 | 13,427 | |
1.7% | 0.5% | |
8.5 | 5.8 | |
9 days ago | 3 days ago | |
Java | Java | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
RoaringBitmap
-
Iterating over Bit Sets Quickly
I was recently reading about Roaring https://roaringbitmap.org/ which is a highly optimized compressed bitset implementation. I reccomend reading about it if you are interested in this sort of thing. The talk at https://roaringbitmap.org/talks/ is especially good.
- Roaring Bitmaps
- Roaring bitmaps are compressed bitmaps, can be 100x faster
-
What feature would you like to remove in C++26?
However, I would love compressed (not just packed) bitsets too, which is something different to me. I would make it another class with a similar interface, based on something like roaring. It doesn't need to be in the standard, but it would be nice if the API was a such that one could easily swap implementations.
-
Jaccard Index
As an aside if you find yourself having to compute them on the fly, know that the Roaring Bitmaps libraries is the way to go [1]. The bitmaps are compressed, and can be streamed directly into SIMD computations (batching XORs and popcnts 256 bits wide!). The Jaccard index is just intersection_len / union_len [2] away
[1] https://roaringbitmap.org/
[2] https://roaringbitmap.readthedocs.io/en/latest/#roaringbitma...
-
Looking for fast, space-efficient key-lookup
Use a two stage approach, with a bloom/cuckoo filter stored as a https://roaringbitmap.org/ in memory. Then a secondary key/value store on disk (bolt or anything else).
-
BitSet Vs BigInteger
As an aside, if you're dealing with large bit sets, you might also want to evaluate Roaring Bitmaps.
-
Negative Incentives in Academic Research
Sidetracking a bit the conversation. What a coincidence that the author (Lemire) is also represented on Today's #1 "Ask HN: What are some cool but obscure data structures you know about?" as he is the main contributor of RoaringBitmap https://github.com/RoaringBitmap/RoaringBitmap and one of the main authors of the data structure.
- Ask HN: What are some 'cool' but obscure data structures you know about?
- Roaring bitmaps: A better compressed bitset
Deeplearning4j
- Deeplearning4j Suite Overview
- Java for ML?
-
Best way to combine Python and Java?
Have you considered migrating off of Python to just using JVM ML libraries then? I hear good things about Deeplearning4j, but there's quite a few.
-
Anybody here using Java for machine learning?
I've gone to the linux workflow as directed in the docs and reconstructed the maven command line:
-
Data Science Competition
DL4J
-
Java Matrix Benchmark is Updated! See how linear algebra libraries compare for speed
Hey folks, just letting you know we see this thread and I appreciate you guys running these benchmarks. I'm not seeing any of your posts on our forums. I think I saw a notification from our examples but we do not actually monitor that. Please use: https://community.konduit.ai/ or at least the main repo dl4j issues: https://github.com/eclipse/deeplearning4j/issues and you'll get a lot more visibility. Thanks!
-
Does Java has similar project like this one in C#? (ml, data)
Also, the website is now redirected to: https://deeplearning4j.konduit.ai/
-
If it gets better w age, will java become compatible for machine learning and data science?
On top of this several popular projects have been built. This includes tensorflow-java and our project eclipse deeplearning4j: https://github.com/eclipse/deeplearning4j
-
Matrices multiplication benchmark: Apache math vs colt vs ejml vs la4j vs nd4j
Nd4j is actively developed. The latest commit was 6 hours ago. Nd4j is part of deeplearning4j which is now owned by eclipse (but the main contributors are from a company) https://github.com/eclipse/deeplearning4j/tree/master/nd4j
What are some alternatives?
HyperMinHash-java - Union, intersection, and set cardinality in loglog space
Deep Java Library (DJL) - An Engine-Agnostic Deep Learning Framework in Java
lucene - Apache Lucene open-source search software
Weka
CQEngine - Ultra-fast SQL-like queries on Java collections
tensorflow - An Open Source Machine Learning Framework for Everyone
Primes - Prime Number Projects in C#/C++/Python
Smile - Statistical Machine Intelligence & Learning Engine
Feign - Feign makes writing java http clients easier
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
maven-compiler-plugin - Apache Maven Compiler Plugin
Apache Mahout - Mirror of Apache Mahout