-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
cleora
Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.
-
LAGraph
This is a library plus a test harness for collecting algorithms that use the GraphBLAS. For test coverage reports, see https://graphblas.org/LAGraph/ . Documentation: https://lagraph.readthedocs.org
Besides, they implemented a fast C++ version of the code that works for much larger graphs. If one searches for ProNE's implementation, they would (hypothetically) find the scikit-style wrapper instead of the fully-functional release. It reminds me of a situation with HOPE, when authors of one survey "implemented" it as naive SVD (https://github.com/palash1992/GEM/blob/master/gem/embedding/hope.py#L68) instead of Jacobi-Davidson generalized solver described in the paper (and literally with code released!!). In the end, I would assume that poor paper was less cited because of that repackaging effort.
First, I compared the speed (on a 6-core Mac, once, not scientific benchmarking, beware) of your library and a 3 year old standalone implementation I remember I once linked to you when you were posting about your library here (1 year ago? idk) https://github.com/xgfs/node2vec-c . The timings are (wall time) 17min 48s for your library and 4min 34s for the above code. That's (in?)famous Blogcatalog data, since I had that lying around. Note that there is also a node2vec implementation in SNAP and countless more on github. Is there any benchmark showing your version is faster than them?
Thanks for raising so many interesting points about model performance and complexity. In this context, I think our newly released graph embedding library - Cleora - might be of interest: https://github.com/Synerise/cleora Cleora has some nice performance-wise properties:
I work on GraphBLAS, primarily on its LAGraph library and on tutorials. In the last few years, the GraphBLAS community has made a lot of progress on more efficient sparse matrix algorithms and porting graph algorithms to linear algebra – I hope LAGraph can play the role of a more efficient NetworkX in the future. The output of most LAGraph algorithms is a bunch of vectors/matrices so piping these into machine learning algorithms should be possible (and probably more efficient than using other representations).
Related posts
-
Ask HN: AI to study my DSL and then output it?
-
Cleora - an ultra fast graph embedding tool written in Rust
-
Cleora.ai - open source general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data - new updates
-
Cleora - an ultra fast graph embedding tool written in Rust
-
[R] Cleora: A Simple, Strong and Scalable Graph Embedding Scheme