The Hunt for the Missing Data Type

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • GraphBLAS

    SuiteSparse:GraphBLAS: graph algorithms in the language of linear algebra. For production: (default) STABLE branch. Code development: ask me for the right branch before submitting a PR. video intro: https://youtu.be/Tj5y6d7FegI .

  • I think one of the elements that author is missing here is that graphs are sparse matrices, and thus can be expressed with Linear Algebra. They mention adjacency matrices, but not sparse adjacency matrices, or incidence matrices (which can express muti and hypergraphs).

    Linear Algebra is how almost all academic graph theory is expressed, and large chunks of machine learning and AI research are expressed in this language as well. There was recent thread here about PageRank and how it's really an eigenvector problem over a matrix, and the reality is, all graphs are matrices, they're typically sparse ones.

    One question you might ask is, why would I do this? Why not just write my graph algorithms as a function that traverses nodes and edges? And one of the big answers is, parallelism. How are you going to do it? Fork a thread at each edge? Use a thread pool? What if you want to do it on CUDA too? Now you have many problems. How do you know how to efficiently schedule work? By treating graph traversal as a matrix multiplication, you just say Ax = b, and let the library figure it out on the specific hardware you want to target.

    Here for example is a recent question on the NetworkX repo for how to find the boundary of a triangular mesh, it's one single line of GraphBLAS if you consider the graph as a matrix:

    https://github.com/networkx/networkx/discussions/7326

    This brings a very powerful language to the table, Linear Algebra. A language spoken by every scientist, engineer, mathematician and researcher on the planet. By treating graphs like matrices graph algorithms become expressible as mathematical formulas. For example, neural networks are graphs of adjacent layers, and the operation used to traverse from layer to layer is matrix multiplication. This generalizes to all matrices.

    There is a lot of very new and powerful research and development going on around sparse graphs with linear algebra in the GraphBLAS API standard, and it's best reference implementation, SuiteSparse:GraphBLAS:

    https://github.com/DrTimothyAldenDavis/GraphBLAS

    SuiteSparse provides a highly optimized, parallel and CPU/GPU supported sparse Matrix Multiplication. This is relevant because traversing graph edges IS matrix multiplication when you realize that graphs are matrices.

    Recently NetworkX has grown the ability to have different "graph engine" backends, and one of the first to be developed uses the python-graphblas library that binds to SuiteSparse. I'm not a directly contributor to that particular work but as I understand it there has been great results.

  • petgraph

    Graph data structure library for Rust.

  • I used to think that since graphs are such a broad datastructure that can be represented in different ways depending on requirements that it just made more sense to implement them at a domain-ish level.

    Then I saw Petgraph [0] which is the first time I had really looked at a generic graph library. It's very interesting, but I still have implemented graphs at a domain level.

    [0] https://github.com/petgraph/petgraph

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • NetworkX

    Network Analysis in Python

  • I think one of the elements that author is missing here is that graphs are sparse matrices, and thus can be expressed with Linear Algebra. They mention adjacency matrices, but not sparse adjacency matrices, or incidence matrices (which can express muti and hypergraphs).

    Linear Algebra is how almost all academic graph theory is expressed, and large chunks of machine learning and AI research are expressed in this language as well. There was recent thread here about PageRank and how it's really an eigenvector problem over a matrix, and the reality is, all graphs are matrices, they're typically sparse ones.

    One question you might ask is, why would I do this? Why not just write my graph algorithms as a function that traverses nodes and edges? And one of the big answers is, parallelism. How are you going to do it? Fork a thread at each edge? Use a thread pool? What if you want to do it on CUDA too? Now you have many problems. How do you know how to efficiently schedule work? By treating graph traversal as a matrix multiplication, you just say Ax = b, and let the library figure it out on the specific hardware you want to target.

    Here for example is a recent question on the NetworkX repo for how to find the boundary of a triangular mesh, it's one single line of GraphBLAS if you consider the graph as a matrix:

    https://github.com/networkx/networkx/discussions/7326

    This brings a very powerful language to the table, Linear Algebra. A language spoken by every scientist, engineer, mathematician and researcher on the planet. By treating graphs like matrices graph algorithms become expressible as mathematical formulas. For example, neural networks are graphs of adjacent layers, and the operation used to traverse from layer to layer is matrix multiplication. This generalizes to all matrices.

    There is a lot of very new and powerful research and development going on around sparse graphs with linear algebra in the GraphBLAS API standard, and it's best reference implementation, SuiteSparse:GraphBLAS:

    https://github.com/DrTimothyAldenDavis/GraphBLAS

    SuiteSparse provides a highly optimized, parallel and CPU/GPU supported sparse Matrix Multiplication. This is relevant because traversing graph edges IS matrix multiplication when you realize that graphs are matrices.

    Recently NetworkX has grown the ability to have different "graph engine" backends, and one of the first to be developed uses the python-graphblas library that binds to SuiteSparse. I'm not a directly contributor to that particular work but as I understand it there has been great results.

  • hoogle

    Haskell API search engine

  • KLighD

    KIELER Lightweight Diagams

  • >Graph drawing tools

    It's hard

    Graphviz-like generic graph-drawing library. More options, more control.

    https://eclipse.dev/elk/

    Experiments by the same team responsible for the development of ELK, at Kiel University

    https://github.com/kieler/KLighD

    Kieler project wiki

    https://rtsys.informatik.uni-kiel.de/confluence/display/KIEL...

    Constraint-based graph drawing libraries

    https://www.adaptagrams.org/

    JS implementation

    https://ialab.it.monash.edu/webcola/

    Some cool stuff:

    HOLA: Human-like Orthogonal Network Layout

    https://ialab.it.monash.edu/~dwyer/papers/hola2015.pdf

    Confluent Graphs demos: makes edges more readable.

    https://www.aviz.fr/~bbach/confluentgraphs/

    Stress-Minimizing Orthogonal Layout of Data Flow Diagrams with Ports

    https://arxiv.org/pdf/1408.4626.pdf

    Improved Optimal and Approximate Power Graph Compression for Clearer Visualisation of Dense Graphs

    https://arxiv.org/pdf/1311.6996v1.pdf

  • LAGraph

    This is a library plus a test harness for collecting algorithms that use the GraphBLAS. For test coverage reports, see https://graphblas.org/LAGraph/ . Documentation: https://lagraph.readthedocs.org

  • > you probably want more specialised tools like BLAS/LAPACK

    The GraphBLAS and LAGraph are sparse matrix optimized libraries for this exact purpose:

    https://github.com/DrTimothyAldenDavis/GraphBLAS

    https://github.com/GraphBLAS/LAGraph/

  • arborescence

    Generic graph library

  • - no edge type is emposed by the library, although it does provide the basic tail-head-pair structure as a utility.

    [1] https://github.com/qbit86/arborescence

    [2] https://github.com/qbit86/arborescence/tree/develop/src/Arbo...

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • chatgpt-ifplatform

    I used ChatGPT-4 to design and code an Interactive Fiction platform on top of .NET/C#.

  • It's really an experimental endeavor. I have a Github repo (https://github.com/ChicagoDave/chatgpt-ifplatform), but I'm still changing things all the time. It's very volatile.

    Finding the balance between OO principals, Fluid coding capabilities, separating the data, grammar, parser, and world model and then constructing a standard IF library of common IF "things" is like juggling 20 kittens and 10 chainsaws.

    Some things are confounding like do I define a container with a boolean property on an object or is a container a subclass of the base Thing? How does that extend to the underlying graph data store? What will queries look like and which solution is more meaningful to authors?

    Seriously, 95% of the fun is figuring all of these things out.

  • Gephi

    Gephi - The Open Graph Viz Platform

  • The following are not exactly what you have asked for.

    https://gephi.org/ This implements lots of graph visualization algorithms.

    https://strlen.com/treesheets/ Excel for tree data.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts