Graphony | snakelog | |
---|---|---|
1 | 2 | |
5 | 9 | |
- | - | |
- | 0.0 | |
over 2 years ago | about 1 year ago | |
Python | Python | |
Apache License 2.0 | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Graphony
-
The "missing" graph datatype already exists. It was invented in the '70s
> Abstractions should be seen as models. They are always wrong, but they are sometimes useful. (And sometimes not.)
George Box was very specifically talking about statistical models when he coined that aphorism. Matrices are linear algebra and graphs are graph theory, I find it hard to think they are not correct and useful models.
> A forward traversal >v of node v enters from the left, reads the sequence stored in the node, and exits from the right. A reverse traversal I'm not an expert in this field but I'm guessing you're talking about De Bruijn graphs, which can be very elegantly modeled with incidence matrices, here's an example of one using the GraphBLAS that downloads data from BioPython, loads it into incidence matrices and graphs it. This is just a simple example, SuiteSparse can handle many billions of edges:
https://github.com/Graphegon/Graphony?tab=readme-ov-file#exa...
Traversing bidirectionally is quite easy, the upper triangle of a matrix are the directed outgoing edges, and the lower triangle are the incoming. This style of "push/pull" optimization is common in the GraphBLAS.
> In a good graph representation, you can do this by maintaining a small state that does not grow significantly with the length of the context or the number of underlying paths.
Again if I understand you correctly, in the GraphBLAS this is accomplished by using accumulators and masks. During traversal data can be accumulated, with a stock operator or one you define, into a vector or matrix, and that object can be used to efficiently mask subsequent computations to avoid unnecessary work or determine when you've reached a termination condition.
> Matrices don't feel like a good abstraction for graphs like this.
Mathematically, graphs and matrices are isomorphic. Regardless of algorithm or storage format like edge lists, tuples or CSR, every graph is a matrix, and vice versa. And if you have a matrix, you have linear algebra to operate on it.
Some people don't like Linear Algebra as a graph abstraction, so I guess for them it is "not good", but on the other hand, it's Linear Algebra and Graph Theory, whose roots date back to the 2nd century BC, forward through great minds like Descartes and Euler, permeating every kind of math, science, physics and engineering discipline humans have ever created. That's a strong argument for its goodness.
Now it is entirely possible, likely even, that the current SuiteSparse implementation doesn't have exactly the tool needed or maybe not the precise best storage format, but these missing pieces do not invalidate the underlying mathematical foundation that it's based on.
snakelog
-
The "missing" graph datatype already exists. It was invented in the '70s
You can without too much work transpile datalog to SQL. SQL does have such strong support that it is useful https://github.com/philzook58/snakelog or perhaps just do it manually https://www.philipzucker.com/tiny-sqlite-datalog/
-
Show HN: Cozo – new Graph DB with Datalog, embedded like SQLite, written in Rust
Very cool! I love the sqlite install everywhere model.
Could you compare use case with Souffle? https://souffle-lang.github.io/
I'd suggest putting the link to the docs more prominently on the github page
Is the "traditional" datalog `path(x,z) :- edge(x,y), path(y,z).` syntax not pleasant to the modern eye? I've grown to rather like it. Or is there something that syntax can't do?
I've been building a Datalog shim layer in python to bridge across a couple different datalog systems https://github.com/philzook58/snakelog (including a datalog built on top of the python sqlite bindings), so I should look into including yours
What are some alternatives?
pycozo - The Python client and Jupyter helper for CozoDB
cozo - A transactional, relational-graph-vector database that uses Datalog for query. The hippocampus for AI!