ldbc_snb_datagen_spark
kuzu
ldbc_snb_datagen_spark | kuzu | |
---|---|---|
5 | 11 | |
165 | 1,052 | |
1.8% | 9.7% | |
3.7 | 9.9 | |
15 days ago | 4 days ago | |
Java | C++ | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ldbc_snb_datagen_spark
-
Benchgraph Backstory: The Untapped Potential
Because of the size, complexity, and feedback from the community, we decided to add a larger dataset. So the next dataset should be large, more complex, and recognizable. The choice was easy here; the industry-leading benchmark group Linked Data Benchmark Council (LDBC), which Memgraph is a part of, has open-sourced the datasets for benchmarking. The exact dataset is the social network dataset. It is a synthetically generated dataset representing a social network. It is being used in LDBC audited benchmarks, SNB interactive, and SNB Buissines intelligence benchmarks. Keep in mind that this is NOT an official implementation of an LDBC benchmark, the open-source dataset is being used as a basis for benchmarks, and it will be used for our in-house testing process and improving Memgraph.
-
Postgres: The Graph Database You Didn't Know You Had
I designed and maintain several graph benchmarks in the Linked Data Benchmark Council, including workloads aimed for databases [1]. We make no restrictions on implementations, they can any query language like Cypher, SQL, etc.
In our last benchmark aimed at analytical systems [2], we found that SQL queries using WITH RECURSIVE can work for expressing reachability and even weighted shortest path queries. However, formulating an efficient algorithm yields very complex SQL queries [3] and their execution requires a system with a sophisticated optimizer such as Umbra developed at TU Munich [4]. Industry SQL systems are not yet at this level but they may attain that sometime in the future.
Another direction to include graph queries in SQL is the upcoming SQL/PGQ (Property Graph Queries) extension. I'm involved in a project at CWI Amsterdam to incorporate this language into DuckDB [5].
[1] https://ldbcouncil.org/benchmarks/snb/
[2] https://www.vldb.org/pvldb/vol16/p877-szarnyas.pdf
[3] https://github.com/ldbc/ldbc_snb_bi/blob/main/umbra/queries/...
[4] https://umbra-db.com/
[5] https://www.cidrdb.org/cidr2023/slides/p66-wolde-slides.pdf
- Bullshit Graph Database Performance Benchmarks
-
From Data Preprocessing to Using Graph Database
Pull the source code from https://github.com/ldbc/ldbc_snb_datagen/tree/stable.To generate data for scale factor 1-1000, use the stable branch.
kuzu
- Unum: Vector Search engine in a single file
-
Building a New Database Management System in Academia
These two posts[2,3] explain where we are from and where we're going, if anyone is interested.
[1]: https://github.com/kuzudb/kuzu
-
Graph Database Community
Hi u/kyleireddit, I want to encourage you to try out KuzuDB: https://github.com/kuzudb/kuzu, which we are actively developing. One of our goals is to help educate developers more on where graph dbmss can offer value, so if you join our Slack channel and ask questions about graph dbmss and my students and I can answer some of your questions.
- Kùzu: an in-process property graph database management system (GDBMS)
-
Best free graph database for order of 500 million nodes
Then you can try Kùzu: https://github.com/kuzudb/kuzu. It should do quite well. We are new but actively developing the system and would love to help you when you are prototyping your application.
- KùzuDB – In-Memory Graph Database
-
PageRank Algorithm for Graph Databases
Not sqlite, but kuzu ( https://github.com/kuzudb/kuzu ) is an interesting project in this space. Fairly new, but already quite impressive IMHO.
-
CIDR 2023 Database Conference from Memgraph’s Perspective
I already mentioned Kùzu folks. They are doing an outstanding job of explaining what they do. Just follow their web 😀 They presented KùzuDB paper which brings interesting concepts to the graph query executions called factorization, S-Join and ASP-Join.
- Bullshit Graph Database Performance Benchmarks
- What Every Competent Graph DBMS Should Do
What are some alternatives?
Apache AGE - Graph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL.
Memgraph - Open-source graph database, tuned for dynamic analytics environments. Easy to adopt, scale and own.
ldbc_snb_bi - Reference implementations for the LDBC Social Network Benchmark's Business Intelligence (BI) workload
SimSIMD - Up to 200x Faster Inner Products and Vector Similarity — for Python, JavaScript, Rust, and C, supporting f64, f32, f16 real & complex, i8, and binary vectors using SIMD for both x86 AVX2 & AVX-512 and Arm NEON & SVE 📐
benchgraph
ustore - Multi-Modal Database replacing MongoDB, Neo4J, and Elastic with 1 faster ACID solution, with NetworkX and Pandas interfaces, and bindings for C 99, C++ 17, Python 3, Java, GoLang 🗄️
arcadedb - ArcadeDB Multi-Model Database, one DBMS that supports SQL, Cypher, Gremlin, HTTP/JSON, MongoDB and Redis. ArcadeDB is a conceptual fork of OrientDB, the first Multi-Model DBMS. ArcadeDB supports Vector Embeddings.
NetworkX - Network Analysis in Python
simple-graph - This is a simple graph database in SQLite, inspired by "SQLite as a document database"
mutable - A Database System for Research and Fast Prototyping
nebula-docker-compose - Docker compose for Nebula Graph