Postgres: The Graph Database You Didn't Know You Had

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • Apache AGE

    Graph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL. (by apache)

  • Of course relational db can act like a graph db. It's just not as efficient due to how things are stored and queried. Would be great to have a graph db plugin (and I found one https://github.com/apache/age)

  • clair

    Vulnerability Static Analysis for Containers

  • It scaled well compared to a naive graph abstraction implemented outside the database, but when performance wasn't great, it REALLY wasn't great. We ended up throwing it out in later versions to try and get more consistent performance.

    I've since worked on SpiceDB[1] which takes the traditional design approach for graph databases and simply treating Postgres as triple-store and that scales far better. IME, if you need a graph, you probably want to use a database optimized for graph access patterns. Most general-purpose graph databases are just bags of optimizations for common traversals.

    [0]: https://github.com/quay/clair

    [1]: https://github.com/authzed/spicedb

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • spicedb

    Open Source, Google Zanzibar-inspired permissions database to enable fine-grained access control for customer applications

  • It scaled well compared to a naive graph abstraction implemented outside the database, but when performance wasn't great, it REALLY wasn't great. We ended up throwing it out in later versions to try and get more consistent performance.

    I've since worked on SpiceDB[1] which takes the traditional design approach for graph databases and simply treating Postgres as triple-store and that scales far better. IME, if you need a graph, you probably want to use a database optimized for graph access patterns. Most general-purpose graph databases are just bags of optimizations for common traversals.

    [0]: https://github.com/quay/clair

    [1]: https://github.com/authzed/spicedb

  • ldbc_snb_bi

    Reference implementations for the LDBC Social Network Benchmark's Business Intelligence (BI) workload

  • I designed and maintain several graph benchmarks in the Linked Data Benchmark Council, including workloads aimed for databases [1]. We make no restrictions on implementations, they can any query language like Cypher, SQL, etc.

    In our last benchmark aimed at analytical systems [2], we found that SQL queries using WITH RECURSIVE can work for expressing reachability and even weighted shortest path queries. However, formulating an efficient algorithm yields very complex SQL queries [3] and their execution requires a system with a sophisticated optimizer such as Umbra developed at TU Munich [4]. Industry SQL systems are not yet at this level but they may attain that sometime in the future.

    Another direction to include graph queries in SQL is the upcoming SQL/PGQ (Property Graph Queries) extension. I'm involved in a project at CWI Amsterdam to incorporate this language into DuckDB [5].

    [1] https://ldbcouncil.org/benchmarks/snb/

    [2] https://www.vldb.org/pvldb/vol16/p877-szarnyas.pdf

    [3] https://github.com/ldbc/ldbc_snb_bi/blob/main/umbra/queries/...

    [4] https://umbra-db.com/

    [5] https://www.cidrdb.org/cidr2023/slides/p66-wolde-slides.pdf

  • ldbc_snb_datagen_spark

    Synthetic graph generator for the LDBC Social Network Benchmark, running on Spark

  • I designed and maintain several graph benchmarks in the Linked Data Benchmark Council, including workloads aimed for databases [1]. We make no restrictions on implementations, they can any query language like Cypher, SQL, etc.

    In our last benchmark aimed at analytical systems [2], we found that SQL queries using WITH RECURSIVE can work for expressing reachability and even weighted shortest path queries. However, formulating an efficient algorithm yields very complex SQL queries [3] and their execution requires a system with a sophisticated optimizer such as Umbra developed at TU Munich [4]. Industry SQL systems are not yet at this level but they may attain that sometime in the future.

    Another direction to include graph queries in SQL is the upcoming SQL/PGQ (Property Graph Queries) extension. I'm involved in a project at CWI Amsterdam to incorporate this language into DuckDB [5].

    [1] https://ldbcouncil.org/benchmarks/snb/

    [2] https://www.vldb.org/pvldb/vol16/p877-szarnyas.pdf

    [3] https://github.com/ldbc/ldbc_snb_bi/blob/main/umbra/queries/...

    [4] https://umbra-db.com/

    [5] https://www.cidrdb.org/cidr2023/slides/p66-wolde-slides.pdf

  • quine

    Quine • a streaming graph • https://quine.io • Discord: https://discord.gg/GMhd8TE4MR

  • Re [5]'s asssertion under "blunders" of the diminish usecases post sql/pgq, what do you think of sometime like Quine?

    https://github.com/thatdot/quine

    Their claim to fame is progressive incremental computation - each node is an actor responding to events -- and I'm not sure how a relational db could do that and match the latencies. That usecase is pretty much pattern matching and forensics and stuff like that.

    https://docs.quine.io/core-concepts/architecture.html

  • materialize

    The data warehouse for operational workloads. (by MaterializeInc)

  • Quine seems to be an interesting project but I'm not too familiar with it. Its main feature, evaluating complex multi-way join queries on incoming streaming data, exists in the relational world as the Materialize database which leverages differential dataflow for computation.

    Quine uses Cypher so expressing path queries can be done with the concise Kleene-star syntax, e.g. (p1:Person)-[:knows*]-(p2:Person).

    Materalize is getting support for WITH RECURSIVE and WITH MUTUALLY RECURSIVE (their own SQL extension that fixes some issues of WITH RECURSIVE):

    https://github.com/MaterializeInc/materialize/issues/11176

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Anyone using cloudnativePG operator knows if it's possible to save backups to a local dir or NFS?

    1 project | /r/kubernetes | 16 May 2023
  • Can someone share experience configuring Highly Available PgSQL?

    8 projects | /r/PostgreSQL | 26 Mar 2023
  • PageRank Algorithm for Graph Databases

    8 projects | news.ycombinator.com | 30 Jan 2023
  • Features I'd Like in PostgreSQL

    14 projects | news.ycombinator.com | 28 Jan 2023
  • Any self hostable postgres clustering, replication and fail over system?

    3 projects | /r/PostgreSQL | 25 Jan 2023