Scalable PostgreSQL Connection Pooler

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • odyssey

    Scalable PostgreSQL connection pooler

  • is the Dockerfile not production ready ? https://github.com/yandex/odyssey/blob/master/docker/Dockerf...

    i mean i see valgrind, vim, etc in there. this would be a very fat dockerfile.

    It seems that way - https://github.com/yandex/odyssey/issues/29#issuecomment-764...

    >But it's hard to "bless" some "official" as image: no one from active contributors uses Odyssey in Docker to maintain it thoroughly.

    OTOH pgbouncer docker images are rock-solid in production.

    Very quickly updated to track upstream.

    e.g. the Bitnami ones - https://hub.docker.com/r/bitnami/pgbouncer/ which also have CVE security scans https://quay.io/repository/bitnami/pgbouncer?tab=tags

  • spqr

    Stateless Postgres Query Router.

  • Transaction poolers are looking on ReadyForQuery packet and it's "in trnsaction" property like this [0]. All you need - is stick server connection on new ParameterStatus[1] packet for "SET search_path" instead of ReadyForQuery.

    [0] https://github.com/pg-sharding/spqr/blob/358f816cd8a964a9c9e...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • pgbouncer-fast-switchover

    Adds query routing and rewriting extensions to pgbouncer

  • sgr

    sgr (command line client for Splitgraph) and the splitgraph Python library

  • We are building a solution for this problem at Splitgraph [0] – it sounds like we could probably help with your use case. You can get it to work yourself with our open source code [1], but our (private beta, upcoming public) SaaS service will put all your schemas on a more scalable “data delivery network,” which incidentally, happens to be implemented with PgBouncer + rewriting + ephemeral instances. In a local engine (just a Postgres DB managed by Splitgraph client to add extra stuff), there is no PgBouncer, but we use Foreign Data Wrappers to accomplish the same.

    On Splitgraph, every dataset – and every version of every dataset – has an address. Think of it like tagged Docker images. The address either points to an immutable “data image” (in which case we can optionally download objects required to resolve a query on-the-fly, although loading up-front is possible too) or to a live data source (in which case we proxy directly to it via FDW translation). This simple idea of _addressable data products_ goes a long way – for example, it means that computing a diff is now as simple as joining across two tables (one with the previous version, one with the new).

    Please excuse the Frankenstein marketing site – we’re in the midst of redesign / rework of info architecture while we build out our SaaS product.

    Feel free to reach out if you’ve got questions. And if you have a business case, we have spots available in our private pilot. My email is in my profile – mention HN :)

    [0] https://www.splitgraph.com/connect

    [1] examples: https://github.com/splitgraph/splitgraph/tree/master/example...

  • pgagroal

    High-performance connection pool for PostgreSQL

  • I'd be happy to help tuning Odyssey to someone who will bachmark both poolers (in fact there's only one number - number of worker processes..well, maybe pool_size too).

    pgagroal claims performance superiority over all poolers [0]. I doubt that Odyssey was used in transaction pooling mode in those experiments.

    [0] https://github.com/agroal/pgagroal/blob/master/doc/PERFORMAN...

  • goofys

    a high-performance, POSIX-ish Amazon S3 file system written in Go

  • > We've had some ideas around using this for distributed querying: in our case, each node responsible for a given partition of a dataset would be able to download just the objects in that partition on the fly (though constraint pruning), so we wouldn't need to knowingly seed each worker with data.

    IMHO, if you're going to do this, I'd recommend not doing this in Postgres itself, but rather doing it at the filesystem level. It's effectively just a tiered-storage read-through cache, and filesystems have those all figured out already.

    You know how pgBackRest does "partial restore" (https://pgbackrest.org/user-guide.html#restore/option-db-inc...), by making all the heap files seem to be there, but actually they're empty sparse files that just happen to have the right allocated length to make PG happy?

    Imagine taking one of the object-storage FUSE filesystems, e.g. https://github.com/kahing/goofys, and modding it so that it represents all not-yet-fetched files under readdir(2) with an equivalent representation.

    Then just make your pg_base dir an overlayfs mount for:

    • top layer: tmpfs (only necessary if you don't give temp tables their own tablespace)

  • catfs

    Cache AnyThing filesystem written in Rust

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts