All the ways to capture changes in Postgres

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • maxwell

    Maxwell's daemon, a mysql-to-json kafka producer

  • I concur for anything that's not auditing. I've implemented a similar project using MySQL and Maxwell's Daemon[0]. But it's a hack—really, events should be generated by the services and applications and use a dedicated event bus.

    That being said, in this "stream changes from the DB" space there's also Debezium[1]. I haven't worked with it but it looks pretty slick.

    [0] https://maxwells-daemon.io/

  • debezium

    Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

  • walex

    Postgres change events (CDC) in Elixir

  • If you're using Elixir, check out https://github.com/cpursley/walex

    I've actually been thinking about turning this idea into a product where you can just point it at your postgres database and select the tables you want to listen to (with filters, like you describe). And have that forwarded to a webhook (with possibility of other protocols).

    I'd love to hear folks thoughts on that (and if it would be something people would pay for).

  • temporal_tables

    Postgresql temporal_tables extension in PL/pgSQL, without the need for external c extension. (by nearform)

  • I enjoyed this blog. I think it provides a great succinct overview of various approaches native to Postgres.

    For the "capture changes in an audit table" section, I've had good experiences at a previous company with the Temporal Tables pattern. Unlike other major RDBMS vendors, it's not built into Postgres itself, but there's a simple pattern [1] you can leverage with a SQL function.

    This allows you to see a table's state as of a specific point in time. Some sample use cases:

    - "What was this user's configuration on Aug 12?"

    - "How many records were unprocessed at 11:55pm last night?"

    - "Show me the diff on feature flags between now and a week ago"

    [1]: https://github.com/nearform/temporal_tables

  • temporal_tables

    Temporal Tables PostgreSQL Extension

  • There is also the temporal_tables extension.

    [0] https://github.com/arkhipov/temporal_tables

  • talk-transcripts

    Transcripts of Clojure-related talks

  • Using triggers + history tables (aka audit tables) is the right answer 98% of the time. Just do it. If you're not already doing it, start today. It is a proven technique, in use for _over 30 years_.

    Here's a quick rundown of how to do it generically https://gist.github.com/slotrans/353952c4f383596e6fe8777db5d... (trades off space efficiency for "being easy").

    It's great if you can store immutable data. Really, really great. But you _probably_ have a ton of mutable data in your database and you are _probably_ forgetting a ton of it every day. Stop forgetting things! Use history tables.

    cf. https://github.com/matthiasn/talk-transcripts/blob/master/Hi...

    Do not use Papertrail or similar application-space history tracking libraries/techniques. They are slow, error-prone, and incapable of capturing any DB changes that bypass your app stack (which you probably have, and should). Worth remembering that _any_ attempt to capture an "updated" timestamp from your app is fundamentally incorrect, because each of your webheads has its own clock. Use the database clock! It's the only one that's correct!

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • realtime

    Broadcast, Presence, and Postgres Changes via WebSockets

  • Yo :D This is what Supabase Realtime does!

    https://github.com/supabase/realtime

    Spin up a Supabase database and then subscribe to changes with WebSockets.

    You can play with it here once you have a db: https://realtime.supabase.com/inspector/new

  • pg-event-proxy-example

    Discontinued Send NOTIFY and WAL events from PostgreSQL to upstream services (amqp / redis / mqtt)

  • It exists [0] but does not seem to be that interesting to users

    [0] https://github.com/subzerocloud/pg-event-proxy-example

  • connectors

    Connectors for capturing data from external data sources (by estuary)

  • No. We implemented our own [1] for a few reasons:

    * Scaling well to multi-TB DBs without pinning the write-ahead log (potentially filling your DB's disk) while the backfill is happening. Instead, our connector constantly reads the WAL and works well in setups like Supabase that have very restrictive WAL sizes (1GB iirc).

    * Incremental fault-tolerant backfills that can be stopped and resumed at will.

    * Being able to offer "precise" captures which are logically consistent in terms of the sequence of create/update/delete events.

    The last one becomes really interesting when paired with REPLICA IDENTITY FULL, because you can feed it into an incremental computation (perhaps differential dataflow) for streaming updates of a continuous computation.

    Our work is based off of the Netflix DBLog paper, which we took and ran with.

    [1] https://github.com/estuary/connectors/tree/main/source-postg...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts