All the ways to capture changes in Postgres

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

maxwell

6 3,911 7.8 Java

Maxwell's daemon, a mysql-to-json kafka producer

I concur for anything that's not auditing. I've implemented a similar project using MySQL and Maxwell's Daemon[0]. But it's a hack—really, events should be generated by the services and applications and use a dedicated event bus.
That being said, in this "stream changes from the DB" space there's also Debezium[1]. I haven't worked with it but it looks pretty slick.
[0] https://maxwells-daemon.io/

debezium

80 9,857 9.9 Java

Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
airbyte

139 13,923 10.0 Python

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
walex

17 250 9.1 Elixir

Postgres change events (CDC) in Elixir

If you're using Elixir, check out https://github.com/cpursley/walex
I've actually been thinking about turning this idea into a product where you can just point it at your postgres database and select the tables you want to listen to (with filters, like you describe). And have that forwarded to a webhook (with possibility of other protocols).
I'd love to hear folks thoughts on that (and if it would be something people would pay for).

temporal_tables

6 576 4.9 PLpgSQL

Postgresql temporal_tables extension in PL/pgSQL, without the need for external c extension. (by nearform)

I enjoyed this blog. I think it provides a great succinct overview of various approaches native to Postgres.
For the "capture changes in an audit table" section, I've had good experiences at a previous company with the Temporal Tables pattern. Unlike other major RDBMS vendors, it's not built into Postgres itself, but there's a simple pattern [1] you can leverage with a SQL function.
This allows you to see a table's state as of a specific point in time. Some sample use cases:
- "What was this user's configuration on Aug 12?"
- "How many records were unprocessed at 11:55pm last night?"
- "Show me the diff on feature flags between now and a week ago"
[1]: https://github.com/nearform/temporal_tables

temporal_tables

16 896 4.2 C

Temporal Tables PostgreSQL Extension

There is also the temporal_tables extension.
[0] https://github.com/arkhipov/temporal_tables

talk-transcripts

34 2,848 4.7

Transcripts of Clojure-related talks

Using triggers + history tables (aka audit tables) is the right answer 98% of the time. Just do it. If you're not already doing it, start today. It is a proven technique, in use for _over 30 years_.
Here's a quick rundown of how to do it generically https://gist.github.com/slotrans/353952c4f383596e6fe8777db5d... (trades off space efficiency for "being easy").
It's great if you can store immutable data. Really, really great. But you _probably_ have a ton of mutable data in your database and you are _probably_ forgetting a ton of it every day. Stop forgetting things! Use history tables.
cf. https://github.com/matthiasn/talk-transcripts/blob/master/Hi...
Do not use Papertrail or similar application-space history tracking libraries/techniques. They are slow, error-prone, and incapable of capturing any DB changes that bypass your app stack (which you probably have, and should). Worth remembering that _any_ attempt to capture an "updated" timestamp from your app is fundamentally incorrect, because each of your webheads has its own clock. Use the database clock! It's the only one that's correct!

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
realtime

54 6,460 9.0 Elixir

Broadcast, Presence, and Postgres Changes via WebSockets

Yo :D This is what Supabase Realtime does!
https://github.com/supabase/realtime
Spin up a Supabase database and then subscribe to changes with WebSockets.
You can play with it here once you have a db: https://realtime.supabase.com/inspector/new

pg-event-proxy-example

1 8 10.0 PLpgSQL

Discontinued Send NOTIFY and WAL events from PostgreSQL to upstream services (amqp / redis / mqtt)

It exists [0] but does not seem to be that interesting to users
[0] https://github.com/subzerocloud/pg-event-proxy-example

connectors

3 32 9.9 Go

Connectors for capturing data from external data sources (by estuary)

No. We implemented our own [1] for a few reasons:
* Scaling well to multi-TB DBs without pinning the write-ahead log (potentially filling your DB's disk) while the backfill is happening. Instead, our connector constantly reads the WAL and works well in setups like Supabase that have very restrictive WAL sizes (1GB iirc).
* Incremental fault-tolerant backfills that can be stopped and resumed at will.
* Being able to offer "precise" captures which are logically consistent in terms of the sequence of create/update/delete events.
The last one becomes really interesting when paired with REPLICA IDENTITY FULL, because you can feed it into an incremental computation (perhaps differential dataflow) for streaming updates of a continuous computation.
Our work is based off of the Netflix DBLog paper, which we took and ran with.
[1] https://github.com/estuary/connectors/tree/main/source-postg...

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project