pgsink vs DataflowTemplates

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

pgsink		DataflowTemplates
	Project
5	Mentions	4
76	Stars	1,089
-	Growth	1.6%
0.0	Activity	9.8
about 1 year ago	Latest Commit	5 days ago
Go	Language	Java
MIT License	License	Apache License 2.0

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

pgsink

Posts with mentions or reviews of pgsink. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-10.

GitHub - go-jet/jet: Type safe SQL builder with code generation and automatic query result data mapping
2 projects | /r/golang | 10 Jan 2023

This is a really awesome project. I’ve used it on https://github.com/lawrencejones/pgsink to generate type safe bindings to the Postgres catalog tables, along with a few of the tables the project maintains itself.
Trade-offs from using ULIDs at incident.io
2 projects | /r/programming | 3 Jan 2023

pgx is really good: it's what I used to write logical decoders in https://github.com/lawrencejones/pgsink
A modern data stack for startups
4 projects | dev.to | 21 Apr 2022

It used to be that companies would write their own hacky scripts to perform this extraction - I've had terrible incidents caused by ETL database triggers in the past, and even built a few generic ETL tools myself.
Sync Postgres to BigQuery, possible? How?
3 projects | /r/bigquery | 5 Apr 2021
Ask HN: Show me your Half Baked project
154 projects | news.ycombinator.com | 9 Jan 2021

Postgres change-capture device that supports high-throughput and low-latency capture to a variety of sinks (at first release, just Google BigQuery):
https://github.com/lawrencejones/pgsink
I know there's debezium and Netflix's dblog, but this project aims to be much simpler.
Forget about kafka and any other dependency: just point it at Postgres, and your data will be pushed into BigQuery. And for people with highly-performance-sensitive databases, the read workload has been designed with Postgres efficiency in mind.
I'm hoping pgsink could be a gateway drug to get small companies up and running with a data warehouse. If your datastore of choice is Postgres, it's a huge help to replicate everything into an analytics datastore. A similar tool has helped my company extract expensive work out of our primary database, which is super useful for scaling.
The project is 90% there, about 10hrs and some testing away from being useable. Once there, I'll be hitting up some start-up friends and seeing if they want to give it a whirl.

DataflowTemplates

Posts with mentions or reviews of DataflowTemplates. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-04-05.

Which Database to use for rest api
1 project | /r/googlecloud | 25 Jun 2022

Google provide a Dataflow template for copying from BigQuery to Datastore, see this stack overflow answer.
Sync Postgres to BigQuery, possible? How?
3 projects | /r/bigquery | 5 Apr 2021
New to GCP - need help designing pipeline from production Heroku Postgres to BigQuery
1 project | /r/dataengineering | 2 Apr 2021

Ah, looks like the template default appends new rows. If I want to overwrite the table, looks like I might be able to just replace this line in the template code to WRITE_TRUNCATE (see here). Cool!
Tricky Dataflow ep.1 : Auto create BigQuery tables in pipelines
1 project | dev.to | 3 Feb 2021

However, learning to use Apache Beam, which is the open source framework behind Dataflow, is no bed of roses: The official documentation is sparse, GCP-provided templates don't work out-of-the-box, and the Javadoc is, well, a javadoc.

What are some alternatives?

When comparing pgsink and DataflowTemplates you can also consider the following projects:

pastty - Copy and paste across devices

janusgraph - JanusGraph: an open-source, distributed graph database

dupver - Deduplicating VCS for large binary files in Go

professional-services - Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.

debezium-examples - Examples for running Debezium (Configuration, Docker Compose files etc.)

yauaa - Yet Another UserAgent Analyzer

xact - Model based design for developers

dbt-metabase - dbt + Metabase integration

migrate - Database migrations. CLI and Golang library.

thgtoa - The Hitchhiker’s Guide to Online Anonymity

bigquery-utils - Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.

pgsink vs pastty DataflowTemplates vs janusgraph pgsink vs dupver DataflowTemplates vs professional-services pgsink vs debezium-examples DataflowTemplates vs yauaa pgsink vs xact DataflowTemplates vs debezium-examples pgsink vs dbt-metabase DataflowTemplates vs migrate pgsink vs thgtoa DataflowTemplates vs bigquery-utils

Compare pgsink vs DataflowTemplates and see what are their differences.

pgsink

DataflowTemplates

pgsink

DataflowTemplates

What are some alternatives?