corp vs pgsink

corp

Assets related to the operation of Fishtown Analytics. (by dbt-labs)

Suggest topics

Source Code

Suggest alternative

Edit details

pgsink

Logically replicate data out of Postgres into sinks (files, Google BigQuery, etc) (by lawrencejones)

Postgres Google Data Bigquery postgres-change-capture postgres-database

Source Code

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

corp		pgsink
	Project
12	Mentions	5
413	Stars	76
-0.2%	Growth	-
4.6	Activity	0.0
18 days ago	Latest Commit	about 1 year ago
	Language	Go
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

corp

Posts with mentions or reviews of corp. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-05-03.

Are there database design Standards out there? As in, formal documents listing exact best practices for OLTP database design?
2 projects | /r/dataengineering | 3 May 2023

Here's one that covers some of your points and that I like in general: https://github.com/dbt-labs/corp/blob/main/dbt_style_guide.md Except instead of prefixing my table names with the processing stage, I keep them in schemas by processing stage (source, staging, analytics). So, I can tell my analysts to look into the analytics schema for all the final tables, and they won't be bothered by intermediate models. The table names also have a precise structure that corresponds to our specific subject.
Looking to understand why the dbt style guide recommends to use *all lower case* for keywords, field names, and function names?
1 project | /r/SQL | 8 Feb 2023
Best practices for data modeling with SQL and dbt
1 project | /r/SQL | 23 Aug 2022

I find the content more or less ripped from of dbt's own styleguide
SQL Code Style Properties Questions
1 project | /r/Jetbrains | 15 Jun 2022

For anyone wondering this is the DBT style guide I am referencing from.
A modern data stack for startups
4 projects | dev.to | 21 Apr 2022

While the tool choice is obvious, how to use dbt is going to be a more controversial. There's a load of great resources on dbt best practices, but as you can see from my Slack questions, there's enough ambiguity to tie you up.
Completed my first Data Engineering project with Kafka, Spark, GCP, Airflow, dbt, Terraform, Docker and more!
13 projects | /r/dataengineering | 2 Apr 2022

Just a slight critique, but I noticed some of the dbt models are a bit hard to read. Especially your dim_users SCD2 model, which uses lots of nested subqueries and multiple columns on the same line. You may want to refer to this style guide from dbt Labs. I find CTEs are a lot easier to parse and read.
What are some good resources for learning to write clean, production-quality code?
3 projects | /r/datascience | 20 Feb 2022

I really like thisthis SQL STYLE GUIDE, and if you use dbt, the dbt style guide.
How do you format your SQL queries?
1 project | /r/SQL | 10 Feb 2022

I like this one very much from dbt very much.
Where do you like to do the L of ELT? Python or DBT?
1 project | /r/dataengineering | 8 Feb 2022

I recommend you write one. You can take inspiration from dbt's one or Gitlab
Confused about benefits of CTE
1 project | /r/SQL | 2 Apr 2021

I've seen fishtown analytics coding conventions recommend a lot around here, but there are a few things about their recommendations of CTE use that confuse me.

pgsink

Posts with mentions or reviews of pgsink. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-10.

GitHub - go-jet/jet: Type safe SQL builder with code generation and automatic query result data mapping
2 projects | /r/golang | 10 Jan 2023

This is a really awesome project. I’ve used it on https://github.com/lawrencejones/pgsink to generate type safe bindings to the Postgres catalog tables, along with a few of the tables the project maintains itself.
Trade-offs from using ULIDs at incident.io
2 projects | /r/programming | 3 Jan 2023

pgx is really good: it's what I used to write logical decoders in https://github.com/lawrencejones/pgsink
A modern data stack for startups
4 projects | dev.to | 21 Apr 2022

It used to be that companies would write their own hacky scripts to perform this extraction - I've had terrible incidents caused by ETL database triggers in the past, and even built a few generic ETL tools myself.
Sync Postgres to BigQuery, possible? How?
3 projects | /r/bigquery | 5 Apr 2021
Ask HN: Show me your Half Baked project
154 projects | news.ycombinator.com | 9 Jan 2021

Postgres change-capture device that supports high-throughput and low-latency capture to a variety of sinks (at first release, just Google BigQuery):
https://github.com/lawrencejones/pgsink
I know there's debezium and Netflix's dblog, but this project aims to be much simpler.
Forget about kafka and any other dependency: just point it at Postgres, and your data will be pushed into BigQuery. And for people with highly-performance-sensitive databases, the read workload has been designed with Postgres efficiency in mind.
I'm hoping pgsink could be a gateway drug to get small companies up and running with a data warehouse. If your datastore of choice is Postgres, it's a huge help to replicate everything into an analytics datastore. A similar tool has helped my company extract expensive work out of our primary database, which is super useful for scaling.
The project is 90% there, about 10hrs and some testing away from being useable. Once there, I'll be hitting up some start-up friends and seeing if they want to give it a whirl.

What are some alternatives?

When comparing corp and pgsink you can also consider the following projects:

nodejs-bigquery - Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.

pastty - Copy and paste across devices

sql-style-guide - An opinionated guide for writing clean, maintainable SQL.

dupver - Deduplicating VCS for large binary files in Go

terraform - Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.

DataflowTemplates - Cloud Dataflow Google-provided templates for solving in-Cloud data tasks

streamify - A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!

debezium-examples - Examples for running Debezium (Configuration, Docker Compose files etc.)

spark-bigquery-connector - BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.

xact - Model based design for developers

dbt-metabase - dbt + Metabase integration

thgtoa - The Hitchhiker’s Guide to Online Anonymity

corp vs nodejs-bigquery pgsink vs pastty corp vs sql-style-guide pgsink vs dupver corp vs terraform pgsink vs DataflowTemplates corp vs streamify pgsink vs debezium-examples corp vs spark-bigquery-connector pgsink vs xact pgsink vs dbt-metabase pgsink vs thgtoa

Compare corp vs pgsink and see what are their differences.

corp

pgsink

corp

pgsink

What are some alternatives?