pgsink VS spec

Compare pgsink vs spec and see what are their differences.

pgsink

Logically replicate data out of Postgres into sinks (files, Google BigQuery, etc) (by lawrencejones)
Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
pgsink spec
5 62
76 8,648
- 2.6%
0.0 0.0
about 1 year ago 3 months ago
Go
MIT License GNU General Public License v3.0 only
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

pgsink

Posts with mentions or reviews of pgsink. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-01-10.
  • GitHub - go-jet/jet: Type safe SQL builder with code generation and automatic query result data mapping
    2 projects | /r/golang | 10 Jan 2023
    This is a really awesome project. I’ve used it on https://github.com/lawrencejones/pgsink to generate type safe bindings to the Postgres catalog tables, along with a few of the tables the project maintains itself.
  • Trade-offs from using ULIDs at incident.io
    2 projects | /r/programming | 3 Jan 2023
    pgx is really good: it's what I used to write logical decoders in https://github.com/lawrencejones/pgsink
  • A modern data stack for startups
    4 projects | dev.to | 21 Apr 2022
    It used to be that companies would write their own hacky scripts to perform this extraction - I've had terrible incidents caused by ETL database triggers in the past, and even built a few generic ETL tools myself.
  • Sync Postgres to BigQuery, possible? How?
    3 projects | /r/bigquery | 5 Apr 2021
  • Ask HN: Show me your Half Baked project
    154 projects | news.ycombinator.com | 9 Jan 2021
    Postgres change-capture device that supports high-throughput and low-latency capture to a variety of sinks (at first release, just Google BigQuery):

    https://github.com/lawrencejones/pgsink

    I know there's debezium and Netflix's dblog, but this project aims to be much simpler.

    Forget about kafka and any other dependency: just point it at Postgres, and your data will be pushed into BigQuery. And for people with highly-performance-sensitive databases, the read workload has been designed with Postgres efficiency in mind.

    I'm hoping pgsink could be a gateway drug to get small companies up and running with a data warehouse. If your datastore of choice is Postgres, it's a huge help to replicate everything into an analytics datastore. A similar tool has helped my company extract expensive work out of our primary database, which is super useful for scaling.

    The project is 90% there, about 10hrs and some testing away from being useable. Once there, I'll be hitting up some start-up friends and seeing if they want to give it a whirl.

spec

Posts with mentions or reviews of spec. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-11.
  • The UX of UUIDs
    10 projects | news.ycombinator.com | 11 Apr 2024
    Can use ULID to "fix" some issues

    https://github.com/ulid/spec

  • Ulid: Universally Unique Lexicographically Sortable Identifier
    1 project | news.ycombinator.com | 30 Mar 2024
  • Ask HN: Is it acceptable to use a date as a primary key for a table in Postgres?
    1 project | news.ycombinator.com | 28 Dec 2023
    Both ULID and UUID v7 have a time code component which can be extracted.

    It would be best for indexing to store the actual value in binary, though not strictly necessary as these later UUID standards (unlike conventional UUIDs) use time code prefixes (so indexing clusters.)

    https://uuid7.com/

    https://github.com/ulid/spec

  • Bye Sequence, Hello UUIDv7
    8 projects | news.ycombinator.com | 1 Oct 2023
    UUIDv7 is a nice idea, and should probably be what people use by default instead of UUIDv4.

    For the curious:

    * UUIDv4 are 128 bits long, 122 bits of which are random, with 6 bits used for the version. Traditionally displayed as 32 hex characters with 4 dashes, so 36 alphanumeric characters, and compatible with anything that expects a UUID.

    * UUIDv7 are 128 bits long, 48 bits encode a unix timestamp with millisecond precision, 6 bits are for the version, and 74 bits are random. You're expected to display them the same as other UUIDs, and should be compatible with basically anything that expects a UUID. (Would be a very odd system that parses a UUID and throws an error because it doesn't recognise v7, but I guess it could happen, in theory?)

    * ULIDs (https://github.com/ulid/spec) are 128 bits long, 48 bits encode a unix timestamp with millisecond precision, 80 bits are random. You're expected to display them in Crockford's base32, so 26 alphanumeric characters. Compatible with almost everything that expects a UUID (since they're the right length). Spec has some dumb quirks if followed literally but thankfully they mostly don't hurt things.

    * KSUIDs (https://github.com/segmentio/ksuid) are 160 bits long, 32 bits encode a timestamp with second precision and a custom epoch of May 13th, 2014, and 128 bits are random. You're expected to display them in base62, so 27 alphanumeric characters. Since they're a different length, they're not compatible with UUIDs.

    I quite like KSUIDs; I think base62 is a smart choice. And while the timestamp portion is a trickier question, KSUIDs use 32 bits which, with second precision (more than good enough), means they won't overflow for well over a century. Whereas UUIDv7s use 48 bits, so even with millisecond precision (not needed) they won't overflow for something like 8000 years. We can argue whether 100 years us future proof enough (I'd argue it probably is), but 8000 years is just silly. Nobody will ever generate a compliant UUIDv7 with any of the first several bits aren't 0. The only downside to KSUIDs is the length isn't UUID compatible (and arguably, that they don't devote 6 bits to a compliant UUID version).

    Still feels like there's room for improvement, but for now I think I'd always pick UUIDv7 over UUIDv4 unless there's an very specific reason not to.

  • 50 years later, is Two-Phase Locking the best we can do?
    1 project | news.ycombinator.com | 30 Sep 2023
    I'd love for Postgres to adopt ULID as a first class variant of the same basic 128bit wide binary optimized column type they use for UUIDs, but I don't expect they will, while its "popular" its not likely popular enough to have support for them to maintain it in the long run... Also the smart money ahead of time would have been for the ULID spec to sacrifice a few data bits to leave the version specifying sections of the bit field layout unused in the ULID binary spec (https://github.com/ulid/spec#binary-layout-and-byte-order) for the sake of future compatibility with "proper" UUIDs... Performing one big bulk bitfield modification to a PostgreSQL column would have been much less painful than re-computing appropriate UUIDv7 (or UUIDv8s for some reason) and then having to perform a primary key update on every row in the table.
  • FLaNK Stack Weekly for 12 September 2023
    26 projects | dev.to | 12 Sep 2023
  • You Don't Need UUID
    13 projects | news.ycombinator.com | 11 Sep 2023
  • UUID Collision
    1 project | news.ycombinator.com | 15 Aug 2023
  • Type-safe, K-sortable, globally unique identifier inspired by Stripe IDs
    19 projects | news.ycombinator.com | 28 Jun 2023
    Many people had the same idea. For example ULID https://github.com/ulid/spec is more compact and stores the time so it is lexically ordered.
  • ULID: Universally Unique Lexicographically Sortable Identifier
    1 project | news.ycombinator.com | 26 Jun 2023

What are some alternatives?

When comparing pgsink and spec you can also consider the following projects:

pastty - Copy and paste across devices

dynamodb-onetable - DynamoDB access and management for one table designs with NodeJS

dupver - Deduplicating VCS for large binary files in Go

uuid6-ietf-draft - Next Generation UUID Formats

DataflowTemplates - Cloud Dataflow Google-provided templates for solving in-Cloud data tasks

kuuid - K-sortable UUID - roughly time-sortable unique id generator

debezium-examples - Examples for running Debezium (Configuration, Docker Compose files etc.)

python-ksuid - A pure-Python KSUID implementation

xact - Model based design for developers

ulid-lite - Generate unique, yet sortable identifiers

dbt-metabase - dbt + Metabase integration

shortuuid.rb - Convert UUIDs & numbers into space efficient and URL-safe Base62 strings, or any other alphabet.