PostgreSQL UUID vs. Serial vs. Identity

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • ksuid

    K-Sortable Globally Unique IDs

  • Yeah, just use a UUID unless the bits to store the UUID really are your driving limitation (they're not), having a UUID that is non-linear is almost always the most straight-forward option for identifying things, for the tradeoff of human readability (though you can get some of that back with prefixes and some other schemes). I'm not going to rehash the benefits that people have brought up for UUIDs, but they're in this thread. At this point what I'm concerned about is just... what is the best kind of UUID to use -- I've recently started using mostly v1 because time relationship is important to me (despite the unfortunate order issues) and v6[0] isn't quite so spread yet. Here's a list of other approaches out there worth looking at

    - isntauuid[1] (mentioned in this thread, I've given it a name here)

    - timeflake[2]

    - HiLo[3][4]

    - ulid[5]

    - ksuid[6] (made popular by segment.io)

    - v1-v6 UUIDs (the ones we all know and some love)

    - sequential interval based UUIDs in Postgres[7]

    Just add a UUID -- this almost surely isn't going to be what bricks your architecture unless you have some crazy high write use case like time series or IoT or something maybe.

    [0]: http://gh.peabody.io/uuidv6/

    [1]: https://instagram-engineering.com/sharding-ids-at-instagram-...

    [2]: https://github.com/anthonynsimon/timeflake

    [3]: https://en.wikipedia.org/wiki/Hi/Lo_algorithm

    [4]: https://www.npgsql.org/efcore/modeling/generated-properties....

    [5]: https://github.com/edoceo/pg-ulid

    [6]: https://github.com/segmentio/ksuid

    [7]: https://www.2ndquadrant.com/en/blog/sequential-uuid-generato...

  • spec

    The canonical spec for ulid

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Hasura

    Blazing fast, instant realtime GraphQL APIs on your DB with fine grained access control, also trigger webhooks on database events.

  • I wouldn't say it's a particularly great upside, but I've found tooling doesn't tend to work as well with UUID keys as it does integer keys. E.g., Hasura won't map UUID keys to the GraphQL ID type [1], which makes working with them unnecessarily difficult. Arguably, the issue should be fixed with the tool (or a different tool chosen), but there's only so many battles to pick and sometimes that decision is out of your control.

    [1] -- https://github.com/hasura/graphql-engine/issues/3578

  • timeflake

    Timeflake is a 128-bit, roughly-ordered, URL-safe UUID.

  • Yeah, just use a UUID unless the bits to store the UUID really are your driving limitation (they're not), having a UUID that is non-linear is almost always the most straight-forward option for identifying things, for the tradeoff of human readability (though you can get some of that back with prefixes and some other schemes). I'm not going to rehash the benefits that people have brought up for UUIDs, but they're in this thread. At this point what I'm concerned about is just... what is the best kind of UUID to use -- I've recently started using mostly v1 because time relationship is important to me (despite the unfortunate order issues) and v6[0] isn't quite so spread yet. Here's a list of other approaches out there worth looking at

    - isntauuid[1] (mentioned in this thread, I've given it a name here)

    - timeflake[2]

    - HiLo[3][4]

    - ulid[5]

    - ksuid[6] (made popular by segment.io)

    - v1-v6 UUIDs (the ones we all know and some love)

    - sequential interval based UUIDs in Postgres[7]

    Just add a UUID -- this almost surely isn't going to be what bricks your architecture unless you have some crazy high write use case like time series or IoT or something maybe.

    [0]: http://gh.peabody.io/uuidv6/

    [1]: https://instagram-engineering.com/sharding-ids-at-instagram-...

    [2]: https://github.com/anthonynsimon/timeflake

    [3]: https://en.wikipedia.org/wiki/Hi/Lo_algorithm

    [4]: https://www.npgsql.org/efcore/modeling/generated-properties....

    [5]: https://github.com/edoceo/pg-ulid

    [6]: https://github.com/segmentio/ksuid

    [7]: https://www.2ndquadrant.com/en/blog/sequential-uuid-generato...

  • pg-ulid

    ULID Functions for PostgreSQL

  • Yeah, just use a UUID unless the bits to store the UUID really are your driving limitation (they're not), having a UUID that is non-linear is almost always the most straight-forward option for identifying things, for the tradeoff of human readability (though you can get some of that back with prefixes and some other schemes). I'm not going to rehash the benefits that people have brought up for UUIDs, but they're in this thread. At this point what I'm concerned about is just... what is the best kind of UUID to use -- I've recently started using mostly v1 because time relationship is important to me (despite the unfortunate order issues) and v6[0] isn't quite so spread yet. Here's a list of other approaches out there worth looking at

    - isntauuid[1] (mentioned in this thread, I've given it a name here)

    - timeflake[2]

    - HiLo[3][4]

    - ulid[5]

    - ksuid[6] (made popular by segment.io)

    - v1-v6 UUIDs (the ones we all know and some love)

    - sequential interval based UUIDs in Postgres[7]

    Just add a UUID -- this almost surely isn't going to be what bricks your architecture unless you have some crazy high write use case like time series or IoT or something maybe.

    [0]: http://gh.peabody.io/uuidv6/

    [1]: https://instagram-engineering.com/sharding-ids-at-instagram-...

    [2]: https://github.com/anthonynsimon/timeflake

    [3]: https://en.wikipedia.org/wiki/Hi/Lo_algorithm

    [4]: https://www.npgsql.org/efcore/modeling/generated-properties....

    [5]: https://github.com/edoceo/pg-ulid

    [6]: https://github.com/segmentio/ksuid

    [7]: https://www.2ndquadrant.com/en/blog/sequential-uuid-generato...

  • Dapper

    Dapper - a simple object mapper for .Net

  • The problem is when reading the data back with Dapper (writing works fine) and it has to parse the guid stored in sqlite back into a .net Guid. Thus I resorted to making the column type text and ToString() all the Guid's before writing, but that's pretty gross. With a normal string field, sqlite can set it as primary key, make unique indexes on it, use it as foreign keys. Thus my attempt to use my own generator the generate id's. So far it works great but not sure if it will be fine with 5 years of usage (basically on tables that keep growing over time, like a comments table).

    But of a rabbit hole: https://github.com/DapperLib/Dapper/pull/1082

  • Optimus

    🤖 Id obfuscation based on Knuth's multiplicative hashing method for PHP.

  • Yes, I completely forgot about it. I used it a few years ago, I tried also [1] which is integers instead of strings.

    [1] https://github.com/jenssegers/optimus

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • cuid

    Collision-resistant ids optimized for horizontal scaling and performance.

  • magit

    It's Magit! A Git Porcelain inside Emacs.

  • Yeah, a lot of it was a mix of the project, me not being good enough at C# in general to fix up the legacy adeptly, etc -- and I picked a bad time to do to it. You can't be good at everything, and I made a choice to just leave my C# skill where it's at.

    > I primarily use C# for api's and tooling (my own build server is code I wrote myself in C#, hosting my own git repo's (not using gitlab/bitbucket etc; rather bare repositories!), and my own nuget server). But for anything web/desktop related, meh skip. In C# api land, you can get such clean/unbloated code with full type safety and compiler checks, I love it. Most C# devs use/throw exceptions but they really muddy the water and make it feel like Java. My own code only have try/catch when talking to the outside world (harddrive, network, etc), the rest of the layers are clean and with minimal null checks (cause at those layers I never have nulls). Exceptions are expensive/slow too, so rather catch it as early as possible and return a Result object (not built in, my own envelope) instead of rethrowing multiple times and doing a crazy amount of null checks. So it is possible to write "clean" C# but most commercial projects I've seen is pretty darn ugly.

    That sounds interesting! Yeah I definitely prefer having types these days (basically don't pick languages without them anymore), and errors-as-values is one of the patterns that is a pearl of modern PL development. Languages that come out these days that still rely on exceptions are an instant turn off for me (ex. Dart). I remember something like 8-10 years ago having discussions around whether checked exceptions had meaning in Java and now that I look back all those discussions were so silly.

    > Just an fyi, I don't use appveyor/windows/powershell/Azure at all. I work primarily from Fedora with Jetbrains products, with most of my tooling being hand rolled. I use the Digital Ocean Api to manage all my boxes and use SSH.NET (nuget package that can open ssh tunnels in C#, I use it for all sorts of things, incl reaching my db servers (ssh.net + Npgsql)) and FluentFtp if needed. So I don't really touch Microsoft's ecosystem at all. My builder code calls the dotnet sdk and I copy the build artifacts around as needed - no third parties involved. In the beginning I stressed about going this way but I see now how great it is - no need to worry about CI minutes or paying Gitlab or Bitbucket, or dealing with 10 different api keys or dealing with Azure etc. So it's totally possible to have sanity - but to be honest I cannot take "my way" to work as everything is hand rolled - people want to use Gitlab/Bitbucket etc else they lose their minds (other developers seems to be my biggest enemy these days - everything needs to match to their exact pattern else they quit). So yeah, own clients = own stack.

    Wow thanks for this level of detail, I exclusively use linux on everything now, and it's great to know that this kind of setup is possible without resorting to a VM. As far as infrastructure/deployment goes, you manage your C# projects with C#, SSHing to them and doing stuff? What are the commands you run like? I'm not super familiar with easily calling C# from the command line, something like `dotnet run deploy.csharp`?

    SO I'm a huge proponent of CI (I think GitLab has the best CI out there) but it definitely is more complex than it should be. Setting up tokens, figuring out how things interact is really annoying 90% of the time, and the other 10% is bliss.

    I'm actually about to relaunch a product I was working on -- one that makes it cheaper to run CI ($10 for 1000 minutes is what GitLab charges and I'm going to offer $15 for unlimited minutes per dedicated vCore/2GB RAM). More to your problem though, I actually make all my stuff work with Makefiles (so much so that I even deploy over-complicated infrastructure you wouldn't like with Makefiles[0][1]), has that not worked for you as far as fixing CI inconsistency? I find that generally knowing that to build I just need to run `make ` (assuming you have the correct system-level libraries installed of course) has fixed most of my issues in this area.

    > On the visual studio side, the IDE has gotten slower and slower... I have a 6 core cpu, 32Gb ram and a Samsung nvme drive, RX580.. yet visual studio lags like crazy, intellisense take 2 minutes to update etc.. that's with only VS open and no extra extensions installed. It also crashes sometimes while doing nothing out of the ordinary. So not sure what the heck they are doing to the project at Microsoft. On the otherhand, the jetbrains products have super slow startup times (try 10 to 30 seconds for Rider and Datagrip) but once in, they work great!

    This is unfortunate, crazy wild conjecture but do you think it's possible visual studio will be subsumed into VS Code? And yeah what you're describing is why I live in emacs/vim (and even emacs is too slow sometimes). I know that I'm giving up a LOT of creature comforts but for me it fits like a worn glove. I'm also constantly impressed with JetBrains -- the IDE experience has generally been good for me as well (I've had to use it before because everything else was just too painful), and it seems like they just keep their heads down and put out good product.

    > Another cool thing to note, Gitkraken sees my bare git repositories as valid remotes (provided ssh keys are in place) and pushes to them without issue (and near instantly, bitbucket push takes 20s no matter what I do).

    Interesting, is bitbucket push having an issue with web requests or something I wonder, 20s is a long time to wait! Since I use emacs magit[2] works for me and it is amazing.

    [0]: https://vadosware.io/post/using-makefiles-and-envsubst-as-an...

    [1]: https://vadosware.io/post/setting-up-mailtrain-on-k8s/#step-...

    [2]: https://magit.vc/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts