cockroach VS vitess

Compare cockroach vs vitess and see what are their differences.

cockroach

CockroachDB - the open source, cloud-native distributed SQL database. (by cockroachdb)

vitess

Vitess is a database clustering system for horizontal scaling of MySQL. (by vitessio)
Our great sponsors
  • Scout APM - A developer's best friend. Try free for 14-days
  • Nanos - Run Linux Software Faster and Safer than Linux with Unikernels
  • SaaSHub - Software Alternatives and Reviews
cockroach vitess
23 22
22,535 12,906
1.7% 2.8%
10.0 10.0
6 days ago 7 days ago
Go Go
GNU General Public License v3.0 or later Apache License 2.0
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

cockroach

Posts with mentions or reviews of cockroach. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-11-30.
  • Composing generic data structures in go
    3 projects | dev.to | 30 Nov 2021
    Recently a colleague, Nathan, reflecting on CockroachDB, remarked (paraphrased from memory) that the key data structure is the interval btree. The story of Nathan’s addition of the first interval btree to cockroach and the power of copy-on-write data structures is worthy of its own blog post for another day. It’s Nathan’s hand-specialization of that data structure that provided the basis (and tests) for the generalization I’ll be presenting here. The reason for this specialization was as much for the performance wins of avoiding excessive allocations, pointer chasing, and cost of type assertions when using interface boxing.
  • Stacked changes: how FB and Google engineers stay unblocked and ship faster
    12 projects | news.ycombinator.com | 17 Nov 2021
    I'm surprised Reviewable[0] hasn't come up in this discussion. It does a great job of allowing stacked code reviews and even handles rebases nicely; the reviewer sees the diff between commit #1 and commit #1' (prime = after rebase).

    CockroachDB[1] has been using it since very early in the project.

    [0] https://reviewable.io/

    [1] https://github.com/cockroachdb/cockroach

  • 1 project | reddit.com/r/facepalm | 6 Nov 2021
    And even if you did want to run your database on a bunch of untrusted machines, a blockchain, being a linked list, is not a particularly efficient implementation. Its size increases linearly with the number of operations, which, for any rapid-fire application such as banking, means you have a tremendously inefficient marginal computational and storage cost per operation. You’d be considerably better off running something like Cockroach, or FoundationDB, or more ‘out-there’ offerings like Hypercore.
  • CockroachDB Grants and Schemas explained
    1 project | dev.to | 28 Aug 2021
    And here: https://github.com/cockroachdb/cockroach/issues/16790
  • Design to Duty: How we make architecture decisions at Adyen
    1 project | dev.to | 28 Jul 2021
    As you now know, we do not want to achieve this by restricting payments of some merchants to certain machines, as this would mean the machines are no longer linearly scalable. The information needs to be available locally, so we eventually decided on integrating Cockroach, a distributed database, with our PALs.
  • go startpack
    8 projects | dev.to | 15 Jul 2021
    CockroachDB (label: E-easy) The Scalable, Survivable, Strongly-Consistent SQL Database
  • The start of my journey learning Go. Any tips/suggestions would greatly appreciated!
    6 projects | reddit.com/r/golang | 29 Jun 2021
  • What is Cost-based Optimization?
    4 projects | dev.to | 2 Jun 2021
    In CockroachDB, the cost is an abstract 64-bit floating-point scalar value.
  • #30DaysofAppwrite : Appwrite’s building blocks
    3 projects | dev.to | 3 May 2021
    Appwrite uses MariaDB as the default database for project collections, documents, and all other metadata. Appwrite is agnostic to the database you use under the hood and support for more databases like Postgres, CockroachDB, MySQL and MongoDB is currently under active development! 😊
  • I am building a Serverless version of Redis - written in Rust
    7 projects | reddit.com/r/rust | 2 May 2021
    For me, if you look back to when Redis has been designed - 11 years ago, it was before the Cloud was a thing. Since then, you have Cloud alternatives that are mostly proprietary. The idea of RedisLess is not competing against a product that is existing for 11 years but showing a new path of how we can build a system on top of an existing one. You can see RedisLess as experimentation. How to build Cloud-native databases by taking advantage of existing solutions? TiDB, Yugabyte, CockroachDB are great examples of being MySQL wire protocol compatible and providing a Cloud way of managing data.

vitess

Posts with mentions or reviews of vitess. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-11-16.
  • Transition to FAANG Interviews
    1 project | reddit.com/r/ExperiencedDevs | 28 Nov 2021
    GitHub pretty much did that. They use Vitess to scale MySQL.
  • LFX mentorship @ Vitess
    1 project | dev.to | 18 Nov 2021
    This summer I was fortunate enough to be selected as linux foundation mentee for Vitess which is a CNCF graduated project. According to vitess.io
  • PlanetScale Is Now GA
    4 projects | news.ycombinator.com | 16 Nov 2021
    - gh-ost

    I authored the original schema change tool, oak-online-alter-table https://shlomi-noach.github.io/openarkkit/oak-online-alter-t..., which is no longer supported, but thankfully I did invest some time in documenting how it works. Similarly, I co-designed and was the main author for gh-ost, https://github.com/github/gh-ost, as part of the database infrastructure team at GitHub. We developed gh-ost because the existing schema change tools could not cope with our particular workloads. Read this engineering blog: https://github.blog/2016-08-01-gh-ost-github-s-online-migrat... to get better sense of what gh-ost is and how it works. I in particular suggest reading these:

    - https://github.com/github/gh-ost/blob/master/doc/cheatsheet....

    - https://github.com/github/gh-ost/blob/master/doc/cut-over.md

    - https://github.com/github/gh-ost/blob/master/doc/subsecond-l...

    - https://github.com/github/gh-ost/blob/master/doc/throttle.md

    - https://github.com/github/gh-ost/blob/master/doc/why-trigger...

    At PlanetScale I also integrated VReplication into the Online DDL flow. This comment is far too short to explain how VReplication works, but thankfully we again have some docs:

    - https://vitess.io/docs/user-guides/schema-changes/ddl-strate... (and really see entire page, there's comparison between the different tools)

    - https://vitess.io/docs/design-docs/vreplication/

    - or see this self tracking issue: https://github.com/vitessio/vitess/issues/8056#issue-8771509...

    Not to leave you with only a bunch of reading material, I'll answer some questions here:

    > Can you elaborate? How? Do they run on another servers? Or are they waiting on a queue change waiting to be applied? If they run on different servers, what they run there, since AFAIK the migration is only DDL, there's no data?

    The way all schema change tools mentioned above work is by creating a shadow aka ghost table on the same primary server where your original table is located. By carefully both copying data from original table as well as tracking ongoing changes to the table (whether by utilizing triggers or by tailing the binary logs), and using different techniques to mitigate conflicts between the two, the tools populate the shadow table with up-to-date data from your original table.

    This can take a long time, and requires an extra amount of space to accommodate the shadow table (both time and space are also required by "natural" ALTER TABLE implementations in DBs I'm aware of).

    With non-trigger solutions, such as gh-ost and VReplication, the tooling have almost ocmplete control over the pace. Given load on the primary server or given increasing replication lag, they can choose to throttle or completely halt execution, to resume later on when load has subsided. We have used this technique specifically at GitHub to run the largest migrations on our busiest tables at any time of the week, including at peak traffic, and this has show to pose little to no impact to production. Again, these techniques are universally used today by almost all large scale MySQL players, including Facebook, Shopify, Slack, etc.

    > who will throttle, the migration? But what is the migration? Let's use my example: a column type change requires a table rewrite. So the table rewrite will throttle, i.e. slow down? But where is this table rewrite running, on the main server (apparently not) or on a shadow server (apparently either since migrations have no data)? Actually you mention "when your production traffic gets too high". What is "high", can you quantify?

    The tool (or Vitess if you will, or PlanetScale in our discussion) will throttle based on continuously collecting metrics. The single most important metric is replication lag, and we found that it predicts load more than any other matric, by far. We throttle at 1sec replication lag. A secondary metric is the number of concurrent executing threads on the primary; this is mroe improtant for pt-online-schema-change, but for gh-ost and VReplication, given their nature of single-thread writes, we found that the metric is not very important to throttle on. It is also trickier since the threshold to throttle at depends on your time of day, particular expected workload etc.

    > We run customers that do dozens to thousands of transactions per second. Is this high enough?

    The tooling are known to work well with these transaction rates. VReplication and gh-ost will add one more transaction at a time (well, two really, but 2nd one is book-keeping and so low volume that we can neglect it); the transactions are intentionally kept small so as to not overload the transaction log or the MVCC mechanism; rule of thumb is to only copy 100 rows at a time, so exepect possibly millions of sequential such small transaction on a billion row table.

    > Will their migrations ever run, or will wait for very long periods of time, maybe forever?

    Some times, if the load is so very high, migrations will throttle more. At other times, they will push as fast as they can while still keeping to low replication lag threshold. In my experience a gh-ost or vreplication migration is normally good to run even on the busiest times. If a database system is such that it _always_ has substantial replication lag, such that a migration cannot complete in a timely manner, then I'd say the database system is beyond its own capacity anyway, and should be optimized/sharded/whatever.

    > How is this possible? Where the migration is running, then? A shadow table, shadow server... none?

    So I already mentioned the ghost table. And then, SELECTs are non blocking on the original table.

    > What's cut-over?

    Cut-over is what we call the final step of the migration: flipping the tables. Specifically, moving away your original table, and renaming the ghost table in its place. This requires a metadata lock, and is the single most critical part of the schema migration, for any tooling involved. This is where something as to give. Tooling such as gh-ost and pt-online-schema-change acquire a metadata lock such that queries are blocked momentarily, until cut-over is complete. With very high load the app will feel it. With extremely high load the database may not be able to (or may not be configured to) accommodate so many blocked queries, and app will see rejections. For low volume load apps may not even notice.

    I hope this helps. Obviously this comment cannot accommodate so much more, but hopefully the documentation links I provided are of help.

  • Encrypting Postgres Data at Rest in Kubernetes
    2 projects | news.ycombinator.com | 31 Oct 2021
    I'm hoping these kinds of policies continue to be phased out.

    The Kubernetes world has changed a lot in the past few years in ways that make databases-in-k8s more appealing. Such as:

    - Kubernetes "eating the world", meaning some teams may not even have good options for databases outside k8s (particularly onprem).

    - Infrastructure-as-code being more prevalent. Since you already have to use k8s manifests for the rest of your app, adding another IaC tool to set up RDS may be undesirable.

    - The rise of microservices, where companies may have hundreds of services that need their own separate data stores (many which don't see high enough traffic to justify the cost of a managed database service).

    - Excellent options like the bitnami helm charts: https://github.com/bitnami/charts or apparently Vitess (haven't used it myself): https://vitess.io/

    Obviously if the use-case is a few huge, highly-tuned, super-critical databases, managed database services are perfect for that. But IMO a blanket ban might be restricting adoption of some more modern development practices.

  • Comparing AWS's RDS and PlanetScale
    1 project | news.ycombinator.com | 5 Oct 2021
    This offering really isn't apples to apples. It would be better compared against AWS Aurora or https://vitess.io/
  • How to Deploy a Python Django Application using PlanetScale and Koyeb Serverless Platform
    2 projects | dev.to | 10 Sep 2021
    git clone https://github.com/vitessio/vitess.git ~/vitess cp -r ~/vitess/support/django/custom_db_backends ./
  • Which databases do you hate the least at scale?
    1 project | reddit.com/r/devops | 6 Sep 2021
    I can't say I loved it, but it worked extremely well. I haven't been working on that network in about 5 years. I hear they're finally out growing that setup and are looking to migrate to Vitess.
  • Moving away from MySQL, evaluating alternatives, picking a winner for our needs
    1 project | news.ycombinator.com | 10 Aug 2021
  • Steps to build Database System from sratch?
    4 projects | reddit.com/r/Database | 10 Aug 2021
    The query parser based on Vitess: https://github.com/vitessio/vitess
  • According to a survey by Red Hat, 80% of customers run databases in Kubernetes
    1 project | reddit.com/r/kubernetes | 28 Jul 2021

What are some alternatives?

When comparing cockroach and vitess you can also consider the following projects:

tidb - TiDB is an open source distributed HTAP database compatible with the MySQL protocol

supabase - The open source Firebase alternative. Follow to stay updated about our public Beta.

citus - Distributed PostgreSQL as an extension

Trino - Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

go-mysql-elasticsearch - Sync MySQL data into elasticsearch

yugabyte-db - The high-performance distributed SQL database for global, internet-scale apps.

migrate - Database migrations. CLI and Golang library.

dgraph - Native GraphQL Database with graph backend

kingshard - A high-performance MySQL proxy

InfluxDB - Scalable datastore for metrics, events, and real-time analytics

Tile38 - Real-time Geospatial and Geofencing