PlanetScale Is Now GA

Our great sponsors

InfluxDB - Power Real-Time Data Analytics at Scale

WorkOS - The modern identity platform for B2B SaaS

SaaSHub - Software Alternatives and Reviews

Our great sponsors

tengo

1 24 6.9 Go

Discontinued Go La Tengo: a MySQL automation library (by skeema)
gh-ost

32 11,982 7.4 Go

GitHub's Online Schema-migration Tool for MySQL

- gh-ost
I authored the original schema change tool, oak-online-alter-table https://shlomi-noach.github.io/openarkkit/oak-online-alter-t..., which is no longer supported, but thankfully I did invest some time in documenting how it works. Similarly, I co-designed and was the main author for gh-ost, https://github.com/github/gh-ost, as part of the database infrastructure team at GitHub. We developed gh-ost because the existing schema change tools could not cope with our particular workloads. Read this engineering blog: https://github.blog/2016-08-01-gh-ost-github-s-online-migrat... to get better sense of what gh-ost is and how it works. I in particular suggest reading these:
- https://github.com/github/gh-ost/blob/master/doc/cheatsheet....
- https://github.com/github/gh-ost/blob/master/doc/cut-over.md
- https://github.com/github/gh-ost/blob/master/doc/subsecond-l...
- https://github.com/github/gh-ost/blob/master/doc/throttle.md
- https://github.com/github/gh-ost/blob/master/doc/why-trigger...
At PlanetScale I also integrated VReplication into the Online DDL flow. This comment is far too short to explain how VReplication works, but thankfully we again have some docs:
- https://vitess.io/docs/user-guides/schema-changes/ddl-strate... (and really see entire page, there's comparison between the different tools)
- https://vitess.io/docs/design-docs/vreplication/
- or see this self tracking issue: https://github.com/vitessio/vitess/issues/8056#issue-8771509...
Not to leave you with only a bunch of reading material, I'll answer some questions here:
> Can you elaborate? How? Do they run on another servers? Or are they waiting on a queue change waiting to be applied? If they run on different servers, what they run there, since AFAIK the migration is only DDL, there's no data?
The way all schema change tools mentioned above work is by creating a shadow aka ghost table on the same primary server where your original table is located. By carefully both copying data from original table as well as tracking ongoing changes to the table (whether by utilizing triggers or by tailing the binary logs), and using different techniques to mitigate conflicts between the two, the tools populate the shadow table with up-to-date data from your original table.
This can take a long time, and requires an extra amount of space to accommodate the shadow table (both time and space are also required by "natural" ALTER TABLE implementations in DBs I'm aware of).
With non-trigger solutions, such as gh-ost and VReplication, the tooling have almost ocmplete control over the pace. Given load on the primary server or given increasing replication lag, they can choose to throttle or completely halt execution, to resume later on when load has subsided. We have used this technique specifically at GitHub to run the largest migrations on our busiest tables at any time of the week, including at peak traffic, and this has show to pose little to no impact to production. Again, these techniques are universally used today by almost all large scale MySQL players, including Facebook, Shopify, Slack, etc.
> who will throttle, the migration? But what is the migration? Let's use my example: a column type change requires a table rewrite. So the table rewrite will throttle, i.e. slow down? But where is this table rewrite running, on the main server (apparently not) or on a shadow server (apparently either since migrations have no data)? Actually you mention "when your production traffic gets too high". What is "high", can you quantify?
The tool (or Vitess if you will, or PlanetScale in our discussion) will throttle based on continuously collecting metrics. The single most important metric is replication lag, and we found that it predicts load more than any other matric, by far. We throttle at 1sec replication lag. A secondary metric is the number of concurrent executing threads on the primary; this is mroe improtant for pt-online-schema-change, but for gh-ost and VReplication, given their nature of single-thread writes, we found that the metric is not very important to throttle on. It is also trickier since the threshold to throttle at depends on your time of day, particular expected workload etc.
> We run customers that do dozens to thousands of transactions per second. Is this high enough?
The tooling are known to work well with these transaction rates. VReplication and gh-ost will add one more transaction at a time (well, two really, but 2nd one is book-keeping and so low volume that we can neglect it); the transactions are intentionally kept small so as to not overload the transaction log or the MVCC mechanism; rule of thumb is to only copy 100 rows at a time, so exepect possibly millions of sequential such small transaction on a billion row table.
> Will their migrations ever run, or will wait for very long periods of time, maybe forever?
Some times, if the load is so very high, migrations will throttle more. At other times, they will push as fast as they can while still keeping to low replication lag threshold. In my experience a gh-ost or vreplication migration is normally good to run even on the busiest times. If a database system is such that it _always_ has substantial replication lag, such that a migration cannot complete in a timely manner, then I'd say the database system is beyond its own capacity anyway, and should be optimized/sharded/whatever.
> How is this possible? Where the migration is running, then? A shadow table, shadow server... none?
So I already mentioned the ghost table. And then, SELECTs are non blocking on the original table.
> What's cut-over?
Cut-over is what we call the final step of the migration: flipping the tables. Specifically, moving away your original table, and renaming the ghost table in its place. This requires a metadata lock, and is the single most critical part of the schema migration, for any tooling involved. This is where something as to give. Tooling such as gh-ost and pt-online-schema-change acquire a metadata lock such that queries are blocked momentarily, until cut-over is complete. With very high load the app will feel it. With extremely high load the database may not be able to (or may not be configured to) accommodate so many blocked queries, and app will see rejections. For low volume load apps may not even notice.
I hope this helps. Obviously this comment cannot accommodate so much more, but hopefully the documentation links I provided are of help.
InfluxDB

www.influxdata.com
sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
vitess

60 17,809 9.9 Go

Vitess is a database clustering system for horizontal scaling of MySQL.

- gh-ost
I authored the original schema change tool, oak-online-alter-table https://shlomi-noach.github.io/openarkkit/oak-online-alter-t..., which is no longer supported, but thankfully I did invest some time in documenting how it works. Similarly, I co-designed and was the main author for gh-ost, https://github.com/github/gh-ost, as part of the database infrastructure team at GitHub. We developed gh-ost because the existing schema change tools could not cope with our particular workloads. Read this engineering blog: https://github.blog/2016-08-01-gh-ost-github-s-online-migrat... to get better sense of what gh-ost is and how it works. I in particular suggest reading these:
- https://github.com/github/gh-ost/blob/master/doc/cheatsheet....
- https://github.com/github/gh-ost/blob/master/doc/cut-over.md
- https://github.com/github/gh-ost/blob/master/doc/subsecond-l...
- https://github.com/github/gh-ost/blob/master/doc/throttle.md
- https://github.com/github/gh-ost/blob/master/doc/why-trigger...
At PlanetScale I also integrated VReplication into the Online DDL flow. This comment is far too short to explain how VReplication works, but thankfully we again have some docs:
- https://vitess.io/docs/user-guides/schema-changes/ddl-strate... (and really see entire page, there's comparison between the different tools)
- https://vitess.io/docs/design-docs/vreplication/
- or see this self tracking issue: https://github.com/vitessio/vitess/issues/8056#issue-8771509...
Not to leave you with only a bunch of reading material, I'll answer some questions here:
> Can you elaborate? How? Do they run on another servers? Or are they waiting on a queue change waiting to be applied? If they run on different servers, what they run there, since AFAIK the migration is only DDL, there's no data?
The way all schema change tools mentioned above work is by creating a shadow aka ghost table on the same primary server where your original table is located. By carefully both copying data from original table as well as tracking ongoing changes to the table (whether by utilizing triggers or by tailing the binary logs), and using different techniques to mitigate conflicts between the two, the tools populate the shadow table with up-to-date data from your original table.
This can take a long time, and requires an extra amount of space to accommodate the shadow table (both time and space are also required by "natural" ALTER TABLE implementations in DBs I'm aware of).
With non-trigger solutions, such as gh-ost and VReplication, the tooling have almost ocmplete control over the pace. Given load on the primary server or given increasing replication lag, they can choose to throttle or completely halt execution, to resume later on when load has subsided. We have used this technique specifically at GitHub to run the largest migrations on our busiest tables at any time of the week, including at peak traffic, and this has show to pose little to no impact to production. Again, these techniques are universally used today by almost all large scale MySQL players, including Facebook, Shopify, Slack, etc.
> who will throttle, the migration? But what is the migration? Let's use my example: a column type change requires a table rewrite. So the table rewrite will throttle, i.e. slow down? But where is this table rewrite running, on the main server (apparently not) or on a shadow server (apparently either since migrations have no data)? Actually you mention "when your production traffic gets too high". What is "high", can you quantify?
The tool (or Vitess if you will, or PlanetScale in our discussion) will throttle based on continuously collecting metrics. The single most important metric is replication lag, and we found that it predicts load more than any other matric, by far. We throttle at 1sec replication lag. A secondary metric is the number of concurrent executing threads on the primary; this is mroe improtant for pt-online-schema-change, but for gh-ost and VReplication, given their nature of single-thread writes, we found that the metric is not very important to throttle on. It is also trickier since the threshold to throttle at depends on your time of day, particular expected workload etc.
> We run customers that do dozens to thousands of transactions per second. Is this high enough?
The tooling are known to work well with these transaction rates. VReplication and gh-ost will add one more transaction at a time (well, two really, but 2nd one is book-keeping and so low volume that we can neglect it); the transactions are intentionally kept small so as to not overload the transaction log or the MVCC mechanism; rule of thumb is to only copy 100 rows at a time, so exepect possibly millions of sequential such small transaction on a billion row table.
> Will their migrations ever run, or will wait for very long periods of time, maybe forever?
Some times, if the load is so very high, migrations will throttle more. At other times, they will push as fast as they can while still keeping to low replication lag threshold. In my experience a gh-ost or vreplication migration is normally good to run even on the busiest times. If a database system is such that it _always_ has substantial replication lag, such that a migration cannot complete in a timely manner, then I'd say the database system is beyond its own capacity anyway, and should be optimized/sharded/whatever.
> How is this possible? Where the migration is running, then? A shadow table, shadow server... none?
So I already mentioned the ghost table. And then, SELECTs are non blocking on the original table.
> What's cut-over?
Cut-over is what we call the final step of the migration: flipping the tables. Specifically, moving away your original table, and renaming the ghost table in its place. This requires a metadata lock, and is the single most critical part of the schema migration, for any tooling involved. This is where something as to give. Tooling such as gh-ost and pt-online-schema-change acquire a metadata lock such that queries are blocked momentarily, until cut-over is complete. With very high load the app will feel it. With extremely high load the database may not be able to (or may not be configured to) accommodate so many blocked queries, and app will see rejections. For low volume load apps may not even notice.
I hope this helps. Obviously this comment cannot accommodate so much more, but hopefully the documentation links I provided are of help.
pg_squeeze

2 400 8.3 C

A PostgreSQL extension for automatic bloat cleanup

> I am estimating that your database space isn't MySQL, which is just fine of course.
You are absolutely right :) My background is strongly on Postgres, you can see from my profile more information if you want to.
So yes, I apologize if some of my questions are not applying or become to obvious for cases that are MySQL-based. But for the most part, I believe principles of operation are the same.
> [other comments]
As mentioned, thank you very much for the detailed information. This completes the picture that I was looking for. I will definitely go in more detail for some of the links provided.
This principle of operation is not too different from something I proposed to a Postgres project some time ago (https://github.com/cybertec-postgresql/pg_squeeze/issues/18). This tool indeed is conceptually pretty similar. It's a shame that supporting schema changes is not part of their focus at this point. It wouldn't do throttling either, but it shouldn't be a difficult feature to add, I guess.
For other users here that may be interested in the Postgres world, there are two tools that perform similar operation (creating a shadow table and filling it in the background), but are both focused on rewriting the table to avoid bloat, rather than for doing a schema migration:
* pg_repack (https://reorg.github.io/pg_repack/): the most used one, relies on triggers

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project