Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
GitLab team member here. Thanks for asking.
Incidents can have different types, i.e. when an application bug or performance regression is discovered, this can involve reverting MRs and rolling back releases. The Platform, Delivery group has a top-level responsibility for ensuring continuous delivery of the GitLab application software to GitLab SaaS, https://about.gitlab.com/handbook/engineering/infrastructure...
Other incidents may involve hardware or infrastructure failures, or a combination of both, infrastructure failure that renders GitLab application services unavailable. This requires cross-functional collaboration from infrastructure, product, engineering, etc. teams in the incident.
To get a better understanding here, it is helpful to review the incident management handbook https://about.gitlab.com/handbook/engineering/infrastructure...
Additional helpful information:
- The GitLab.com SaaS production architecture is documented in https://about.gitlab.com/handbook/engineering/infrastructure...
- The Monitoring of GitLab.com handbook provides insights into monitoring workflows, incident management, SLAs, etc. https://about.gitlab.com/handbook/engineering/monitoring/
- Runbooks https://about.gitlab.com/handbook/engineering/infrastructure...
For the current incident discussed in this HN thread, the review issue can be followed in https://gitlab.com/gitlab-com/gl-infra/production/-/issues/1... to learn more.
Lucky for you, I have a shitty, not-well-tested tool to do that.
https://github.com/tylerjgarland/git2git
lol. I used it to migrate my gitlab to github on the last outage.