Incident-response-docs Alternatives

Similar projects and alternatives to incident-response-docs based on common topics and language

kubernetes

661 106,923 10.0 Go incident-response-docs VS kubernetes

Production-Grade Container Scheduling and Management
dispatch

20 4,602 9.9 Python incident-response-docs VS dispatch

All of the ad-hoc things you're doing to manage incidents today, done for you, and much more!
InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
pagerduty2zabbix

1 4 7.9 Perl incident-response-docs VS pagerduty2zabbix

Update Zabbix events with PagerDuty incident changes via WebHook (2-way ack).
postmortem-docs

2 66 0.0 Dockerfile incident-response-docs VS postmortem-docs

PagerDuty's Public Postmortem Documentation
security-training

2 402 0.0 Shell incident-response-docs VS security-training

Public version of PagerDuty's employee security training courses.
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better incident-response-docs alternative or higher similarity.

Suggest an alternative to incident-response-docs

incident-response-docs reviews and mentions

Posts with mentions or reviews of incident-response-docs. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-22.

It's not always DNS – unless it is
2 projects | news.ycombinator.com | 22 Dec 2023

I can’t read the blog but Pagerduty provides a good standards for handling incidents: https://response.pagerduty.com/
What's your incident response flow?
2 projects | /r/sre | 27 May 2023

If you’re after some general advice, PagerDuty’s response guide is evergreen content, as is our practical guide to incident management.
SRE - Process to handle incident management
1 project | /r/devops | 7 Sep 2022

PagerDuty has shared their process and has some great resources: https://response.pagerduty.com/
What happens if you cannot resolve the issue at hand?
1 project | /r/sysadmin | 25 Aug 2022
Launch HN: Rootly (YC S21) – Manage Incidents in Slack
2 projects | news.ycombinator.com | 7 Jun 2022

Cool, thanks for this view.
I'm also intrigued by the text in this launch announcement:
> Our focus in the early days was build a hyper opinionated product to help them follow what we believe are the best practices. Now our product direction is focused on configuration and flexibility, how can we plug Rootly into your already existing way of working and automate it. This has helped our larger enterprise customers be successful with their current processes being automated.
As I have gotten more experience managing complex incidents I've come around to the idea that having a standard process you follow for big issues is somewhat more important than what the process really is.
I loved the PagerDuty response documentation ( https://response.pagerduty.com/ ) not so much because of the specifics but because it suggests they have a culture where there is a well-understood protocol they always try to follow for big problems.
I think about archery and "shot grouping" - once you learn to always land in the same place, you can move your aim to start landing somewhere else.
A number of the things that I see as valuable incident management involve having responders with a shared set of priorities. Tooling can influence how easy/hard some of these things are but it's really up to the people to do things like:
* Actually finding and fixing the problem and being sure the fix worked
* Clearly communicating the current user impact to the people who care
* Figuring out who the right responders are, and getting them in the room quickly
* Making one production change at a time with the incident coordinator's signoff, so you know which one helped and when it happened
* Helping the rest of the organization learn from what happened (you may not know what there is to learn)
Do you see room for the tooling company to also provide best-practices training, mentorship, or other kinds of support? That stuff scales less well than a web app but is arguably more important to changing a company's culture in a way that gets better user outcomes.
Startup guide to incident management
1 project | dev.to | 16 Mar 2022

There's an enormous amount of content available for organisations looking to import 'gold standard' incident management best practices -- things like the PagerDuty Response site, the Atlassian incident management best practices, and the Google SRE book. All of these are fantastic resources for larger companies, but as a newly founded startup, you're left to figure out which bits are important and which bits you can defer until later on.
Diary of a First-Time On-Call Engineer
1 project | news.ycombinator.com | 14 Mar 2022

Career-long sysadmin/SRE/SRE Team Lead, here. I've worked at large shops (10-30 million end users) and some shops where 99.98% is the SLA to prevent millions of dollars of losses in supply chain.
First of all, I appreciated this diary because Anna took the task with a positive attitude and as a learning experience. Thanks for writing this. To see an old problem through new eyes is inspiring.
I have numerous, "hot-take" criticisms of your current organization's practices, but I'm not sure I have all the context yet. The one suggestion I will make is: if you're not already using it - clone https://github.com/pagerduty/incident-response-docs/ and modify it to meet your organization's needs. Then, have it blessed as policy by management and train SREs and Devs on it.
To the other comments: I see there's a lot of people here who say they'd never do the SRE job, or return to doing it. I'm not discounting your fear or feelings of burnout. Been there. But, hear me out:
DevOps is not just about CI/CD pipelines and monitoring and Pagerduty. It's about having a culture where developers don't throw operational or security poop over a wall of confusion at sysadmin types as well as at their peers. This kind of organizational dysfunction can be devastating to a business.
DevOps at it's best is about about empathy. One of the best places I ever worked was filled with developers who had true empathy. They realized that an error or omission in their work could would wake up their Ops team at stupid o'clock in the morning, repeatedly - leading to all the things that drive SRE's and on-call folks literally insane. They practised strict TDD.
These developers volunteered to be second-tier on call after the ops team did triage, out of the kindness of their hearts for their coworkers. Management also led a culture of defending time to find permanent solutions to drive measured improvements in SLI.
SRE isn't about waking up at stupid o'clock every night to press buttons. It's about having a culture of driving permanent fixes and compensating by using cost-effective and appropriate cloud architectures. It's also about leading the working agreements with engineering teams to do blameless post-incident retros together and making the work bring your teams closer instead of pushing them apart.
I can't help but take away that a lot of you feel like On-Call heroics are what SRE is about. It's more difficult than that, but also less stressful, and simultaneously more rewarding when you get it right.
A note from our sponsor - InfluxDB
www.influxdata.com | 2 May 2024

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Stats

Basic incident-response-docs repo stats

Mentions

Stars

1,009

Activity

3.0

Last Commit

8 months ago

PagerDuty/incident-response-docs is an open source project licensed under Apache License 2.0 which is an OSI approved license.

The primary programming language of incident-response-docs is Dockerfile.

Popular Comparisons