Launch HN: Rootly (YC S21) – Manage Incidents in Slack

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

dispatch

20 4,602 9.9 Python

All of the ad-hoc things you're doing to manage incidents today, done for you, and much more!

The open source option from Netflix is quite popular too: https://github.com/Netflix/dispatch

incident-response-docs

8 1,009 3.0 Dockerfile

PagerDuty's Incident Response Documentation.

Cool, thanks for this view.
I'm also intrigued by the text in this launch announcement:
> Our focus in the early days was build a hyper opinionated product to help them follow what we believe are the best practices. Now our product direction is focused on configuration and flexibility, how can we plug Rootly into your already existing way of working and automate it. This has helped our larger enterprise customers be successful with their current processes being automated.
As I have gotten more experience managing complex incidents I've come around to the idea that having a standard process you follow for big issues is somewhat more important than what the process really is.
I loved the PagerDuty response documentation ( https://response.pagerduty.com/ ) not so much because of the specifics but because it suggests they have a culture where there is a well-understood protocol they always try to follow for big problems.
I think about archery and "shot grouping" - once you learn to always land in the same place, you can move your aim to start landing somewhere else.
A number of the things that I see as valuable incident management involve having responders with a shared set of priorities. Tooling can influence how easy/hard some of these things are but it's really up to the people to do things like:
* Actually finding and fixing the problem and being sure the fix worked
* Clearly communicating the current user impact to the people who care
* Figuring out who the right responders are, and getting them in the room quickly
* Making one production change at a time with the incident coordinator's signoff, so you know which one helped and when it happened
* Helping the rest of the organization learn from what happened (you may not know what there is to learn)
Do you see room for the tooling company to also provide best-practices training, mentorship, or other kinds of support? That stuff scales less well than a web app but is arguably more important to changing a company's culture in a way that gets better user outcomes.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

What's your incident response flow?

2 projects | /r/sre | 27 May 2023
SRE - Process to handle incident management

1 project | /r/devops | 7 Sep 2022
What happens if you cannot resolve the issue at hand?

1 project | /r/sysadmin | 25 Aug 2022
Startup guide to incident management

1 project | dev.to | 16 Mar 2022
PagerDuty Postmortem Handbook

1 project | news.ycombinator.com | 7 Dec 2023

Launch HN: Rootly (YC S21) – Manage Incidents in Slack

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
pagerduty incident-response Documentation oncall team-security
Post date: 7 Jun 2022

dispatch

incident-response-docs

InfluxDB

Related posts

What's your incident response flow?

SRE - Process to handle incident management

What happens if you cannot resolve the issue at hand?

Startup guide to incident management

PagerDuty Postmortem Handbook

Launch HN: Rootly (YC S21) – Manage Incidents in Slack

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com pagerduty incident-response Documentation oncall team-security Post date: 7 Jun 2022

dispatch

incident-response-docs

InfluxDB

Related posts

What's your incident response flow?

SRE - Process to handle incident management

What happens if you cannot resolve the issue at hand?

Startup guide to incident management

PagerDuty Postmortem Handbook

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
pagerduty incident-response Documentation oncall team-security
Post date: 7 Jun 2022