CMOC Checklist#
Shift Start#
Declaring an Incident#
checklist for starting and incident:
- For the CMOC - post in #incident-management that you are the CMOC - cross post to #support_gitlab-com if needed
- Create the production issue if possible. In slack:
/start-incident
or if you have an alert in alerts-general - click the Open Issue button in the thread. - Create an incident in http://status.io - make sure you check the options to broadcast to slack, twitter, etc
- If you don't have full specifics, get the incident created in status.io and first tweet out with a more generic "We are seeing elevated error rates on GitLab.com". It is better to have a post sooner with investigating than waiting 5 minutes to know more.
- Create a google doc from the shared template
- Update Slack and the status.io incident with links to the issue number and google docs.
- Check with incident team. Are they all in the same channel, gdoc, on the zoom as needed. Coordinate and consolidate communication
- Set a timer for 15 minutes to remind yourself to update status.io and tweet
- Start to gather overall summary and look to write up an executive summary in the production issue or gdoc for others in the company
- Check in with incident team:
- Do they need more people or expertise? Broadcast and ask for help as soon as you know it is needed.
- Clear the deck - make sure other changes / teams know an incident is going on
- Clear the deck - cancel other meetings as needed.
Taking Ownership of an Incident#
The CMOC