Incident Investigation

📘

Incident investigation and all its features are only available on Standard and Enterprise plans.

Opsgenie’s Incident Investigation tool is a dashboard where you can track all code changes in the form of deployments leading up to an incident. You can drill down into these deployments and check the commits that were included in the deployment. Upon checking the details of these deployments and commits you can then mark seemingly problematic commits or whole deployments as potential causes to the incident. In doing so you can now effectively lower your Mean Time to Resolution durations by quickly detecting problematic code changes and identifying authors of code to rope into the incident resolution process.

Configuration

Using Opsgenie’s Incident investigation requires you to perform prior configurations. Opsgenie now empowers you with the ability to resolve incidents in a timely manner by connecting your Bitbucket workspaces with Opsgenie. To enable the use of the Incident investigation view you can map your Bitbucket repositories under workspaces linked to your Opsgenie account, to your owned services. By mapping your Bitbucket repositories you will be able to track deployments of code changes leading up to an incident on the service.

Therefore to enable the Incident Investigation feature and all its benefits it is necessary to have the prior configurations:

  1. Link your Bitbucket workspaces to Opsgenie
  2. Map repositories under linked workspaces to your owned services

By performing the required configurations and enabling the incident investigation feature, you can access the Incident investigation view where all the deployments performed on impacted services can be tracked. The view can be accessed via the incident’s detail page which also lists all the deployments and commits marked as the probable cause of the incident. Potential causes in the form of commits or entire deployments are marked using the Invicent investigation view and are listed in the incident’s detail page.

Adding potential causes

  1. Connect your Bitbucket workspaces with Opsgenie.
  2. Map your Bitbucket repos to your Opsgenie services.
  3. Identify and add impacted services to your incident to investigate deployments made over impacted services.
  4. Go to the detail of the incident and click Investigate.
  1. Work on the deployment dashboard; select deployments and check the code changes on impacted services, or their related services.

The Deployment history graph at the top of the Incident investigation view presents all deployments and past incidents related to the impacted services and their related services. Go to our document to learn more about how to use the Deployment History Graph of the Incident Investigation view here.

  1. Select commits or the entire deployment as a potential cause from the deployment details panel at the bottom of the Deployment History Graph.
 To do that, hover over the deployment or commit, and click on the *Select deployment option.
  1. Add the selected commits or deployments as potential causes by clicking on the Add potential causes button at the bottom of the Incident Investigation view.
  1. After adding the deployments and/or commits as potential causes, they will be listed on the incident details page.

Updating a potential cause

  1. After adding potential causes in the form of specific commits or whole deployments, you may update them by clicking on the Investigate button in the incident details.
  1. You'll access the incident details, where you can select deployments from the Deployment History Graph.

Deployment nodes that represent deployments, or deployments with commits, already added as potential causes will be marked with orange dots. Go to our document to learn more about how to use the Deployment History Graph and the Incident Investigation View here.

  1. You can add or remove the commit, or even the entire deployment by hovering over the deployment details displayed below the Deployment History Graph.
  1. Click on the Update potential causes option at the bottom of the Incident Investigation View.
  1. When you go back to the incident detail, you can see the updated list of potential causes.

Removing a potential cause

  1. To remove a potential cause, go to the Incident Investigation View from the incident detail. Click on the relevant deployment from the Deployment History Graph.
  1. To remove a deployment, select the relevant deployment and click on the Remove button located on the summary panel. To remove a commit, hover over the relevant commit in the same panel and click on the Remove button that appears.
  1. When selected, click on the Update potential causes button at the bottom.

Redeploying selected deployments

The Incident investigation not only allows you to detect causes of what went wrong but also remediate incidents via redeployments and rollbacks to the previous known stable states.

  1. Select the desired deployment node on the Incident investigations' Deployment history graph that you wish to redeploy.

  2. Click on the Redeploy button in the Deployment summary panel.

  1. You will be then redirected to Bitbucket's redeployment modal where you can gain further insights into the redeployment you're about to perform to confirm whether or not you would like to continue. Click on the *Redeploy button to continue with the redeployment process.
  1. Within the Incident investigation view, you will be able to view the redeployed deployment as a new node in the Deployment history graph.

Updated 20 days ago


Incident Investigation


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.