Zebrium Integration

Zebrium uses unsupervised machine learning on logs and metrics to automatically catch the “leading edge” of critical application and system problems. This lets it find problems earlier than traditional monitoring and logging tools and shrinks the time to resolution. It has specifically designed this to detect related problems that impact multiple services, in a way that minimizes alert noise.

What does Opsgenie offer Zebrium users?

Opsgenie provides a two-way integration with Zebrium. Use the Zebrium Integration to forward Zebrium incidents to Opsgenie. Opsgenie determines the right people to notify based on on-call schedules– notifies via email, text messages (SMS), phone calls, iOS & Android push notifications, and escalates alerts until the alert is acknowledged or closed.
This document describes the basic functionality of the integration, how to configure it, and details of the data exchanged between Opsgenie and Zebrium.

Functionality of the integration

  1. When an incident is created in zebrium, it creates an alert in Opsgenie.
  2. If Send Alert Details for Opsgenie Alerts is enabled, details of alerts would be sent to Zebrium. Zebrium correlates those incident details with its Autonomous Incident Detection and Root Cause by looking across logs and metrics. The Opsgenie incident is updated with Zebrium Incident details and likely root cause via the Opsgenie API.

Add Zebrium Integration in Opsgenie

  1. Please create an Opsgenie account if you haven't done so already.
  2. Go to Opsgenie's Zebrium Integration page.

🚧

For Free and Essentials plans, you can only add the integrations from the Team Dashboards, please use the alternative instructions given below to add this integration.

  1. Specify who is notified of Zebrium notifications using the Teams field. Autocomplete suggestions are provided as you type.

📘

An alternative for Step 2) and Step 3) is to add the integration from the Team Dashboard of the team which will own the integration. To add an integration directly to a team, navigate to the Team Dashboard and open Integrations tab. Click Add Integration and select the integration that you would like to add.

  1. Copy the integration URL which includes Opsgenie's endpoint along with the api key in query params. It would look something like : {{opsgenie URL}}?apiKey={{apiKey}}.
  2. Click Save Integration

Configuration in Zebrium

  1. In Zebrium, click on settings icon on the top right corner near your username.
  2. Select Outbound Alerts from the Dropdown menu.
  3. Click Create Outbound Alert button.
  4. Select Webhook from the dropdown.
  5. Select Alert On: zebrium_incident
  6. Paste the copied Opsgenie URL along with the API Key parameter in "Webhook Url". It would look something like : {{opsgenie URL}}?apiKey={{apiKey}}.
  7. Select Authentication: NONE and click Create.
  8. Your Webhook is now created and you will start receiving alerts from Opsgenie whenever an incident is created in Zebrium.

Sample payload sent from Zebrium to Opsgenie

For more details on Zebrium Webhook payload, please check Zebrium Docs.

{
  "event_type": "zebrium_incident",
  "customer_name": "Acme",
  "deployment_name": "mydeployment",
  "incident_group": "prod",
  "incident_id": "0005edb2-2a5d-65b0-0200-007000013af2",
  "incident_url": "https://portal03.zebrium.com/0/incidents/0005edb2-2a5d-65b0-0200-007000013af2",
  "incident_epoch": 1591419557878,
  "incident_epoch_ts": "2020-06-06T04:59:17.878000Z",
  "incident_local_timestamp": "2020-06-06T04:59:17.878000",
  "incident_local_utcoffset": "+0000",
  "incident_hallmark_event": {
    "event_uuid": "0005edb2-2a6a-f4b0-0200-007000013b46",
    "ze_uid": "123df45e6dcb56a",
    "event_text": "2020-06-06 04:59:18,718 WARN  [UpmScheduler:thread-1]  com.atlassian.upm.pac.PacClientImpl Update check request may take longer because of the number of add-ons",
    "host": "host005",
    "log_name": "bitbkt",
    "severity": "Warning",
    "severity_num": 4,
    "app": null,
    "container_name": null,
    "namespace_name": "default",
    "incident_group": "prod",
    "epoch": 1591419558718,
    "epoch_ts": "2020-06-06T04:59:18.718000Z",
    "local_timestamp": "2020-06-06T04:59:18.718000",
    "local_utcoffset": "+0000",
    "event_meta_data": {
        "host": "host005",
        "pod_name": "bitbucket_master_76de32ac-86d3"
    }
  },
  "incident_events": [
    {
      "event_uuid": "0005edb2-2a5d-65b0-0200-007000013af2",
      "ze_uid": "1267ad231d8df",
      "event_text": "2020-06-06 04:59:17,878 INFO  [spring-startup]  c.a.u.c.l.PluginSettingsAuditLogService Thu Jun 25 04:59:17 PDT 2019 Bitbucket: Successfully started the Universal Plugin Manager",
      "host": "host005",
      "log_name": "bitbkt",
      "severity": "Informational",
      "severity_num": 6,
      "app": null,
      "container_name": null,
      "namespace_name": "default",
      "incident_group": "prod",
      "epoch": 1591419557878,
      "epoch_ts": "2020-06-06T04:59:17.878000Z",
      "local_timestamp": "2020-06-06T04:59:17.878000",
      "local_utcoffset": "+0000",
      "event_meta_data": {
          "host": "host005",
          "pod_name": "bitbucket_master_76de32ac-86d3"
      }
    },
    {
      "event_uuid": "0005edb2-2a5d-65b0-0200-007000013af6",
      "ze_uid": "716def325f89",
      "event_text": "2020-06-06 04:59:17,878 INFO  [spring-startup]  c.a.p.c.p.l.ConnectPluginEnabledHandler Got the last lifecycle event... Time to get started!",
      "host": "host005",
      "log_name": "bitbkt",
      "severity": "Informational",
      "severity_num": 6,
      "app": null,
      "container_name": null,
      "namespace_name": "default",
      "incident_group": "prod",
      "epoch": 1591419557878,
      "epoch_ts": "2020-06-06T04:59:17.878000T",
      "local_timestamp": "2020-06-06T04:59:17.878000",
      "local_utcoffset": "+0000",
      "event_meta_data": {
          "host": "host005",
          "pod_name": "bitbucket_master_76de32ac-86d3"
      }
    },
    {
      "event_uuid": "0005edb2-2a5d-7550-0200-007000013af9",
      "ze_uid": "128fde6ab4f567",
      "event_text": "2020-06-06 04:59:17,882 DEBUG [spring-startup]  c.a.b.i.m.u.DefaultMirrorService Validating that all configured mirror servers are still installed",
      "host": "host005",
      "log_name": "bitbkt",
      "severity": "Debug",
      "severity_num": 7,
      "app": null,
      "container_name": null,
      "namespace_name": "default",
      "incident_group": "prod",
      "epoch": 1591419557882,
      "epoch_ts": "2020-06-06T04:59:17.882000Z",
      "local_timestamp": "2020-06-06T04:59:17.882000",
      "local_utcoffset": "+0000",
      "event_meta_data": {
          "host": "host005",
          "pod_name": "bitbucket_master_76de32ac-86d3"
      }
    }
  ],
  "incident_stats": [
    {
      "stat_reason": "PEAK",
      "stat_name": "system_cpu",
      "stat_ct": 12
    }
  ]
}

Sample alert in Opsgenie

Configuring Opsgenie to Zebrium Integration (Optional)

To enable Opsgenie to send alert details to Zebrium, configure both Zebrium and Opsgenie.

At Zebrium

  1. In Zebrium, click on settings icon on the top right corner near your username.
  2. Select Inbound Alerts from the dropdown.
  3. Click on Create Inbound Alert button.
  4. Select Opsgenie as inbound alert type from the dropdown.
  5. Log into your Opsgenie account. Copy the API key from the Zebrium integration you've created above. Go to your Zebrium account, paste the copied API key.
  1. For Opsgenie Data Center Region there are 2 options: US and EU.
    To locate the region, log into your Opsgenie account and look at the URL in your browser’s address bar.
  2. Click on Create button.
  3. Copy the INBOUND WEBHOOK URL.

In Opsgenie

  1. Go to Opsgenie Integration page and click the Zebrium integration to modify.
  2. Make sure Send alert details to Zebrium for Opsgenie Alerts is enabled.
  3. Paste the INBOUND WEBHOOK URL to Zebrium URL field.
  4. Click Save Integration.

Sample note added by Zebrium for an Incident in Opsgenie

Sample payload sent from Opsgenie to Zebrium

{
      "integrationType": "Zebrium",
      "integrationName": "Zebrium_TeamA",
      "alert": {
        "actions": ["alertActions"],
        "description": "description about the incident",
        "details": {
          "incident-alert-type": "Responder",
          "incident-id": "0391fdb9-e18a-45aa-adb7-eee2f301a01a"
        },
        "source": "source of alert",
        "message": "Zebrium Testing 1",
        "priority": "P3",
        "createdAt": "1615984035634",
        "responders": [
          {
            "name": "Responder1",
            "id": "fc68dad4-ea29-448d-b2c1-f69f341617b0",
            "type": "user"
          }
        ],
        "teams": ["list of Teams"],
        "tags": ["tag1","tag2"],
        "tinyId": "89",
        "alias": "0f9ffdb9-f18a-45aa-adb7-eee2f300a01a_e32a2e37-7520-4b33-80d2-18fa2aa619b4",
        "alertId": "63c048c0-fa6f-3fd6-9729-77b17a12b37a-1615984035634",
        "entity": "",
        "updatedAt": "1615984036313000000",
        "username": "System"
      },
      "action": "Create",
      "integrationId": "2b3c8f5b-f4b4-4f31-9701-202b9048237c",
      "source": {
        "name": "incidentSource",
        "type": "incident"
      }
}

Updated about a month ago


Zebrium Integration


Zebrium uses unsupervised machine learning on logs and metrics to automatically catch the “leading edge” of critical application and system problems. This lets it find problems earlier than traditional monitoring and logging tools and shrinks the time to resolution. It has specifically designed this to detect related problems that impact multiple services, in a way that minimizes alert noise.

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.