Amazon CloudWatch Integration

Amazon CloudWatch provides monitoring for Amazon Web Services (AWS) and the applications that run on AWS. AWS CloudWatch triggers user-defined alarms which are then dispatched by OpsGenie.

CloudWatch metrics play a critical role in the monitoring of the applications running on AWS cloud.

What does OpsGenie offer CloudWatch users?

A CloudWatch alarm watches a single metric over a specified time period and executes automated actions based on the value of the watched metric and given threshold. OpsGenie acts as a dispatcher for these alarms, determines the right people to notify based on on-call schedules– notifies them using email, text messages (SMS), phone calls and iPhone & Android push notifications, and escalates alerts until the alert is acknowledged or closed. CloudWatch detects problems and OpsGenie ensures the right people are working on them.

Functionality of the integration

  • When an alert is created in CloudWatch, an alert is created in OpsGenie automatically through the integration.
  • When alert is closed in CloudWatch, related alert is closed in OpsGenie automatically through the integration.

Add CloudWatch Integration in OpsGenie

  1. Please create an OpsGenie account if you haven't done so already.
  2. Go to OpsGenie's Cloudwatch Integration page.
  3. Specify who is notified of CloudWatch alerts using the Teams field. Autocomplete suggestions are provided as you type.
  4. Copy the integration URL.
  5. Click Save Integration.

Setup Alarms on CloudWatch

  1. Create an SNS topic with the name OpsGenie.
  1. Add an HTTPS subscription to the topic with the OpsGenie API endpoint. Use the URL provided from OpsGenie's Integration page for CloudWatch (as shown below).

https://api.opsgenie.com/v1/json/cloudwatch?apiKey=integrationApiKey

Upon successfully configuring SNS subscription to OpsGenie a confirmation alert is created in OpsGenie:

  1. Create a CloudWatch alarm with any metric and select the SNS topic as the action.

Make sure the Alarm Name on CloudWatch includes resource and metric names for uniqueness and ease of readability : Sample : SQS Queue chat_webhook messageCount Too High
Make sure that notifications are sent for all of the states ALARM, OK and INSUFFICIENT to OpsGenie. Notifications with state ALARM create alerts, and notifications with states OK and INSUFFICIENT close the matching alerts by default.

By default the OpsGenie Integration creates alias for each alert by using Region - AlarmName, this way OpsGenie uniquely identifies each CloudWatch alarm, creates, and closes them respectively.

Test Alerts

  1. After setting up Notifications for an alarm on CloudWatch, for testing purposes modify the threshold of the alarm on CloudWatch so that it creates an alarm.
  2. When the alarm condition is met, AWS CloudWatch passes alarm details to OpsGenie, and OpsGenie notifies the specified users (recipients parameter) based on the recipients’ notification preferences. The OpsGenie alert contains all the relevant information provided by CloudWatch.
  1. Make sure to revert the CloudWatch alarm threshold configuration back after testing is completed with CloudWatch.

Sample payload sent from Cloudwatch

Create Alert payload:

{
  "Type": "Notification",
  "MessageId": "1cf7a0eb-4179-4181-b15b-ea22c5aa0280",
  "TopicArn": "arn:aws:sns:us-east-1:08931xxxxxx:CloudWatchHTTPAlarms",
  "Subject": "ALARM: \"cpuUtilTest\" in US - N. Virginia",
  "Message": "{\"AlarmName\":\"cpuUtilTest\",\"AlarmDescription\":\"testing alarms for cpu utilization\",\"AWSAccountId\":\"08931xxxxxx\",\"NewStateValue\":\"ALARM\",\"OldStateValue\":\"OK\",\"NewStateReason\":\"Threshold Crossed: 1 datapoint (5.199) was greater than or equal to the threshold (5.0).\",\"StateChangeTime\":\"2012-08-05T22:31:25.524+0000\",\"Region\":\"US - N. Virginia\",\"Trigger\":{\"MetricName\":\"CPUUtilization\",\"Namespace\":\"AWS/EC2\",\"Statistic\":\"AVERAGE\",\"Unit\":null,\"Dimensions\":[{\"name\":\"InstanceId\",\"value\":\"i-39e64c5f\"}],\"Period\":900,\"EvaluationPeriods\":1,\"ComparisonOperator\":\"GreaterThanOrEqualToThreshold\",\"Threshold\":5.0}}",
  "Timestamp": "2012-08-05T22:31:30.673Z",
  "SignatureVersion": "1",
  "Signature": "XrsO2wtE0b+ofOl1ZxxxxxxxxlimTUg+rV4U9RmNSSBEdlmyWvtGgpjebsmNv1wkjUsBQOJZjZnpZp5FBn6quAn3twNdRMmMLf15lv6ESbYFxxxxxxxx0vmjj/ZLwiH9Pr/cxVYxxxxxxxYn8w6g=",
  "SigningCertURL": "https://sns.us-east-1.amazonaws.com/SimpleNotificationService-f3ecfb7224c72xxxxxxxxx6de52f.pem",
  "UnsubscribeURL": "https://sns.us-east-1.amazonaws.com/?Action=Unsubscribe&SubscriptionArn=arn:aws:sns:us-east-1:08931xxxxxxxx:CloudWatchHTTPAlarms:1841c5ca-ddda-450e-bbfb-xxxxxxxxx"
}

This payload is parsed by OpsGenie as:

{
  "TopicArn": "arn:aws:sns:us-east-1:08931xxxxxxx:CloudWatchHTTPAlarms",
  "AlarmName": "cpuUtilTest",
  "Subject": "ALARM: \"cpuUtilTest\" in US - N. Virginia",
  "AlarmDescription": "testing alarms for cpu utilization",
  "NewStateReason": "Threshold Crossed: 1 datapoint (5.199) was greater than or equal to the threshold (5.0).",
  "NewStateValue": "ALARM",
  "OldStateValue": "OK",
  "StateChangeTime": "2012-08-05T22:31:25.524+0000",
  "Region": "US - N. Virginia",
  "Trigger": {
    "MetricName": "CPUUtilization",
    "Namespace": "AWS/EC2",
    "Statistic": "AVERAGE",
    "Unit": null,
    "Dimensions": [
      {
        "name": "InstanceId",
        "value": "i-39e64c5f"
      }
    ],
    "Period": 900,
    "EvaluationPeriods": 1,
    "ComparisonOperator": "GreaterThanOrEqualToThreshold",
    "Threshold": 5
  }
}

Our AWS CloudWatch Integration was updated on December 8, 2015 to cover CloudWatch events with an INSUFFICIENT state. After this update, by default, events with state ALARM result in creating alerts and events with states OK and INSUFFICIENT result in closing the matching alerts by default after this update. Therefore, please make sure that CloudWatch events that are directed to OpsGenie also includes the state INSUFFICIENT.

If using advanced settings for the AWS CloudWatch integration, set up the filtering condition rules as follows to make the AWS CloudWatch integration works as described above:

  • The filtering rule for Create Alert to be matched if NewStateValue is ALARM.
  • The filtering rule for Close Alert to be matched if NewStateValue is not ALARM. *

Amazon CloudWatch Integration

Amazon CloudWatch provides monitoring for Amazon Web Services (AWS) and the applications that run on AWS. AWS CloudWatch triggers user-defined alarms which are then dispatched by OpsGenie.