Nagios XI Plugin

Opsgenie provides a Nagios XI Integration package that utilizes full capabilities of Opsgenie, including rich alerts with charts, automated closing of alerts, and bi-directional integration with Nagios. When installed and appropriately configured, the plugin creates Opsgenie alerts when a Nagios XI notification is sent, and by using Marid utility, attaches extra information (status, alert histogram, trends, etc. ) related with the relevant host or service. This extra information can be viewed with the alert directly from Opsgenie apps.

Installation

The steps below describes how to integrate Opsgenie and Nagios using Opsgenie Nagios XI integration plugin. Note that you may need to slightly alter these instructions depending on your exact Linux distribution and your Nagios XI configuration.

πŸ“˜

If you're using Lamp based Nagios XI plugin, you should backup all your configurations. Uninstall the old plugin, then install the new one.

Prerequisites

Packages are provided for Red Hat and Debian based linux distributions.

  • Red Hat based linux distributions
  • Debian based linux distributions

Download Opsgenie Nagios XI Plugin

For Red Hat Based Distributions

🚧

During upgrades, rpm package does not overwrite your existing configurations. It saves the new default configuration file as opsgenie-integration.conf.rpmnew. You can find more information about rpm upgrade config file handling from here.

🚧

If you want to update from version 201X-XX-XX to 2.X.X, you must add --force parameter. E.g.:

rpm -U --force opsgenie-integration-<your_version>.rpm

We suggest that you backup your configuration files before update!

For Debian Based Distributions

Add NagiosXI Integration in OpsGenie

To add Nagios XI integration in Opsgenie, go to Opsgenie Nagios XI Integration page

Click on "Save Integration" button to save the integration. An "API Key" is generated for the integration. This key will be used by Nagios XI to authenticate with Opsgenie and specify the integration that should be used to process Nagios XI alerts.

2136

Opsgenie NagiosXI Plugin Configuration

The plugin uses a golang-executable file (included in the plugin as nagios2opsgenie) to create, acknowledge and close alerts in Opsgenie. Nagios XI is configured to execute this file on events to create, acknowledge and close alerts in Opsgenie.

Configuration ParametersDescriptionMandatory to fill
apiKeyCopy the API key from the Nagios XI integration you've created above. nagios2opsgenie uses this key to authenticate to Opsgenie. API key is also used to identify the right integration configuration that should be used to process alerts.Yes
recipientsRecipients field is used to specify who should be notified for the NagiosXI alerts. This field is used to set the default recipients field value. The recipients can be modified to route different alerts to different people and teams in Opsgenie Nagios XI integration, Advanced Settings page. Recipients can be set to users, groups, escalations or schedules who will be notified by Opsgenie. If you did not set recipients in the integration, this field is required.Optional
teamsTeams field is used to specify which teams should be notified for the NagiosXI alerts. This field is used to set the default teams field value. It can be modified to route different alerts to different teams in Opsgenie Nagios XI integration, Advanced Settings page.Optional
tagsTags field is used to specify the tags of the alert that created in Opsgenie.Optional
nagios_servernagios_server field is used to identify the Nagios XI server in Opsgenie, and only required when there are multiple Nagios XI servers. This field is used by Opsgenie when sending actions executed by users (acknowledge, close, etc.) back to your Nagios XI servers via MaridOptional
viaMaridUrlviaMaridUrl field is used to send alerts to Opsgenie through Marid. You should enter host and port values of your working Marid.
Useful when Nagios server has no internet connection but Marid has internet connection.
In order to use this feature you should be running the Marid provided within Opsgenie Nagios Plugin
Marid should be running with web server enabled ( http or https configurations eanbled )
Marid can run on a seperate host server, the communication between nagios2opsgenie & Marid is done with basic http.
* Helps Nagios server to consume less time when sending data to Opsgenie by letting Marid do the long task with an async approach.
Optional
logPathSpecifies the full path of the log file. (Default value is /var/log/opsgenie/nagios2opsgenie.log)Optional
nagios2opsgenie.http.proxy.enablednagios2opsgenie.http.proxy.enabled field is to enable/disable external proxy configuration. The default value is false.Optional
nagios2opsgenie.http.proxy.hostIt is the host of the proxy.Optional
nagios2opsgenie.http.proxy.portIt is the port of the proxy.Optional
nagios2opsgenie.http.proxy.schemeIt is the proxy connection protocol. It may be http or https depending on your proxy servers. Its default value is http.Optional
nagios2opsgenie.http.proxy.usernameIt is the Proxy authentication username.Optional
nagios2opsgenie.http.proxy.passwordIt is the Proxy authentication password.Optional

There are three ways to configure golang-executable file:

  1. Configuring from conf file: You can configure from /etc/opsgenie/conf/opsgenie-integration.conf file. Configuring from conf file will overwrite the configurations made in the script.
  2. Configuring by using Golang Flags: You can configure by entering flags to command from Nagios XI web interface through Configure -> Core Config Manager -> Commands. Use -apiKey flag for your apiKey and -ns flag for your nagios_server name. If you don't have multiple nagios servers, you don't have to define the nagios server. Using flags will overwrite all the other configuration methods mentioned above.

πŸ“˜

If you want to send additional custom arguments, you can add them after the flags as: customArgName1 customArgValue1 customArgName2 customArgValue2
You can parse custom arguments by adding {{_payload.customArgName}} to wherever you need on the input fields.
For more information about using raw parameters please visit Dynamic Fields document.

  1. Configuring from script: You can configure apiKey and nagios_server from nagios2opsgenie.go script. If you use this option, you need to build the script again and put the new executable to /usr/bin directory. You can find information about the location of the nagios2opsgenie.go and how to build a go script in the "Source" section.

Define Nagios XI Commands

  1. Create and configure host and service notification commands, with the following content from Nagios XI web interface through Configure -> Core Config Manager -> Commands.
  2. Define Host Command.
Command Name:notify-host-by-opsgenie
Command Type:misc command

Command Line:

/usr/bin/nagios2opsgenie -entityType=host -t="$NOTIFICATIONTYPE$" -ldt="$LONGDATETIME$" -hn="$HOSTNAME$" -hdn="$HOSTDISPLAYNAME$" -hal="$HOSTALIAS$" -haddr="$HOSTADDRESS$" -hs="$HOSTSTATE$" -hsi="$HOSTSTATEID$" -lhs="$LASTHOSTSTATE$" -lhsi="$LASTHOSTSTATEID$" -hst="$HOSTSTATETYPE$" -ha="$HOSTATTEMPT$" -mha="$MAXHOSTATTEMPTS$" -hei="$HOSTEVENTID$" -lhei="$LASTHOSTEVENTID$" -hpi="$HOSTPROBLEMID$" -lhpi="$LASTHOSTPROBLEMID$" -hl="$HOSTLATENCY$" -het="$HOSTEXECUTIONTIME$" -hd="$HOSTDURATION$" -hds="$HOSTDURATIONSEC$" -hdt="$HOSTDOWNTIME$" -hpc="$HOSTPERCENTCHANGE$" -hgn="$HOSTGROUPNAME$" -hgns="$HOSTGROUPNAMES$" -lhc="$LASTHOSTCHECK$" -lhsc="$LASTHOSTSTATECHANGE$" -lhu="$LASTHOSTUP$" -lhd="$LASTHOSTDOWN$" -lhur="$LASTHOSTUNREACHABLE$" -ho="$HOSTOUTPUT$" -lho="$LONGHOSTOUTPUT$" -hnu="$HOSTNOTESURL$" -hpd="$HOSTPERFDATA$"
809
  1. Define Service Command.
Command Name:notify-host-by-opsgenie
Command Type:misc command

Command Line:

/usr/bin/nagios2opsgenie -entityType=service -t="$NOTIFICATIONTYPE$" -ldt="$LONGDATETIME$" -hn="$HOSTNAME$" -hdn="$HOSTDISPLAYNAME$" -hal="$HOSTALIAS$" -haddr="$HOSTADDRESS$" -hs="$HOSTSTATE$" -hsi="$HOSTSTATEID$" -lhs="$LASTHOSTSTATE$" -lhsi="$LASTHOSTSTATEID$" -hst="$HOSTSTATETYPE$" -ha="$HOSTATTEMPT$" -mha="$MAXHOSTATTEMPTS$" -hei="$HOSTEVENTID$" -lhei="$LASTHOSTEVENTID$" -hpi="$HOSTPROBLEMID$" -lhpi="$LASTHOSTPROBLEMID$" -hl="$HOSTLATENCY$" -het="$HOSTEXECUTIONTIME$" -hd="$HOSTDURATION$" -hds="$HOSTDURATIONSEC$" -hdt="$HOSTDOWNTIME$" -hpc="$HOSTPERCENTCHANGE$" -hgn="$HOSTGROUPNAME$" -hgns="$HOSTGROUPNAMES$" -lhc="$LASTHOSTCHECK$" -lhsc="$LASTHOSTSTATECHANGE$" -lhu="$LASTHOSTUP$" -lhd="$LASTHOSTDOWN$" -lhur="$LASTHOSTUNREACHABLE$" -ho="$HOSTOUTPUT$" -lho="$LONGHOSTOUTPUT$" -hpd="$HOSTPERFDATA$" -s="$SERVICEDESC$" -sdn="$SERVICEDISPLAYNAME$" -ss="$SERVICESTATE$" -ssi="$SERVICESTATEID$" -lss="$LASTSERVICESTATE$" -lssi="$LASTSERVICESTATEID$" -sst="$SERVICESTATETYPE$" -sa="$SERVICEATTEMPT$" -msa="$MAXSERVICEATTEMPTS$" -siv="$SERVICEISVOLATILE$" -sei="$SERVICEEVENTID$" -lsei="$LASTSERVICEEVENTID$" -spi="$SERVICEPROBLEMID$" -lspi="$LASTSERVICEPROBLEMID$" -sl="$SERVICELATENCY$" -set="$SERVICEEXECUTIONTIME$" -sd="$SERVICEDURATION$" -sds="$SERVICEDURATIONSEC$" -sdt="$SERVICEDOWNTIME$" -spc="$SERVICEPERCENTCHANGE$" -sgn="$SERVICEGROUPNAME$" -sgns="$SERVICEGROUPNAMES$" -lsch="$LASTSERVICECHECK$" -lssc="$LASTSERVICESTATECHANGE$" -lsok="$LASTSERVICEOK$" -lsw="$LASTSERVICEWARNING$" -lsu="$LASTSERVICEUNKNOWN$" -lsc="$LASTSERVICECRITICAL$" -so="$SERVICEOUTPUT$" -lso="$LONGSERVICEOUTPUT$" -snu="$SERVICENOTESURL$" -spd="$SERVICEPERFDATA$"
809
  1. After adding commands dont forget to press Apply New Configuration for the changes to take effect.

Define Nagios XI Contacts

  1. Go to Configure -> Core Config Manager -> Contacts
  2. Click on Add New button
  3. Populate Common Settings as follows:
Contact Name:opsgenie
Description:Opsgenie Contact
Active:checked
784
  1. Populate Alert Settings as follows:
Host Notifications Enabled:checked
Host Notification Timeperiod:24Γ—7
Host Notification options:d, r
Manage Host Notification Commands:Add notify-host-by-opsgenie command to selected list
Service Notifications Enabled:checked
Service Notification Timeperiod:24Γ—7
Service Notification options:c,r
Manage Service Notification Commands:Add notify-service-by-opsgenie command to selected list
813
  1. Click on Save and Apply Configuration buttons

πŸ“˜

Please make sure that the contact is added to your Hosts and Services contact list.

If everything goes well, you will see alerts in Opsgenie for every notification created in Nagios XI. When the host or service comes back up, the alert in Opsgenie gets closed automatically as well.

1104

If you have any problem, check nagios2opsgenie logs and Troubleshoot Guide for common problems. Please don't hesitate to contact us, if your problem persists.

Configure Opsgenie to Nagios XI Integration (Optional)

🚧

If you are using Opsgenie Edge Connector instead of Marid, you can find the integration specific script and its sample config from here. For more information about OEC, please refer OEC Integration documentation

The plugin uses Marid utility (included in the plugin) to enrich alerts when they get created and to update the state of the alerts in Nagios XI when alerts get updated in Opsgenie. For example, when an alert is created in Opsgenie, Marid gets details(histogram, trends etc.) from Nagios and attaches them to the alert. Also when users acknowledge an alert from their mobile devices using the Opsgenie app, alert gets acknowledged in Nagios XI, and when users add comments to alerts in Opsgenie, comments get posted to Nagios XI as well. Marid subscribes to alert actions in Opsgenie and reflects these actions on Nagios XI using Nagios CGIs.

  1. To start Marid, run the following command: /etc/init.d/marid start
  2. To stop Marid, run the following command: /etc/init.d/marid stop

Marid is a java application; therefore requires the Java Runtime version 1.6+ Both the Open JDK and Oracle JVMs can be used.

πŸ“˜

In order to use this feature "Send Alert Actions To Nagios XI" checkbox should be enabled in Opsgenie Nagios XI Integration.

πŸ“˜

Ensure that JAVA_HOME environment variable is set. If it is not, you may set it by removing the comment at the begining of the following line in /etc/opsgenie/profile file and set JAVA_HOME to your JRE installation directory.
#JAVA_HOME=<path/to/JDK or JRE/install>

To be able to execute actions in Nagios, Marid gets the configuration parameters from /etc/opsgenie/conf/opsgenie-integration.conf file.

Configuration Parameters
nagios.alert_histogram_image_urlMarid retrieves histogram images from Nagios XI using this URL. Localhost should be replaced with your nagios server address.
nagios.trends_image_urlMarid retrieves trends images from Nagios XI using this URL. Localhost should be replaced with your nagios server address.
nagios.command_urlURL to update Nagios XI alerts when alerts get acknowledged, commented, etc.
nagios.user:Credentials to authenticate Nagios web server to get nagios histogram and trends images. Please follow below steps to get credentials:
Go to Nagios XI Admin >> Manage Components >> Backend API URL page.
Select a user from Account Selection and click Apply Settings.
* Copy ticket information in one of the listed Backend API URLs.
nagios.ticket:See pic below for nagios ticket.
nagios.http.timeout:timeout duration in msecs to get nagios histogram and trends images.
1005

Multiple Nagios Server Support

Marid can be configured to forward Opsgenie alert actions to multiple Nagios servers, in order to do so:

  • On each Nagios Server, modify /etc/opsgenie/conf/opsgenie-integration.conf nagios_server config property value to a unique name.
  • On Marid server, modify /etc/opsgenie/conf/opsgenie-integration.conf, Add commandurl, user, password configuration for each Nagios server.
  • opsgenie-integration.conf has a commented sample configuration for multiple Nagios servers:
#nagios.server1.command_url=http://nagiosHost:port/nagiosxi/includes/components/nagioscore/ui/cmd.php
#nagios.server1.alert_histogram_image_url=http://nagiosHost:port/nagiosxi/includes/components/nagioscore/ui/histogram.php
#nagios.server1.trends_image_url=http://nagiosHost:port/nagiosxi/includes/components/nagioscore/ui/trends.php
#nagios.server1.user=nagiosadmin
#nagios.server1.ticket=nagiosticket
#nagios.server1.http.timeout=30000

If you see "JAVA_HOME not defined" error in /var/log/opsgenie/nagios2opsgenie.log, you should define it in /etc/opsgenie/profile shell script.

If you use Marid, you will see rich alerts populated with host or service current status information in Opsgenie for every notification created in Nagios.

1034 952

For more information refer to Marid Integration Server and Callbacks docs. Please do not hesitate to get in touch with any questions, issues, etc.

🚧

NagiosXI integration package does not support SSL v1.0. If your NagiosXI Server has SSL v1.0, we suggest you to upgrade your SSL server.

FAQ and Troubleshooting

If you're having trouble getting the integration to work, please check if your problem is mentioned below, and follow our advice:

1- NagiosXI alerts are not getting created in Opsgenie:

Run the following test command from the shell. Check if the test alert is created in Opsgenie:

/usr/bin/nagios2opsgenie -entityType=host -t=PROBLEM -hs=DOWN -hn=test_host

If you're getting a "Trace/breakpoint trap" error: It means your nagios2opsgenie plugin isn't compatible with your server distribution. Follow the "Source and Recompiling nagios2opsgenie" section below and rebuild your nagios2opsgenie.go according to your specific server environment.

If the alert is created in Opsgenie: It means the integration is installed correctly. The problem might be that Nagios is not notifying the Opsgenie contact for alerts. Check your Nagios alert notifications log.

If not: Check the logs at /var/log/opsgenie/nagios2opsgenie.log. Look for the following errors in the log file:

  • If you're seeing "RestException[Could not authenticate.]" in the logs, it means Opsgenie couldn't identify your api key. Check if you've set the API key correctly, as explained in "Opsgenie NagiosXI Plugin Configuration" above.
  • If you're seeing "Could not execute this action with apiKey of [NagiosXI] integration" in the logs, you might have downloaded the wrong integration package. Make sure that you download the NagiosXI integration package, not Nagios or any other.
  • If you can't make sense of the problem, set the plugin's log level to debug, try again and send the logs to us at [email protected].

If there is no /var/log/opsgenie/nagios2opsgenie.log file, or there are no logs in it, check the following:

  1. First, make sure the nagios user has permission to write to /var/log/opsgenie directory. The installation package should automatically do this for you. If you encounter problems, execute:
    chown -R nagios:opsgenie /var/log/opsgenie
  2. Now check your Nagios server logs at /usr/local/nagios/var/nagios.log. See if there are error logs regarding nagios2opsgenie, and contact us with them.

Setting nagios2opsgenie plugin's log level to DEBUG:

Change the line nagios2opsgenie.logger=warning to nagios2opsgenie.logger=debug in /etc/opsgenie/conf/opsgenie-integration.conf file.

2- The NagiosXI alert is not acknowledged when you ack the alert at Opsgenie:

  1. First, check your alert logs.
  • If you don't see the "Posted [Acknowledge] action to NagiosXI.." log, it means Opsgenie didn't send the Acknowledge action to NagiosXI. Check your integration configuration, it might not have matched the alert action.
  • If you're seeing "Executed [Acknowledge] action via Marid with errors." log, it means the nagiosActionExecutor.groovy script in your Marid has encountered an error. Check the logs at /var/log/opsgenie/marid/script.log for error logs.
  • If you only see the "Posted [Acknowledge] action to NagiosXI.." log and no related log after that, it might mean Marid is having connection problems. Check the logs at /var/log/opsgenie/marid/Marid.log for error logs.
  1. If you can't make sense of the problem, set the Marid's script log level to debug, try again and send the /var/log/opsgenie/marid/script.log file to us at [email protected].

Setting Marid's script log level to DEBUG:
Change the line log4j.logger.script=WARN, script to log4j.logger.script=DEBUG, script in /etc/opsgenie/marid/log.properties file. Then, restart Marid service.

3- Marid is causing memory leak, or using up too much RAM:

Change the line log4j.rootLogger=WARN, marid to log4j.rootLogger=DEBUG, marid in /etc/opsgenie/marid/log.properties file. Then, restart Marid service and send the /var/log/opsgenie/marid/Marid.log file to us at [email protected] so we can analyze further.

Configuring multiple Nagios Servers

πŸ“˜

If you do not want to use multiple Nagios servers please make sure that nagios_server variable is set to default in the configuration file.

If you do want to use multiple servers, set the nagios_server variable to server's name. For example; nagios_server:server1 and configure the Nagios variables as in the following:

nagios.server1.alert_histogram_image_url=http://nagiosHost:port/nagios/cgi-bin/histogram.cgi
nagios.server1.trends_image_url=http://nagiosHost:port/nagios/cgi-bin/trends.cgi
nagios.server1.command_url=http://nagiosHost:port/nagios/cgi-bin/cmd.cgi
nagios.server1.user=nagiosadmin
nagios.server1.password=admin
nagios.server1.http.timeout=30000

Source and Recompiling nagios2opsgenie

The source for the executable nagios2opsgenie is located under /usr/bin/ and nagios2opsgenie.go is located under /etc/opsgenie/ and is also available at GitHub OpsGenie Integration repository. If you wish to change the behavior of the executable, you can edit nagios2opsgenie.go and build it using:

go build nagios2opsgenie.go

For installing go, refer to http://golang.org/doc/install. Note that the executable in the plugin is built for linux/386 systems.