stream alert | Bashing Linux

Say you have a Nagios system monitoring everything already in your system and in addition to that you have a Graylog2 installation which parses logs from anywhere and provides you with invaluable feedback on what’s really going on in your system.
And then comes the problem, or one of them:

Graylo2 is not really good in sending alerts (or maybe it is?)
Nagios is already configured to send alerts and you would like to use the same contact groups for instance

The solution is below.

Design

Before making you read through the whole blog entry, I’ll just outline the solution I’ve chosen to implement and you can decide whether it’s good for you or not. Here it is in a nutshell:

An alert is being generated in Graylo2 in a configured stream
Graylog2 will use exec callback plugin to call an external alerting command, call it graylog2-alert.sh for instance
graylog2-alert.sh will push an alert using send_ncsa
Nagios parses the alert and notifies whoever is subscribed on that service

Pretty simple and bullet proof.

Graylog2 Configuration

I assume you already have Graylo2 fully configured, in this case download the wonderful exec callback plugin and place it under the plugin/alarm_callbacks directory (under the Graylog2 directory obviously).

Click configure and point it to /usr/local/sbin/graylog2-alert.sh

That’s it for now on the Graylog2 interface side.

NSCA – Nagios Service Check Acceptor

Properly configure NSCA to work in your nagios configuration. That means usually:

Opening port 5667 (or another port) on your nagios server
Choosing a password for symmetrical encryption on the nagios server and the NSCA clients
Starting the nsca daemon on the nagios server, so it will accept NSCA communications

Generally speaking configuring NSCA is out of the scope of this article and more information can be found here:
http://www.nsclient.org/nscp/wiki/doc/usage/nagios/nsca

That said, I’ll just mention that when everything works, you should be able to run successfully:

echo "HOSTNAME;SERVICE;2;Critical" | send_nsca -d ';' -H NAGIOS_HOSTNAME

graylog2-alert.sh

On the Graylog2 host, place the following file under /usr/local/sbin/graylog2-alert.sh:

#!/bin/bash

# nagios servers to notify
NAGIOS_SERVERS="NAGIOS_SERVER_1 NAGIOS_SERVER_2 NAGIOS_SERVER_3"
# add a link to the nagios message, so it's easy to access the interface
# on your mobile device once you get an alert
GL2_LINK="http://GRAYLOG_URL/streams"

main() {
	local tmp_file=`mktemp`
	local gl2_topic=`echo $GL2_TOPIC | cut -d'[' -f2 | cut -d']' -f1`
	echo `hostname`";Graylog2-$gl2_topic;2;$GL2_LINK $GL2_DESCRIPTION" > $tmp_file
	local nagios_server
	for nagios_server in $NAGIOS_SERVERS; do
		/usr/sbin/send_nsca -d ';' -H $nagios_server < $tmp_file
	done
	rm -f $tmp_file
}

main "$@"

This in combination of what we did before will fire alerts from Graylo2 -> exec callback plugin -> graylog2-alert.sh -> NSCA -> nagios server.

The nagios side

All you have left to do now is to define services for use with Graylog2 alerts. It is a rather straight forward service configuration for nagios, here is mine (generated puppet in case you wonder):

define service {
        service_description            Graylog2-STREAM_NAME
        host                           REPLACE_WITH_YOUR_GRAYLOG2_HOST
        use                            generic-service
        contact_groups                 Graylog2-STREAM_NAME
        passive_checks_enabled         1
        max_check_attempts             1
        # enable active checks only to reset the alarm
        active_checks_enabled          1
        check_command                  check_tcp!22
        normal_check_interval          10
        notification_interval          10
        # set the contact group
        contact_groups                 Graylog2-STREAM_NAME
        flap_detection_enabled         0
}

define contactgroup{
        contactgroup_name       Graylog2-STREAM_NAME
        alias                   Graylog2-STREAM_NAME
        members                 dan
}

We usually have a contact group per Graylog2 stream. We just associate developers with the topic that’s relevant to them.

Restart your nagios and you’re set. Don’t forget to also start nsca!!

Resetting the alert

Graylog2 and NSCA will never generate “positive” OK alerts, but only critical ones. So you need a mechanism to reset the alert every once in a while. If you will scroll up you will see that I check port 22 (SSH) on the Graylog2 host.
How often you ask?
When configuring a new stream in Graylog2, it is best if you match the Grace period in Graylog2 to the normal_check_interval in nagios. Which would guarantee the alert will be reset before a new one comes in.

Puppet

The whole shenanigans is obviously puppetized in our environment. Tailoring nagios to an environment is usually very different between environments so I have decided it is rather redundant to paste puppet recipes altogether.

I hope you can find this semi-tutorial helpful.

Bashing Linux Linux system administration, programming and everything that goes in between…

Archive for the ‘stream alert’ Tag

Graylog2 and Nagios Integration 1 comment