After a period of time tinkering with the nagios plugin check_cluster2, written in C and being very upset about its limitations, I decided to rewrite the plugin from scratch. Yes I know there is some other perl version of the plugin called check_cluster3.pl, but it doesn't do what I need either.
Here is how the script has to be used:
=begin text
usage: ./check_cluster.pl
--crit=
--data=
--info=
[--active] use active nagios CGI links for info displaying. caution: status log is limited in length, so this may not fit into the CGI output field and maybe destroy the HTML. You can use the macros %H(host) and %S(service) for the link uri, eg: "showdetails.php?host=%H&service=%S"
[--log=
[--help] display the usage message
example usage:
% check_cluster.pl -w 0 -c 1 -d 1 -d 0 -i 2:proxy1:PING -i 0:proxy2:PIN
The check will return CRITICAL state, because first data option (-d) indicates state 1(=CRITICAL). The command output will look like this:
Cluster-Check: 1/2 OK, 1 WARN (w:0,c:2)
CRIT:proxy1:PIN
The nagios check_command config for the check for this example would be:
check_cluster.pl -w 0 -c 1 \
-d $SERVICESTATE:proxy1:PING$ \
-d $SERVICESTATE:proxy2:PING$ \
-i $SERVICESTATE:proxy1:PING$:proxy1:PING \
-i $SERVICESTATE:proxy2:PING$:proxy2:PIN
./check_cluster.pl version 1.00
COPYRIGHT (c) 2006 T.L.
N
And finally here comes the script:
=begin text#!/usr/bin/perl
my $VERSION = "1.00";
use Getopt::Long; use warnings; use strict;
use constant OK => 0; use constant WARNING => 1; use constant CRITICAL => 2; use constant UNKNOWN => 3;
use vars qw(%state $warnings $criticals @statusdata @infodata $log $help $active); my %state = (0 => "OK", 1 => "WARN", 2 => "CRIT", 3 => "ERR"); my $active = "";
sub finish; sub usage; sub shortusage;
my $result = GetOptions( "warn=i" => \$warnings, "crit=i" => \$criticals, "data=s" => \@statusdata, "info=s" => \@infodata, "log=s" => \$log, "active=s" => \$active, "help", => \$help, );
if ($help) { usage; }
if (!defined ($warnings) || !defined($criticals) || !defined(@statusdata) || !$result) { shortusage; }
if (!$ log) { $log = "Cluster-Check:"; }
my $gotwarnings = 0; my $gotcriticals = 0; my $gotunknowns = 0; my $gotok = 0; my $logmessage = ""; my @info = ();
foreach my $entry (@infodata) { my($status, $host, $service) = split /:/, $entry, 3; if (! defined($status) || !($host && $service)) { finish UNKNOWN, "supplied infodata \"$entry\" does not match the spec: SERIVESTATE:HOST:SERVICE!"; } else { if ($status) { if ($active) { $active =~ s/\%H/$host/g; $active =~ s/\%S/$service/g; push @info, "$state{$status}:$host:$service"; } else { push @info, "$state{$status}:$host:$service"; } } } }
$logmessage = join ",", @info;
foreach my $state (@statusdata) { if ($state == WARNING) { $gotwarnings++; } elsif ($state == CRITICAL) { $gotcriticals++; } elsif ($state == UNKNOWN) { $gotunknowns++; } else { $gotok++; } }
my @logstate;
if ($gotok) { push @logstate, "$gotok/" . scalar @statusdata . " OK"; } if ($gotwarnings) { push @logstate, "$gotwarnings WARN"; } if ($gotcriticals) { push @logstate, "$gotcriticals CRIT"; } if ($gotunknowns) { push @logstate, "$gotunknowns ERR"; }
my $logstate = join ", ", @logstate; $logstate .= " (w:$warnings,c:$criticals)";
if ($gotcriticals && ($gotcriticals >= $criticals)) {
finish CRITICAL, "$log $logstate
$logmessage";
}
if ($gotwarnings && ($gotwarnings >= $warnings)) {
finish WARNING, "$log $logstate
$logmessage";
}
if ($gotunknowns) {
finish UNKNOWN, "$log $logstate
$logmessage";
}
# nothing happened finish OK, "$log $logstate";
sub finish { my ($state, $message) = @_; print "$message\n"; exit $state; }
sub shortusage { print STDERR qq(usage: $0 [-wcdia] $0 --help fpr more information ); exit UNKNOWN; }
sub usage {
print STDERR qq(usage: $0
--crit=
--data=
--info=
[--active] use active nagios CGI links for info displaying. caution: status log is limited in length, so this may not fit into the CGI output field and maybe destroy the HTML. You can use the macros \%H(host) and \%S(service) for the link uri, eg: "showdetails.php?host=\%H&service=\%S"
[--log=
[--help] display the usage message
example usage:
% check_cluster.pl -w 0 -c 1 -d 1 -d 0 -i 2:proxy1:PING -i 0:proxy2:PIN
The check will return CRITICAL state, because first data option (-d) indicates state 1(=CRITICAL). The command output will look like this:
Cluster-Check: 1/2 OK, 1 WARN (w:0,c:2)
CRIT:proxy1:PIN
The nagios check_command config for the check for this example would be:
check_cluster.pl -w 0 -c 1 \\
-d \$SERVICESTATE:proxy1:PING\$ \\
-d \$SERVICESTATE:proxy2:PING\$ \\
-i \$SERVICESTATE:proxy1:PING\$:proxy1:PING \\
-i \$SERVICESTATE:proxy2:PING\$:proxy2:PIN
$0 version $VERSION
COPYRIGHT (c) 2006 T.L.