概述

本文的环境:Zabbix版本为3.4,一台Server,一台Porxy,一台agent。Porxy主动抓取agent的状态并sender到Server。

首先需要保证服务器的BMC口能够联网,并且拥有管理用户和密码,Proxy和agent能够保持联网。本文只针对HP系列服务器,其他品牌服务器后续更新。

安装

首先安装所需的软件包

yum install perl-IO-Socket-SSL.noarch perl-XML-Simple.noarch perl-Class-Accessor perl-Config-Tiny.noarch perl-Monitoring-Plugin
rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

脚本的代码我贴在文末,脚本存放在/root下,无需修改其内容,文件名保存为hp.pl,权限保持为-rw-r--r-- 即可。

执行脚本

perl /root/check_ilo2_health.pl -u Admin -p aliyun -3 -v  -t 60 -H 192.168.1.1

选项:

-u     用户名

-p     密码

-v     版本

-t     超时时间,单位为s

-H   主机BMC管理IP

返回以下内容:(截取部分风扇)

               <ZONE VALUE = "System"/><LABEL VALUE = "Fan 1"/><STATUS VALUE = "Not Installed"/><SPEED VALUE = "0" UNIT="Percentage"/></FAN><FAN><ZONE VALUE = "System"/><LABEL VALUE = "Fan 2"/><STATUS VALUE = "Not Installed"/>
chunk: 1ff
chunk size: 511<SPEED VALUE = "0" UNIT="Percentage"/></FAN><FAN><ZONE VALUE = "System"/><LABEL VALUE = "Fan 3"/><STATUS VALUE = "OK"/><SPEED VALUE = "27" UNIT="Percentage"/></FAN><FAN><ZONE VALUE = "System"/><LABEL VALUE = "Fan 4"/>

这里可以看到,可以抓取到风扇的序号和运行状态,当前状态为“Not Installed”和“OK”,我们可以通过grep和awk进行过滤,筛选需要的信息。

这时可以写一个脚本,脚本内容如下:

#!/bin/bash
LSI_LOG=/tmp/hp.log
perl /root/check_ilo2_health.pl -u administrator -p aliyun -3 -v -t 60 -H 192.168.1.1 >$LSI_LOG
cat $LSI_LOG | grep -v "^chunk" |grep -A 2 "Fan [1-9]" |grep "STATUS VALUE" |sed -e 's/"/ /g' | awk -F "/" '{print $1}' | awk -F "=" '{print $2}' 

得到以下内容

  Not Installed Not Installed OK OK Not Installed OK OK OK

这时就可以写一个循环,将其推送到server上,代码如下:

Sender的用法:

-z   主机IP

-s   Zabbix上的主机名

-k   Zabbix监控项的Key值

-o   数值,key值的数值

#!/bin/bash
LSI_LOG="/tmp/hp.log"
perl /root/hp.pl -u Administrator -p ****** -3 -v -t 60 -H 192.168.1.11 >$LSI_LOG
#get Fan state
n=1
s=1
cat $LSI_LOG | grep -v "^chunk" |grep -A 2 "Fan [1-9]" |grep "STATUS VALUE" |sed -e 's/"/ /g' | awk -F "/" '{print $1}' | awk -F "=" '{print $2}' > $LSI_LOG.temp
while read line
do
/usr/bin/zabbix_sender -z 192.168.1.10 -s hp01 -k fan"$n".state -o "$line"
((n++))
if [ $n == 13 ];then
n=1
fi
done <$LSI_LOG.temp

脚本保存在/usr/lib/zabbix/externalscripts/hp.sh

创建定时任务,crontab -e ,每五分钟执行一次

*/5 * * * * /usr/lib/zabbix/externalscripts/hp.sh >/dev/null 2>&1

此时在Zabbix添加相应的监控项以及Key值就可以了,需要注意的是,脚本Sender的-s参数(主机名字)一定要和Zabbix的主机名字对应,否则将无法获取数据。

ccb6e3377ac064c27d6a8724fd4c8d28ed7d63ed

查看最新值

8403a0d7945caf6a5cf79a25c1e5760d0fd63987

创建相应触发器即可。

脚本内容

#!/usr/bin/perl
# icinga: -epn# check_ilo2_health.pl
# based on check_stuff.pl and locfg.pl
#
# Nagios plugin using the Nagios::Plugin module and the
# HP Lights-Out XML PERL Scripting Sample from
# ftp://ftp.hp.com/pub/softlib2/software1/pubsw-linux/p391992567/v60711/linux-LOsamplescripts3.00.0-2.tgz
# checks if all sensors are ok, returns warning on high temperatures and
# fan failures and critical on overall health failure
#
# Alexander Greiner-Baer <alexander.greiner-baer@web.de> 2007 - 2018
# Matthew Stier <Matthew.Stier@us.fujitsu.com> 2011
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
#
# Changelog:
# 1.62    Mon, 14 May 2018 19:05:22 +0200
#   retrieve firmware infos only when using --getinfos
# 1.61    Thu, 01 Jun 2017 20:05:04 +0200
#   fix for iLO4 2.50 link state when using --ignorelinkdown
# 1.60    Wed, 12 Aug 2015 18:20:13 +0200
#   provide --sslopts to override defaults settings
#   fix }; for GET_EVENT_LOG
#   applied patch from Rene Koch <rene.koch@siedl.net>:
#     handle missing values when using "-g"
#   CONTROLLER_STATUS not present on iLO4 anymore, use STATUS instead
#   put SSL_VERIFY_NONE in ''
# 1.59    Wed, 28 Jan 2015 18:56:26 +0100
#   fix chunk size handling
#   corrected HTTP/1.1 HOST Header
#   applied patch from Max Winterstein <winterstein@siriusonline.de>:
#     sslv3 support
#   add retries option
#   catch XMLin() errors
#   applied patch from Rene Koch <rene.koch@siedl.net>:
#     ignore battery not installed status (option "-x")
#     display server name (option "-g")
#     added warning for logical drive status "Degraded (Recovering)"
#     display system details (hardware model, serial number, SystemROM, iLO version)
#     display memory size and part number in case of memory failure
#     display hard disk model number in case of hard disk failure
#     display power supply part number in case of power supply failure
# 1.58    Thu, 08 Aug 2013 18:17:02 +0200
#   ignore network link down status (option "-i")
#   added ENCLOSURE_ADDR to drive bay label (bay numbering was inconsistent)
#   ignore spare drives
# 1.57    Fri, 17 May 2013 19:30:48 +0200
#   SSL_verify_mode SSL_VERIFY_NONE (IO::Socket::SSL changed default)
#   event log support for ilo2
#   disable embedded perl in icinga
# 1.56    Fri, 15 Mar 2013 20:47:13 +0100
#   applied patch from Niklas Edmundsson <Niklas.Edmundsson@hpc2n.umu.se>:
#     check processor and memory details
#   applied patches from Dragan Sekerovic <dragan.sekerovic@onestep2.at>:
#     add location label to temperature (option "-b")
#     support for checking event log (option "-l")
#     add iLO version to output
#   add 2 new values for power supply status
#   --
# 1.55    Sun, 05 Aug 2012 20:18:46 +0200
#   faulty drive (option "-c") exits now with CRITICAL instead of WARNING
#   applied patches from Niklas Edmundsson <Niklas.Edmundsson@hpc2n.umu.se>:
#     iLO4 RAID Controller Status
#     nodriveexit
#   add g6 drive status
#   overall health probes every element now
#   fixed bug with drive bay index
#   supports iLO3 with multiple backplanes
#   supports iLO4 disk check
#   Note: overall health may show drive/storage status, even without "-c"
#   --
# 1.54    Thu, 14 Jun 2012 21:36:40 +0200
#   applied fix for iLO4 from Niklas Edmundsson <Niklas.Edmundsson@hpc2n.umu.se>
#   --
# 1.53    Tue, 14 Feb 2012 19:47:40 +0100
#   added new disk bay variant
#   added power supply NOT APPLICABLE
#   --
# 1.52    Wed, 27 Jul 2011 20:46:14 +0200
#   fixed <LABEL VALUE = "Power Supplies"/> again
#   --
# 1.51    Mon, 25 Jul 2011 19:36:53 +0200
#   fixed bug with chunked replies by Matthew Stier
#   --
# 1.5     Sat, 16 Jul 2011 10:02:10 +0200
#    optimized by Matthew Stier
#   --
# 1.47    Thu, 14 Jul 2011 12:02:01 +0200
#   also print perfdata when temperature output is disabled
#   --
# 1.46    Wed, 06 Jul 2011 08:46:51 +0200
#   fixed bug with nagios embedded perl interpreter
#   --
# 1.45    Wed, 13 Oct 2010 22:17:01 +0200
#   new option "--ilo3"
#
#   "--checkdrives" enhancements
#
#   <LABEL VALUE = "Power Supplies"/> shows always "Failed" even when the power
#   supplies are redundant
#
#   improved "--fanredundancy" and "--powerredundancy"
#   --
# 1.44    Mon, 14 Dec 2009 20:11:37 +0100
#   new option "--checkdrives"
#   --
# 1.43    Mon, 17 Aug 2009 20:50:13 +0200
#   new option "--fanredundancy"
#
#   new option "--powerredundancy"
#   --
# 1.42          Mon, 17 Aug 2009 12:52:23 +0100
#   check power supply and fans redundancy
#               gcivitella@enter.it
#   --
# 1.41          Thu, 26 Jul 2007 17:42:36 +0200
#   perfdata label ist now quoted
#   --
# 1.4           Mon, 25 Jun 2007 09:45:52 +0200
#   check vrm and power supply
#
#   new option "--notemperatures"
#
#   new option "--perfdata"
#
#   some minor changes
#   --
# 1.3beta       Wed, 20 Jun 2007 09:57:46 +0200
#   do some error checking
#
#   new option "--inputfile"
#   read bmc output from file
#   --
# 1.2   Mon, 18 Jun 2007 09:33:17 +0200
#   new option "--skipsyntaxerrors"
#   ignores syntax errors in the xml output, maybe required by older firmwares
#
#   introduce a date to the changelog ;)
#   --
# 1.1   do not return warning if temperature status is n/a
#
#   add "<LOCFG VERSION="2.21" />" to get rid of the
#   "<INFORM>Scripting utility should be updated to the latest version.</INFORM>"
#   message
#   --
# 1     initial releaseuse strict;
use warnings;
use strict 'refs';use Monitoring::Plugin;
use Sys::Hostname;
use IO::Socket::SSL;
use XML::Simple;$Net::SSLeay::slowly = 5;use vars qw($VERSION $PROGNAME  $verbose $warn $critical $timeout $result);
$VERSION = 1.62;$PROGNAME = "check_ilo2_health";# instantiate Nagios::Plugin
our $p = Monitoring::Plugin->new(usage => "Usage: %s [-H <host>] [ -u|--user=<USERNAME> ][ -p|--password=<PASSWORD> ] [ -f|--inputfile=<filename> ][ -a|--fanredundancy ] [ -c|--checkdrives ] [ -d|--perfdata ][ -e|--skipsyntaxerrors ] [ -n|--notemperatures ] [ -3|--ilo3 ][ -o|--powerredundancy ] [ -b|--locationlabel ] [ -l|--eventlogcheck][ -i|--ignorelinkdown ] [ -x|--ignorebatterymissing ] [ -s|--sslv3 ][ -t <timeout> ] [ -r <retries> ] [ -g|--getinfos ] [ --sslopts ][ -v|--verbose ] ",version => $VERSION,blurb => 'This plugin checks the health status on a remote iLO2|3|4 device
and will return OK, WARNING or CRITICAL. iLO (integrated Lights-Out)
can be found on HP Proliant servers.'
);$p->add_arg(spec => 'host|H=s',help =>qq{-H, --host=STRINGSpecify the host on the command line.},
);# add all arguments
$p->add_arg(spec => 'user|u=s',help =>qq{-u, --user=STRINGSpecify the username on the command line.},
);$p->add_arg(spec => 'password|p=s',help =>qq{-p, --password=STRINGSpecify the password on the command line.},
);$p->add_arg(spec => 'inputfile|f=s',help =>qq{-f, --inputfile=STRINGRead input from file.},
);$p->add_arg(spec => 'fanredundancy|a',help =>qq{-a, --fanredundancyCheck fan redundancy},
);$p->add_arg(spec => 'checkdrives|c',help =>qq{-c, --checkdrivesCheck drive bays.},
);$p->add_arg(spec => 'perfdata|d',help =>qq{-d, --perfdataEnable perfdata on output.},
);$p->add_arg(spec => 'locationlabel|b',help =>qq{-b, --locationlabelShow temperature with location.},
);$p->add_arg(spec => 'eventlogcheck|l',help =>qq{-l, --eventlogcheckParse ILO eventlog for interesting events (f.e. broken memory).},
);$p->add_arg(spec => 'skipsyntaxerrors|e',help =>qq{-e, --skipsyntaxerrorsSkip syntax errors on older firmwares.},
);$p->add_arg(spec => 'ignorebatterymissing|x',help =>qq{-x, --ignorebatterymissingIgnore Battery missing status.},
);$p->add_arg(spec => 'ignorelinkdown|i',help =>qq{-i, --ignorelinkdownIgnore NIC Link Down status (iLO4).},
);$p->add_arg(spec => 'notemperatures|n',help =>qq{-n, --notemperaturesDisable temperature listing.},
);$p->add_arg(spec => 'powerredundancy|o',help =>qq{-o, --powerredundancyCheck power redundancy.},
);$p->add_arg(spec => 'getinfos|g',help =>qq{-g, --getinfosDisplay additional infos like firmware version and servername. May need increased timeout.},
);$p->add_arg(spec => 'ilo3|3',help =>qq{-3, --ilo3Check iLO3|4 device.},
);$p->add_arg(spec => 'retries|r=i',help => qq{-r, --retries=INTEGERNumber of retries.},
);$p->add_arg(spec => 'sslv3|s',help => qq{-s, --sslv3Use sslv3 for connection.},
);$p->add_arg(spec => 'sslopts=s',help => qq{--ssloptsSets IO::Socket:SSL Options, defaults to 'SSL_verify_mode => SSL_VERIFY_NONE'.Some firmware may need --sslopts 'SSL_verify_mode => SSL_VERIFY_NONE, SSL_version => "TLSv1"'.},
);# parse arguments
$p->getopts;my $return = "OK";
my $message = "";
our $xmlinput = "";
our $isinput = 0;
our $drive_input = "";
our $is_drive_input = 0;
our $drive_xml_broken = 0;
our $client;
our $is_event_input = 0;
our $event_severity = "";
our $event_class = "";
our $event_description = "";
our %event_status;
my $host = $p->opts->host;
my $hostname = $p->opts->host;
my $username = $p->opts->user;
my $password = $p->opts->password;
my $inputfile = $p->opts->inputfile;
our $skipsyntaxerrors = defined($p->opts->skipsyntaxerrors) ? 1 : 0;
my $optfanredundancy = defined($p->opts->fanredundancy) ? 1 : 0;
my $optpowerredundancy = defined($p->opts->powerredundancy) ? 1 : 0;
my $notemperatures = defined($p->opts->notemperatures) ? 1 : 0;
our $optcheckdrives = defined($p->opts->checkdrives) ? 1 : 0;
my $optilo3 = defined($p->opts->ilo3) ? 1 : 0;
my $iloboardversion = defined($p->opts->ilo3) ? "ILO>=3" : "ILO2";
my $perfdata = defined($p->opts->perfdata) ? 1 : 0;
my $locationlabel = defined($p->opts->locationlabel) ? 1 : 0;
my $eventlogcheck = defined($p->opts->eventlogcheck) ? 1 : 0;
my $ignorelinkdown = defined($p->opts->ignorelinkdown) ? 1 : 0;
my $ignorebatterymissing = defined($p->opts->ignorebatterymissing) ? 1 : 0;
my $getinfos = defined($p->opts->getinfos) ? 1 : 0;
our %drives;
our $drive;
our $drivestatus;
our $product_name = "";
our $serial_number = "";
our $server_name = "";
my $retries=0;
my $xml;
my $sslv3 = defined($p->opts->sslv3) ? 1 : 0;
my $sslopts = 'SSL_verify_mode => SSL_VERIFY_NONE';
our @product;
our @serial;
our @sname;$message = "(Board-Version: $iloboardversion) ";unless ( ( defined($inputfile) ) ||( defined($host) && defined($username) && defined($password) ) ) {$p->nagios_die("ERROR: Missing host, password and user.");
}if ( defined ( $p->opts->retries ) ) {$retries = $p->opts->retries;
}if ( defined ( $p->opts->sslopts ) ) {$sslopts = $p->opts->sslopts;
}alarm $p->opts->timeout;my $boundary;
our $sendsize;
my $localhost = hostname() || 'localhost';
print "hostname is $localhost\n" if ( $p->opts->verbose );for (my $i=0;$i<=$retries;$i++) {print "retry: $i\n" if ( $p->opts->verbose );unless ( defined($inputfile) ) {# query code from locfg.pl# Set the default SSL port number if no port is specified$host .= ":443" unless ($host =~ m/:/);## Open the SSL connection and the input file$client = new IO::Socket::SSL->new(PeerAddr => $host, eval $sslopts, $sslv3 ? ( SSL_version => 'SSLv3' ) : () );unless ( $client ) {$p->nagios_exit(return_code => "UNKNOWN",message => "ERROR: Failed to establish SSL connection with $host $! $SSL_ERROR.");}if ( $optilo3 ) {print "sending ilo3\n" if ( $p->opts->verbose );my $cmd = '<?xml version="1.0"?>';$cmd .= '<LOCFG VERSION="2.21" />';$cmd .= '<RIBCL VERSION="2.21">';$cmd .= '<LOGIN USER_LOGIN="'.$username.'" PASSWORD="'.$password.'">';$cmd .= '<SERVER_INFO MODE="read">';$cmd .= '<GET_EMBEDDED_HEALTH />';if ( $eventlogcheck ) { $cmd .= '<GET_EVENT_LOG />';}if ( $getinfos ) {$cmd .= '<GET_HOST_DATA />';$cmd .= '<GET_PRODUCT_NAME />';$cmd .= '<GET_SERVER_NAME />';}$cmd .= '</SERVER_INFO>';$cmd .= '</LOGIN>';$cmd .= '</RIBCL>';$cmd .= "\r\n";send_or_calculate(0,$cmd);send_to_client(0, "POST /ribcl HTTP/1.1\r\n");send_to_client(0, "HOST: $hostname\r\n");          # Mandatory for http 1.1send_to_client(0, "TE: chunked\r\n");send_to_client(0, "Connection: Close\r\n");         # Requiredsend_to_client(0, "Content-length: $sendsize\r\n"); # Mandatory for http 1.1send_to_client(0, "\r\n");send_or_calculate(1,$cmd);  #Send it to iLO}else {# send xml to BMCprint $client '<?xml version="1.0"?>' . "\r\n";print $client '<LOCFG VERSION="2.21" />' . "\r\n";print $client '<RIBCL VERSION="2.21">' . "\r\n";print $client '<LOGIN USER_LOGIN="'.$username.'" PASSWORD="'.$password.'">' . "\r\n";print $client '<SERVER_INFO MODE="read">' . "\r\n";print $client '<GET_EMBEDDED_HEALTH />' . "\r\n";if ( $eventlogcheck ) { print $client '<GET_EVENT_LOG />' . "\r\n"; }if ( $getinfos ) {print $client '<GET_HOST_DATA />' . "\r\n";print $client '<GET_PRODUCT_NAME />' . "\r\n";print $client '<GET_SERVER_NAME />' . "\r\n";}print $client '</SERVER_INFO>' . "\r\n";print $client '</LOGIN>' . "\r\n";print $client '</RIBCL>' . "\r\n";}}else {open($client,$inputfile) or $p->nagios_die("ERROR: $inputfile not found");}# retrieve dataif ( $optilo3 && !$inputfile ) {read_chunked_reply();}else {while (my $ln = <$client>) {parse_reply($ln);}close $client;}# parse with XML::Simpleif ( $xmlinput && $isinput == 0 ) {$xml = eval { XMLin($xmlinput, ForceArray => 1) };if ( $@ ) {if ( $i < $retries ) { next;}$p->nagios_exit(return_code => "UNKNOWN",message => "ERROR: $@");}else {last;}}else {$p->nagios_exit(return_code => "UNKNOWN",message => "ERROR: No parseable output.");}
}if ( $getinfos ) {$serial_number = "";if ( defined $serial[3] ) {$serial[3] =~ tr/ //ds ;$serial_number = "Serial: $serial[3]";}$server_name = "";$server_name = " - Servername: $sname[1]" if defined $sname[1];my $system_rom = undef;my $firmware_name = undef;my $firmware_version = undef;# loop through firmware hashforeach my $index (keys %{ $xml->{'FIRMWARE_INFORMATION'}[0] }) {if (defined $xml->{'FIRMWARE_INFORMATION'}[0]->{$index}[0]->{'FIRMWARE_NAME'}[0]->{'VALUE'}) {if ($xml->{'FIRMWARE_INFORMATION'}[0]->{$index}[0]->{'FIRMWARE_NAME'}[0]->{'VALUE'} eq "iLO") {$firmware_name    = 'iLO';$firmware_version = $xml->{'FIRMWARE_INFORMATION'}[0]->{$index}[0]->{'FIRMWARE_VERSION'}[0]->{'VALUE'};} elsif ($xml->{'FIRMWARE_INFORMATION'}[0]->{$index}[0]->{'FIRMWARE_NAME'}[0]->{'VALUE'} eq "HP ProLiant System ROM") {$system_rom = $xml->{'FIRMWARE_INFORMATION'}[0]->{$index}[0]->{'FIRMWARE_VERSION'}[0]->{'VALUE'};}}}my $product_name = undef;if (! defined $product[1]) {$product_name = "Unknown product";} else {$product_name = $product[1]}$serial_number = "Unknown serial" if ! defined $serial_number;$firmware_name = "iLO" if ! defined $firmware_name;$firmware_version = "Unknown firmware version" if ! defined $firmware_version;$server_name = "Unknown server name" if ! defined $server_name;if (defined $system_rom) {$message = "($product_name - SystemROM: $system_rom - $serial_number - $firmware_name FW $firmware_version" . "$server_name) ";} else {$message = "($product_name - $serial_number - $firmware_name FW $firmware_version" . "$server_name) ";}
}my $drive_xml;
if ( $optcheckdrives && !$drive_xml_broken ) {if ( $drive_input && $is_drive_input == 0 ) {$drive_xml = eval { XMLin($drive_input, ForceArray => 1) };if ( $@ ) {$p->nagios_exit(return_code => "UNKNOWN",message => "ERROR: $@");}}elsif ( ref $xml->{'STORAGE'}[0]->{'CONTROLLER'} ) {# iLO4 specific, no need for $drive_input}else {# No need to error out if host uncapable of checking drive statuswarn "No drive_input found" if ( $p->opts->verbose );}
}my $temperatures = $xml->{'TEMPERATURE'}[0]->{'TEMP'};
my $backplanes = $drive_xml->{'BACKPLANE'};
my $raidcontroller = $xml->{'STORAGE'}[0]->{'CONTROLLER'};
my @checks;
push(@checks,$xml->{'FANS'}[0]->{'FAN'});
push(@checks,$xml->{'VRM'}[0]->{'MODULE'});
push(@checks,$xml->{'POWER_SUPPLIES'}[0]->{'SUPPLY'});
if($xml->{'PROCESSORS'}) {push(@checks,$xml->{'PROCESSORS'}[0]->{'PROCESSOR'});
}
my $memdetails;
if($xml->{'MEMORY'}) {$memdetails = $xml->{'MEMORY'}[0]->{'MEMORY_DETAILS'}[0];
}
my $health = $xml->{'HEALTH_AT_A_GLANCE'}[0];
my $label;
my $status;
my $temperature;
my $cautiontemp;
my $criticaltemp;## check overall health statusmy $componentstate;
foreach (keys %{$health}) {$componentstate = $health->{$_}[0]->{'STATUS'};if ( defined($componentstate) && ( $componentstate !~ m/^Ok$|^OTHER$|^NOT APPLICABLE$/i ) ) {if ($_ eq 'STORAGE') {if ( ref($raidcontroller) ) {# For iLO4 we can look at the raid controller to get a more detailed# status, so just log a WARNING unless we find something CRITICAL# later on.$return = "WARNING" unless ( $return eq "CRITICAL" );}else {$return = "CRITICAL";}}elsif ( ( $_ eq 'BATTERY' ) && $ignorebatterymissing && ( $componentstate =~ m/^Not Installed$/i ) ) {next;}elsif ( ( $_ eq 'NETWORK' ) && $ignorelinkdown && ( $componentstate =~ m/^Link Down$/i || $componentstate =~ m/^Degraded$/i ) ) {next;}else {$return = "CRITICAL";}$message .= "$_ $componentstate, ";}
}if ( $optpowerredundancy ) {my $powerredundancy = $health->{'POWER_SUPPLIES'}[1]->{'REDUNDANCY'};if ( defined($powerredundancy) &&( $powerredundancy !~ m/^Fully Redundant$|^REDUNDANT$|^NOT APPLICABLE$/i ) ) {$return = "CRITICAL";$message .= "Power supply $powerredundancy, ";}
}if ( $optfanredundancy ) {my $fanredundancy = $health->{'FANS'}[1]->{'REDUNDANCY'};if ( defined($fanredundancy) &&( $fanredundancy !~ m/^Fully Redundant$|^REDUNDANT$|^NOT APPLICABLE$/i ) ) {$return = "CRITICAL";$message .= "Fans $fanredundancy, ";}
}# check fans, vrm and power supplies
foreach my $check ( @checks ) {if ( ref($check) ) {foreach my $item ( @$check ) {$label=$item->{'LABEL'}[0]->{'VALUE'};$status=$item->{'STATUS'}[0]->{'VALUE'};if ( defined($label) && defined($status) ) {# misleading output on some iLO3 shows always failed, skip itif ($label =~ m/^Power Supplies$/) {next;}if ($label =~ m/^Power Supply/) {# get details for power supplies$label =~ s/ /_/g;if ( ( $status !~ m"^Ok$|^Good|^n/a$|^Not Installed$|^Unknown$"i ) ) {$return = "WARNING" unless ( $return eq "CRITICAL" );if ( defined($item->{'MODEL'}[0]->{'VALUE'}) ) {$message .= "$label is $status (ModelNumber: $item->{'MODEL'}[0]->{'VALUE'}) ";}else {$message .= "$label is $status ";}}next;}$label =~ s/ /_/g;if ( ( $status !~ m"^Ok$|^Good|^n/a$|^Not Installed$|^Unknown$"i ) ) {$return = "WARNING" unless ( $return eq "CRITICAL" );$message .= "$label: $status, ";}}}}
}# check memory status (iLO4 only?)
if ( ref($memdetails) ) {foreach my $loc ( sort keys %{$memdetails} ) {foreach ( @{$memdetails->{$loc}} ) {$status = $_->{'STATUS'}[0]->{'VALUE'};if ( ( $status !~ m"^Ok$|^Good|^n/a$|^Not Present$"i ) ) {$return = "WARNING" unless ( $return eq "CRITICAL" );my $socket = $_->{'SOCKET'}[0]->{'VALUE'};my $size = $_->{'SIZE'}[0]->{'VALUE'};my $part = $_->{'PART'}[0]->{'NUMBER'};if ( defined $part ) {# works only with new iLO4 firmware$message .= "Mem $loc $socket: $status (Size: $size, PartNumber: $part), ";}else {$message .= "Mem $loc $socket: $status (Size: $size), ";}}}}
}# check newer drive bays (iLO3)
if ( ref($backplanes) ) {my $backplane = 0;foreach ( @{$backplanes} ) {if ( defined($_->{'ENCLOSURE_ADDR'}[0]->{'VALUE'} ) ) {$backplane = $_->{'ENCLOSURE_ADDR'}[0]->{'VALUE'};}else {$backplane++;}if ( $_->{'DRIVE_BAY'} ) {for ( my $i=0; $i<= $#{$_->{'DRIVE_BAY'}}; $i++ ) {$label=$backplane." ".$_->{'DRIVE_BAY'}[$i]->{'VALUE'};$status=$_->{'STATUS'}[$i]->{'VALUE'};$drives{$label}{'status'} = $status;}}if ( $_->{'DRIVE'} ) {for ( my $i=0; $i<= $#{$_->{'DRIVE'}}; $i++ ) {$label=$backplane." ".$_->{'DRIVE'}[$i]->{'BAY'};$status=$_->{'DRIVE_STATUS'}[$i]->{'VALUE'};$drives{$label}{'status'} = $status;}}}
}# seems that iLO4 reads the state from the RAID controller, nice
if ( ref($raidcontroller) ) {foreach ( @{$raidcontroller} ) {my $ctrllabel = $_->{'LABEL'}[0]->{'VALUE'};my $ctrlstatus = $_->{'CONTROLLER_STATUS'}[0]->{'VALUE'} || $_->{'STATUS'}[0]->{'VALUE'};if($ctrlstatus ne 'OK') {$return = "CRITICAL";$message .= "SmartArray $ctrllabel Status: $ctrlstatus, ";}my $cachestatus = $_->{'CACHE_MODULE_STATUS'}[0]->{'VALUE'};if($cachestatus && $cachestatus ne 'OK') {# FIXME: There are probably other valid cache module states that#        needs to be excluded.$return = "CRITICAL";$message .= "SmartArray $ctrllabel Cache Status: $cachestatus, ";}foreach ( @{$_->{'DRIVE_ENCLOSURE'}} ) {my $enclabel = $_->{'LABEL'}[0]->{'VALUE'};my $encstatus = $_->{'STATUS'}[0]->{'VALUE'};my $encmodel = $_->{'MODEL_NUMBER'}[0]->{'VALUE'};if($encstatus ne 'OK') {$message .= "SmartArray $ctrllabel Enclosure $enclabel: $encstatus (ModelNumber: $encmodel) - check hardware status in OS, ";$return = "CRITICAL";}}foreach ( @{$_->{'LOGICAL_DRIVE'}} ) {my $ldlabel = $_->{'LABEL'}[0]->{'VALUE'};my $ldstatus = $_->{'STATUS'}[0]->{'VALUE'};if($ldstatus ne 'OK') {$message .= "SmartArray $ctrllabel LD $ldlabel: $ldstatus, ";if($ldstatus eq 'Degraded (Rebuilding)' || $ldstatus eq 'Degraded (Recovering)') {$return = "WARNING" unless ( $return eq "CRITICAL" );}else {$return = "CRITICAL";}}foreach ( @{$_->{'PHYSICAL_DRIVE'}} ) {$label = "$ctrllabel $_->{'LABEL'}[0]->{'VALUE'}";$status = $_->{'STATUS'}[0]->{'VALUE'};my $model = $_->{'MODEL'}[0]->{'VALUE'};$drives{$label}{'status'} = $status;$drives{$label}{'model'} = $model;}}}
}# check drive bays
if ( $optcheckdrives ) {foreach ( sort keys(%drives) ) {if ( ( $drives{$_}{'status'} !~ m"^(Ok)$|^(n/a)$|^(Spare)$|^(Not Installed)|^(Not Present/Not Installed)$|^(spun down)$"i ) ) {$return = "CRITICAL";$message .= "$_: ".$drives{$_}{'status'};if (defined $drives{$_}{'model'}){$message .= " (Drive ModelNumber: " . $drives{$_}{'model'} ."), ";}}}
}# check event logs
if ( $eventlogcheck ) {foreach ( keys %event_status ) {next if ( $event_status{$_} =~ m/Repaired/ );$message .= " $_:$event_status{$_} ";$return = "WARNING" unless ( $return eq "CRITICAL" );}
}unless ( $message ) {$message .= "No faults detected, ";
}# check temperatures
if ( ref($temperatures) ) {unless ( $notemperatures ) {$message .= "Temperatures: ";}foreach my $temp ( @$temperatures ) {$label=$temp->{'LABEL'}[0]->{'VALUE'};if ( $locationlabel && defined($temp->{'LOCATION'}[0]->{'VALUE'}) ) {$label .= " (" . $temp->{'LOCATION'}[0]->{'VALUE'} . ")";}$status=$temp->{'STATUS'}[0]->{'VALUE'};$temperature=$temp->{'CURRENTREADING'}[0]->{'VALUE'};if ( defined($label) && defined($status) && defined($temperature) ) {$label =~ s/ /_/g;unless ( ( $status =~ m"^Ok$|^n/a$|^Not Installed$"i ) ) {$return = "WARNING" unless ( $return eq "CRITICAL" );$message .= "$label ($status): $temperature, "if ( $notemperatures );}unless ( ( $status =~ m"^n/a$|^Not Installed$"i ) )  {$message .= "$label ($status): $temperature, "unless ( $notemperatures );if ( $perfdata ) {$cautiontemp=$temp->{'CAUTION'}[0]->{'VALUE'};$criticaltemp=$temp->{'CRITICAL'}[0]->{'VALUE'};# Returned value can be 'N/A', enforce this being a numberif($cautiontemp && $cautiontemp !~ /^[0-9]+/) {$cautiontemp=undef;}if($criticaltemp && $criticaltemp !~ /^[0-9]+/) {$criticaltemp=undef;}if ( defined($cautiontemp) && defined($criticaltemp) ) {$p->set_thresholds(warning  => $cautiontemp,critical => $criticaltemp,);my $threshold = $p->threshold;# add perfdata$p->add_perfdata(label   => $label,value   => $temperature,uom     => "",threshold => $threshold,);}}}}else {$message .= "no reading, ";}}
}# strip trailing ","
$message =~ s/, $//;$p->nagios_exit(return_code => $return,message => $message
);# send_to_client, send_or_calculate and read_chunked_reply
# are adapted from locfg.plsub send_to_client
{my ($send, $cmd) = @_;print $cmd if ( $p->opts->verbose && length($cmd) < 1024 );print $client $cmd;$sendsize -= length($cmd) if ( $send );
}sub send_or_calculate    # used for iLO 3 only
{$sendsize = 0;my ($send, $cmd) = @_;if ($send) {print $client $cmd;}$sendsize += length($cmd);print "size $sendsize\n" if ( $p->opts->verbose );
}sub read_chunked_reply    # used for iLO 3 only
{my $ln = "";my $lp = "";my $hide = 1;my $chunk = 1;my $chunkSize;while( 1 ) {# Read a line$ln = <$client>;# Get length of linemy $length =  length($ln);# Exit loop if zeroif ( $length == 0 ) {if ( $p->opts->verbose ) {print "read_chunked_reply: read a zero-length line. Continue...\n";}last;}# Skip HTTP headers and first line of chunked responsesif ( $hide ) {$hide = 0 if ( $ln =~ m/^\r\n$/ );print "Head: " . $ln if ( $p->opts->verbose );next;}# Get size of chunkif ( $chunk ) {print "chunk: " . $ln if ( $p->opts->verbose );$ln =~ s/\r|\n//g;$chunkSize = hex($ln);$chunk = 0;print "chunk size: $chunkSize\n" if ( $p->opts->verbose );next;}# Last Chunkif ( $chunkSize == 0 ) {print "read_chunked_reply: reach end of responses.\n" if ($p->opts->verbose);last;}# End of chunk, process incomplete lineif ( $chunkSize < $length ) {$chunk = 1; # Next line, new chunk$hide = 0;  # Skip hide$lp .= substr($ln, 0, $chunkSize); # Truncate and append}# End of chunk, process complete lineelsif ( $chunkSize == $length ) {$chunk = 1; # Next line, new chunk$hide = 1;  # Hide new chunk's first line$lp .= $ln; # Append line as-is}# Process lineelse {$chunkSize -= $length; # Decrement chunk size$lp .= $ln; # Append line as-is}# Skip incomplete linenext unless ( $lp =~ m/\n$/ );# Parse complete lineparse_reply($lp);# Line parsed, clear line$lp = "";}if ($client->error()) {print "Error: connection error " . $client->error() . "\n";}
}sub parse_reply
{my ($line) = @_;$line =~ s/\r\n$/\n/;print $line if ( $p->opts->verbose );if ( $getinfos ) {# Prune all unnecessary lines$isinput = 1 if ( $line =~ m"<GET_EMBEDDED_HEALTH_DATA>|</DRIVES>" );$xmlinput .= $line if ( $isinput );$isinput = 0 if ( $line =~ m"</GET_EMBEDDED_HEALTH_DATA>|<DRIVES>" );$product_name = $line if ( $line =~ m"<PRODUCT_NAME VALUE" );$serial_number = $line if ( $line =~ m/FIELD NAME="Serial Number"/ );$server_name = $line if ( $line =~ m"<SERVER_NAME" );@product = split (/"/, $product_name)  if defined $product_name;@serial  = split (/"/, $serial_number) if defined $serial_number;@sname   = split (/"/, $server_name )  if defined $server_name;}else {# Prune all unnecessary lines$isinput = 1 if ( $line =~ m"<GET_EMBEDDED_HEALTH_DATA>|</DRIVES>|</FIRMWARE_INFORMATION>" );$xmlinput .= $line if ( $isinput );$isinput = 0 if ( $line =~ m"</GET_EMBEDDED_HEALTH_DATA>|<DRIVES>|<FIRMWARE_INFORMATION>" );}# drive check needs special handling# <DRIVES>#    <BACKPLANE>#       <FIRMWARE_VERSION VALUE="1.18"/>#       <ENCLOSURE_ADDR VALUE="224"/>#     <DRIVE_BAY VALUE = "1"/>#       <PRODUCT_ID VALUE = "EH0300FBQDD    "/>#       <STATUS VALUE = "Ok"/>#       <UID_LED VALUE = "Off"/>#     <DRIVE_BAY VALUE = "2"/>#       <PRODUCT_ID VALUE = "EH0300FBQDD    "/>#       <STATUS VALUE = "Fault"/>#       <UID_LED VALUE = "Off"/>#    </BACKPLANE># </DRIVES>$is_drive_input = 1 if ( $line =~ m"<DRIVES>" );$drive_input .= $line if ( $is_drive_input );$is_drive_input = 0 if ( $line =~ m"</DRIVES>" );# because on many (older?) iLOs drive status is not XMLif ($optcheckdrives) {if ( $line =~ m/<Drive Bay: / ) {$drive_xml_broken = 1;# <Drive Bay: "3"; status: "Smart Error"; uid led="Off"/>( $drive, $drivestatus ) = ( $line =~m/Drive Bay: "(.*)"; status: "(.*)"; uid led: ".*"/ );if ( defined($drive) && defined($drivestatus) ) {$drives{$drive} = $drivestatus;}}if ( $line =~ m/<DRIVE BAY=".*" PRODUCT_ID="/ ) {$drive_xml_broken = 1;# <DRIVE BAY="3" PRODUCT_ID="N/A"STATUS="Smart Error" UID_LED="Off"/>( $drive, $drivestatus ) = ( $line =~m/DRIVE BAY="(.*)" PRODUCT_ID=".*"STATUS="(.*)" UID_LED=".*"/ );if ( defined($drive) && defined($drivestatus) ) {$drives{$drive} = $drivestatus;}}}if ( $eventlogcheck ) {$is_event_input = 1 if ( $line =~ m"<EVENT" );if ( $is_event_input ) {if ( $line =~ m/SEVERITY="(.*?)"/ ) {$event_severity = $1;#print "SEV: $event_severity\n";}if ( $line =~ m/CLASS="(.*?)"/ ) {$event_class = $1;#print "CLASS: $event_class\n";}if ( $line =~ m/DESCRIPTION="(.*?)"/ ) {$event_description = $1;#print "DESCRIPTION: $event_description\n";}}$is_event_input = 0 if ( $is_event_input && $line =~ m"/>" );if ( $is_event_input == 0 && $event_class ) {if ( ($event_class !~ m/POST|Maintenance/) && ( $event_severity !~ m/Informational/) ) {$event_status{$event_description} = $event_severity;$event_class = "";}}}if ( $line =~ m/MESSAGE='(.*)'/ ) {my $msg = $1;if ( $msg =~ m/No error/i ) {# Skip}elsif ( $msg =~ m/Syntax error/i && $skipsyntaxerrors ) {# Skip}else {close $client;$p->nagios_exit(return_code => "UNKNOWN",message => "ERROR: $msg.");}}
}

使用Zabbix通过BMC管理口监控HP服务器相关推荐

  1. ThinkSystem SR650 重置并登录BMC管理口

    SR650 重置并登录BMC管理口 前言: 远程无法访问服务器,只能连上服务器的BMC口检查服务器状态.下面为客户忘记BMC口管理密码的具体操作流程. 操作前准备:网线一根.笔记本电脑一台. 操作流程 ...

  2. Nagios监控HP服务器的硬件状态

    Nagios监控HP服务器的硬件状态 安装环境:RHEL6 方法一: (1)       下载bootstrap.sh #wget http://downloads.linux.hp.com/SDR/ ...

  3. 监控HP服务器cpu状态脚本

    监控HP服务器cpu状态脚本 脚本1(如有问题则发邮件通知): # vi cpu.sh 按a或i进入编辑模式 #!/bin/bash Name=`hostname` IP=`/sbin/ifconfi ...

  4. python监控服务器cpu温度实例_监控HP服务器CPU温度的脚本

    监控HP服务器CPU温度的脚本: #!/bin/bash Name=`hostname` IP=`/sbin/ifconfig eth0 | grep "inet addr" | ...

  5. 命令配置bmc管理口

    https://blog.51cto.com/chier11/2582463 http://www.zhaowenyu.com/linux-doc/ipmi/ipmitool.html service ...

  6. ThinkSystem SR650 BMC管理口配置

    配置步骤: 将自己电脑直连服务器BMC管理网口,给自己电脑IP配置成"192.168.70.111/24"不需要设置网关. 在浏览器输入https://192.168.70.125 ...

  7. 浪潮服务器管理口地址linux系统,Linux-HikvisionOS系统安装手册-管理口安装[1].pdf

    Linux HikOS 系统安装和使用手册 管理口安装 1. 管理口配置 1.1).管理口连接配置 管理口位置: 1.1 重启按F2 或者delete 进入bios,选到server mgmt. 1. ...

  8. 通过iLO进行Zabbix监控——针对HP服务器集成

    转载来源 :通过iLO进行Zabbix监控--针对HP服务器集成 : https://www.jianshu.com/p/803354515c1f 原文地址 HP服务器集成 iLO 端口的配置 (出处 ...

  9. 【Zabbix】通过iLO进行Zabbix监控——针对HP服务器集成

          iLO 全名是 Integrated Lights-out,它是惠普某些型号的服务器上集成的远程管理端口,它能够允许用户基于不同的操作系统从远端管理服务器,实现了虚拟存在和控制,从而进行智 ...

最新文章

  1. 第二章 如何学习Linux(鸟哥的Linux私房菜基础学习篇)
  2. 多云,安全集成推动了SD-WAN的广泛采用—Vecloud
  3. python rsa_python rsa加解密
  4. 130506datafile和tablespace offline区别
  5. 圆锥曲线万能弦长公式_2020高考数学50条秒杀型公式与方法
  6. 记一次mysql主从同步因断电产生的不能同步问题 1236 and 1032
  7. 有面值为1元、3元和5元的硬币若干枚,如何用最少的硬币凑够11元?
  8. linux版filezilla使用教程,FileZilla使用测评
  9. 吉林大学邮箱smtp服务器,吉林大学邮件系统成功案例-彩讯Richmail邮件系统,致力于互联网行业软件的开发和应用12年....
  10. [USACO2008 Mar]土地购买
  11. Win10 如何在系统内用cmd命令查看系统详细信息
  12. 心肺运动试验----各类参数分析笔记
  13. Crack:Aspose.Slides for .NET 22.12.x
  14. RLC并联谐振电路分析
  15. class torch.optim.lr_scheduler.StepLR
  16. 三角函数诱导公式大全
  17. linux输出数量大于一行,linux top命令详解
  18. 浏览器查看cookie过期时间
  19. 17、《每天5分钟玩转Docker容器技术》学习--Multi-host网络
  20. 请问打开PB时出现其停止工作的提示,如何处理?

热门文章

  1. 因子分析数据_Excel数据分析案例:用Excel做因子分析
  2. 编写纳新网站后端的相关知识总结
  3. fire.php,Fire PHP
  4. 【jQuery】jQuery知识点梳理(持续更新)
  5. Flex布局教程(来源:阮一峰)
  6. 对数组中的数字从小到大排序
  7. 微信小程序,对象转换成数组
  8. 使用 fastlane 实现 iOS 持续集成(二)
  9. docker容器网络 - 同一个host下的容器间通信
  10. spring boot 实战 / 可执行war启动参数详解