Graph of temperature from NagiosGraph

Cost-Effective Temperature Monitoring with Nagios: How-To

Download the plugin

This was my first plugin and it’s written in Perl. It’s pretty simple – when called it will check a Watchport temperature or combo temperature/humidity sensor for the temperature and return both the Nagios state (unknown, good, warning or critical), and perfdata, for which I provide some map settings you can use to graph the output in NagiosGraph. This is also my first public tutorial, so please bear with me! If you’re experienced with Nagios, you can probably skip most of this guide.

This plugin can be called directly from your Nagios server, however I currently run it on remote servers using NRPE (on Linux) and NSClient++ (on Windows). The basic order I follow is:

  1. Install the Perl interpreter and required libraries on the server that the sensor is plugged in to
  2. Install and configure either NRPE or NSClient++ on the server that the sensor is plugged in to.
  3. Modify the commands config file in Nagios to add the temperature check command
  4. Modify your config file of choice to add a service check for the temperature on the host you have the sensor connected to
  5. Optional – Modify your NagiosGraph map file to interpret and graph the perfdata returned by the plugin, and add the appropriate link to the service check. I am currently on a very old version of NagiosGraph, so the instructions for your version may be slightly different

Step 3 is a on-time thing if you deploy multiple sensors, and steps 4 and 5 become much quicker after you have performed the initial setup. I personally just keep a single service for temperature sensors – add a host to the host_name directive and it’s checking the temperature.

Graph of temperature from NagiosGraph

Prerequisites:

There are several prerequisites for using this plugin and sensor. First, you’ll need a free USB port! second, you’ll need to set up your environment to run the plugin. If you have experience with Perl on your OS of choice, then this will be a breeze, you just need the interpreter and the serial port library. Keep in mind that the library is different for *nix and Windows. For now let’s go over configuring the hosts which actually have the sensors attached, which most likely aren’t going to be your Nagios server.

Linux:

Running under Linux, you’ll need to have a Perl interpreter installed. For example, under Ubuntu, you can run

apt-get install perl libdevice-serialport-perl nagios-nrpe-server

“Perl” installs the Perl interpreter, easy enough, and libdevice-serialport-perl installs the library needed to speak with the serial port. You can also use CPAN to install Device::SerialPort, however at this point using CPAN won’t be covered under this guide. nagios-nrpe-server installs the Nagios remote plugin program so that you can securely run the check and get the data back to your Nagios box. Note that if you don’t use NRPE you can of course use whatever method you desire to execute the plugin.

Next, configure nrpe

sudo vim /etc/nagios/nrpe.cfg

Add the following lines to your nrpe.cfg file:

# Allow the Nagios server to connect:
allowed_hosts=

# This command checks the temperature sensor. It will warn at 72F and
# go critical at 74F.
command[check_temp]=/usr/bin/perl /usr/lib/nagios/plugins/check_watchptTemp_linux.pl -w 72 -c 74

Now reload nrpe with

/etc/init.d/nagios-nrpe-server reload

Now you’re ready to copy the plugin over – copy the Linux plugin over, move it to /usr/lib/nagios/plugins/, make sure it is owned by the nrpe user (usually “nagios”), and set its permissions to allow execution.

Now test it out! from the server with the sensor attached, run:

/usr/bin/perl /usr/lib/nagios/plugins/check_watchptTemp_linux.pl -w 72 -c 74

If you don’t get the temperature back – try some troubleshooting techniques. You may need to manually specify the serial port with an argument to the plugin. By default it will poll /dev/ttyUSB0, however if you add “-p /dev/” you can tell it to check another. Once you have that working, move over to your Nagios server and run:

//check_nrpe -H <server-with-sensor-attached.yourdomain> -c check_temp

If everything worked correctly, you should get something like “OK: Temp is good at 70.8750|temp=70.8750;72;74;0;100” back. The first part is just a status message, the second can be interpreted by NagiosGraph. If you get an error in return, check the Nagios documentation to get NRPE configured and working properly. Try just getting NRPE to work first – if you leave out the “-c check_temp” part of the NRPE command, you should get back a status message telling you whether or not NRPE is working.

If you don’t have any Windows machines to set up, skip down now to Configuring the Server.

Windows:

For Windows, you’ll want to have a Perl interpreter installed like ActivePerl or Strawberry Perl.  Both ActivePerl and Strawberry Perl provide their own interfaces for installing Perl libraries. You’ll want to search for and install Win32::SerialPort before you move on to configuring NSCLient++.

Next, install NSClient++ and open \program files (x86)\nsclient++\NSC.ini in a text editor. Add the following line under the [NRPE Handlers] section:

command[check_temp]= C:\perl\bin\perl.exe C:\nagios\check_watchptTemp_win.pl -w 72 -c 74 --config C:\Nagios\temp.cfg

And this line under the [Settings] section:

allowed_hosts=<nagios-server-IP-or-FQDN>

Then run

net stop nsclientpp && net start nsclientpp

from a command prompt to restart the NSClient++ service. NSClient++ will emulate NRPE, so your Nagios server can call the same check for both Windows and Linux hosts. How convenient! Now copy both the Windows version of the plugin as well as the temp.cfg file to C:\Nagios. Of course you don’t have to use C:\Nagios, just be sure to update your NSC.ini file and restart the service if you change it. If your serial port is something other than COM3, you’ll have to edit the temp.cfg file. I know the file says “DO NOT EDIT,” so if you have problems you will need to follow the manual for Win32::SerialPort to generate your own config file. I’ve never had a problem manually editing the file, though.

If you would like to test, at this point you should be able to run

 C:\perl\bin\perl.exe C:\nagios\check_watchptTemp_win.pl -w 72 -c 74 --config C:\Nagios\temp.cfg

from a command prompt and receive the temperature back. Next, test from the Nagios server as outlined in the Linux configuration instructions. Don’t forget to open port TCP 5666 to allow NRPE traffic in, though I believe the installer will do this for you if you check the box. Assuming everything is working, time to configure Nagios!

Configuring Nagios:

Before starting this step, you should be able to get the temperature back from your remote servers with check_nrpe. Once you have confirmed that your hosts with sensors attached are working properly, add a check command to your commands.cfg file:

define command {
    command_name check_temp
    command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_temp -t 40
}

Note that you could just as easily skip this step and add a service check as “check_nrpe!check_temp.” For the purposes of this guide I’ll stick with the above command. Next edit your config file in which you have the hosts you’d like to check. Or, if you want to be tidy about it, make a new config file called something along the lines of “environmental.cfg” and make sure your nagios.cfg file knows to check it. In the config file, add:

define service{
    use generic-service
    host_name sensor_server_1,sensor_server_2,sensor_server3
    service_description Temperature
    check_command check_temp
}

You should already have a host entry for the server you are checking in your Nagios config. Where I have “sensor_server_1,” you would put in the host_name value of the server you want to check. Now, reload Nagios and check the host with the temperature sensor. Hopefully you see some clean output – if you don’t need graphing, you’re done! Note that performing this using a service check is how I do this in my environment, you could also manage this using a hostgroup, or chain multiple environmental checks together with a servicegroup.

Configuring NagiosGraph:

Now that you’re monitoring temperature, why not graph it? Because NagiosGraph can be a little confusing if you have never set it up or seen regular expressions before, that’s why! It shouldn’t give you too much trouble these days, though, with a little persistence anyone should be able to get it running alongside their Nagios installation in a relatively short amount of time. This guide is definitely not meant to be a NagiosGraph tutorial by any means, but I would like to at least go over it since editing the map file confused me when I was starting out.

NagiosGraph works by checking the perfdata log configured in your nagios.cfg file. This log is continuously scanned and cleared by NagiosGraph. Take a look at the following output from check_temp:

OK: Temp is good at 70.8750|temp=70.8750;72;74;0;100

Everything after the pipe | is “perfdata” and contains information returned by the service check, but not directly used by Nagios. This is where NagiosGraph comes in. You’ll notice that the perfdata section contains several values delimited by semicolons:

OK: Temp is good at 70.8750|temp=70.8750(The Temperature);72(The warning level;74(The critical level;0(The bottom of our y-axis);100(The top of our y-axis)

This data is now ready to be parsed by NagiosGraph. When NagiosGraph pulls the above log entry out of the perfdata.log file, it will run it through the map. The map is a directory of expressions describing ways to parse perfdata using regular expressions. Each line of perfdata will hopefully match an expression in the map file and be split out into small chunks of data which are entered into small database files, which are then used to build the performance graphs. So, this leads us to editing the map file. Add the following lines to your map file, it doesn’t really matter where, but be sure to add some appropriate comments:

# Service type: Datacenter Temp
#   output:OK! Temp is good at 70.8750
#   perfdata:temp=71.0000;75;78;0;100
/perfdata:.*?([\d]+\.[\d]+)\;([\d]+)\;([\d]+).*/
and push @s, [temp,
		[ degreesF, 	GAUGE, $1	]];
	# uncomment the following lines if you want to see warn and
 # crit in your graph. Don't forget to delete the
 # trailing ]; above!
  #	[ warn, 	GAUGE, $2 	],
	 #	[ crit,		GAUGE, $3	]];

# Service type: Datacenter Humidity
#   output:OK: Humidity is good at 45%
#   perfdata:humidity=45;60;70;40;30;0;100
/perfdata:.*humidity=(\d{1,3});(\d{1,3});(\d{1,3});(\d{1,3});(\d{1,3});(\d{1,3});(\d{1,3}).*/
and push @s, [humidity,
		[ humidityPct, 	GAUGE, $1	],
		[ warnU, 	GAUGE, $2 	],
		[ critU, 	GAUGE, $3 	],
		[ warnl, 	GAUGE, $4 	],
		[ critl,		GAUGE, $5	]];

These map entries cover both the temperature and humidity plugins. I am not going to break down what each expression does – if you are unfamiliar with regular expressions, I highly recommend that you do some research and play around with them, they are extremely powerful! Anyways, save your map file and head back over to your environmental.cfg file. Add the following entry:

define serviceextinfo{
   host_name sensor_server_1,sensor_server_2,sensor_server3
   service_description DatacenterTemp
   notes_url /nagiosgraph/show.cgi?host=$HOSTNAME$&service=$SERVICEDESC$&db=temp,degreesF&geom=1000x300&rrdopts=%2Dl%200%20%2Dt %2DTemp
}

This will add a series of graphs to each host defined under host_name once you reload your Nagios server. Give it about 20 minutes to gather data before expecting the graphs to appear. I must admit that I set this part of my configuration up when NagiosGraph was much younger, I know for certain that the configuration directives and best practices for NagiosGraph have improved dramatically since methods like the one above were considered acceptable. Everything will work fine using the NagiosGraph configuration above, but I urge the reader to familiarize themselves with the current Nagios and NagiosGraph documentation for more efficient, readable, and maintainable configuration styles.

I hope you find this guide and these plugins to be useful. Please leave a comment or send me a message if you have questions or suggestions on how to improve either this documentation or the plugins themselves. Thanks for reading!

Advertisements

4 thoughts on “Cost-Effective Temperature Monitoring with Nagios: How-To

  1. Pingback: First Nagios Guide Finished! | codeplasma

  2. Pingback: First Nagios Guide Finished! | codeplasma

  3. Ole Magnus H. Waaler (@evotech)

    Not sure if you are still active on this, but i couldn’t get your script to work on windows (wouldn’t let me connect to the comport) so i did a little work on it to make it work on windows 2008r2

    I’m posting this here in case some other sod needs it and finds your blogpost

    It no longer requires a separate config file and i also modified it to throw warnings and critcals about too low temps

    https://github.com/evolite/check_watchptTemp_winmod

    Reply
    1. Julius Post author

      Thank you so much, Ole! I’m sorry it’s taken me so long to approve this, I haven’t logged onto this site in a very long time. I really appreciate you leaving this here for others and for working on the code!!

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s