|
hostchecker.pl - Service
Availability Monitor.
v.1.04 - 10/19/2003
Overview
Hostchecker.pl
is a simple program intended to automatically query services on
internet hosts or ping them to determine availability. It can be
configured to query nearly any tcp service (http, dns, citrix, rtsp,
etc.) so that you can actively monitor your server and receive alerts
when a service (or host) becomes unavailable.
Hostchecker.pl
is written in Perl and uses the Net::Ping
library.
How
does it work?
For example,
let's say that you have two hosts: server1 and server2. Each machine
is configured to run Hostchecker in their /etc/crontab file and
scan the other machine for availability. They could be sitting next
to each other or located on different networks. For example:

In the event
that server1 goes down, (or even if one of the monitored services
on server1 becomes unavailable), server2 will take notice when it
is next run and send a brief e-mail to the address you've configured
to receive alerts. This could be a pager, cell phone, etc.
If server1 continues
to be down, server2 will send one further alert (to avoid bombarding
you with alerts) and then keep checking silently until service becomes
available again. Upon the resumption of availability, hostchecker
will take note and send an alert.
Downloading
The perl script,
sample configuration file, and installation instructions can be
retrieved here:
UNIX
Installation
1. Unpack the
tar file and place the 'hostchecker.pl' file where you intend to
keep the program. (i.e. '/root'). The 'hostchecker.cfg' file should
be placed in /etc.
2. Edit the
/etc/hostchecker.cfg file to suit your needs. The format is sectioned
into Config, BaseLine, and Monitor as follows:
<Config>
directory=/tmp
notify=someaddress@pager.com
sendmail=/usr/sbin/sendmail
</Config>
Generally
the '/tmp' directory is a good place for the program to put its
temporary files. Notify should be set to a valid e-mail address.
(The alerts are size-optimized with pagers in mind.)
<BaseLine>
www.cnn.com,http
www.news.com,http
www.aol.com,http
etc...
</BaseLine>
The BaseLine
section contains a list of hosts that should respond to your query
under normal circumstances. You should have at least 10 or 20
for a good sampling. If 90% of these hosts do NOT respond to a
query, it is likely that your machine is down and the program
will terminate. (We don't want to queue up any alerts because
they'll all be delivered in a flood once the host goes back online.)
<Monitor>
chicago1,smtp
chicago1,http
chicago1,ping
chicago1,domain
chicago1,ftp
chicago1,rtsp
san-anton2,domain
san-anton2,pop3
san-anton2,imap
san-anton2,ftp
citrixserver.someone.com,ica
border-router,ping
gateway-router,ping
firewall.mynetwork.com,ping
</Monitor>
As you can
see by the sample, you can monitor a broad range of services using
the program. (The service name must be defined in the '/etc/services'
file. See that file for more information about service names.)
Note: 'ping' is an exception that is coded within the program
to do a simple ICMP query.
The format
is simple: hostname.something.com,service
Why monitor
so many services? Won't a simple 'ping' entry be sufficient? ...
Not quite. It is very possible that an application error could
occur without the machine going down. For example: Apache could
crash--without a 'http' monitor, you won't be alerted to this.
3. You can now
run the program in 'debug' mode to test whether it is working or
not by executing:
./hostchecker.pl
debug
4. As long as
everything looks ok, you're ready to add an '/etc/crontab' entry
to run the program regularly. (The example runs the script every
5 minutes.)
*/5
* * * * root /root/hostchecker.pl
Tips
For Host Monitoring
In general practice,
it's bets to use a machine to monitor only the connectivity of remote
servers and not the availability of their applications. In the event
that the remote link fails, not only will you get a notification
of the ping failing, but a notification on every other service you're
monitoring as well. So--every time something goes funky with a link,
you'll get a LOT of messages.
The best bet
for monitoring a host's services is to do so locally and let remote
machines worry about monitoring the hosts's connectivity. If the
host goes down, it will find out through it's baseline status run
and not run service status checks until it goes back up.
Feedback
/ Questions
This program
was written very quickly to address a very old script that had ceased
to perform properly. The old script required a separate instance
with command line options for each host to be queried and relied
on an external 'ping' call that took 10 seconds per query regardless
of success. This program uses the Perl Net::Ping library and will
run in a couple seconds if all is well.
Anyhow, if you
find this program useful, please let us know.
E-Mail: contact@csma.biz.
|