Scripts for monitoring Linux servers

Tired of regularly logging into our servers to find out if anything is wrong, I wanted to know before something like a spam attack or a disk-eater gets out of hand.

Wrote some scripts to automate the process and email me some stats. Yes there are tools to do this for you, but then you have to maintain those tools, and sometimes installing them and configuring them on so many servers can be a pain. This system needs only ssh and mutt, lightweight and available in almost all Linux distributions I know.

This, it’s simply bash installed on one server, and everything is “pushed” fresh to the servers every time it is run, so updates are automatic and easily deployed.

Design

The main script checkserver.sh copies (using scp)  the checkservices, checkdf, checkmailq and checkmaillog scripts and config files to the target host, runs them there and emails the output to you.

On your host servers there is nothing to install except openssh-server. Configure your password-less ssh-keys for all the servers in ~/.ssh/config on your monitoring server with the public keys in /root/.ssh/authorized_keys on the hosts. For help with that see our article: master-ssh-key

Scripts

 

To send notification emails you need a mutt wrapper, almost all distributions have mutt in their repos.

muttcc.sh

func_muttemail () {

#### establish current dir
APPDIR=$( cd “$( dirname “$0″ )” && pwd )
gotdate=`date +%Y-%m-%d-%H-%M`
DEBUGLOG=”$APPDIR/functions-email.log”
echo “ENTERING mutt.sh” | tee -a $DEBUGLOG

#### ASSIGN VARS
mailentity=”$1″
mailsubject=”$2″
maildatafile=”$3″
mailentitycc=”$4″
echo “$mailentity $mailsubject $maildatafile” | tee -a $DEBUGFILE
#send email
echo “cc specified: $4″ | tee -a $DEBUGLOG
if [ x”${mailentitycc}” = x ]
then
#the cc is not set so do not include mecc
mutt -F $APPDIR/muttrc -s “$mailsubject” “$mailentity” < $maildatafile
echo “no cc specified: $mailentitycc” | tee -a $DEBUGLOG
else
mutt -F $APPDIR/muttrc -s “$mailsubject” -c “$mailentitycc” “$mailentity” < $maildatafile
echo “cc specified: $mailentitycc” | tee -a $DEBUGLOG
fi

EX=$?
echo “$EX”
echo “##### EXITING mutt.sh #######” | tee -a $DEBUGLOG
return $EX
}

You’ll need to install mutt and have access to an smtp server. Mine is local and needs no authentication. Normally this fill will be the user’s “.muttrc” file, but in the wrapper we have specified it’s location.

muttrc

set from = “me@myserver.net”
set realname = “CHECKSERVER”
set use_envelope_from = yes
set smtp_url=smtp://mylocalmailserverhostname.net/
set ssl_starttls = no

checkservers.sh

#!/bin/bash
APPDIR=$( cd “$( dirname “$0″ )” && pwd )
. /$APPDIR/muttcc.sh
SERVERLIST=$APPDIR/server.list
MAILQTESTS=$APPDIR/checkmailq.txt
MAILLOGTESTS=$APPDIR/checkmaillog.txt
MAILQSH=$APPDIR/checkmailq.sh
MAILLOGSH=$APPDIR/checkmaillog.sh
DISKDF=$APPDIR/checkdiskdf.sh
CHECKLOG=/tmp/servercheck.log

while IFS=”:” read servername ifmailq ifmaillog ifdiskdf serviceslist
do
echo “=========== SERVERCHECK STARTS: $servername ==============”
echo “=========== SERVERCHECK STARTS: $servername ==============” > $CHECKLOG
echo “———————- disks —————— ” >> $CHECKLOG
if [ “$ifdiskdf” == “diskdf” ]
then
scp $APPDIR/checkdiskdf.sh root@$servername:/usr/local/sbin/
ssh -n root@$servername /usr/local/sbin/checkdiskdf.sh >> $CHECKLOG
else
echo “$servername diskdf not flagged for check” >> $CHECKLOG

fi
echo “———————- services —————— ” >> $CHECKLOG
echo “SERVICEFILE: $serviceslist”
case $serviceslist in
no)
echo “$servername services not flagged for check” >> $CHECKLOG
;;
*)
scp $APPDIR/checkservices-$serviceslist.txt root@$servername:/usr/local/sbin/checkservices.txt
scp $APPDIR/checkservices.sh root@$servername:/usr/local/sbin/
ssh -n root@$servername /usr/local/sbin/checkservices.sh >> $CHECKLOG
esac

echo “———————- mailq —————— ” >> $CHECKLOG
if [ “$ifmailq” == “mailq” ]
then
scp $APPDIR/checkmailq.sh root@$servername:/usr/local/sbin/
scp $APPDIR/checkmailq.txt root@$servername:/usr/local/sbin/
ssh -n root@$servername /usr/local/sbin/checkmailq.sh >> $CHECKLOG
else
echo “$servername mailq not flagged for check” >> $CHECKLOG
fi

echo “—————– maillog —————– ” >> $CHECKLOG
if [ “$ifmaillog” == “maillog” ]
then
scp $APPDIR/checkmaillog.sh root@$servername:/usr/local/sbin/
scp $APPDIR/checkmaillog.txt root@$servername:/usr/local/sbin/
ssh -n root@$servername /usr/local/sbin/checkmaillog.sh >> $CHECKLOG
else
echo “$servername maillog not flagged for check” >> $CHECKLOG
fi
echo “=========== SERVERCHECK ENDS: $servername ==============” >> $CHECKLOG
email_subject=”checkserver results: $servername”
echo “$email_subject”
tmp2=$(func_muttemail “myemail@address.com” “$email_subject” “$CHECKLOG” “ccemail@address.com”)
echo “=========== SERVERCHECK ENDS: $servername ==============”

done < “$SERVERLIST”

checkdiskdf.sh

#!/bin/bash
/bin/df -h
exit

checkmaillog.sh

#!/bin/bash
APPDIR=$( cd “$( dirname “$0″ )” && pwd )
#greps maillotgs and mailqueue fiels for various issues
MAILLOGTESTS=$APPDIR/checkmaillog.txt
MAILRESULT=$APPDIR/checkmailresult.log
MAILRESULTREPORT=$APPDIR/checkmailresult.log
MAILRESULTTEMP=$APPDIR/checkmailresulttemp.log
rm $MAILRESULTTEMP
while IFS=”:” read testname before after
do
datum=`date “+%b %d”`
grep -A$after -B$before “$testname” /var/log/maillog | grep “$datum” >> $MAILRESULTTEMP

done < $MAILLOGTESTS
resultlength=$(cat “$MAILRESULTTEMP” | wc -l)
echo “$resultlength”
if [ “$resultlength” -gt “1” ]
then
echo “issues found”
echo “mail log errors found. lines=$resultlength.” > $MAILRESULTREPORT
cat “$MAILRESULTTEMP” >> $MAILRESULTREPORT
cat “$MAILRESULTREPORT”
else
echo “nothing to report”
fi

checkmaillog.txt

refused to talk to me:0:0
Connection timed out:0:0
Name service error for name:0:0
sender non-delivery notification:0:0
to=<noreply@:0:0
421 Too many concurrent SMTP connections:0:0
testing something:0:0

checkmailq.sh

#!/bin/bash
APPDIR=$( cd “$( dirname “$0″ )” && pwd )
#greps maillotgs and mailqueue fiels for various issues

MAILQTESTS=$APPDIR/checkmailq.txt
MAILRESULT=$APPDIR/checkmailresult.log
MAILRESULTREPORT=$APPDIR/checkmailresult.log
MAILRESULTTEMP=$APPDIR/checkmailresulttemp.log
rm $MAILRESULTTEMP

while IFS=”:” read testname before after
do
/usr/bin/mailq | grep -A$after -B$before “$testname” >> $MAILRESULTTEMP
done < $MAILQTESTS
resultlength=$(cat “$MAILRESULTTEMP” | wc -l)
echo “$resultlength”
if [ “$resultlength” -gt “1” ]
then
echo “issues found”
echo “there is queued mail. lines=$resultlength.” > $MAILRESULTREPORT
cat “$MAILRESULTTEMP” >> $MAILRESULTREPORT
cat “$MAILRESULTREPORT”
else
echo “nothing to report”
fi

checkmailq.sh

#!/bin/bash
APPDIR=$( cd “$( dirname “$0″ )” && pwd )
#greps maillotgs and mailqueue fiels for various issues

MAILQTESTS=$APPDIR/checkmailq.txt
MAILRESULT=$APPDIR/checkmailresult.log
MAILRESULTREPORT=$APPDIR/checkmailresult.log
MAILRESULTTEMP=$APPDIR/checkmailresulttemp.log
rm $MAILRESULTTEMP

while IFS=”:” read testname before after
do
/usr/bin/mailq | grep -A$after -B$before “$testname” >> $MAILRESULTTEMP
done < $MAILQTESTS
resultlength=$(cat “$MAILRESULTTEMP” | wc -l)
echo “$resultlength”
if [ “$resultlength” -gt “1” ]
then
echo “issues found”
echo “there is queued mail. lines=$resultlength.” > $MAILRESULTREPORT
cat “$MAILRESULTTEMP” >> $MAILRESULTREPORT
cat “$MAILRESULTREPORT”
else
echo “nothing to report”
fi

checkmailq.txt

Connection timed out:1:1
Connection refused:1:1

checkservices.sh

#!/bin/bash

#!/bin/bash

APPDIR=$( cd “$( dirname “$0″ )” && pwd )

#greps maillotgs and mailqueue fiels for various issues

SERVICETESTS=$APPDIR/checkservices.txt

RESULTREPORT=$APPDIR/checkservicesresultreport.log

RESULTTEMP=$APPDIR/checkservicesresulttemp.log

rm $RESULTREPORT

rm $RESULTTEMP

while IFS=”:” read portnumber service

do result=$( lsof -i:$portnumber | head -n2 > $RESULTTEMP)

resultlength=$(cat “$RESULTTEMP” | wc -l )

if [ “$resultlength” -gt “1” ] then

#service is running there

echo “OK: $service on $portnumber” >> $RESULTREPORT else

echo “++++++ NOK: $service on $portnumber +++++++++” >> $RESULTREPORT

fi

done < $SERVICETESTS

cat “$RESULTREPORT”

checkservices-ALL.txt

22:ssh
443:https
993:imaps
3306:mysql
389:ldap

servers.list

This is your list of servers to process.

megatron:nomailq:nomaillog:diskdf:ALL
magdelena:mailq:maillog:diskdf:ALL

In my crontab. I don’t want email noise from cron so I use /dev/null 2> &1 to null that.

30 08 * * * /usr/local/sbin/backup_app/checkservers.sh > /dev/null 2>&1

Debugging

Debug email issues with functions-email.log

For output issues log into your remote servers and run the check*.sh files to see what comes out.

Room for improvement

tmp files are used per server in the main loop, so take care not to run this script again until complete or you’ll get nonsense.