quarta-feira, 25 de setembro de 2019

Check your servers are UP - Linux BASH

Recently I was trying to check if one of my servers had a glitch, so after some research and coding I came up with this solution for Linux (ie. in my case for Raspbian Jessie):

1) Get the following code and save it to your target server.

#!/bin/bash
#
# Test connectivity with ping

# filename: test_alive.sh
#
# braselectron.com - September 25, 2019
#
# get IP from command line argument
#
ip=$1
#
# check IP address format code
# Mitch Frazier - Linux Journal - June 26, 2008
#
valid_ip () {
    local  ip=$1
    local  stat=1
    if [[ $ip =~ ^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$ ]]; then
        OIFS=$IFS
        IFS='.'
        ip=($ip)
        IFS=$OIFS
        [[ ${ip[0]} -le 255 && ${ip[1]} -le 255 && ${ip[2]} -le 255 && ${ip[3]} -le 255 ]]
        stat=$?
    fi
    return $stat
}
#
# check syntax
#
if [ $# -eq 0 ]
   then echo "syntax: test_live.sh <ip address>"
   exit 1
fi
#
if ! valid_ip $ip
   then echo "IP is invalid"
   exit 1
fi
#
# ping but don't wait
# log if ping fails
#
while true
   do ping -w 1 -c 1 $ip |\
      grep received |\
      cut -d" " -f 4 |\
      if [ "$(cat -)" != "1" ]
         then echo "Ping failed $(date)"
      fi
   done
 

 2) Choose another host on your network to ping that is up (for sure) and is trusted by you.  For example: 192.168.0.1 (usually the router on your network - default factory value).

3) Now using "nohup" to setup your test on the target host, at the command terminal, do this:

nohup  /bin/bash test_alive.sh 192.168.0.1 &
# remember to rename the 192.168.0.1 with your host IP (step 2)

4) Now you can logoff and from time to time check the nohup.out file to see if the ping failed.

5) When you are satisfied with your tests just kill the process on the target host, doing this:

5.1) first find your process with:  ps -ef | grep test_alive

This will give you a output similar to this:

pi        1257  4688  1 17:59 pts/1    00:00:00 grep --color=auto test_alive
pi       22213     1  9 15:08 ?        00:16:24 /bin/bash test_alive.sh 192.168.0.1


5.2) get the process id and kill it with:  kill 22213  # caution! use your id number

5.3) and remove the  nohup.out file with: rm nohup.out

NOTE:  if the nohup.out file is empty, means no erros, your server is working and connectivity is working.  But if the target host is freezing or other problems that make the script stop, this may mislead your conclusion.

Cheers!

Nenhum comentário:

Postar um comentário