Server Monitoring
Server monitioring is done using a variety of tools:
- PRTG
- Nodeping
- Crons - for RAID status (disk space monitoring)
PRTG: When you need to monitor minute settings on the server, like RAM, disk space, SNMP network speed, etc., you need to use PRTG. We have two PRTG servers -- one in Peer1 DC and the other one in 3Z. (check the server list for details). Two PRTG servers are used so that if one goes down, we can always rely on the other. You need to add the deviceto be monitored to Cluster Probe so that the device is awailable on both the PRTG server. At present, we are mainly using PRTG for network and disk space monitoring with certain threshold set. If the network threshold is reached and it continues for a long time, we need to check what the problem is. The disk space threshold if reached has to be checked proactively.
Nodeping: We have mainly ping, http monitoring of shared, L3 customers, KVM, etc. here.
Crons: We use megacli utility to check if there are any issues with the disks in raid array. You will get mail to admin@4goodhosting.com mail every Wednesday and Saturday. This is basically the cron that we use:
/opt/MegaRAID/MegaCli/./MegaCli64 -PDList -a0|grep Firmware| mail -s 'Drive Status-esr1' admin@4goodhosting.com
There is also a RAID status cron which helps to see if the RAID card is functioning normally:
/usr/bin/bash /usr/local/src/raidstatus.sh