Recently I got pretty annoyed with my Nagios notifications. Nagios was set to alert me when a service or host went offline and came back online. But, due to me hurrying through all of the configurations of setting up the hosts (time crunch) I kept the default notification intervals. Once a service/host went offline the notifications would start, then if the host stayed offline for more than 60 minutes it would alert me again. If a host/service was offline for more than an hour I would get tons of alerts. This became really bad when I had a hard drive on my backup server fill up beyond 80%, the alert never cleared, and I was out of the office for a week. So, for 7 days every hour I would get an email saying that my drive was 80% full.
After sitting through a presentation it hit me that I could change the notification intervals (fairly obvious, I know). I figured out where the configuration files were for both the hosts and the services. Once getting SSH’d into the Nagios XI box I moved to the
/usr/local/nagios/etc/hosts and /usr/local/nagios/etc/services directories.
All of the configuration files were saved in there. I went into my favorite editor and started making configuration changes. Editing the notification_interval item from 60 (every 60 minutes) to 0 (only when it goes offline).
After changing every host (about 100) manually I restarted my Nagios server and noticed that all of the changes went back. I then went back into the files and noticed this at the top:
# –DO NOT EDIT THIS FILE BY HAND–
# Nagios QL will overwrite all manual settings during the next update
That isn’t very helpful for me. I went to the Nagios forums and found some good information. I was notified that in order to make the configuration changes I needed to create a new folder for both the hosts and services. I made a folder hostsNew and servicesNew in the /usr/local/nagios directory and copied all of the configuration files into the new folders. I then ran a perl script that changed all of the “60” to “0”
perl -pi -e “s/60/0/g;” *.cfg
This went through and did a search/replace on all of the 60’s in the files.
**** Be very careful that none of your hosts or services have other 60’s in them, it will replace those****
This posed a problem to me because all of my IP Addresses were 10.160.16.*, meaning they all changed to 10.10.16.*, therefore I made another script of:
perl -pi -e “s/10.10.16/10.160.16/g;” *.cfg
That took all of the second octets that I had changed from 160 to 10 back to 160.
Once that was taken care of I had to re-import all of the new configuration files into the correct place. This was done by: going to the “configure” tab, “Core Configuration Manager”, “Tools” on the left side, “Import Config Files” and select the files that were just edited. This will import the new files into the correct places with the right changes.
configure -> CCM -> Tools -> Import config files
Now if you’re anything like me, you got to the end of the article and got very confused. Restarting the server doesn’t work. After some research online, I found that I needed to update the configuration files after uploading.
Since you’re already int he CCM you need to select “Write Config Files” under the tools section.
Once in that section, select “Write” then “Verify” then “Restart” for the changes to get written to the database. Once the server has rebooted, you should have updated config files.