|
|
| (8 intermediate revisions by 4 users not shown) |
| Line 1: |
Line 1: |
| − | = Nagios: Standard Operating Procedure =
| + | {{header|infra}} |
| | + | {{shortcut|ISOP:NAGIOS}} |
| | | | |
| | | | |
| | + | This SOP has moved to the fedora Infrastructure SOP git repo. Please see the current document at: http://infrastructure.fedoraproject.org/infra/docs/nagios.txt |
| | | | |
| − | == Contact Information ==
| + | For changes, questions or comments, please contact anyone in the Fedora Infrastructure team. |
| − | Owner: Fedora Infrastructure Team
| + | |
| | | | |
| − | Contact: #fedora-admin, sysadmin-main & sysadmin-noc groups
| |
| | | | |
| − | Location: Anywhere
| + | [[Category:Infrastructure SOPs]] |
| − | | + | |
| − | Servers: noc1, noc2, puppet1
| + | |
| − | | + | |
| − | Purpose: This SOP is to describe nagios configurations
| + | |
| − | | + | |
| − | = Initial Configuration =
| + | |
| − | == CGI Access ==
| + | |
| − | To view information in nagios (anything with cgi-bin in the path) you need to be able to grant yourself access. After checking out the Puppet CVS tree as described in the [[Infrastructure/SOP/Puppet |Puppet SOP]] you first need to edit configs/system/nagios/cgi.cfg and append your FAS username to 'authorized_for_system_commands'
| + | |
| − | == Contact Information ==
| + | |
| − | {{Message/warning2 | You must configure a contacts file to be able to acknowledge [[Infrastructure/SOP/Outage |outages]]}}
| + | |
| − | Create a new file named 'fasname.cfg' in configs/system/nagios/contacts/ with the following details:
| + | |
| − | <pre>
| + | |
| − | define contact{
| + | |
| − | contact_name fasname
| + | |
| − | alias Real Name
| + | |
| − | service_notification_period 24x7
| + | |
| − | host_notification_period 24x7
| + | |
| − | service_notification_options w,u,c,r
| + | |
| − | host_notification_options d,u,r
| + | |
| − | service_notification_commands notify-by-email
| + | |
| − | host_notification_commands host-notify-by-email
| + | |
| − | email Email address (any)
| + | |
| − | }
| + | |
| − | </pre>
| + | |
| − | {{Message/warning3 | Using the 24x7 notification period may cause duplicate messages if you are a member of sysadmin-main, in which case you can specify 'never' instead}}
| + | |
| − | | + | |
| − | Next append your name to the 'members' section of configs/system/nagios/contactgroups/fedora-sysadmin-email.cfg
| + | |
| − | | + | |
| − | == nagios-external ==
| + | |
| − | The same changes will need to be applied with the nagios-external configuration (configs/system/nogios-external)
| + | |
| − | | + | |
| − | == Commit Changes ==
| + | |
| − | {{Message/warning2 | Remember to "cvs add" the contacts/fasname.cfg files}}
| + | |
| − | | + | |
| − | Commit changes by running cvs commit -m "Adding fasname to Nagios" and then mark the changes for distribution by make install
| + | |
| − | | + | |
| − | = Understanding the Messages =
| + | |
| − | == General ==
| + | |
| − | Nagios notifications are generally easy to read, and follow this consistent format:
| + | |
| − | <pre>
| + | |
| − | ** PROBLEM/ACKNOWLEDGEMENT/RECOVERY alert - hostname/Check is WARNING/CRITICAL/OK **
| + | |
| − | ** HOST DOWN/UP alert - hostname **
| + | |
| − | </pre>
| + | |
| − | Reading the message will provide extra information on what is wrong.
| + | |
| − | | + | |
| − | == Disk Space Warning/Critical ==
| + | |
| − | Disk space warnings normally include the following information:
| + | |
| − | <pre>
| + | |
| − | DISK WARNING/CRITICAL/OK - free space: mountpoint freespace(MB) (freespace(%) inode=freeinodes(%)):
| + | |
| − | </pre>
| + | |
| − | | + | |
| − | A message stating "(1% inode=99%)" means that the diskspace is critical '''not''' the inode usage and is a sign that more diskspace is required.
| + | |
| − | | + | |
| − | = Further Reading =
| + | |
| − | * [[Infrastructure/SOP/Puppet |Puppet SOP]]
| + | |
| − | * [[Infrastructure/SOP/Outage |Outages SOP]]
| + | |
For changes, questions or comments, please contact anyone in the Fedora Infrastructure team.