Nagios Infrastructure SOP

From FedoraProject

(Difference between revisions)
Jump to: navigation, search
(Imported from MoinMoin)
 
(redirect page to new infra-docs)
 
(10 intermediate revisions by 5 users not shown)
Line 1: Line 1:
= Nagios: Standard Operating Procedure =
+
{{header|infra}}
 +
{{shortcut|ISOP:NAGIOS}}
  
  
 +
This SOP has moved to the fedora Infrastructure SOP git repo. Please see the current document at: http://infrastructure.fedoraproject.org/infra/docs/nagios.txt
  
== Contact Information ==
+
For changes, questions or comments, please contact anyone in the Fedora Infrastructure team.
Owner: Fedora Infrastructure Team
+
  
Contact: #fedora-admin, sysadmin-main & sysadmin-noc groups
 
  
Location: Anywhere
+
[[Category:Infrastructure SOPs]]
 
+
Servers: noc1, noc2, puppet1
+
 
+
Purpose: This SOP is to describe nagios configurations
+
 
+
= Initial Configuration =
+
== CGI Access ==
+
To view information in nagios (anything with cgi-bin in the path) you need to be able to grant yourself access.  After checking out the Puppet CVS tree as described in the  [http://fedoraproject.org/wiki/Infrastructure/SOP/Puppet Puppet SOP]  you first need to edit configs/system/nagios/cgi.cfg and append your FAS username to 'authorized_for_system_commands'
+
== Contact Information ==
+
{{Template:Warning}} You must configure a contacts file to be able to acknowledge [http://fedoraproject.org/wiki/Infrastructure/SOP/Outage outages]  
+
Create a new file named 'fasname.cfg' in configs/system/nagios/contacts/ with the following details:
+
<pre>
+
define contact{
+
contact_name            fasname
+
alias                  Real Name
+
service_notification_period  24x7
+
host_notification_period      24x7
+
service_notification_options  w,u,c,r
+
host_notification_options    d,u,r
+
service_notification_commands notify-by-email
+
host_notification_commands    host-notify-by-email
+
email                  Email address (any)
+
}
+
</pre>
+
{{Template:Caution}} Using the 24x7 notification period may cause duplicate messages if you are a member of sysadmin-main, in which case you can specify 'never' instead
+
 
+
Next append your name to the 'members' section of configs/system/nagios/contactgroups/fedora-sysadmin-email.cfg
+
 
+
== nagios-external ==
+
The same changes will need to be applied with the nagios-external configuration (configs/system/nogios-external)
+
 
+
== Commit Changes ==
+
{{Template:Warning}} Remember to "cvs add" the contacts/fasname.cfg files
+
 
+
Commit changes by running cvs commit -m "Adding fasname to Nagios" and then mark the changes for distribution by make install
+
 
+
= Understanding the Messages =
+
== General ==
+
Nagios notifications are generally easy to read, and follow this consistent format:
+
<pre>
+
** PROBLEM/ACKNOWLEDGEMENT/RECOVERY alert - hostname/Check is WARNING/CRITICAL/OK **
+
** HOST DOWN/UP alert - hostname **
+
</pre>
+
Reading the message will provide extra information on what is wrong.
+
 
+
== Disk Space Warning/Critical ==
+
Disk space warnings normally include the following information:
+
<pre>
+
DISK WARNING/CRITICAL/OK - free space: mountpoint freespace(MB) (freespace(%) inode=freeinodes(%)):
+
</pre>
+
 
+
A message stating "(1% inode=99%)" means that the diskspace is critical '''not''' the inode usage and is a sign that more diskspace is required.
+
 
+
 
+
 
+
 
+
= Further Reading =
+
* [http://fedoraproject.org/wiki/Infrastructure/SOP/Puppet Puppet SOP]
+
* [http://fedoraproject.org/wiki/Infrastructure/SOP/Outage Outages SOP]
+

Latest revision as of 18:38, 19 December 2011

Infrastructure InfrastructureTeamN1.png
Shortcut:
ISOP:NAGIOS


This SOP has moved to the fedora Infrastructure SOP git repo. Please see the current document at: http://infrastructure.fedoraproject.org/infra/docs/nagios.txt

For changes, questions or comments, please contact anyone in the Fedora Infrastructure team.