From Fedora Project Wiki

m (SOP Formatting)
(7 intermediate revisions by 5 users not shown)
Line 1: Line 1:
This SOP is to describe nagios configurations
{{admon/important|All SOPs have been moved to the Fedora Infrastructure [ SOP git repository]. Please consult the [ online documentation] for the current version of this document.}}
== Contact Information ==
Owner: Fedora Infrastructure Team
Contact: #fedora-admin, sysadmin-main & sysadmin-noc groups
Location: Anywhere
Servers: noc1, noc2, puppet1
Purpose: This SOP is to describe nagios configurations
== Initial Configuration ==
=== CGI Access ===
To view information in nagios (anything with cgi-bin in the path) you need to be able to grant yourself access.  After checking out the Puppet CVS tree as described in the  [[Infrastructure/SOP/Puppet |Puppet SOP]]  you first need to edit configs/system/nagios/cgi.cfg and append your FAS username to 'authorized_for_system_commands'
=== Contact Information ===
{{Admon/caution | You must configure a contacts file to be able to acknowledge [[Infrastructure/SOP/Outage |outages]]}}
Create a new file named 'fasname.cfg' in configs/system/nagios/contacts/ with the following details:
define contact{
contact_name            fasname
alias                  Real Name
service_notification_period  24x7
host_notification_period      24x7
service_notification_options  w,u,c,r
host_notification_options    d,u,r
service_notification_commands notify-by-email
host_notification_commands    host-notify-by-email
email                  Email address (any)
{{Admon/warning | Using the 24x7 notification period may cause duplicate messages if you are a member of sysadmin-main, in which case you can specify 'never' instead}}
Next append your name to the 'members' section of configs/system/nagios/contactgroups/fedora-sysadmin-email.cfg
=== nagios-external ===
The same changes will need to be applied with the nagios-external configuration (configs/system/nogios-external)
=== Commit Changes ===
{{Admon/caution | Remember to "cvs add" the contacts/fasname.cfg files}}
Commit changes by running <code>cvs commit -m "Adding fasname to Nagios"</code> and then mark the changes for distribution by <code>make install</code>
== Configuration ==
=== Instances ===
Fedora Project runs two nagios instances, [ nagios] (noc1) and [ nagios-external] (noc2), you must be in the 'sysadmin' group to accesss them.
=== nagios (noc1) ===
The nagios configuration on noc1 should only monitor general host statistics - puppet status, uptime, apache status (up/down), SSH etc.
The configurations are found at <code>configs/system/nagios/</code> in the puppet tree.
=== nagios-external (noc2) ===
The nagios configuration on noc2 is located outside of our main datacenter and should monitor our user websites/applications (, FAS, PackageDB, Bodhi/Updates).
The configurations are found at <code>configs/system/nagios-external/</code> in the puppet tree.
== Understanding the Messages ==
=== General ===
Nagios notifications are generally easy to read, and follow this consistent format:
** HOST DOWN/UP alert - hostname **
Reading the message will provide extra information on what is wrong.
=== Disk Space Warning/Critical ===
Disk space warnings normally include the following information:
DISK WARNING/CRITICAL/OK - free space: mountpoint freespace(MB) (freespace(%) inode=freeinodes(%)):
A message stating "(1% inode=99%)" means that the diskspace is critical '''not''' the inode usage and is a sign that more diskspace is required.
== Further Reading ==
* [[Infrastructure/SOP/Puppet |Puppet SOP]]
* [[Infrastructure/SOP/Outage |Outages SOP]]
[[Category:Infrastructure SOPs]]
[[Category:Infrastructure SOPs]]

Latest revision as of 12:01, 16 February 2017

Infrastructure InfrastructureTeamN1.png

All SOPs have been moved to the Fedora Infrastructure SOP git repository. Please consult the online documentation for the current version of this document.