Features/pacemaker-cloud

From FedoraProject

< Features(Difference between revisions)
Jump to: navigation, search
(Created page with '{{admon/important | Comments and Explanations | The page source contains comments providing guidance to fill out each section. They are invisible when viewing this page. To rea...')
 
(How To Test)
 
(47 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{admon/important | Comments and Explanations | The page source contains comments providing guidance to fill out each section.  They are invisible when viewing this page.  To read it, choose the "edit" link.<br/> '''Copy the source to a ''new page'' before making changes!  DO NOT EDIT THIS TEMPLATE FOR YOUR FEATURE.'''}}
 
 
{{admon/important | Set a Page Watch| Make sure you click ''watch'' on your new page so that you are notified of changes to it by others, including the Feature Wrangler}}
 
 
{{admon/note | All sections of this template are required for review by FESCo.  If any sections are empty it will not be reviewed }}
 
 
 
 
<!-- All fields on this form are required to be accepted by FESCo.
 
<!-- All fields on this form are required to be accepted by FESCo.
 
  We also request that you maintain the same order of sections so that all of the feature pages are uniform.  -->
 
  We also request that you maintain the same order of sections so that all of the feature pages are uniform.  -->
Line 11: Line 4:
 
<!-- The actual name of your feature page should look something like: Features/YourFeatureName.  This keeps all features in the same namespace -->
 
<!-- The actual name of your feature page should look something like: Features/YourFeatureName.  This keeps all features in the same namespace -->
  
= Feature Name <!-- The name of your feature --> =
+
= pacemaker-cloud <!-- The name of your feature --> =
  
 
== Summary ==
 
== Summary ==
 +
The pacemaker-cloud project demonstrates the current community work in providing application service high availability in a cloud environment.
 +
 
<!-- A sentence or two summarizing what this feature is and what it will do.  This information is used for the overall feature summary page for each release. -->
 
<!-- A sentence or two summarizing what this feature is and what it will do.  This information is used for the overall feature summary page for each release. -->
  
 
== Owner ==
 
== Owner ==
 
<!--This should link to your home wiki page so we know who you are-->
 
<!--This should link to your home wiki page so we know who you are-->
* Name: [[User:FASAcountName| Your Name]]
+
* Name: [[User:sdake| Steven Dake]]
  
 
<!-- Include you email address that you can be reached should people want to contact you about helping with your feature, status is requested, or  technical issues need to be resolved-->
 
<!-- Include you email address that you can be reached should people want to contact you about helping with your feature, status is requested, or  technical issues need to be resolved-->
* Email: <your email address so we can contact you, invite you to meetings, etc.>
+
* Email: <sdake@redhat.com>
  
 
== Current status ==
 
== Current status ==
* Targeted release: [[Releases/<number> | Fedora <number> ]]  
+
* Targeted release: [[Releases/16 | Fedora 16 ]]  
* Last updated: (DATE)
+
* Last updated: (July 20th, 2011)
* Percentage of completion: XX%
+
* Percentage of completion: 100%
 +
 
  
 
<!-- CHANGE THE "FedoraVersion" TEMPLATES ABOVE TO PLAIN NUMBERS WHEN YOU COMPLETE YOUR PAGE. -->
 
<!-- CHANGE THE "FedoraVersion" TEMPLATES ABOVE TO PLAIN NUMBERS WHEN YOU COMPLETE YOUR PAGE. -->
Line 32: Line 28:
 
== Detailed Description ==
 
== Detailed Description ==
 
<!-- Expand on the summary, if appropriate.  A couple sentences suffices to explain the goal, but the more details you can provide the better. -->
 
<!-- Expand on the summary, if appropriate.  A couple sentences suffices to explain the goal, but the more details you can provide the better. -->
 +
The software provides a user interface shell called pcloudsh which provides:
 +
* Create deployables including:
 +
** Create a JEOS image of F14, F15, F16, RHEL6
 +
** Create an assembly of F14, F15, F16, RHEL6
 +
** Add assemblies to a deployable
 +
** Add managed resources to an assembly
 +
* Launch a deployable, including all of its assembly images
 +
* Provides user interface feedback when an application or assembly fails and describe which corrective actions are taken.
 +
 +
The software provides daemons and init scripts which provide high availability of the deployables configured in the system:
 +
* Kill/restart applications if a failure is detected.
 +
* Kill and restart assemblies if an assembly failure is detected.
 +
 +
Nomenclature:
 +
* JEOS - just enough operating system - the bare minimum operating system required to boot a virtual machine image
 +
* Assembly - Composition of a JEOS image and managed resources
 +
* Deployable - Collection of assemblies that represent all virtual machines required to provide a specific service
 +
* Resource - Daemon application, such as Apache's httpd service, which is managed for high availability
 +
* high availability - Applying the techniques of:
 +
** monitoring a component for failure
 +
** forcibly terminating a component when a failure has been detected
 +
** restarting the failed component
 +
** providing notification to system administration so they may repair the underlying fault
 +
** See [http://www.redhat.com/summit/2011/presentations/summit/whats_new/thursday/dake_th_1130_high_availability_in_the_cloud.pdf Pacemaker Cloud Project Slides] for more details.
  
 
== Benefit to Fedora ==
 
== Benefit to Fedora ==
 
<!-- What is the benefit to the platform?  If this is a major capability update, what has changed?  If this is a new feature, what capabilities does it bring? Why will Fedora become a better distribution or project because of this feature?-->
 
<!-- What is the benefit to the platform?  If this is a major capability update, what has changed?  If this is a new feature, what capabilities does it bring? Why will Fedora become a better distribution or project because of this feature?-->
 
+
This feature provides a preview of high availability for cloud environments using a building block that is reusable in other cloud management systems.  This feature provides only single node deployable high availability, but for F17 we plan to integrate with other distributed cloud management tools such as Aeolus.
 
== Scope ==
 
== Scope ==
 
<!-- What work do the developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
 
<!-- What work do the developers have to accomplish to complete the feature in time for release?  Is it a large change affecting many parts of the distribution or is it a very isolated change? What are those changes?-->
 +
This is a standalone package but has several dependencies on other parts of Fedora 16.  We are in good shape relatng to dependencies, however, systemd is not LSB compliant currently resulting in our software not being able to provide high availability for F15 or Rawhide guests.
 +
 +
We are nearing code completion for the single node case and have some basic packaging done.
  
 
== How To Test ==
 
== How To Test ==
<!-- This does not need to be a full-fledged document.  Describe the dimensions of tests that this feature is expected to pass when it is done.  If it needs to be tested with different hardware or software configurations, indicate them.  The more specific you can be, the better the community testing can be.  
+
<pre>
 +
setenforce 0 (we are working on this)
 +
configure firewall to allow communication via TCP port 49000 or disable firewall
 +
yum install pacemaker-cloud
 +
systemctl start pcloud-cped.service
 +
</pre>
  
Remember that you are writing this how to for interested testers to use to check out your feature - documenting what you do for testing is OK, but it's much better to document what *I* can do to test your feature.
+
We have a test suite that can be run which provides automated validation the software functions properly.
  
A good "how to test" should answer these four questions:
+
Manually the following operations can be done:
 +
<pre>
 +
root# pcloudsh
 +
pcloudsh# jeos_create F14 x86_64
 +
pcloudsh# assembly_create assy1 F14 x86_64
 +
pcloudsh# assembly_clone assy1 assy2
 +
pcloudsh# assembly_clone assy1 assy3
 +
pcloudsh# assembly_resource_add httpdone httpd assy1
 +
pcloudsh# assembly_resource_add httpdtwo httpd assy2
 +
pcloudsh# assembly_resource_add httpdthree httpd assy3
 +
pcloudsh# deployable_create dep1
 +
pcloudsh# deployable_assembly_add dep1 assy1
 +
pcloudsh# deployable_assembly_add dep1 assy2
 +
pcloudsh# deployable_assembly_add dep1 assy3
 +
pcloudsh# deployable_start dep1
 +
</pre>
  
0. What special hardware / data / etc. is needed (if any)?
+
Keep pcloudsh running and in another shell:
1. How do I prepare my system to test this feature? What packages
+
 
need to be installed, config files edited, etc.?
+
# verify application restart works properly:
2. What specific actions do I perform to check that the feature is
+
## login to one of the assemblies and killall -9 httpd
working like it's supposed to?
+
## verify that httpd is restarted via pacemaker-cloud
3. What are the expected results of those actions?
+
# verify deployable restart works properly:
-->
+
## Open the virtual machine manager GUI
 +
## Use the force off functionality on an assembly
 +
## The virtual machine manager should display that the assembly is restarted
 +
## Login to the restarted virtual machine and verify httpd was restarted properly
 +
# verify pcloudsh displays feedback
 +
## verify failed applications indicate they are failed and restarted
 +
## verify failed assemblies indicate they are failed and restarted
  
 
== User Experience ==
 
== User Experience ==
 
<!-- If this feature is noticeable by its target audience, how will their experiences change as a result?  Describe what they will see or notice. -->
 
<!-- If this feature is noticeable by its target audience, how will their experiences change as a result?  Describe what they will see or notice. -->
 +
The audience will notice a shell with comands which can be used to create, launch, and monitor deployables single node.
  
 
== Dependencies ==
 
== Dependencies ==
 
<!-- What other packages (RPMs) depend on this package?  Are there changes outside the developers' control on which completion of this feature depends?  In other words, completion of another feature owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate?  Other upstream projects like the kernel (if this is not a kernel feature)? -->
 
<!-- What other packages (RPMs) depend on this package?  Are there changes outside the developers' control on which completion of this feature depends?  In other words, completion of another feature owned by someone else and might cause you to not be able to finish on time or that you would need to coordinate?  Other upstream projects like the kernel (if this is not a kernel feature)? -->
 +
Previously packaged in Fedora rawhide:
 +
<pre>
 +
glib2
 +
dbus-glib
 +
libxml2
 +
libuuid
 +
libqb
 +
pacemaker-libs
 +
qmf
 +
libxslt
 +
qpid-cpp-server
 +
qpid-cpp-client
 +
python-qmf
 +
matahari-service
 +
matahari-host
 +
libqb
 +
oz
 +
systemd
 +
</pre>
 +
 +
Dependency with broken functionality:
 +
systemd - systemd guests don't work properly because systemd is not LSB compliant.  A patch to resolve this issue has been merged upstream and tested working with the current pacemaker-cloud code in a f15 JEOS + upstream patch on top of latest rawhide systemd rpm.
  
 
== Contingency Plan ==
 
== Contingency Plan ==
 
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "None necessary, revert to previous release behaviour."  Or it might not.  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
 
<!-- If you cannot complete your feature by the final development freeze, what is the backup plan?  This might be as simple as "None necessary, revert to previous release behaviour."  Or it might not.  If you feature is not completed in time we want to assure others that other parts of Fedora will not be in jeopardy.  -->
 +
If this feature is not ready by July 26, it can moved to a later Fedora version.  If systemd is not LSB compliant by July 26, appropriate release notes should indicate that systemd is in the progress of updating its build with upstream packages.  More then likely this will just be fixed as part of f16 release of systemd.
  
 
== Documentation ==
 
== Documentation ==
 
<!-- Is there upstream documentation on this feature, or notes you have written yourself?  Link to that material here so other interested developers can get involved. -->
 
<!-- Is there upstream documentation on this feature, or notes you have written yourself?  Link to that material here so other interested developers can get involved. -->
*
+
 
 +
[http://www.pacemaker-cloud.org Project Page]
 +
 
 +
[https://github.com/pacemaker-cloud/pacemaker-cloud Developer Resources]
 +
 
 +
[http://www.redhat.com/summit/2011/presentations/summit/whats_new/thursday/dake_th_1130_high_availability_in_the_cloud.pdf Pacemaker Cloud Project Slides]
  
 
== Release Notes ==
 
== Release Notes ==
 
<!-- The Fedora Release Notes inform end-users about what is new in the release.  Examples of past release notes are here: http://docs.fedoraproject.org/release-notes/ -->
 
<!-- The Fedora Release Notes inform end-users about what is new in the release.  Examples of past release notes are here: http://docs.fedoraproject.org/release-notes/ -->
 
<!-- The release notes also help users know how to deal with platform changes such as ABIs/APIs, configuration or data file formats, or upgrade concerns.  If there are any such changes involved in this feature, indicate them here.  You can also link to upstream documentation if it satisfies this need.  This information forms the basis of the release notes edited by the documentation team and shipped with the release. -->
 
<!-- The release notes also help users know how to deal with platform changes such as ABIs/APIs, configuration or data file formats, or upgrade concerns.  If there are any such changes involved in this feature, indicate them here.  You can also link to upstream documentation if it satisfies this need.  This information forms the basis of the release notes edited by the documentation team and shipped with the release. -->
*
+
 
 +
Pacemaker-Cloud provides high availability for application services inside virtual machines on a single node.  This feature provides a shell for creating virtual machine images, associating resources with the virtual machines, and combining these images into a deployable.  A deployable can then be launched and monitored for high availability.  If virtual machines or applications fail, these components will be restarted reducing MTTR (mean time to repair) improving availability over manual operator restart.
 +
 
 +
Fedora ''guest virtual machines using systemd'' are currently non-functional until the following bugzilla is merged into rawhide: See [https://bugzilla.redhat.com/show_bug.cgi?id=702621 systemd defect 702621] discussion.
  
 
== Comments and Discussion ==
 
== Comments and Discussion ==
* See [[Talk:Features/YourFeatureName]]  <!-- This adds a link to the "discussion" tab associated with your page.  This provides the ability to have ongoing comments or conversation without bogging down the main feature page -->
+
* See [[Talk:Features/pacemaker-cloud]]  <!-- This adds a link to the "discussion" tab associated with your page.  This provides the ability to have ongoing comments or conversation without bogging down the main feature page -->
  
  
[[Category:FeaturePageIncomplete]]
+
[[Category:FeatureAcceptedF16]]
 
<!-- When your feature page is completed and ready for review -->
 
<!-- When your feature page is completed and ready for review -->
 
<!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler -->
 
<!-- remove Category:FeaturePageIncomplete and change it to Category:FeatureReadyForWrangler -->
 
<!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete-->
 
<!-- After review, the feature wrangler will move your page to Category:FeatureReadyForFesco... if it still needs more work it will move back to Category:FeaturePageIncomplete-->
 
<!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process -->
 
<!-- A pretty picture of the page category usage is at: https://fedoraproject.org/wiki/Features/Policy/Process -->

Latest revision as of 10:19, 15 November 2011


Contents

[edit] pacemaker-cloud

[edit] Summary

The pacemaker-cloud project demonstrates the current community work in providing application service high availability in a cloud environment.


[edit] Owner

  • Email: <sdake@redhat.com>

[edit] Current status

  • Targeted release: Fedora 16
  • Last updated: (July 20th, 2011)
  • Percentage of completion: 100%


[edit] Detailed Description

The software provides a user interface shell called pcloudsh which provides:

  • Create deployables including:
    • Create a JEOS image of F14, F15, F16, RHEL6
    • Create an assembly of F14, F15, F16, RHEL6
    • Add assemblies to a deployable
    • Add managed resources to an assembly
  • Launch a deployable, including all of its assembly images
  • Provides user interface feedback when an application or assembly fails and describe which corrective actions are taken.

The software provides daemons and init scripts which provide high availability of the deployables configured in the system:

  • Kill/restart applications if a failure is detected.
  • Kill and restart assemblies if an assembly failure is detected.

Nomenclature:

  • JEOS - just enough operating system - the bare minimum operating system required to boot a virtual machine image
  • Assembly - Composition of a JEOS image and managed resources
  • Deployable - Collection of assemblies that represent all virtual machines required to provide a specific service
  • Resource - Daemon application, such as Apache's httpd service, which is managed for high availability
  • high availability - Applying the techniques of:
    • monitoring a component for failure
    • forcibly terminating a component when a failure has been detected
    • restarting the failed component
    • providing notification to system administration so they may repair the underlying fault
    • See Pacemaker Cloud Project Slides for more details.

[edit] Benefit to Fedora

This feature provides a preview of high availability for cloud environments using a building block that is reusable in other cloud management systems. This feature provides only single node deployable high availability, but for F17 we plan to integrate with other distributed cloud management tools such as Aeolus.

[edit] Scope

This is a standalone package but has several dependencies on other parts of Fedora 16. We are in good shape relatng to dependencies, however, systemd is not LSB compliant currently resulting in our software not being able to provide high availability for F15 or Rawhide guests.

We are nearing code completion for the single node case and have some basic packaging done.

[edit] How To Test

setenforce 0 (we are working on this)
configure firewall to allow communication via TCP port 49000 or disable firewall
yum install pacemaker-cloud
systemctl start pcloud-cped.service

We have a test suite that can be run which provides automated validation the software functions properly.

Manually the following operations can be done:

root# pcloudsh
pcloudsh# jeos_create F14 x86_64
pcloudsh# assembly_create assy1 F14 x86_64
pcloudsh# assembly_clone assy1 assy2
pcloudsh# assembly_clone assy1 assy3
pcloudsh# assembly_resource_add httpdone httpd assy1
pcloudsh# assembly_resource_add httpdtwo httpd assy2
pcloudsh# assembly_resource_add httpdthree httpd assy3
pcloudsh# deployable_create dep1
pcloudsh# deployable_assembly_add dep1 assy1
pcloudsh# deployable_assembly_add dep1 assy2
pcloudsh# deployable_assembly_add dep1 assy3
pcloudsh# deployable_start dep1

Keep pcloudsh running and in another shell:

  1. verify application restart works properly:
    1. login to one of the assemblies and killall -9 httpd
    2. verify that httpd is restarted via pacemaker-cloud
  2. verify deployable restart works properly:
    1. Open the virtual machine manager GUI
    2. Use the force off functionality on an assembly
    3. The virtual machine manager should display that the assembly is restarted
    4. Login to the restarted virtual machine and verify httpd was restarted properly
  3. verify pcloudsh displays feedback
    1. verify failed applications indicate they are failed and restarted
    2. verify failed assemblies indicate they are failed and restarted

[edit] User Experience

The audience will notice a shell with comands which can be used to create, launch, and monitor deployables single node.

[edit] Dependencies

Previously packaged in Fedora rawhide:

glib2
dbus-glib
libxml2
libuuid
libqb
pacemaker-libs
qmf
libxslt
qpid-cpp-server
qpid-cpp-client
python-qmf
matahari-service
matahari-host
libqb
oz
systemd

Dependency with broken functionality: systemd - systemd guests don't work properly because systemd is not LSB compliant. A patch to resolve this issue has been merged upstream and tested working with the current pacemaker-cloud code in a f15 JEOS + upstream patch on top of latest rawhide systemd rpm.

[edit] Contingency Plan

If this feature is not ready by July 26, it can moved to a later Fedora version. If systemd is not LSB compliant by July 26, appropriate release notes should indicate that systemd is in the progress of updating its build with upstream packages. More then likely this will just be fixed as part of f16 release of systemd.

[edit] Documentation

Project Page

Developer Resources

Pacemaker Cloud Project Slides

[edit] Release Notes

Pacemaker-Cloud provides high availability for application services inside virtual machines on a single node. This feature provides a shell for creating virtual machine images, associating resources with the virtual machines, and combining these images into a deployable. A deployable can then be launched and monitored for high availability. If virtual machines or applications fail, these components will be restarted reducing MTTR (mean time to repair) improving availability over manual operator restart.

Fedora guest virtual machines using systemd are currently non-functional until the following bugzilla is merged into rawhide: See systemd defect 702621 discussion.

[edit] Comments and Discussion