Fedora Cloud Infrastructure SOP

File:Infrastructure InfrastructureTeamN1.png

Shortcuts: ISOP:FEDORACLOUD ISOP:OVIRT

We are still working on the Fedora cloud setup, the content of this page will grow at the same time we work and troubleshoot all services.

Fedora Cloud computing

Contact Information

Owner: Fedora Infrastructure Team

Contact: #fedora-admin, sysadmin-cloud group

Persons: mmcgrath, SmootherFrOgZ, G

Location: Phoenix ?

Servers: capp1.fedoraproject.org, cnode[1-5].fedoraproject.org, store[1-4]

Purpose: Provide Virtual Machine for Fedora contributors.

Description

blabalbalablab
blablablabal

Rebuild capp1 (ovirt-server)

Log into cnode1
Check that no capp1 domain is running

sudo virsh list

If there is a capp1 running, proceed as follow

sudo virsh destroy capp1
sudo virsh undefine capp1

Format capp1 disk for a better new virtual install

sudo /sbin/mkfs.ext3 -j /dev/VolGroup00/appliance1

You can now start install a new fresh capp1 virtual system

sudo virt-install -n capp1 -r 1024 --vcpus=2 --os-variant fedora11 --os-type linux \
-l http://mirrors.kernel.org/fedora/releases/11/Fedora/x86_64/os/ \
--disk="path=/dev/VolGroup00/appliance1" --nographics --noacpi --hvm --network=bridge:br2 \
--accelerate -x "console=ttyS0 ks=http://infrastructure.fedoraproject.org/rhel/ks/fedora ip=209.132.178.19 netmask=255.255.254.0 gateway=209.132.179.254 dns=4.2.2.2"

Note: If the network messes up during the prompt install, just configure it manually. NM will takes care of it then.

Note2: The above ks file seems to have graphical install as install method. Rebuild one or do a manual install to continue.

Network configuration

capp1 network interfaces will need to be setup manually in order to work against physical one.
Here is how to proceed, create your network interface

sudo vi /etc/sysconfig/network-scripts/ifcfg-eth1

Then add this following configuration to the file

DEVICE=eth1
BOOTPROTO=static
ONBOOT=yes
PEERNTP=yes
IPADDR=$physical_br_IP
NETMASK=$physical_br_NETMASK
HWADDR=$random_mac_addr

Reproduce the above for eth3 against br3
You can get br? IP and netmask on cnode1 with <ifconfig> cd-line.

Troubleshooting

VMs can't receive tasks anymore

If for some reason VM appear to not receive tasks, this's because it's not reachable anymore.
Reloading your browser will show the VM as state <unreachable> in the ovirt UI.
At first, check that hosts are still available. If so, you gonna have to do a manual reload of ovirt taksOmatic on the broker connectivity.

Log in to capp1 and restart services in this order (hold few sec for each)

sudo service ovirt-taskomatic restart

If you get an error like below from taskomatic.log :
--
ERROR Wed Nov 11 18:46:11 +0000 2009 (3382) Task action processing failed: RuntimeError: No agent responded within timeout period
--

Restart qpid servive first (taskomatic process will died which is normal)

sudo service qpidd restart

Cnodes or VM are unreachable

If cnode(s) or VM got unreachable, there're a couple of way to figure out what's going on.

0. first off, logs are always useful (specificaly : db-omatic.log and task-omatic.log).
1. Check that cnode(s) or VM are still "physically" reachable.
2. If there are but VM, check if libvirt-qpid or libvirtd is still running on related cnode(s).

If there're not :

sudo libvirtd-qpid start
sudo libvirtd start

3. If both cnode and VM are still UP and running, it could be a timeout on qmf connectivity or db-omatic died without any reason. The best way to fix this is to reload ovirt qmf/qpid's process as follow :

sudo service qpidd restart
sudo service ovirt-db-omatic restart
sudo service ovirt-taskomatic restart

Search