From Fedora Project Wiki
Line 18: Line 18:
 
<pre>
 
<pre>
 
# get the errors
 
# get the errors
sudo journalctl --since=yesterday -o short -u fedmsg-hub > ~/error.out
+
sudo journalctl --since=yesterday -o short --full --no-pager -u fedmsg-hub > ~/error.out
 
awk '/E[Rr][Rr]/' ~/error.out
 
awk '/E[Rr][Rr]/' ~/error.out
  

Revision as of 18:04, 7 September 2016


Overview

  • login to bodhi-backend01.phx2.fedoraproject.org
  • sign testing/stable update packages
  • Things to consider before pushing updates.
  • push testing/stable update packages
  • monitor the updates push
  • troubleshooting
  • pushing very important update packages to stable

Bodhi2 push monitoring

Troubleshooting

# get the errors
sudo journalctl --since=yesterday -o short --full --no-pager -u fedmsg-hub > ~/error.out
awk '/E[Rr][Rr]/' ~/error.out

# or with color
egrep 'ERROR|Errno' ~/error.out


OSError: [Errno 39] Directory not empty: '/mnt/koji/mash/updates/dist-6E-epel-testing-151201.1956/../dist-6E-epel-testing.repocache/repodata/'

Reset the fedmsg hub if that happens

# reset the fedmsg-hub service
sudo systemctl restart fedmsg-hub

NOTE: This is apparently due to stale NFSv3 locks. Bodhi2 is attempting to remove an old repocache, but there are stale NFS locks that prevent.


Inspect the repo cache areas (optional) to verify empty ?

ls \
/mnt/fedora_koji/koji/mash/updates/dist-5E-epel{,-testing}.repocache/repodata/ \ 
/mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \
/mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/        \
/mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/  \
/mnt/fedora_koji/koji/mash/updates/f24-updates{,-testing}.repocache/repodata/  \
/mnt/fedora_koji/koji/mash/updates/f25-updates{,-testing}.repocache/repodata/


OSError: [Errno 16] Device or resource busy: '/var/lib/mock/fedora-23-updates-x86_64/root/var/tmp/rpm-ostree.hjvMfC'

A bind mount needs to be removed. Look for TMPFS relics from rpm-ostree

findmnt -t tmpfs -o TARGET | grep rpm-ostree

Dismount ALL the relic TMPFS mounts (warning it's unwise to do this unless you know bodhi2 is completely idle)

sudo umount $(findmnt -t tmpfs -o TARGET | grep rpm-ostree)


IOError: Cannot open /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache/repodata/repomd.xml: File /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache/repodata/repomd.xml doesn't exists or not a regular file

rm -rf /mnt/koji/mash/updates/dist-6E-epel-151105.1606/../dist-6E-epel.repocache


ERROR: can't download kf5-kdesu-None:5.13.0-1.fc24.i686 from signed path /mnt/koji/packages/kf5-kdesu/5.13.0/1.fc24/data/signed/8e1431d5/i686/kf5-kdesu-5.13.0-1.fc24.i686.rpm

Sometimes an update has lingered so long that the signed rpm's have been garbage collected, so we simply put them back from the signature cache

NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-24 -v --write-all --sigul-batch-size=25 kf5-kdesu-5.13.0-1.fc24

Signing

Start the bodhi push for signing (responding 'no' when prompted)

cd /var/cache/sigul
sudo true ; yes 'no' | sudo -u apache -S bodhi-push --releases '25 24 23 5 6 7' --username parasense


Sign the builds

for i in 25 24 23 ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py fedora-$i -v --write-all --sigul-batch-size=25 $(cat /var/cache/sigul/{Stable,Testing}-F${i})    ; done
for i in  7  6  5 ; do NSS_HASH_ALG_SUPPORT=+MD5 ~/releng/scripts/sigulsign_unsigned.py epel-$i   -v --write-all --sigul-batch-size=25 $(cat /var/cache/sigul/{Stable,Testing}-*EL-${i}) ; done

Another way to do this would be to use screen to tmux, start the push but delay the y/n answer. The files in /var/cache/sigul are created, so go ahead and sign those builds, then send "y" to the bodhi-push prompt.

Test if bodhi masher is running

These checks are not scientific, but may help inform.

Check for existing masher locks of currently running push, or failed previous push

ls -l /mnt/koji/mash/updates/MASHING-*

Check for running bodhi2 push (via masher)

pgrep -af /usr/bin/mash

Also check for rsync

pgrep -af rsync


Resuming failed push

# resume the push interactivly
sudo -u apache bodhi-push --resume --username parasense

# resume the push responding yes to everything
sudo true; yes | sudo -u apache -S bodhi-push --resume --username parasense
# Follow the output of fedmsg-hub
sudo journalctl -o short -u fedmsg-hub -l -f

Sign Bridge Tasks

# monitor the signing on bridge for potential stalls from bodhi-backend
ssh -v -o'ControlPath=none' sign-bridge01 'tail -f /var/log/sigul_bridge.log'
# Verify the bridge is running or not
pgrep -af bridge.py

# Restart the bridge as necessary 
sudo pkill -f -9 bridge.py
sudo NSS_HASH_ALG_SUPPORT=+MD5 sigul_bridge -d -v -v

# Review the bridge.py output

# Findout the location of log file
lsof | awk '/sigul/ && /log/ {print $NF;exit}'


tail -f /var/log/sigul_bridge.log

Stable push requests

Sometimes an urgent request to have something pushed to stable.

Here we had two lorax builds in the testing queue, and QA engineer requested they go to stable ASAP. So you have to header over to the bodhi2 web front end, revoke the "tasting push", then choose to push the build to stable. Once that is done the bodhi-push comment will permit the stable push.

sudo -u apache bodhi-push --releases '21 22 23' --request=stable  --builds 'lorax-21.34-1.fc21 lorax-22.13-1.fc22 freeipa-4.2.3-1.fc23' --username parasense

Testing push request

So there was a stable push in progress, and folks requesting testing push. The testing push uses a different lock file, so they will go in parallel with the stable push in progress.

sudo -u masher bodhi-push --releases '21 22 23' --request=testing  --username parasense


Script to avoid common problems

Here is a script automates most common bodhi2 troubleshooting.

  • Invalid repocache
  • Restarting fedmsg hub to clear persisting sqlite file locks left behind by createrepo
  • Umount persisting tmpfs mounts from Atomic compose
#!/bin/bash
#
# Copyright (C) 2016 Red Hat, Inc.
# SPDX-License-Identifier:      MIT
# 
# Authors:
#     Jon Disnard <jdisnard@redhat.com>
#
# Attempt to fix common bodhi2 issues prior to running


## Stale NFS locks
## "OSError: [Errno 39] Directory not empty"
##
## REMARKS:
## These are caused by createrepo sqlite database locking.
## This issue could be solved by using either using createrepo_c, and/or generate the metadata off to the side in /tmp, then move to NFS.
## Also, koji signed-repos would solve this
##
printf '* Restarting fedmsg-hub\n'
sudo systemctl restart fedmsg-hub


## Stale tmpfs mounts
## "OSError: [Errno 16] Device or resource busy"
##
## REMARKS:
## A recursive 'umount -R' would solve this.
## Atomic needs to clean after itself, or the atomic compose parts in bodhi2.
##
while read tmpfs
do  if test -z "$tmpfs"
    then  continue
    else  
          printf '* tmpfs found; umounting %s\n' "$tmpfs"
          sudo umount -v "$tmpfs"
    fi
done < <(findmnt -t tmpfs -o TARGET | grep rpm-ostree)


## Missing repodata/repomd.xml
## "IOError: Cannot open"
##
## REMARKS:
## The repocache improve speed, less effort
## But it's ridiculous to fail here
## Easy Bodhi2 fix.
##
for I in /mnt/fedora_koji/koji/mash/updates/dist-5E-epel{,-testing}.repocache/repodata/ \
         /mnt/fedora_koji/koji/mash/updates/dist-6E-epel{,-testing}.repocache/repodata/ \
         /mnt/fedora_koji/koji/mash/updates/epel7{,-testing}.repocache/repodata/        \
         /mnt/fedora_koji/koji/mash/updates/f23-updates{,-testing}.repocache/repodata/  \
         /mnt/fedora_koji/koji/mash/updates/f24-updates{,-testing}.repocache/repodata/  \
         /mnt/fedora_koji/koji/mash/updates/f25-updates{,-testing}.repocache/repodata/  ;
do  if test -d "$I"
    then  if test -f "${I}/repomd.xml"
          then  continue
          else  printf '* No repomd.xml found; Removing %s\n' "$I"
                sudo rmdir "$I"
          fi
    fi
done


#THE END