User talk:Wwoods/Building images with treebuilder

From FedoraProject

< User talk:Wwoods(Difference between revisions)
Jump to: navigation, search
Line 142: Line 142:
 
  [compression]
 
  [compression]
 
  bcj=off
 
  bcj=off
 +
 +
=== The anaconda installer starts but then hangs (hamzy) ===
 +
 +
boot the 08/04/2011 iso image with the following options
 +
  linux systemd.log_target=kmsg systemd.log_level=debug rd.break
 +
run the following command in the shell
 +
  sed -i -e 's/tmp\.mount //' /sysroot/lib/systemd/system/loader.service
 +
  exit
 +
 +
You will see and then nothing else
 +
  Starting Anaconda version 16.14.
 +
 +
Change the file /lib/systemd/system/loader.service to start the loader like this
 +
  ExecStart=/usr/bin/strace -f -e write=3,4,5 /sbin/loader
 +
 +
This will run strace on the loader and produce the following output
 +
 +
<pre>
 +
[pid  606] <... read resumed> "", 4096) = 0
 +
[pid  606] --- {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=910, si_status=0, si_utime=0, si_stime=1} (Child exited) ---
 +
[pid  606] read(11, "", 4096)          = 0
 +
[pid  606] waitpid(910, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 910
 +
[pid  606] close(9)                    = 0
 +
[pid  606] close(11)                  = 0
 +
[pid  606] access("/mnt/install/source/.treeinfo", R_OK) = -1 ENOENT (No such file or directory)
 +
[pid  606] umount("/mnt/install/source", 0) = 0
 +
[pid  606] time(NULL)                  = 1312818684
 +
[pid  606] send(6, "<143>Aug  8 15:51:24 loader: in "..., 62, MSG_NOSIGNAL) = 62
 +
[pid  606] write(4, "15:51:24,021 DEBUG loader: in do"..., 61) = 61
 +
| 00000  31 35 3a 35 31 3a 32 34  2c 30 32 31 20 44 45 42  15:51:24 ,021 DEB |
 +
| 00010  55 47 20 6c 6f 61 64 65  72 3a 20 69 6e 20 64 6f  UG loade r: in do |
 +
| 00020  4c 6f 61 64 65 72 4d 61  69 6e 2c 20 73 74 65 70  LoaderMa in, step |
 +
| 00030  20 3d 20 53 54 45 50 5f  4c 41 4e 47 0a            = STEP_ LANG.    |
 +
[pid  606] select(3, [2], NULL, NULL, {0, 0}) = 0 (Timeout)
 +
[pid  606] write(1, "\33[2;21H\33[m\17\33[30m\33[47m\342\224\214\342\224\200\342\224\200\342\224"..., 1926) = 1926
 +
[pid  606] select(1, [0], [], [], NULL
 +
</pre>
 +
 +
The loader gets as far as outputting the following statement
 +
  15:06:02,581 DEBUG loader: in doLoaderMain, step = STEP_LANG.
 +
 +
However, I think that there is still unbuffered strace output that did not make it to the screen.

Revision as of 18:55, 9 August 2011

Contents

Boot problem (jreiser)

[2011-08-06 0000 UTC] Upon boot of DVD, then dracut drops to shell with the message:

 Warning: No root device "live:/dev/disk/by-label/Fedora" found.

The file is present as

 /dev/disk/by-label/Fedora 16 x86_64 DVD

with those three spaces. Reboot with quoting (single, or double, or backslash) also fails. Altering the boot parameter to

 root=ID=ata_SONY_DVD_RW_AW_G-170A

also fails [that name is in my /dev/disk/by-id].

[2011-08-06 1700 UTC] After modifying pungi so that the label of the DVD has dashes "Fedora-16-x86_64-DVD" and changing the boot command line to include the dashes "root=live:CDLABEL=Fedora-16-x86_64-DVD", then dracut still complains "No root device \"live:/dev/disk/by-label/Fedora-16-x86_64-DVD\" found". The device is there, symlinked to ../../sr0 ==> /dev/sr0, and that is the correct name of the hardware drive.

[2011-08-07 0400 UTC] In the emergency shell after dracut cannot find the root, I can find it by:

 mkdir /mnt1 /mnt2 /mnt3
 mount -o ro,loop /dev/sr0 /mnt1
 mount -o ro,loop /mnt1/images/install.img /mnt2
 mount -o ro,loop /mnt2/LiveOS/rootfs.img /mnt3

but I cannot find how to specify that in a "root=live..." that dracut understands. It's also a mystery why so many mounts are necessary.

notes/solutions (wwoods)

  1. The DVD isn't built correctly - pungi changes the label of the image without changing the boot config file. Pungi will need to be fixed to handle that. (boot.iso should work fine.)
    • I believe dracut expects spaces to be escaped, so it would need to be: root=live:CDLABEL=Fedora%x2016%x20x86_64%x20DVD
    • As a workaround, you could just boot with root=live:/dev/sr0
  2. The mounts are necessary because treebuilder images are live images, and all live images are ext4 images inside squashfs images inside an iso (CD/DVD) or vfat (USB stick) filesystem.
    • Yes, that does seem needlessly complicated
    • No, there isn't a simpler way to do it right now


mount traps when systemd mounts /tmp (hamzy)

Seen during booting:

... [ 11.992445] systemd[1]: tmp.mount mount process exited, code=killed status=11 [ 12.029660] systemd[1]: Job loader.service/start failed with result 'dependency'. [ 12.029695] systemd[1]: Unit tmp.mount entered failed state. ...

bash-4.2# systemctl status tmp.mount [ 71.988204] systemd[1]: Accepted connection on private bus. [ 71.988860] systemd[1]: Got D-Bus request: org.freedesktop.systemd1.Manager.LoadUnit() on /org/freedesktop/systemd1 [ 71.989077] systemd[1]: Got D-Bus request: org.freedesktop.DBus.Properties.GetAll() on /org/freedesktop/systemd1/unit/tmp_2emount tmp.mount - Runtime Directory

         Loaded: loaded (/lib/systemd/system/tmp.mount)
         Active: inactive (dead)
          Where: /tmp
           What: tmpfs
         CGroup: name=systemd:/system/tmp.mount

[ 71.990451] systemd[1]: Got D-Bus request: org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local

FIX:

boot with the following options

  linux systemd.log_target=kmsg systemd.log_level=debug rd.break

run the following command in the shell

  sed -i -e 's/tmp\.mount //' /sysroot/lib/systemd/system/loader.service

notes (wwoods)

  • Only seems to happen on PPC64 systems
  • This workaround has been added to the treebuilder branch; should be fixed for images built on or after Aug. 9

Slow (jreiser)

Treebuilder is slower. The first culprit looks like dracut.

number execve
7606 /bin/egrep
2578 /bin/cp
1633 /lib64/ld-linux-x86-64.so.2
1434 /usr/bin/ldd
1413 /bin/ln
1303 /usr/local/bin/ln
1303 /usr/bin/ln
1087 /sbin/modinfo

The second culprit is not honoring $TMPDIR, and not putting yumroot-$PID and installroot-$PID inside it.

count filename
1693845 *
253387 $DESTDIR/yumroot/...
79977 $DESTDIR/installroot/...
44486 /etc/xattr.conf
41713 /proc/self/task/31818/attr/fscreate
25919 /proc/self/task/1070/attr/fscreate
20220 /usr
19487 /etc/localtime
18485 /lib64/libc.so.6
18294 /usr/share
18184 /etc/ld.so.cache
18074 /etc/ld.so.preload
17213 /usr/bin/ldd
14471 /bin/egrep
9861 /usr/share/locale
9090 /var/log/dracut.log
9013 /dev/null
7529 /sbin/modprobe
6110 .

commentary (wwoods)

  • This is pretty useless without some hard numbers. On my years-old test systems, I can complete a run in the following times:
Arch Time
x86_64 (treebuilder, bcj) 19m15s (20m35s user, 2m56s system)
ppc64 (treebuilder, bcj) 40m10s (28m46s user, 16m37s system)
  • If time is your primary concern, here's some speedup suggestions:
  • Add this to /etc/lorax/lorax.conf:
[compression]
bcj=off

The anaconda installer starts but then hangs (hamzy)

boot the 08/04/2011 iso image with the following options

  linux systemd.log_target=kmsg systemd.log_level=debug rd.break

run the following command in the shell

  sed -i -e 's/tmp\.mount //' /sysroot/lib/systemd/system/loader.service
  exit

You will see and then nothing else

  Starting Anaconda version 16.14.

Change the file /lib/systemd/system/loader.service to start the loader like this

  ExecStart=/usr/bin/strace -f -e write=3,4,5 /sbin/loader

This will run strace on the loader and produce the following output

[pid   606] <... read resumed> "", 4096) = 0
[pid   606] --- {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=910, si_status=0, si_utime=0, si_stime=1} (Child exited) ---
[pid   606] read(11, "", 4096)          = 0
[pid   606] waitpid(910, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 910
[pid   606] close(9)                    = 0
[pid   606] close(11)                   = 0
[pid   606] access("/mnt/install/source/.treeinfo", R_OK) = -1 ENOENT (No such file or directory)
[pid   606] umount("/mnt/install/source", 0) = 0
[pid   606] time(NULL)                  = 1312818684
[pid   606] send(6, "<143>Aug  8 15:51:24 loader: in "..., 62, MSG_NOSIGNAL) = 62
[pid   606] write(4, "15:51:24,021 DEBUG loader: in do"..., 61) = 61
 | 00000  31 35 3a 35 31 3a 32 34  2c 30 32 31 20 44 45 42  15:51:24 ,021 DEB |
 | 00010  55 47 20 6c 6f 61 64 65  72 3a 20 69 6e 20 64 6f  UG loade r: in do |
 | 00020  4c 6f 61 64 65 72 4d 61  69 6e 2c 20 73 74 65 70  LoaderMa in, step |
 | 00030  20 3d 20 53 54 45 50 5f  4c 41 4e 47 0a            = STEP_ LANG.    |
[pid   606] select(3, [2], NULL, NULL, {0, 0}) = 0 (Timeout)
[pid   606] write(1, "\33[2;21H\33[m\17\33[30m\33[47m\342\224\214\342\224\200\342\224\200\342\224"..., 1926) = 1926
[pid   606] select(1, [0], [], [], NULL

The loader gets as far as outputting the following statement

  15:06:02,581 DEBUG loader: in doLoaderMain, step = STEP_LANG.

However, I think that there is still unbuffered strace output that did not make it to the screen.