User talk:Wwoods/Building images with treebuilder: Difference between revisions

Latest revision as of 19:59, 12 August 2011

Boot problem (jreiser)

[2011-08-06 0000 UTC] Upon boot of DVD, then dracut drops to shell with the message:

 Warning: No root device "live:/dev/disk/by-label/Fedora" found.

The file is present as

 /dev/disk/by-label/Fedora 16 x86_64 DVD

with those three spaces. Reboot with quoting (single, or double, or backslash) also fails. Altering the boot parameter to

 root=ID=ata_SONY_DVD_RW_AW_G-170A

also fails [that name is in my /dev/disk/by-id].

[2011-08-06 1700 UTC] After modifying pungi so that the label of the DVD has dashes "Fedora-16-x86_64-DVD" and changing the boot command line to include the dashes "root=live:CDLABEL=Fedora-16-x86_64-DVD", then dracut still complains "No root device \"live:/dev/disk/by-label/Fedora-16-x86_64-DVD\" found". The device is there, symlinked to ../../sr0 ==> /dev/sr0, and that is the correct name of the hardware drive.

[2011-08-07 0400 UTC] In the emergency shell after dracut cannot find the root, I can find it by:

 mkdir /mnt1 /mnt2 /mnt3
 mount -o ro,loop /dev/sr0 /mnt1
 mount -o ro,loop /mnt1/images/install.img /mnt2
 mount -o ro,loop /mnt2/LiveOS/rootfs.img /mnt3

but I cannot find how to specify that in a "root=live..." that dracut understands. It's also a mystery why so many mounts are necessary.

notes/solutions (wwoods)

The DVD isn't built correctly - pungi changes the label of the image without changing the boot config file. Pungi will need to be fixed to handle that. (boot.iso should work fine.)
- I believe dracut expects spaces to be escaped, so it would need to be: root=live:CDLABEL=Fedora%x2016%x20x86_64%x20DVD
- As a workaround, you could just boot with root=live:/dev/sr0
The mounts are necessary because treebuilder images are live images, and all live images are ext4 images inside squashfs images inside an iso (CD/DVD) or vfat (USB stick) filesystem.
- Yes, that does seem needlessly complicated
- No, there isn't a simpler way to do it right now

boot problem on a PowerMac G5

I've tried to boot with 'root=live:/dev/hda', but dracut bailed out with 'Warning: No root device "live:/dev/hda" found'

I've also tried to boot with 'root=live:http://some-server/.../install.img', but dracut aborted with a similar error.

mount traps when systemd mounts /tmp (hamzy)

boot the 08/04/2011 iso image with the following options

  linux systemd.log_target=kmsg systemd.log_level=debug rd.break

Seen during booting:

...
[   11.992445] systemd[1]: tmp.mount mount process exited, code=killed status=11
[   12.029660] systemd[1]: Job loader.service/start failed with result 'dependency'.
[   12.029695] systemd[1]: Unit tmp.mount entered failed state.
...

bash-4.2# systemctl status tmp.mount
[   71.988204] systemd[1]: Accepted connection on private bus.
[   71.988860] systemd[1]: Got D-Bus request: org.freedesktop.systemd1.Manager.LoadUnit() on /org/freedesktop/systemd1
[   71.989077] systemd[1]: Got D-Bus request: org.freedesktop.DBus.Properties.GetAll() on /org/freedesktop/systemd1/unit/tmp_2emount
tmp.mount - Runtime Directory
          Loaded: loaded (/lib/systemd/system/tmp.mount)
          Active: inactive (dead)
           Where: /tmp
            What: tmpfs
          CGroup: name=systemd:/system/tmp.mount
[   71.990451] systemd[1]: Got D-Bus request: org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local

FIX:

boot with the following options

  linux systemd.log_target=kmsg systemd.log_level=debug rd.break

run the following command in the shell

  sed -i -e 's/tmp\.mount //' /sysroot/lib/systemd/system/loader.service

notes (wwoods)

Only seems to happen on PPC64 systems
This workaround has been added to the treebuilder branch; should be fixed for images built on or after Aug. 9

Slow (jreiser)

Treebuilder is slower. The first culprit looks like dracut.

number	execve
7606	/bin/egrep
2578	/bin/cp
1633	/lib64/ld-linux-x86-64.so.2
1434	/usr/bin/ldd
1413	/bin/ln
1303	/usr/local/bin/ln
1303	/usr/bin/ln
1087	/sbin/modinfo

The second culprit is not honoring $TMPDIR, and not putting yumroot-$PID and installroot-$PID inside it.

count	filename
1693845	*
253387	$DESTDIR/yumroot/...
79977	$DESTDIR/installroot/...
44486	/etc/xattr.conf
41713	/proc/self/task/31818/attr/fscreate
25919	/proc/self/task/1070/attr/fscreate
20220	/usr
19487	/etc/localtime
18485	/lib64/libc.so.6
18294	/usr/share
18184	/etc/ld.so.cache
18074	/etc/ld.so.preload
17213	/usr/bin/ldd
14471	/bin/egrep
9861	/usr/share/locale
9090	/var/log/dracut.log
9013	/dev/null
7529	/sbin/modprobe
6110	.

commentary (wwoods)

This is pretty useless without some hard numbers. On my years-old test systems, I can complete a run in the following times:

Build type	Arch	Time	Notes
treebuilder	x86_64	19:15	install.img: 125M
treebuilder (bcj=off)	x86_64	15:07	install.img: 127M
treebuilder	ppc64	40:10	install.img: 134M
treebuilder (bcj=off)	ppc64	37:43	install.img: 134M

(clearly I should turn bcj off by default on ppc64)

comments (jreiser)

I find the data useful. The execve table points to dracut as a slow-poke. Verified: dracut is using a derivative of buildinstall to fondle each file as slowly as possible, negating much of the speedup provided by lorax; estimated loss: a minute (5%). The filename counts show that TMPDIR is not being honored. Using a RAM-backed tmpfs such as /dev/shm saves two minutes (10%), but pungi refuses: https://bugzilla.redhat.com/show_bug.cgi?id=723251. Also, dracut forces TMPDIR=/var/tmp which is another slowdown.

My 3GHZ box with 8GB RAM builds -G -B -C (x86_64 including bcj) in 14m13s; adding -I brings the total to 16m17s.

speedups

bcj=off: Add this to /etc/lorax/lorax.conf

[compression]
bcj=off

The anaconda installer starts but then hangs (hamzy)

boot the 08/04/2011 iso image with the following options

  linux systemd.log_target=kmsg systemd.log_level=debug rd.break

run the following command in the shell

  sed -i -e 's/tmp\.mount //' /sysroot/lib/systemd/system/loader.service
  exit

You will see and then nothing else

  Starting Anaconda version 16.14.

Change the file /lib/systemd/system/loader.service to start the loader like this

  ExecStart=/usr/bin/strace -f -e write=3,4,5 /sbin/loader

This will run strace on the loader and produce the following output

[pid   606] <... read resumed> "", 4096) = 0
[pid   606] --- {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=910, si_status=0, si_utime=0, si_stime=1} (Child exited) ---
[pid   606] read(11, "", 4096)          = 0
[pid   606] waitpid(910, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 910
[pid   606] close(9)                    = 0
[pid   606] close(11)                   = 0
[pid   606] access("/mnt/install/source/.treeinfo", R_OK) = -1 ENOENT (No such file or directory)
[pid   606] umount("/mnt/install/source", 0) = 0
[pid   606] time(NULL)                  = 1312818684
[pid   606] send(6, "<143>Aug  8 15:51:24 loader: in "..., 62, MSG_NOSIGNAL) = 62
[pid   606] write(4, "15:51:24,021 DEBUG loader: in do"..., 61) = 61
 | 00000  31 35 3a 35 31 3a 32 34  2c 30 32 31 20 44 45 42  15:51:24 ,021 DEB |
 | 00010  55 47 20 6c 6f 61 64 65  72 3a 20 69 6e 20 64 6f  UG loade r: in do |
 | 00020  4c 6f 61 64 65 72 4d 61  69 6e 2c 20 73 74 65 70  LoaderMa in, step |
 | 00030  20 3d 20 53 54 45 50 5f  4c 41 4e 47 0a            = STEP_ LANG.    |
[pid   606] select(3, [2], NULL, NULL, {0, 0}) = 0 (Timeout)
[pid   606] write(1, "\33[2;21H\33[m\17\33[30m\33[47m\342\224\214\342\224\200\342\224\200\342\224"..., 1926) = 1926
[pid   606] select(1, [0], [], [], NULL

The loader gets as far as outputting the following statement

  15:06:02,581 DEBUG loader: in doLoaderMain, step = STEP_LANG.

However, I think that there is still unbuffered strace output that did not make it to the screen.

TODO:

Try to set up debugging with gdbserver

  gdbserver :9999 /sbin/loader

$HOME? (jreiser)

During install from boot.iso DVD, then /mnt/sysimage has a "home/<<me>>" directory which contains a copy of my user $HOME at the time I ran pungi, even though I ran pungi within "su; cd" which set $CWD to /root.

SQUASHFS errors (hamzy)

boot the 08/04/2011 iso image with the following options

  linux systemd.log_target=kmsg systemd.log_level=debug rd.break

When you are in the shell, you will see the following errors:

switch_root:/# grep -r 'Starting Anaconda' / 2>/dev/null
[  112.508993] SQUASHFS error: xz_dec_run error, data probably corrupt
[  112.509009] SQUASHFS error: squashfs_read_data failed to read block 0x6e32d5a
[  112.509021] SQUASHFS error: Unable to read data cache entry [6e32d5a]
[  112.509028] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
[  112.509146] SQUASHFS error: Unable to read data cache entry [6e32d5a]
[  112.509152] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
[  112.541370] SQUASHFS error: Unable to read data cache entry [6e32d5a]
[  112.541379] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
[  112.541506] SQUASHFS error: Unable to read data cache entry [6e32d5a]
[  112.541512] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4

or

bash-4.2# nslookup hamzy2.austin.ibm.com
[ 3082.076565] SQUASHFS error: xz_dec_run error, data probably corrupt
[ 3082.076587] SQUASHFS error: squashfs_read_data failed to read block 0x6e32d5a
[ 3082.076597] SQUASHFS error: Unable to read data cache entry [6e32d5a]
[ 3082.076604] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
Bus error

It seems that one of the blocks is corrupt?

missing files for networking (hamzy)

boot the 08/04/2011 iso image with the following options

  linux systemd.log_target=kmsg systemd.log_level=debug rd.break

When trying to bring up the networking, I noticed the following missing files:

bash-4.2# ifup eth0
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/sysconfig/network-scripts/ifup-ipv6: line 91: /sbin/sysctl: No such file or directory
/etc/sysconfig/network-scripts/ifup-ipv6: line 115: [: -ne: unary operator expected
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory

/bin/mountpoint and /sbin/sysctl are missing from the image.

networking does not work (hamzy)

boot the 08/04/2011 iso image with the following options

  linux systemd.log_target=kmsg systemd.log_level=debug rd.break

I am in the shell, I set up the networking by the following commands:

bash-4.2# cat << __EOF__ > /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE="eth0"
BOOTPROTO="static"
DNS1="9.10.244.100"
GATEWAY="9.5.250.1"
HWADDR="2A:0F:4A:82:98:04"
ONBOOT="yes"
IPADDR="9.5.250.146"
IPV6INIT="yes"
NETMASK="255.255.255.0"
NM_CONTROLLED="yes"
TYPE="Ethernet"
__EOF__
bash-4.2# cat << __EOF__ > /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=bluebill
__EOF__
bash-4.2# ifup eth0

I then saw the following errors:

bash-4.2# ifup eth0
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/sysconfig/network-scripts/ifup-ipv6: line 91: /sbin/sysctl: No such file or directory
/etc/sysconfig/network-scripts/ifup-ipv6: line 115: [: -ne: unary operator expected
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory

However, I can bypass the mountpoint errors by setting the SYSTEMCTL_SKIP_REDIRECT environment variable

bash-4.2# SYSTEMCTL_SKIP_REDIRECT=yes ifup eth0
/etc/sysconfig/network-scripts/ifup-ipv6: line 91: /sbin/sysctl: No such file or directory
/etc/sysconfig/network-scripts/ifup-ipv6: line 115: [: -ne: unary operator expected

Unfortunately, while networking looks like it is up, the machine is unresponsive to the network (not pingable, not sshable).

bash-4.2# ifconfig
eth0      Link encap:Ethernet  HWaddr 2A:0F:4A:82:98:04
          inet addr:9.5.250.146  Bcast:9.5.250.255  Mask:255.255.255.0
          inet6 addr: fe80::280f:4aff:fe82:9804/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:2460 (2.4 KiB)
          Interrupt:20

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

bash-4.2# nslookup hamzy2.austin.ibm.com
[ 3082.076565] SQUASHFS error: xz_dec_run error, data probably corrupt
[ 3082.076587] SQUASHFS error: squashfs_read_data failed to read block 0x6e32d5a
[ 3082.076597] SQUASHFS error: Unable to read data cache entry [6e32d5a]
[ 3082.076604] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
Bus error

possible recursive locking detected (hamzy)

boot the 08/04/2011 iso image with the following options

  linux systemd.log_target=kmsg systemd.log_level=debug rd.break

Sometimes, you will see the following:

[    7.860083] =============================================
[    7.860110] [ INFO: possible recursive locking detected ]
[    7.860121] 3.0.0-1.fc16.ppc64 #1
[    7.860130] ---------------------------------------------
[    7.860140] udevd/288 is trying to acquire lock:
[    7.860152]  (&mm->mmap_sem){++++++}, at: [<c000000000044560>] .drop_cop+0x70/0x13c
[    7.860185]
[    7.860186] but task is already holding lock:
[    7.860199]  (&mm->mmap_sem){++++++}, at: [<c00000000069c3c8>] .do_page_fault+0x244/0x72c
[    7.860227]
[    7.860228] other info that might help us debug this:
[    7.860241]  Possible unsafe locking scenario:
[    7.860242]
[    7.860253]        CPU0
[    7.860261]        ----
[    7.860267]   lock(&mm->mmap_sem);
[    7.860283]   lock(&mm->mmap_sem);
[    7.860302]
[    7.860303]  *** DEADLOCK ***
[    7.860304]
[    7.860323]  May be due to missing lock nesting notation
[    7.860324]
[    7.860341] 1 lock held by udevd/288:
[    7.860351]  #0:  (&mm->mmap_sem){++++++}, at: [<c00000000069c3c8>] .do_page_fault+0x244/0x72c
[    7.860390]
[    7.860391] stack backtrace:
[    7.860401] Call Trace:
[    7.860412] [c00000027250aab0] [c000000000015764] .show_stack+0x94/0x144 (unreliable)
[    7.860435] [c00000027250ab70] [c0000000006a3098] .dump_stack+0x24/0x2c
[    7.860455] [c00000027250abf0] [c0000000000ee25c] .__lock_acquire+0x176c/0x17f4
[    7.860475] [c00000027250ad40] [c0000000000ee858] .lock_acquire+0x10c/0x128
[    7.860498] [c00000027250ae30] [c000000000698f34] .down_read+0x78/0x12c
[    7.860519] [c00000027250aed0] [c000000000044560] .drop_cop+0x70/0x13c
[    7.860539] [c00000027250af70] [c0000000000448b4] .destroy_context+0x2c/0x50
[    7.860558] [c00000027250b000] [c0000000000a88d4] .__mmdrop+0x50/0x70
[    7.860577] [c00000027250b090] [c000000000096c2c] .finish_task_switch+0x10c/0x15c
[    7.860600] [c00000027250b130] [c000000000696284] .schedule+0x9f8/0xaf0
[    7.860620] [c00000027250b3b0] [c000000000696428] .io_schedule+0xac/0x118
[    7.860646] [c00000027250b440] [c00000000016f550] .sleep_on_page+0x1c/0x2c
[    7.860668] [c00000027250b4c0] [c00000000069717c] .__wait_on_bit_lock+0x88/0x108
[    7.860688] [c00000027250b580] [c00000000016f528] .__lock_page+0x6c/0x78
[    7.860712] [c00000027250b650] [c00000000016f69c] .find_lock_page+0x7c/0xc4
[    7.860735] [c00000027250b6f0] [c0000000001881b0] .shmem_getpage+0x9c/0xad8
[    7.860755] [c00000027250b800] [c000000000188c44] .shmem_fault+0x58/0xb8
[    7.860775] [c00000027250b8b0] [c000000000192e10] .__do_fault+0x78/0x650
[    7.860795] [c00000027250b9c0] [c000000000196400] .handle_pte_fault+0x3cc/0x9b0
[    7.860815] [c00000027250bac0] [c000000000197a4c] .handle_mm_fault+0x15c/0x16c
[    7.860840] [c00000027250bb80] [c00000000069c628] .do_page_fault+0x4a4/0x72c
[    7.860860] [c00000027250be30] [c00000000000676c] handle_page_fault+0x20/0x74

ISO image unbootable (hamzy)

booting the 08/09/2011 iso image ( http://ppc.koji.fedoraproject.org/scratch/karsten/iso/Fedora-20110809-ppc64-DVD.iso ) receives the following error:

Welcome to yaboot version 1.3.16 (Red Hat 1.3.16-3.fc16)
Enter "help" to get some basic usage information
boot: linux
Please wait, loading kernel...
linux:0,/vmlinux: Unable to open file, Invalid device

notes/solutions (Kick_)

Try again with the 08/10/2011 images, please.

There was a bug in pungi that didn't add the PPC specific parameters to the mkisofs commandline. See bugzilla #729798 for the details

squashed 500MB rootfs saves 535KB (jreiser)

Squashing a 500MB ext4 rootfs saves 535KB and build time in contrast to a 2GB ext4.

 +++ b/src/pylorax/treebuilder.py
 -        fssize = 2 * (1024*1024*1024) # 2GB sparse file compresses down to nothin'
 +        fssize = 500 * (1024*1024)  # typ:380MB used

 -Filesystem size 121315.13 Kbytes (118.47 Mbytes)
 +Filesystem size 120779.49 Kbytes (117.95 Mbytes)

Search

User talk:Wwoods/Building images with treebuilder: Difference between revisions

Latest revision as of 19:59, 12 August 2011

Contents

Boot problem (jreiser)

notes/solutions (wwoods)

boot problem on a PowerMac G5

mount traps when systemd mounts /tmp (hamzy)

notes (wwoods)

Slow (jreiser)

commentary (wwoods)

comments (jreiser)

speedups

The anaconda installer starts but then hangs (hamzy)

$HOME? (jreiser)

SQUASHFS errors (hamzy)

missing files for networking (hamzy)

networking does not work (hamzy)

possible recursive locking detected (hamzy)

ISO image unbootable (hamzy)

notes/solutions (Kick_)

squashed 500MB rootfs saves 535KB (jreiser)

@@ Line 1: / Line 1: @@
-=== Boot problem (jreiser) ===
+== Boot problem (jreiser) ==
 [2011-08-06 0000 UTC] Upon boot of DVD, then dracut drops to shell with the message:
    Warning: No root device "live:/dev/disk/by-label/Fedora" found.
@@ Line 17: / Line 17: @@
 but I cannot find how to specify that in a "root=live..." that dracut understands.  It's also a mystery why so many mounts are necessary.
-=== Boot problem solutions ===
+=== notes/solutions (wwoods) ===
 # The ''DVD'' isn't built correctly - pungi changes the label of the image without changing the boot config file. Pungi will need to be fixed to handle that. (boot.iso should work fine.)
 #* I believe dracut expects spaces to be escaped, so it would need to be: <code>root=live:CDLABEL=Fedora%x2016%x20x86_64%x20DVD</code>
@@ Line 24: / Line 24: @@
 #* Yes, that does seem needlessly complicated
 #* No, there isn't a simpler way to do it right now
+== boot problem on a PowerMac G5 ==
+I've tried to boot with 'root=live:/dev/hda', but dracut bailed out with 'Warning: No root device "live:/dev/hda" found'
+I've also tried to boot with 'root=live:http://some-server/.../install.img', but dracut aborted with a similar error.
+== mount traps when systemd mounts /tmp (hamzy) ==
+boot the 08/04/2011 iso image with the following options
+   linux systemd.log_target=kmsg systemd.log_level=debug rd.break
+Seen during booting:
+<pre>
+...
+[   11.992445] systemd[1]: tmp.mount mount process exited, code=killed status=11
+[   12.029660] systemd[1]: Job loader.service/start failed with result 'dependency'.
+[   12.029695] systemd[1]: Unit tmp.mount entered failed state.
+...
+</pre>
+<pre>
+bash-4.2# systemctl status tmp.mount
+[   71.988204] systemd[1]: Accepted connection on private bus.
+[   71.988860] systemd[1]: Got D-Bus request: org.freedesktop.systemd1.Manager.LoadUnit() on /org/freedesktop/systemd1
+[   71.989077] systemd[1]: Got D-Bus request: org.freedesktop.DBus.Properties.GetAll() on /org/freedesktop/systemd1/unit/tmp_2emount
+tmp.mount - Runtime Directory
+          Loaded: loaded (/lib/systemd/system/tmp.mount)
+          Active: inactive (dead)
+           Where: /tmp
+            What: tmpfs
+          CGroup: name=systemd:/system/tmp.mount
+[   71.990451] systemd[1]: Got D-Bus request: org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local
+</pre>
+FIX:
+boot with the following options
+   linux systemd.log_target=kmsg systemd.log_level=debug rd.break
+run the following command in the shell
+   sed -i -e 's/tmp\.mount //' /sysroot/lib/systemd/system/loader.service
+=== notes (wwoods) ===
+* Only seems to happen on PPC64 systems
+* This workaround has been added to the treebuilder branch; should be fixed for images built on or after Aug. 9
+== Slow (jreiser) ==
+Treebuilder is slower.  The first culprit looks like dracut.
+{|
+!number!!execve
+|-
+|   7606||/bin/egrep
+|-
+|   2578||/bin/cp
+|-
+|   1633||/lib64/ld-linux-x86-64.so.2
+|-
+|   1434||/usr/bin/ldd
+|-
+|   1413||/bin/ln
+|-
+|   1303||/usr/local/bin/ln
+|-
+|   1303||/usr/bin/ln
+|-
+|   1087||/sbin/modinfo
+|}
+The second culprit is not honoring $TMPDIR, and not putting yumroot-$PID and installroot-$PID inside it.
+{|
+!count!!filename
+|-
+|1693845|| *
+|-
+| 253387||$DESTDIR/yumroot/...
+|-
+|  79977||$DESTDIR/installroot/...
+|-
+|  44486||/etc/xattr.conf
+|-
+|  41713||/proc/self/task/31818/attr/fscreate
+|-
+|  25919||/proc/self/task/1070/attr/fscreate
+|-
+|  20220||/usr
+|-
+|  19487||/etc/localtime
+|-
+|  18485||/lib64/libc.so.6
+|-
+|  18294||/usr/share
+|-
+|  18184||/etc/ld.so.cache
+|-
+|  18074||/etc/ld.so.preload
+|-
+|  17213||/usr/bin/ldd
+|-
+|  14471||/bin/egrep
+|-
+|   9861||/usr/share/locale
+|-
+|   9090||/var/log/dracut.log
+|-
+|   9013||/dev/null
+|-
+|   7529||/sbin/modprobe
+|-
+|   6110||.
+|}
+=== commentary (wwoods) ===
+* This is pretty useless without some hard numbers. On my years-old test systems, I can complete a run in the following times:
+{|
+! Build type !! Arch !! Time !! Notes
+|-
+| treebuilder || x86_64 || 19:15 || install.img: 125M
+|-
+| treebuilder (bcj=off) || x86_64 || 15:07 || install.img: 127M
+|-
+| treebuilder || ppc64 || 40:10 || install.img: 134M
+|-
+| treebuilder (bcj=off) || ppc64 || 37:43 || install.img: 134M
+|}
+(clearly I should turn bcj off by default on ppc64)
+=== comments (jreiser) ===
+I find the data useful.  The execve table points to dracut as a slow-poke.  Verified: dracut is using a derivative of buildinstall to fondle each file as slowly as possible, negating much of the speedup provided by lorax; estimated loss: a minute (5%).  The filename counts show that TMPDIR is not being honored.  Using a RAM-backed tmpfs such as /dev/shm saves two minutes (10%), but pungi refuses: https://bugzilla.redhat.com/show_bug.cgi?id=723251.  Also, dracut forces TMPDIR=/var/tmp which is another slowdown.
+My 3GHZ box with 8GB RAM builds -G -B -C (x86_64 including bcj) in 14m13s; adding -I brings the total to 16m17s.
+=== speedups ===
+* <code>bcj=off</code>: Add this to /etc/lorax/lorax.conf
+ [compression]
+ bcj=off
+== The anaconda installer starts but then hangs (hamzy) ==
+boot the 08/04/2011 iso image with the following options
+   linux systemd.log_target=kmsg systemd.log_level=debug rd.break
+run the following command in the shell
+   sed -i -e 's/tmp\.mount //' /sysroot/lib/systemd/system/loader.service
+   exit
+You will see and then nothing else
+   Starting Anaconda version 16.14.
+Change the file /lib/systemd/system/loader.service to start the loader like this
+   ExecStart=/usr/bin/strace -f -e write=3,4,5 /sbin/loader
+This will run strace on the loader and produce the following output
+<pre>
+[pid   606] <... read resumed> "", 4096) = 0
+[pid   606] --- {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=910, si_status=0, si_utime=0, si_stime=1} (Child exited) ---
+[pid   606] read(11, "", 4096)          = 0
+[pid   606] waitpid(910, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 910
+[pid   606] close(9)                    = 0
+[pid   606] close(11)                   = 0
+[pid   606] access("/mnt/install/source/.treeinfo", R_OK) = -1 ENOENT (No such file or directory)
+[pid   606] umount("/mnt/install/source", 0) = 0
+[pid   606] time(NULL)                  = 1312818684
+[pid   606] send(6, "<143>Aug  8 15:51:24 loader: in "..., 62, MSG_NOSIGNAL) = 62
+[pid   606] write(4, "15:51:24,021 DEBUG loader: in do"..., 61) = 61
+ | 00000  31 35 3a 35 31 3a 32 34  2c 30 32 31 20 44 45 42  15:51:24 ,021 DEB |
+ | 00010  55 47 20 6c 6f 61 64 65  72 3a 20 69 6e 20 64 6f  UG loade r: in do |
+ | 00020  4c 6f 61 64 65 72 4d 61  69 6e 2c 20 73 74 65 70  LoaderMa in, step |
+ | 00030  20 3d 20 53 54 45 50 5f  4c 41 4e 47 0a            = STEP_ LANG.    |
+[pid   606] select(3, [2], NULL, NULL, {0, 0}) = 0 (Timeout)
+[pid   606] write(1, "\33[2;21H\33[m\17\33[30m\33[47m\342\224\214\342\224\200\342\224\200\342\224"..., 1926) = 1926
+[pid   606] select(1, [0], [], [], NULL
+</pre>
+The loader gets as far as outputting the following statement
+:06:02,581 DEBUG loader: in doLoaderMain, step = STEP_LANG.
+However, I think that there is still unbuffered strace output that did not make it to the screen.
+TODO:
+Try to set up debugging with gdbserver
+   gdbserver :9999 /sbin/loader
+== $HOME? (jreiser) ==
+During install from boot.iso DVD, then /mnt/sysimage has a "home/<<me>>" directory which contains a copy of my user $HOME at the time I ran pungi, even though I ran pungi within "su; cd" which set $CWD to /root.
+== SQUASHFS errors (hamzy) ==
+boot the 08/04/2011 iso image with the following options
+   linux systemd.log_target=kmsg systemd.log_level=debug rd.break
+When you are in the shell, you will see the following errors:
+<pre>
+switch_root:/# grep -r 'Starting Anaconda' / 2>/dev/null
+[  112.508993] SQUASHFS error: xz_dec_run error, data probably corrupt
+[  112.509009] SQUASHFS error: squashfs_read_data failed to read block 0x6e32d5a
+[  112.509021] SQUASHFS error: Unable to read data cache entry [6e32d5a]
+[  112.509028] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
+[  112.509146] SQUASHFS error: Unable to read data cache entry [6e32d5a]
+[  112.509152] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
+[  112.541370] SQUASHFS error: Unable to read data cache entry [6e32d5a]
+[  112.541379] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
+[  112.541506] SQUASHFS error: Unable to read data cache entry [6e32d5a]
+[  112.541512] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
+</pre>
+or
+<pre>
+bash-4.2# nslookup hamzy2.austin.ibm.com
+[ 3082.076565] SQUASHFS error: xz_dec_run error, data probably corrupt
+[ 3082.076587] SQUASHFS error: squashfs_read_data failed to read block 0x6e32d5a
+[ 3082.076597] SQUASHFS error: Unable to read data cache entry [6e32d5a]
+[ 3082.076604] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
+Bus error
+</pre>
+It seems that one of the blocks is corrupt?
+== missing files for networking (hamzy) ==
+boot the 08/04/2011 iso image with the following options
+   linux systemd.log_target=kmsg systemd.log_level=debug rd.break
+When trying to bring up the networking, I noticed the following missing files:
+<pre>
+bash-4.2# ifup eth0
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/sysconfig/network-scripts/ifup-ipv6: line 91: /sbin/sysctl: No such file or directory
+/etc/sysconfig/network-scripts/ifup-ipv6: line 115: [: -ne: unary operator expected
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+</pre>
+/bin/mountpoint and /sbin/sysctl are missing from the image.
+== networking does not work (hamzy) ==
+boot the 08/04/2011 iso image with the following options
+   linux systemd.log_target=kmsg systemd.log_level=debug rd.break
+I am in the shell, I set up the networking by the following commands:
+<pre>
+bash-4.2# cat << __EOF__ > /etc/sysconfig/network-scripts/ifcfg-eth0
+DEVICE="eth0"
+BOOTPROTO="static"
+DNS1="9.10.244.100"
+GATEWAY="9.5.250.1"
+HWADDR="2A:0F:4A:82:98:04"
+ONBOOT="yes"
+IPADDR="9.5.250.146"
+IPV6INIT="yes"
+NETMASK="255.255.255.0"
+NM_CONTROLLED="yes"
+TYPE="Ethernet"
+__EOF__
+bash-4.2# cat << __EOF__ > /etc/sysconfig/network
+NETWORKING=yes
+HOSTNAME=bluebill
+__EOF__
+bash-4.2# ifup eth0
+</pre>
+I then saw the following errors:
+<pre>
+bash-4.2# ifup eth0
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/sysconfig/network-scripts/ifup-ipv6: line 91: /sbin/sysctl: No such file or directory
+/etc/sysconfig/network-scripts/ifup-ipv6: line 115: [: -ne: unary operator expected
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+/etc/init.d/functions: line 17: /bin/mountpoint: No such file or directory
+</pre>
+However, I can bypass the mountpoint errors by setting the SYSTEMCTL_SKIP_REDIRECT environment variable
+<pre>
+bash-4.2# SYSTEMCTL_SKIP_REDIRECT=yes ifup eth0
+/etc/sysconfig/network-scripts/ifup-ipv6: line 91: /sbin/sysctl: No such file or directory
+/etc/sysconfig/network-scripts/ifup-ipv6: line 115: [: -ne: unary operator expected
+</pre>
+Unfortunately, while networking looks like it is up, the machine is unresponsive to the network (not pingable, not sshable).
+<pre>
+bash-4.2# ifconfig
+eth0      Link encap:Ethernet  HWaddr 2A:0F:4A:82:98:04
+          inet addr:9.5.250.146  Bcast:9.5.250.255  Mask:255.255.255.0
+          inet6 addr: fe80::280f:4aff:fe82:9804/64 Scope:Link
+          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
+          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:38 errors:0 dropped:0 overruns:0 carrier:0
+          collisions:0 txqueuelen:1000
+          RX bytes:0 (0.0 b)  TX bytes:2460 (2.4 KiB)
+          Interrupt:20
+lo        Link encap:Local Loopback
+          inet addr:127.0.0.1  Mask:255.0.0.0
+          inet6 addr: ::1/128 Scope:Host
+          UP LOOPBACK RUNNING  MTU:16436  Metric:1
+          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
+          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
+          collisions:0 txqueuelen:0
+          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
+bash-4.2# nslookup hamzy2.austin.ibm.com
+[ 3082.076565] SQUASHFS error: xz_dec_run error, data probably corrupt
+[ 3082.076587] SQUASHFS error: squashfs_read_data failed to read block 0x6e32d5a
+[ 3082.076597] SQUASHFS error: Unable to read data cache entry [6e32d5a]
+[ 3082.076604] SQUASHFS error: Unable to read page, block 6e32d5a, size 8ec4
+Bus error
+</pre>
+== possible recursive locking detected (hamzy) ==
+boot the 08/04/2011 iso image with the following options
+   linux systemd.log_target=kmsg systemd.log_level=debug rd.break
+Sometimes, you will see the following:
+<pre>
+[    7.860083] =============================================
+[    7.860110] [ INFO: possible recursive locking detected ]
+[    7.860121] 3.0.0-1.fc16.ppc64 #1
+[    7.860130] ---------------------------------------------
+[    7.860140] udevd/288 is trying to acquire lock:
+[    7.860152]  (&mm->mmap_sem){++++++}, at: [<c000000000044560>] .drop_cop+0x70/0x13c
+[    7.860185]
+[    7.860186] but task is already holding lock:
+[    7.860199]  (&mm->mmap_sem){++++++}, at: [<c00000000069c3c8>] .do_page_fault+0x244/0x72c
+[    7.860227]
+[    7.860228] other info that might help us debug this:
+[    7.860241]  Possible unsafe locking scenario:
+[    7.860242]
+[    7.860253]        CPU0
+[    7.860261]        ----
+[    7.860267]   lock(&mm->mmap_sem);
+[    7.860283]   lock(&mm->mmap_sem);
+[    7.860302]
+[    7.860303]  *** DEADLOCK ***
+[    7.860304]
+[    7.860323]  May be due to missing lock nesting notation
+[    7.860324]
+[    7.860341] 1 lock held by udevd/288:
+[    7.860351]  #0:  (&mm->mmap_sem){++++++}, at: [<c00000000069c3c8>] .do_page_fault+0x244/0x72c
+[    7.860390]
+[    7.860391] stack backtrace:
+[    7.860401] Call Trace:
+[    7.860412] [c00000027250aab0] [c000000000015764] .show_stack+0x94/0x144 (unreliable)
+[    7.860435] [c00000027250ab70] [c0000000006a3098] .dump_stack+0x24/0x2c
+[    7.860455] [c00000027250abf0] [c0000000000ee25c] .__lock_acquire+0x176c/0x17f4
+[    7.860475] [c00000027250ad40] [c0000000000ee858] .lock_acquire+0x10c/0x128
+[    7.860498] [c00000027250ae30] [c000000000698f34] .down_read+0x78/0x12c
+[    7.860519] [c00000027250aed0] [c000000000044560] .drop_cop+0x70/0x13c
+[    7.860539] [c00000027250af70] [c0000000000448b4] .destroy_context+0x2c/0x50
+[    7.860558] [c00000027250b000] [c0000000000a88d4] .__mmdrop+0x50/0x70
+[    7.860577] [c00000027250b090] [c000000000096c2c] .finish_task_switch+0x10c/0x15c
+[    7.860600] [c00000027250b130] [c000000000696284] .schedule+0x9f8/0xaf0
+[    7.860620] [c00000027250b3b0] [c000000000696428] .io_schedule+0xac/0x118
+[    7.860646] [c00000027250b440] [c00000000016f550] .sleep_on_page+0x1c/0x2c
+[    7.860668] [c00000027250b4c0] [c00000000069717c] .__wait_on_bit_lock+0x88/0x108
+[    7.860688] [c00000027250b580] [c00000000016f528] .__lock_page+0x6c/0x78
+[    7.860712] [c00000027250b650] [c00000000016f69c] .find_lock_page+0x7c/0xc4
+[    7.860735] [c00000027250b6f0] [c0000000001881b0] .shmem_getpage+0x9c/0xad8
+[    7.860755] [c00000027250b800] [c000000000188c44] .shmem_fault+0x58/0xb8
+[    7.860775] [c00000027250b8b0] [c000000000192e10] .__do_fault+0x78/0x650
+[    7.860795] [c00000027250b9c0] [c000000000196400] .handle_pte_fault+0x3cc/0x9b0
+[    7.860815] [c00000027250bac0] [c000000000197a4c] .handle_mm_fault+0x15c/0x16c
+[    7.860840] [c00000027250bb80] [c00000000069c628] .do_page_fault+0x4a4/0x72c
+[    7.860860] [c00000027250be30] [c00000000000676c] handle_page_fault+0x20/0x74
+</pre>
+== ISO image unbootable (hamzy) ==
+booting the 08/09/2011 iso image ( http://ppc.koji.fedoraproject.org/scratch/karsten/iso/Fedora-20110809-ppc64-DVD.iso ) receives the following error:
+<pre>
+Welcome to yaboot version 1.3.16 (Red Hat 1.3.16-3.fc16)
+Enter "help" to get some basic usage information
+boot: linux
+Please wait, loading kernel...
+linux:0,/vmlinux: Unable to open file, Invalid device
+</pre>
+=== notes/solutions (Kick_) ===
+Try again with the 08/10/2011 images, please.
+There was a bug in pungi that didn't add the PPC specific parameters to the mkisofs commandline. See bugzilla #729798 for the details
+== squashed 500MB rootfs saves 535KB (jreiser) ==
+Squashing a 500MB ext4 rootfs saves 535KB and build time in contrast to a 2GB ext4.
+  +++ b/src/pylorax/treebuilder.py
+  -        fssize = 2 * (1024*1024*1024) # 2GB sparse file compresses down to nothin'
+  +        fssize = 500 * (1024*1024)  # typ:380MB used
+  -Filesystem size 121315.13 Kbytes (118.47 Mbytes)
+  +Filesystem size 120779.49 Kbytes (117.95 Mbytes)