How to debug installation problems
m (moved Anaconda/BugReporting to How to debug installation problems: Renaming page per decision during QA meeting (https://fedoraproject.org/wiki/QA/Meetings/20091019#Action_items))
Revision as of 16:06, 23 October 2009
--Intro goes here--
Narrowing it Down
Before doing anything else, please try to figure out if you're testing the latest version of anaconda - this goes for all software products. If you're running into a problem in F10 while we're developing F12, a bug report will not be very helpful and the problem may even be fixed already. If you ever encounter a bug that is not in the latest anaconda, please locate the newest one and try that.
To check which version of anaconda you're running, you need only watch the screen when anaconda starts up. It will print something like
Greetings. anaconda installer init version 220.127.116.11 starting
And that's your version of anaconda!
The other useful number is the tree. If you're using an official release, all we need is which Fedora you're installing. If, on the other hand, you're using the development tree (Rawhide), there are two ways to discover which tree you're using. Often the installation source will have a date in the path, which will tell us which build to look at. If your install source does not have a date, its .treeinfo file should contain a timestamp and a version number.
If you cannot remember what you were using when you encountered the bug, please try again with the newest one. If you don't, we're just going to have to ask you to do so and it would save everyone time not to deal with that.
If you are unwilling to try the newest rawhide to see if the bug is still present, especially if the rawhide you were using is an old one, your bug will probably be closed as INSUFFICIENT_DATA.
So, once you've seen a bug and determined it's still present in the latest version, it's time to open a bugzilla.
Types of Bugs
There are several types of bugs you might encounter. The four listed here are the most common, and are generally easy to identify.
If your bug looks like this, you've encountered a problem in the first stage of anaconda, known as the loader. It is written in C, and what you are seeing is a dump of the hex addresses of functions. The numbers don't mean anything to you, but they tell us exactly where the bug is occurring and how we got there.
These numbers change every time anaconda or one of the components it depends upon is rebuilt, as the locations of functions in the compiled programs change. In order to diagnose the problem, we will need accurate version numbers as described in the first section, as well as the exact numbers shown on the screen. A common way of getting the numbers without risking a typo is to take a picture of the screen with a camera, and then attach the photo to a bug report.
Without all of those numbers, we will be completely unable to fix the bug, so please make sure you get all of them!
If your install just stops doing anything, you may have encountered a kernel bug. Try to press ctrl-alt-F4. This should take you over to the fourth virtual console (tty4) unless things are seriously screwed up. If this is the case, you have very likely encountered a kernel bug. This report should be filed against the kernel, not against anaconda. They will likely want a picture of what's on screen, as well as a list of steps you had taken before the problem occurred. However, filing kernel bugs is outside the scope of this document.
When anaconda transitions from stage 1 to stage 2, sometimes X will fail to start. Sometimes this results in being given the text mode interface instead of the graphical one, and sometimes all you get is a black screen. These bugs should be filed against xorg-x11, along with /var/X.log.
Bugs that manifest like the above image are bugs in the second stage of anaconda, or Stage 2. These are easier to file bugs against, thanks to python-meh. What you're seeing is similar to the loader stack trace, but much more informative. You're seeing a Python exception report.
Whenever a traceback happens, a file named anacdump.txt will be created containing the tracebacks as well as the other log files. If you choose to automatically file a bug report, this file will be attached to your bug. This is often the key piece of information we need to fix the bug, but don't go away yet! Some bugs are weird, and it's always best to get some other information from you in a comment.
Hardware Errors are a bit of a mixed bag. Sometimes they can manifest like kernel bugs, and sometimes they can look like tracebacks.
If you see any lines in your traceback that say "I/O Error", you probably have bad media somewhere (hard drive, CD/DVD, and so on). If you can verify that your media is all just fine before filing the bug it would be helpful; if not that will be the first thing we point out and ask about.
Other Potentially Useful Information
If you ran into a traceback (and you have a bugzilla account and networking was working...), you're rather lucky. A bug report with your anacdump.txt has been created, and sometimes, that's all we need. But adding more information in a comment can help get your bug fixed faster, or help us reproduce it on-site. If you don't provide us with that information, we're going to have to rely on you to retest the bug for us.
If you got a traceback, but cannot use automated bug filing, you can collect the file /tmp/anacdump.txt from tty2 (to switch to tty2, press ctrl+alt+f2) and attach it to the bug report.
If you didn't get a traceback, this section lists information you might want to include when you file the report.
Anaconda writes several log files which contain valuable information about what the installer is/was doing.
If the file
/tmp/anacdump.txt exists, attach it to your bug report. You may ignore the remainder of "Log Files" section.
There is a quick way to obtain a single file that includes the content of all of the most important log files (
- switch to
- run the command
killall -USR2 anaconda
If the file
/tmp/anacdump.txt exists, attach it to your bug report. You may ignore the remainder of "Log Files" section.
The following log files, all of which are included in
anacdump.txt, are useful in the event that a failure has occured:
If the failure occurred during package selection or package installation, the following additional files can be very useful (if present):
Letting us know what hardware you're on can be informative. Don't worry, though, we don't need serial numbers of every piece of metal in your machine. If your bug appears to be with networking, the network card you're using is probably the only thing we need. If you're having X troubles, let us know what video card you're using. If the bug appears in partitioning, tell us what disks and how many.
If you're installing with kickstart, please include the kickstart file. Make sure to scrub it of any passwords first, though!
Always, always useful are the steps you took to cause the problem. It's most useful because it allows us to reproduce the bug and test fixes in our own lab, without leaning on you to test for us. It can also help us connect bugs which might not otherwise appear related.
It is rare that language and keyboard are the defining factor for whether your install crashes, but it has been known to happen so please don't leave these out!
If you have any USB or Firewire devices attached to the computer during the install, please include that information. Also include whether you plugged or unplugged any of them at any point during the install. (If you're wondering, it's a bad idea! Don't do it!)
If your bug is not a crypto-luks error and you were using encryption, you might want to include that. If it is a crypto-luks error and you were not using encryption, definitely let us know!
If you got as far as partitioning, please let us know what your layout was if it was a custom one. You should also include whether the drive or drives contained previous information that was being wiped, what was on it, and whether there are any other operating systems installed on the drive(s).
Writing the Report
So, now that you've thought about what information you might want to include, it is time to actually write your report (or for tracebackers, to add a comment). You should go to  to file a new bug.
Navigate the filing system for the correct release (RHEL versus Fedora) you encountered the bug in to arrive at the information screen. There are a couple crucial fields to fill out, and some that should be left alone.
The component must be set as anaconda, or we won't see the bug unless someone else happens to notice and sends it to us. The version number should be which release you used; if you installed with Rawhide just select that one. You can narrow down the platform to the arch you tried, if you wish, but it's not crucial.
Severity and Priority should not be changed. They are used by developers and people in QA, and should not be touched by reporters. Yes, even if you think your bug is the worst bug ever.
After the top section, scroll down until you see the "Summary" field. This will be the title of your bug, so please try to make it useful. "anaconda crash" and "Bad: Install Failed" are completely unhelpful as titles, because they tell us nothing about what's going on. Please try to include keywords important to your bug, such as "lvm" or "grub". You can also put in the error message you received.
Note: If your bug is a UI bug, feel free to use a descriptive title. "Use of F12 is inconsistent in the installer" is a great title.
The last thing you need to enter is the description, and there are different ways to do this.
To Form or Not to Form?
By default, new bug reports include in the description section a standard format for your report. You can use this, if you wish, but it is not required. If you choose not to use the format, please delete it!
If you choose to follow the form, your information should be divided up as it requests. The first part asks for a description of the problem. Then the version number, then how reproducible it is. If you haven't tried to reproduce the bug, please say so instead of guessing.
Then the steps to reproduce - as was said above, we really like getting these. The more specific and detailed you can be about what you did, the better. For example, if you're getting a bug after making a custom partition layout, "Made custom layout" is not helpful at all, "Made custom layout that looked like this" is pretty helpful, and "created /boot on ext3 with 200M space, created a 200M RAID and deleted it, created 4 RAIDs with remaining space, combined them all into a RAID10 and mounted encrypted /" is fantastic. Sometimes we get bugs that only appear when you create then delete a partition, and there would be no way to diagnose that problem from the first two examples.
Actual results are obviously what happened, and expected results are what was expected to happen - generally, for anaconda not to crash.
In additional info, feel free to include any of the information described in section 3.
It is really important that you be as specific as possible in your terminology. An upgrade is not the same thing as an install, and using the two terms interchangeably will only cause us to tear our hair out in frustration.
Things you should not confuse:
- upgrade vs install - LV vs PV vs VG - dmraid vs mdraid - The order of your steps
We know. You just wanted to install, and something went wrong. Maybe you didn't get the new system, or maybe your old system got wiped.
But no matter what, being short or snarky with us will not help. We don't enjoy dealing with people who are going to be rude, and when we have so many bugs we have to deal with, we're likely to prioritize ones where the reporters remember that we're human too... and only human.
We do not make a point of wasting your time, so please don't waste ours. If we ask for a piece of information, it's because we need it to help figure out or fix the bug you've seen. Please don't try to tell us that you don't have it anymore and aren't willing to try another install to get it, but that we should be able to figure out your problem from the information you provided. If we could, we wouldn't have asked.
At the same time, we are swamped with bugs, and some do get overlooked. If it has been a while since you heard anything about your bug, feel free to comment and ask what the status is. However, please don't say things like "I guess I now know how much the anaconda developers care about fixing bugs!" It's just petty.
What Happens Next?
Once you've filed your bug report, you get to wait. How exciting! There's no way to tell how long your bug will be open - some are fixed in hours and are in rawhide the following day, and some (well, one) are waiting 10 years because somebody else needs to support something before we can.
When we get to your bug, we may ask you for more information, to try running a command and giving us the output, or to try an update image. It's most helpful if your computer is still available for testing, since even with steps there's no guarantee our hardware will reproduce the bug, so reporting a bug and then immediately installing something else means your bug will most likely not get fixed.
It's worth noting, however, that up until partitioning nothing that you do is permanent. You can run anaconda all you want and test everything up through the partitioning screen, so if your bug happened before then you won't be risking your system by trying another install.
Please, don't assume that somebody else will file a bug if you don't. Rawhide doesn't have a very large testing base, so while bugs that affect all installs will be filed, ones that happen semi-rarely might not be. If that happens, the bugs will make it into an official release and will be unleashed upon a large group of users. It's so much better for everyone if bugs can be reported before, rather than after, they make it into a release.