Xorg/Input Triage Algorithm

From FedoraProject

< Xorg(Difference between revisions)
Jump to: navigation, search
(Created page with '[08:17:45] whot!porky.stuttgart.redhat.com: "spanked"? that bad? [08:22:07] whot!porky.stuttgart.redhat.com: hmm. we need to find some middle ground for triaging on the bug error...')
 
(List which logs are needed)
 
(19 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[08:17:45] whot!porky.stuttgart.redhat.com: "spanked"? that bad?
+
== General overview ==
[08:22:07] whot!porky.stuttgart.redhat.com: hmm. we need to find some middle ground for triaging on the bug errors. I spent a lot of time telling users the same things over and over (hal configuration, etc.). now i'm so flat out I hardly get to look at bugreports, and there's a lot that just need attention but not really fixing
+
For Fedora 13 (with xorg-x11-server-1.7.99.901-1) we disabled HAL configuration, so a fair bunch of configuration problems may come from that. The fdi files are now ignored, so the configurations need to be moved. We recommend that they be put as InputClass sections into the xorg.conf ('''careful when recommending to delete the xorg.conf for other bugs'''). This needs to be done manually, there's no automatic conversion tool.
[08:22:25] whot!porky.stuttgart.redhat.com: right now, I'm not even sure how to fix this problem since I guess you're about as flat out as I am
+
 
[08:26:49] mcepl: well, over IRC, so it wasn't that bad
+
Most evdev bugs aren’t actually evdev bugs. they tend to get caught before the releases. They’re either configuration issues (HAL, usually) or server bugs. especially when it comes to keyboard layouts etc., it’s nearly always server.
[08:26:56] mcepl: strongly challenged ... let's say
+
 
[08:27:32] mcepl: ehm, "as flat as" ... means what?
+
Also, there are almost no <code>xorg-x11-drv-{mouse,keyboard}</code> bugs. User has to have <code>AutoAddDevices "off"</code>, or <code>"AllowEmptyInput" "off"</code> in the config file, so nobody without <code>xorg.conf</code> will use these drivers.
[08:28:10] mcepl: burnt out or having no idea?
+
 
[08:28:18] whot!porky.stuttgart.redhat.com: "flat out" means really busy
+
Real evdev bugs are usually when over time something degenerates, when the pointer jumps after pressing a button, things like that, or when the server says “don’t know how to use device” (in <code>Xorg.*.log</code>). In some cases, hardware is partially broken (usually a kernel issue). For those, usually bugs with mice and touchpads, ask for the evtest output of the device. If it's a bug that crashes the server, ask for the evtest-capture output on this device to reproduce.
[08:28:25] whot!porky.stuttgart.redhat.com: might be australian
+
 
[08:28:37] whot!porky.stuttgart.redhat.com: comes from "flat out like a lizard drinking"
+
Therefore most input device bugs are in the <code>xorg-x11-server</code> component, assigned to peter.
[08:28:47] mcepl: well, yes, but ehm, your bugs are my job, so that's what I should be busy with
+
 
[08:29:16] mcepl: yeah, please, move back to your Central European roots  not that many lizards here
+
== Log files ==
[08:29:24] whot!porky.stuttgart.redhat.com: hehe. sorry.
+
What log files are needed for bug triaging:
[08:29:47] whot!porky.stuttgart.redhat.com: a couple of things off the top of my head:
+
* /var/log/Xorg.0.log (or the Xorg.0.log.old after a crash, it contains the backtrace) is almost *always* needed. Please attach it as a whole, uncompressed. Note that depending on what $DISPLAY the server starts up, the numbers may be different or change after a restart (/var/log/Xorg.1.log, etc.). If you are not sure which one is the right one, check the timestamps on the file.
[08:30:06] mcepl: I can see the page on wiki ... I will use it
+
* xorg.conf if there is one
[08:30:18] whot!porky.stuttgart.redhat.com: most evdev bugs aren't actually evdev bugs. they tend to get caught before the releases
+
* ''Fedora 13 or later'': custom xorg.conf.d snippets if there are ones
[08:30:34] mcepl: (well, once I make my firefox not to loose all bookmarks, that is )
+
* ''Fedora 12 or earlier'': output from lshal
[08:30:55] whot!porky.stuttgart.redhat.com: they're either configuration issues (HAL, usually) or server bugs. especially when it comes to keyboard layouts etc., it's nearly always server
+
* For keyboard layout issues: the output of "xkbcomp -xkb $DISPLAY -"
[08:31:00] mcepl: yeah, that was the question ... I have no clue how to spread bugs between evdev and -mouse or -keyboard
+
* For any device issues: evtest against the device file
[08:31:29] whot!porky.stuttgart.redhat.com: mouse/keyboard is unused unless the have AutoAddDevices "off", or "AllowEmptyInput" "off" in the config file.
+
* For any crashers or misbehavings after a certain series of (hardware) actions: evtest-capture against the device file
[08:31:56] whot!porky.stuttgart.redhat.com: so anyone w/o a configuration file will never use mouse/kbd
+
 
[08:32:02] mcepl: well, and couldn't we make -evdev virtual component representing "whatever input problems we have" ... I have distrust for too big components?
+
== Keyboard layouts ==
[08:32:16] mcepl: OK, noted ... no -mouse, -keyboard
+
* Some modifier keys stop working, some keys don't repeat, etc. => '''xorg-x11-server'''. This must have a simple, seproducible test case, it's near-impossible to debug otherwise. Things like "do something for 20 minutes and then it stops working" isn't enough.
[08:32:50] whot!porky.stuttgart.redhat.com: i'm fine with a virtual component "X input" or so. evdev is more "political", if people always see that it's an evdev issue you get them to think that it's a broken driver
+
* A keyboard layout works, but one or two keys provide the wrong symbols => '''xkeyboard-config''', should usually be upstreamed, Peter hands the decisions on layout changes to the upstream maintainer.
[08:33:04] whot!porky.stuttgart.redhat.com: which means they try to switch to mouse/kbd, and that's where the real issues start
+
* The user-selected layout does not apply => '''configuration issue, gnome-settings-daemon, gdm or xorg-x11-server.'''
[08:33:04] mcepl: OK, point taken
+
** Does the layout from /etc/sysconfig/keyboard show up in the Xorg.log file (only for keyboard devices, devices like the Power Button or Video Bus stay on 'us')? no? => '''system-setup-keyboard'''
[08:33:45] mcepl: how can I recognize real -evdev bugs then ... and I will move all others back to -server
+
** Do user-configured options (in custom .fdi files) show up in the Xorg.log file? no? => hal configuration needs porting to xorg.conf format.
[08:33:50] mcepl: ?
+
** Do user-configured options (in xorg.conf/xorg.conf.d) show up in the Xorg.log file? no? => '''xorg-x11-server'''
[08:34:00] whot!porky.stuttgart.redhat.com: (you can still assign them to me, that's fine)
+
** Does the layout work in a plain X session (yum install xterm && sudo init 3 && xinit --)? no? => '''xorg-x11-server'''
[08:34:29] whot!porky.stuttgart.redhat.com: real evdev bugs are usually when over time something degenerates, when the pointer jumps after pressing a button, things like that
+
** Does the layout work in a plain X session but stops working after login? => '''gnome-settings-daemon''' or possibly '''gdm'''
[08:34:42] whot!porky.stuttgart.redhat.com: or when the server says "don't know how to use device".
+
* Arrow keys broken (up == printscreen): this is caused by an xfree86 (or "base") ruleset for an evdev device. The cause of this is almost always a stray AllowEmptyInput "off" in the config (which is nearly always a wrong configuration). Removing the option fixes it.
[08:34:47] mcepl: or should I just fail on the other side and just move them to -server and we can move them back if necessary.
+
 
[08:35:24] whot!porky.stuttgart.redhat.com: yeah, probably. I don't think there's been a lot of real evdev bugs for a while now anyway
+
== How to spread bugs between evdev, -mouse or -keyboard and kernel ==
[08:35:43] whot!porky.stuttgart.redhat.com: so the standard approach for issues with devices is:
+
 
[08:35:57] whot!porky.stuttgart.redhat.com: is the device listed in /proc/bus/input/devices? no? -> kernel
+
# is the device listed in <code>/proc/bus/input/devices</code>? no? -> '''kernel'''
[08:36:04] whot!porky.stuttgart.redhat.com: is the device listed in lshal? no? -> kernel
+
# does the device match any InputClass sections in /etc/xorg.conf.d/? no? -> '''configuration issue'''
[08:36:16] whot!porky.stuttgart.redhat.com: is the devices listed with an input.x11_driver in lshal? no? - hal configuration issue
+
# are there any user-configured options in the xorg.conf or /etc/xorg.conf.d but they're not merged? -> '''configuration issue''' or '''xorg-x11-server'''
[08:36:50] whot!porky.stuttgart.redhat.com: any input.x11_options by the user? no? - hal configuration issue. these are hard because they're usually typos that are hard to spot
+
# if the Xorg.log lists the device when it appears and says “don’t know how to use device” that means the device is not detected by '''evdev'''. It shouldn’t happen with <code>>= F11</code> or rawhide, can still happen with <code>F10</code>. If it happens with <code>> F11</code>, its a bug that needs to be fixed upstream (and in that case I always need the output from http://people.freedesktop.org/~whot/evtest) I use output of <code>evtest</code> to write software test devices to simulate the hardware. the repository for that is at [[git://people.freedesktop.org/~whot/testdevices.git]], if the user can program, it’s quite simple to add new devices. (but of course, we should ask them first, and not push mere users to programming). If the device doesn’t send events or doesn’t behave properly, always check the evtest output too. If that one is busted, it’s a kernel or hardware issue. If evtest looks normal, but the server jumps it’s a server issue, often pointer acceleration or scaling
[08:37:11] whot!porky.stuttgart.redhat.com: but you can probably punt the user to the input device configuration wiki page and let them sort it out themselves
+
 
[08:37:59] whot!porky.stuttgart.redhat.com: if the Xorg.log lists the device when it appears and says 'don't know how to use device" that means the device is not detected by evdev. shouldn't happen with F11 or rawhide, can still happen with F10
+
=== Fedora 12 or earlier ===
[08:38:27] whot!porky.stuttgart.redhat.com: if it happens with > F11, its a bug that needs to be fixed upstream.
+
# is the devices listed with an <code>input.x11_driver</code> in <code>lshal</code>? no? -> '''hal configuration issue'''
[08:38:42] mcepl: > or >= ?
+
# any <code>input.x11_options</code> by the user? no? - '''hal configuration issue'''. These are hard because they’re usually typos that are hard to spot, but you can probably punt the user to the [https://fedoraproject.org/wiki/Input_device_configuration input device configuration] wiki page and let them sort it out themselves.
[08:38:48] whot!porky.stuttgart.redhat.com: and in that case I always need the output from http://people.freedesktop.org/~whot/evtest.c
+
 
[08:39:02] whot!porky.stuttgart.redhat.com: sorry, F11 or higher (inclusive)
+
== Synaptics bugs ==
[08:39:46] whot!porky.stuttgart.redhat.com: I use that output to write software test devices to simulate the hardware. the repository for that is at git://people.freedesktop.org/~whot/testdevices.git, if the user can program, it's quite simple to add new devices
+
 
[08:40:28] whot!porky.stuttgart.redhat.com: if the device doesn't send events or doesn't behave properly, always check the evtest output too. if that one is busted, it's a kernel or hardware issue
+
Another thing about evtest—synaptics won’t spit out any events to evtest while X is running because the device file is grabbed. so you need to VT switch away to get the events from the hardware. For evdev devices, that’s not a problem (now, evdev used to grab too until mid-F10).
[08:40:43] whot!porky.stuttgart.redhat.com: if evtest looks normal, but the server jumps it's a server issue, often pointer acceleration or scaling
+
 
[08:40:45] mcepl: Hmm, I think, good start will be to take this IRC conversation and rewrite it into wiki page ...
+
# anything that says input.capabilities input.touchpad in lshal uses synaptics.
[08:40:51] whot!porky.stuttgart.redhat.com:
+
# anything else, evdev
[08:40:59] whot!porky.stuttgart.redhat.com: I should have written all that down a while ago
+
[08:41:44] mcepl: is that it?
+
[08:41:48] whot!porky.stuttgart.redhat.com: lets do the following: you copy the above into a wiki page and I'll fix it up and add things as we go along. if you have extra questions, just ping me on IRC
+
[08:42:39] whot!porky.stuttgart.redhat.com: well, for now. I'll try to add some more info so you can help triaging it down to something more concrete to the wiki page
+
[08:42:44] mcepl: BTW, personal question, if I resume my T400 from suspend, and mouse doesn't move for couple of seconds? Have you seen this bug (I cannot recall I did)
+
[08:43:12] whot!porky.stuttgart.redhat.com: i got the same here. I think it's the kernel not restoring the device quickly enough.
+
[08:43:19] mcepl: well, OK, action notice for me now, is to make the draft of that wiki, and where to move from this point.
+
[08:43:21] whot!porky.stuttgart.redhat.com: so graphics comes back, but USB takes a while longer
+
[08:43:33] mcepl: OK, kernel then?
+
[08:43:45] mcepl: will file
+
[08:43:57] whot!porky.stuttgart.redhat.com: probably. I don't quite see how that could be triggered anywhere in the X stack
+
[08:44:34] whot!porky.stuttgart.redhat.com: oh, another thing about evtest - synaptics won't spit out any events to evtest while X is running because the device file is grabbed. so you need to VT switch away to get the events from the hardware
+
[08:44:41] whot!porky.stuttgart.redhat.com: for evdev devices, that's not a problem
+
[08:45:35] mcepl: meaning synaptics without evdev? (I guess synaptics driver is used somewhere in the background, isn't it?)
+
[08:48:10] whot!porky.stuttgart.redhat.com: synaptics uses the same backend, but evdev is a different driver
+
[08:48:19] whot!porky.stuttgart.redhat.com: evdev used to grab too until mid-F10 I think
+
[08:48:34] mcepl: OK, I see
+
[08:48:42] whot!porky.stuttgart.redhat.com: anything that says input.capabilities input.touchpad in lshal uses synaptics.
+
[08:48:57] whot!porky.stuttgart.redhat.com: anything else, evdev
+
[08:50:16] mcepl: OK, anything else (I am getting hungry and coffee-deprived; how long will you be around?)
+
[08:52:47] mcepl: BTW, can you run
+
ls ~/.mozilla/firefox/*.default/places.sqlite-*.corrupt|wc -l
+
and tell me what's the result?
+
[08:56:45] whot!porky.stuttgart.redhat.com: don't have the file
+
[08:56:58] mcepl: cool, thanks
+
[08:57:06] whot!porky.stuttgart.redhat.com: i'm on F11 though, maybe that's it?
+
[08:57:16] mcepl: no, I am on F11 as well
+
[08:57:25] mcepl: not *all* bugs went to Rawhide
+
[08:57:49] whot!porky.stuttgart.redhat.com: I think that's it with X bugs, at least from the top of my head. Gotta run for the bus too
+
[08:58:08] whot!porky.stuttgart.redhat.com: thanks and sorry for the spanking. I did tell kevin that it was mostly my fault for not telling you what to look for
+
[08:59:01] mcepl: no, spanking is probably too harsh ... kem is nice even when saying I screwed up (but that was the spirit of it)
+
[09:05:10] whot!porky.stuttgart.redhat.com: hehe
+
[09:05:22] whot!porky.stuttgart.redhat.com: ok. gotta run. thanks for listening, I'll add to the wiki page once it's there
+

Latest revision as of 23:24, 17 March 2010

Contents

[edit] General overview

For Fedora 13 (with xorg-x11-server-1.7.99.901-1) we disabled HAL configuration, so a fair bunch of configuration problems may come from that. The fdi files are now ignored, so the configurations need to be moved. We recommend that they be put as InputClass sections into the xorg.conf (careful when recommending to delete the xorg.conf for other bugs). This needs to be done manually, there's no automatic conversion tool.

Most evdev bugs aren’t actually evdev bugs. they tend to get caught before the releases. They’re either configuration issues (HAL, usually) or server bugs. especially when it comes to keyboard layouts etc., it’s nearly always server.

Also, there are almost no xorg-x11-drv-{mouse,keyboard} bugs. User has to have AutoAddDevices "off", or "AllowEmptyInput" "off" in the config file, so nobody without xorg.conf will use these drivers.

Real evdev bugs are usually when over time something degenerates, when the pointer jumps after pressing a button, things like that, or when the server says “don’t know how to use device” (in Xorg.*.log). In some cases, hardware is partially broken (usually a kernel issue). For those, usually bugs with mice and touchpads, ask for the evtest output of the device. If it's a bug that crashes the server, ask for the evtest-capture output on this device to reproduce.

Therefore most input device bugs are in the xorg-x11-server component, assigned to peter.

[edit] Log files

What log files are needed for bug triaging:

  • /var/log/Xorg.0.log (or the Xorg.0.log.old after a crash, it contains the backtrace) is almost *always* needed. Please attach it as a whole, uncompressed. Note that depending on what $DISPLAY the server starts up, the numbers may be different or change after a restart (/var/log/Xorg.1.log, etc.). If you are not sure which one is the right one, check the timestamps on the file.
  • xorg.conf if there is one
  • Fedora 13 or later: custom xorg.conf.d snippets if there are ones
  • Fedora 12 or earlier: output from lshal
  • For keyboard layout issues: the output of "xkbcomp -xkb $DISPLAY -"
  • For any device issues: evtest against the device file
  • For any crashers or misbehavings after a certain series of (hardware) actions: evtest-capture against the device file

[edit] Keyboard layouts

  • Some modifier keys stop working, some keys don't repeat, etc. => xorg-x11-server. This must have a simple, seproducible test case, it's near-impossible to debug otherwise. Things like "do something for 20 minutes and then it stops working" isn't enough.
  • A keyboard layout works, but one or two keys provide the wrong symbols => xkeyboard-config, should usually be upstreamed, Peter hands the decisions on layout changes to the upstream maintainer.
  • The user-selected layout does not apply => configuration issue, gnome-settings-daemon, gdm or xorg-x11-server.
    • Does the layout from /etc/sysconfig/keyboard show up in the Xorg.log file (only for keyboard devices, devices like the Power Button or Video Bus stay on 'us')? no? => system-setup-keyboard
    • Do user-configured options (in custom .fdi files) show up in the Xorg.log file? no? => hal configuration needs porting to xorg.conf format.
    • Do user-configured options (in xorg.conf/xorg.conf.d) show up in the Xorg.log file? no? => xorg-x11-server
    • Does the layout work in a plain X session (yum install xterm && sudo init 3 && xinit --)? no? => xorg-x11-server
    • Does the layout work in a plain X session but stops working after login? => gnome-settings-daemon or possibly gdm
  • Arrow keys broken (up == printscreen): this is caused by an xfree86 (or "base") ruleset for an evdev device. The cause of this is almost always a stray AllowEmptyInput "off" in the config (which is nearly always a wrong configuration). Removing the option fixes it.

[edit] How to spread bugs between evdev, -mouse or -keyboard and kernel

  1. is the device listed in /proc/bus/input/devices? no? -> kernel
  2. does the device match any InputClass sections in /etc/xorg.conf.d/? no? -> configuration issue
  3. are there any user-configured options in the xorg.conf or /etc/xorg.conf.d but they're not merged? -> configuration issue or xorg-x11-server
  4. if the Xorg.log lists the device when it appears and says “don’t know how to use device” that means the device is not detected by evdev. It shouldn’t happen with >= F11 or rawhide, can still happen with F10. If it happens with > F11, its a bug that needs to be fixed upstream (and in that case I always need the output from http://people.freedesktop.org/~whot/evtest) I use output of evtest to write software test devices to simulate the hardware. the repository for that is at [[1]], if the user can program, it’s quite simple to add new devices. (but of course, we should ask them first, and not push mere users to programming). If the device doesn’t send events or doesn’t behave properly, always check the evtest output too. If that one is busted, it’s a kernel or hardware issue. If evtest looks normal, but the server jumps it’s a server issue, often pointer acceleration or scaling

[edit] Fedora 12 or earlier

  1. is the devices listed with an input.x11_driver in lshal? no? -> hal configuration issue
  2. any input.x11_options by the user? no? - hal configuration issue. These are hard because they’re usually typos that are hard to spot, but you can probably punt the user to the input device configuration wiki page and let them sort it out themselves.

[edit] Synaptics bugs

Another thing about evtest—synaptics won’t spit out any events to evtest while X is running because the device file is grabbed. so you need to VT switch away to get the events from the hardware. For evdev devices, that’s not a problem (now, evdev used to grab too until mid-F10).

  1. anything that says input.capabilities input.touchpad in lshal uses synaptics.
  2. anything else, evdev