Saturday, March 17, 2007

Debugging A Kernel Oops

I used to put up howtos on my old blog, but now that it's gone I want to resurrect some of the most useful ones and post them again. This blog entry deals with how to debug a kernel panic (which happens a lot when I touch kernel code).

So how does one debug a kernel panic? The easiest way is to compile your kernel with CONFIG_DEBUG_INFO (if not available, just add a -g to CFLAGS) and run the vmlinux through gdb. When you're in gdb you can do funky stuff like disassemble functions, show source listings etc., just from hex numbers in the oops.

Consider an oops where a nasty error occurred at EIP:0010:[<c012f8da>]. This happened to me back when was writing code for my MSc. Running the oops through ksymoops (this applies only to kernel 2.4 as this step is no longer required for kernel 2.6 and above; the newer kernels show you function names in the panic message) gave me the following output which showed the name of the offending function.

Code; c012f8da <do_mmap_pgoff+2a/550> <=====

The first part is 0x0010, which is the value of the segment register which you can safely ignore (unless you're messing with the GDT). Then to find out what line in the offending function (offset 0x2a, the 550 is apparently function size) is, you just do this:

(gdb) list *do_mmap_pgoff+0x2a
0xc012f8da is in do_mmap_pgoff (mmap.c:404).
399 unsigned int vm_flags;
400 int correct_wcount = 0;
401 int error;
402 rb_node_t ** rb_link, * rb_parent;
403
404 if (file && (!file->f_op || !file->f_op->mmap))
405 return -ENODEV;
406
407 if (!len)
408 return addr;

Ta da! Line 404 is the culprit. Inspecting the disassembled code, see the register being used to dereference (the one on the left is the source, the one on the right is the destination):

mov 0x10(%edx),%eax

If you look at the register values in the oops:

eax: c171e3e4 ebx: 00000000 ecx: fffffffe edx: fffffffe
esi: 00001812 edi: cf1dbf10 ebp: cf2cbc14 esp: cf2cbbb4
ds: 0018 es: 0018 ss: 0018

It's obvious edx has some bogus value (in this case, -2). And a bogus dereference means, something naughty happened with a pointer, and the only pointer you can see in the line is file. The value of file was set incorrectly to -2, and (if you look at the original source), file is actually a parameter passed to the function. Therefore, you'll need to trace the function that called it by looking at the call trace. You can find out which functions the hex numbers represent by typing disassemble value where value should be a hex number starting with 0x.

Thanks to Zwane, Jeff and Alex for helping me learn how to do this many moons ago.

2 comments:

サンテラ said...

This will come handy if I ever make kernel-hacking as a hobby. Nice, obi.

Panos said...

Very very useful, thanks!