You are viewing chrismiles

Previous Entry | Next Entry

Introspecting Python objects within gdb

venture, dean
I had to debug a Python C extension recently. Using gdb, it was easier than I thought to walk through the source and introspect Python objects. Here's how to do it.

The first step is to make sure you've got a Python build that contains debugging symbols. Build Python manually using "make OPT=-g".

The nice Python guys have even supplied some handy gdb macros. Grab the Misc/gdbinit file from the Python source tree and make it your ~/.gdbinit file.

$ cd Python-2.5/Misc
$ cp gdbinit ~/.gdbinit


Now let's play with gdb. Fire it up and point it at the interpreter.

$ gdb
(gdb) file /opt/python-2.4.4-debug/bin/python
Reading symbols for shared libraries .... done
Reading symbols from /opt/python-2.4.4-debug/bin/python...done.


A very useful feature with gdb is the ability to set breakpoints on files that haven't been loaded yet, such as shared libraries. Let's set one in the source of a module I've been playing with. The shared library won't be loaded until Python processes the import statement, but gdb will still let us set it.

(gdb) b processtable.c:654
No source file named processtable.c.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (processtable.c:654) pending.


Now let's fire up the unit tests, to get something happening. You can see the pending breakpoint is automatically resolved when the relevant library is loaded.

(gdb) run setup.py test
Starting program: /opt/python-2.4.4-debug/bin/python setup.py test
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
running test
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Breakpoint 1 at 0x627338: file processtable.c, line 654.
Pending breakpoint 1 - "processtable.c:654" resolved
test_args (tests.process_test.ProcessCommandTest) ... ok
test_command (tests.process_test.ProcessCommandTest) ... ok
test_command_path (tests.process_test.ProcessCommandTest) ... ok
test_env (tests.process_test.ProcessCommandTest) ... ok
test_nice (tests.process_test.ProcessPriorityTest) ... ok
test_priority (tests.process_test.ProcessPriorityTest) ... ok
test_resident_size (tests.process_test.ProcessSizeTest) ... ok
test_virtual_size (tests.process_test.ProcessSizeTest) ... ok
test_flags (tests.process_test.ProcessTimeTest) ... ok
test_parent_pid (tests.process_test.ProcessTimeTest) ... ok
test_status (tests.process_test.ProcessTimeTest) ... ok
test_terminal (tests.process_test.ProcessTimeTest) ... ok
test_threads (tests.process_test.ProcessTimeTest) ... ok
test_current_gid (tests.process_test.ProcessUserTest) ... ok
test_current_group (tests.process_test.ProcessUserTest) ... ok
test_current_uid (tests.process_test.ProcessUserTest) ... ok
test_current_user (tests.process_test.ProcessUserTest) ... ok
test_real_gid (tests.process_test.ProcessUserTest) ... ok
test_real_group (tests.process_test.ProcessUserTest) ... ok
test_real_uid (tests.process_test.ProcessUserTest) ... ok
test_real_user (tests.process_test.ProcessUserTest) ... ok
test_bad_arg (tests.process_test.SimplestProcessTest) ... ok
test_pid (tests.process_test.SimplestProcessTest) ... ok
test_type (tests.process_test.SimplestProcessTest) ... ok
test_args (tests.processtable_test.ProcessTableProcessTests) ...
Breakpoint 1, ProcessTable_init (self=0x4410e0, args=0x405030, kwds=0x0) at processtable.c:654
654 if (PyList_Insert(self->processes, 0, (PyObject*)proc_obj)) {


Python ran some tests until it hit our breakpoint, inside the C extension module. We can view the source, of course.

(gdb) list
649
650
651 /* Add processes to list in reverse order, which ends up ordering
652 * them by ascending PID value.
653 */
654 if (PyList_Insert(self->processes, 0, (PyObject*)proc_obj)) {
655 return -1; /* failure */
656 }
657 Py_DECREF(proc_obj);
658 }


We are inside the __init__ function of a class. So there's the usual Python self object. In C extension modules, self is a pointer to a struct representing the internal attributes of the class. Let's take a look at self->processes.

(gdb) p self
$1 = (ProcessTableObject *) 0x4410e0
(gdb) p self->processes
$2 = (PyObject *) 0x4b5940


In this case, self is a pointer to our custom class. self->processes is a pointer to a PyObject, which could be any Python object type. The .gdbinit we borrowed from the Python source defines a very useful macro for inspecting the target of PyObject pointers.

(gdb) pyo self->processes
object : []
type : list
refcount: 1
address : 0x4b5940
$3 = void


Cool, so self->processes is a list type, and its current value is an empty list. Our breakpoint is located within a loop, so let's iterate around and get an object added to this list.

(gdb) cont
Continuing.

Breakpoint 1, ProcessTable_init (self=0x4410e0, args=0x405030, kwds=0x0) at processtable.c:654
654 if (PyList_Insert(self->processes, 0, (PyObject*)proc_obj)) {
(gdb) pyo self->processes
object : [<psi.process.process object="object" pid="16543">]
type : list
refcount: 1
address : 0x4b5940
$4 = void


Cool, the list now contains an object. Let's add another by looping again.

(gdb) cont
Continuing.

Breakpoint 1, ProcessTable_init (self=0x4410e0, args=0x405030, kwds=0x0) at processtable.c:654
654 if (PyList_Insert(self->processes, 0, (PyObject*)proc_obj)) {
(gdb) pyo self->processes
object : [<psi.process.process object="object" pid="16536">, <psi.process.process object="object" pid="16543">]
type : list
refcount: 1
address : 0x4b5940
$5 = void


So, self->processes is a list and currently contains 2 objects. Is it possible to fetch an element from the list and examine it? Sure is. We need to call the Python C functions that know how to deal with Python objects. gdb will allow us to do this.

(gdb) pyo PyObject_GetItem(self->processes,Py_BuildValue("i",0))
object : <psi.process.process object="object" pid="16536">
type : psi.process.Process
refcount: 3
address : 0x4dbf28
$6 = void


PyObject_GetItem(obj, y) is the C equivalent of obj[y] or obj.__getitem__(y)). The "y" must also be a Python object, you cannot just give it a C int. So we use Py_BuildValue() to build a Python integer object. The above is the equivalent of self.processes[0]. (Note that you cannot have any spaces within the argument given to pyo, as arguments to gdb macros are split by white space and pyo will only use the first one ($arg0).)

So, how do we examine the Process object itself? We can easily look at an attribute of the object, which might be handy. Let's look at the "command" attribute of the Process object.

(gdb) pyo PyObject_GetAttr(PyObject_GetItem(self->processes,Py_BuildValue("i",0)),Py_BuildValue("s","command"))
object : 'gdb-i386-apple-d'
type : str
refcount: 3
address : 0x640bb0
$7 = void


and same for the other object in the list.

(gdb) pyo PyObject_GetAttr(PyObject_GetItem(self->processes,Py_BuildValue("i",1)),Py_BuildValue("s","command"))
object : 'python'
type : str
refcount: 3
address : 0x63cf60
$8 = void


Cool, so even though we are deep within a C extension module, we can still introspect our objects with relative ease.

Tags:

Comments

( 4 comments — Leave a comment )
gpshead
Jul. 3rd, 2007 11:27 pm (UTC)
more python gdb macros
Doing a web search for python gdb macros like these and others I also came across these which also look handy:

http://www.mashebali.com/?Python_GDB_macros:The_Macros
chrismiles
Jul. 4th, 2007 07:27 am (UTC)
Re: more python gdb macros
Nice find. Thanks.

nilbus
Dec. 5th, 2008 11:53 pm (UTC)
Thanks :D You're the best. I appreciate this useful post
(Anonymous)
Jul. 5th, 2010 02:15 am (UTC)
Very nice!
Very exciting this article! I have learned something new!
( 4 comments — Leave a comment )

Latest Month

March 2009
S M T W T F S
1234567
891011121314
15161718192021
22232425262728
293031    
Powered by LiveJournal.com
Designed by Tiffany Chow