How debuggers work: Part 1 – Basics

January 23rd, 2011 at 7:40 am

This is the first part in a series of articles on how debuggers work. I’m still not sure how many articles the series will contain and what topics it will cover, but I’m going to start with the basics.

In this part

I’m going to present the main building block of a debugger’s implementation on Linux – the ptrace system call. All the code in this article is developed on a 32-bit Ubuntu machine. Note that the code is very much platform specific, although porting it to other platforms shouldn’t be too difficult.


To understand where we’re going, try to imagine what it takes for a debugger to do its work. A debugger can start some process and debug it, or attach itself to an existing process. It can single-step through the code, set breakpoints and run to them, examine variable values and stack traces. Many debuggers have advanced features such as executing expressions and calling functions in the debbugged process’s address space, and even changing the process’s code on-the-fly and watching the effects.

Although modern debuggers are complex beasts [1], it’s surprising how simple is the foundation on which they are built. Debuggers start with only a few basic services provided by the operating system and the compiler/linker, all the rest is just a simple matter of programming.

Linux debugging – ptrace

The Swiss army knife of Linux debuggers is the ptrace system call [2]. It’s a versatile and rather complex tool that allows one process to control the execution of another and to peek and poke at its innards [3]. ptrace can take a mid-sized book to explain fully, which is why I’m just going to focus on some of its practical uses in examples.

Let’s dive right in.

Stepping through the code of a process

I’m now going to develop an example of running a process in "traced" mode in which we’re going to single-step through its code – the machine code (assembly instructions) that’s executed by the CPU. I’ll show the example code in parts, explaining each, and in the end of the article you will find a link to download a complete C file that you can compile, execute and play with.

The high-level plan is to write code that splits into a child process that will execute a user-supplied command, and a parent process that traces the child. First, the main function:

int main(int argc, char** argv)
    pid_t child_pid;

    if (argc < 2) {
        fprintf(stderr, "Expected a program name as argument\n");
        return -1;

    child_pid = fork();
    if (child_pid == 0)
    else if (child_pid > 0)
    else {
        return -1;

    return 0;

Pretty simple: we start a new child process with fork [4]. The if branch of the subsequent condition runs the child process (called "target" here), and the else if branch runs the parent process (called "debugger" here).

Here’s the target process:

void run_target(const char* programname)
    procmsg("target started. will run '%s'\n", programname);

    /* Allow tracing of this process */
    if (ptrace(PTRACE_TRACEME, 0, 0, 0) < 0) {

    /* Replace this process's image with the given program */
    execl(programname, programname, 0);

The most interesting line here is the ptrace call. ptrace is declared thus (in sys/ptrace.h):

long ptrace(enum __ptrace_request request, pid_t pid,
                 void *addr, void *data);

The first argument is a request, which may be one of many predefined PTRACE_* constants. The second argument specifies a process ID for some requests. The third and fourth arguments are address and data pointers, for memory manipulation. The ptrace call in the code snippet above makes the PTRACE_TRACEME request, which means that this child process asks the OS kernel to let its parent trace it. The request description from the man-page is quite clear:

Indicates that this process is to be traced by its parent. Any signal (except SIGKILL) delivered to this process will cause it to stop and its parent to be notified via wait(). Also, all subsequent calls to exec() by this process will cause a SIGTRAP to be sent to it, giving the parent a chance to gain control before the new program begins execution. A process probably shouldn’t make this request if its parent isn’t expecting to trace it. (pid, addr, and data are ignored.)

I’ve highlighted the part that interests us in this example. Note that the very next thing run_target does after ptrace is invoke the program given to it as an argument with execl. This, as the highlighted part explains, causes the OS kernel to stop the process just before it begins executing the program in execl and send a signal to the parent.

Thus, time is ripe to see what the parent does:

void run_debugger(pid_t child_pid)
    int wait_status;
    unsigned icounter = 0;
    procmsg("debugger started\n");

    /* Wait for child to stop on its first instruction */

    while (WIFSTOPPED(wait_status)) {
        /* Make the child execute another instruction */
        if (ptrace(PTRACE_SINGLESTEP, child_pid, 0, 0) < 0) {

        /* Wait for child to stop on its next instruction */

    procmsg("the child executed %u instructions\n", icounter);

Recall from above that once the child starts executing the exec call, it will stop and be sent the SIGTRAP signal. The parent here waits for this to happen with the first wait call. wait will return once something interesting happens, and the parent checks that it was because the child was stopped (WIFSTOPPED returns true if the child process was stopped by delivery of a signal).

What the parent does next is the most interesting part of this article. It invokes ptrace with the PTRACE_SINGLESTEP request giving it the child process ID. What this does is tell the OS – please restart the child process, but stop it after it executes the next instruction. Again, the parent waits for the child to stop and the loop continues. The loop will terminate when the signal that came out of the wait call wasn’t about the child stopping. During a normal run of the tracer, this will be the signal that tells the parent that the child process exited (WIFEXITED would return true on it).

Note that icounter counts the amount of instructions executed by the child process. So our simple example actually does something useful – given a program name on the command line, it executes the program and reports the amount of CPU instructions it took to run from start to finish. Let’s see it in action.

A test run

I compiled the following simple program and ran it under the tracer:

#include <stdio.h>

int main()
    printf("Hello, world!\n");
    return 0;

To my surprise, the tracer took quite long to run and reported that there were more than 100,000 instructions executed. For a simple printf call? What gives? The answer is very interesting [5]. By default, gcc on Linux links programs to the C runtime libraries dynamically. What this means is that one of the first things that runs when any program is executed is the dynamic library loader that looks for the required shared libraries. This is quite a lot of code – and remember that our basic tracer here looks at each and every instruction, not of just the main function, but of the whole process.

So, when I linked the test program with the -static flag (and verified that the executable gained some 500KB in weight, as is logical for a static link of the C runtime), the tracing reported only 7,000 instructions or so. This is still a lot, but makes perfect sense if you recall that libc initialization still has to run before main, and cleanup has to run after main. Besides, printf is a complex function.

Still not satisfied, I wanted to see something testable – i.e. a whole run in which I could account for every instruction executed. This, of course, can be done with assembly code. So I took this version of "Hello, world!" and assembled it:

section    .text
    ; The _start symbol must be declared for the linker (ld)
    global _start


    ; Prepare arguments for the sys_write system call:
    ;   - eax: system call number (sys_write)
    ;   - ebx: file descriptor (stdout)
    ;   - ecx: pointer to string
    ;   - edx: string length
    mov    edx, len
    mov    ecx, msg
    mov    ebx, 1
    mov    eax, 4

    ; Execute the sys_write system call
    int    0x80

    ; Execute sys_exit
    mov    eax, 1
    int    0x80

section   .data
msg db    'Hello, world!', 0xa
len equ    $ - msg

Sure enough. Now the tracer reported that 7 instructions were executed, which is something I can easily verify.

Deep into the instruction stream

The assembly-written program allows me to introduce you to another powerful use of ptrace – closely examining the state of the traced process. Here’s another version of the run_debugger function:

void run_debugger(pid_t child_pid)
    int wait_status;
    unsigned icounter = 0;
    procmsg("debugger started\n");

    /* Wait for child to stop on its first instruction */

    while (WIFSTOPPED(wait_status)) {
        struct user_regs_struct regs;
        ptrace(PTRACE_GETREGS, child_pid, 0, &regs);
        unsigned instr = ptrace(PTRACE_PEEKTEXT, child_pid, regs.eip, 0);

        procmsg("icounter = %u.  EIP = 0x%08x.  instr = 0x%08x\n",
                    icounter, regs.eip, instr);

        /* Make the child execute another instruction */
        if (ptrace(PTRACE_SINGLESTEP, child_pid, 0, 0) < 0) {

        /* Wait for child to stop on its next instruction */

    procmsg("the child executed %u instructions\n", icounter);

The only difference is in the first few lines of the while loop. There are two new ptrace calls. The first one reads the value of the process’s registers into a structure. user_regs_struct is defined in sys/user.h. Now here’s the fun part – if you look at this header file, a comment close to the top says:

/* The whole purpose of this file is for GDB and GDB only.
   Don't read too much into it. Don't use it for
   anything other than GDB unless know what you are
   doing.  */

Now, I don’t know about you, but it makes me feel we’re on the right track :-) Anyway, back to the example. Once we have all the registers in regs, we can peek at the current instruction of the process by calling ptrace with PTRACE_PEEKTEXT, passing it regs.eip (the extended instruction pointer on x86) as the address. What we get back is the instruction [6]. Let’s see this new tracer run on our assembly-coded snippet:

$ simple_tracer traced_helloworld
[5700] debugger started
[5701] target started. will run 'traced_helloworld'
[5700] icounter = 1.  EIP = 0x08048080.  instr = 0x00000eba
[5700] icounter = 2.  EIP = 0x08048085.  instr = 0x0490a0b9
[5700] icounter = 3.  EIP = 0x0804808a.  instr = 0x000001bb
[5700] icounter = 4.  EIP = 0x0804808f.  instr = 0x000004b8
[5700] icounter = 5.  EIP = 0x08048094.  instr = 0x01b880cd
Hello, world!
[5700] icounter = 6.  EIP = 0x08048096.  instr = 0x000001b8
[5700] icounter = 7.  EIP = 0x0804809b.  instr = 0x000080cd
[5700] the child executed 7 instructions

OK, so now in addition to icounter we also see the instruction pointer and the instruction it points to at each step. How to verify this is correct? By using objdump -d on the executable:

$ objdump -d traced_helloworld

traced_helloworld:     file format elf32-i386

Disassembly of section .text:

08048080 <.text>:
 8048080:     ba 0e 00 00 00          mov    $0xe,%edx
 8048085:     b9 a0 90 04 08          mov    $0x80490a0,%ecx
 804808a:     bb 01 00 00 00          mov    $0x1,%ebx
 804808f:     b8 04 00 00 00          mov    $0x4,%eax
 8048094:     cd 80                   int    $0x80
 8048096:     b8 01 00 00 00          mov    $0x1,%eax
 804809b:     cd 80                   int    $0x80

The correspondence between this and our tracing output is easily observed.

Attaching to a running process

As you know, debuggers can also attach to an already-running process. By now you won’t be surprised to find out that this is also done with ptrace, which can get the PTRACE_ATTACH request. I won’t show a code sample here since it should be very easy to implement given the code we’ve already gone through. For educational purposes, the approach taken here is more convenient (since we can stop the child process right at its start).

The code

The complete C source-code of the simple tracer presented in this article (the more advanced, instruction-printing version) is available here. It compiles cleanly with -Wall -pedantic --std=c99 on version 4.4 of gcc.

Conclusion and next steps

Admittedly, this part didn’t cover much – we’re still far from having a real debugger in our hands. However, I hope it has already made the process of debugging at least a little less mysterious. ptrace is truly a versatile system call with many abilities, of which we’ve sampled only a few so far.

Single-stepping through the code is useful, but only to a certain degree. Take the C "Hello, world!" sample I demonstrated above. To get to main it would probably take a couple of thousands of instructions of C runtime initialization code to step through. This isn’t very convenient. What we’d ideally want to have is the ability to place a breakpoint at the entry to main and step from there. Fair enough, and in the next part of the series I intend to show how breakpoints are implemented.


I’ve found the following resources and articles useful in the preparation of this article:

[1] I didn’t check but I’m sure the LOC count of gdb is at least in the six-figures range.
[2] Run man 2 ptrace for complete enlightment.
[3] Peek and poke are well-known system programming jargon for directly reading and writing memory contents.
[4] This article assumes some basic level of Unix/Linux programming experience. I assume you know (at least conceptually) about fork, the exec family of functions and Unix signals.
[5] At least if you’re as obsessed with low-level details as I am :-)
[6] A word of warning here: as I noted above, a lot of this is highly platform specific. I’m making some simplifying assumptions – for example, x86 instructions don’t have to fit into 4 bytes (the size of unsigned on my 32-bit Ubuntu machine). In fact, many won’t. Peeking at instructions meaningfully requires us to have a complete disassembler at hand. We don’t have one here, but real debuggers do.

Related posts:

  1. How debuggers work: Part 2 – Breakpoints
  2. How debuggers work: Part 3 – Debugging information
  3. Python internals: how callables work
  4. SICP section 5.2

20 Responses to “How debuggers work: Part 1 – Basics”

  1. NadavNo Gravatar Says:

    Eli, great post!

    I wanted to point out that you can also read symbols from the elf file using libBFD and place breakpoints at places of interest. I wrote some code for that in here:

  2. CharlesNo Gravatar Says:

    Thanks Eli, that was brilliant. Having never looked at how debuggers worked before reading this article, I’ve come away feeling as though I’ve learned a lot about them.

    Looking forward to the next instalment!

  3. GirishNo Gravatar Says:

    Very informative! I look forward to part 2. I would also be interested in knowing about the hardware support for implementing a ptrace like system call.

  4. brooksbpNo Gravatar Says:

    Great article!

    Here are the main issues for compiling this with GCC 4.2.1 on OS X 10.6.6:

    * Use printf instead of procmsg (not completely sure if procmsg is defined for OS X).
    * Pass an extra parameter NULL in the call to execl(...).

  5. Richard MooreNo Gravatar Says:

    Interesting post, thanks. :-)

  6. elibenNo Gravatar Says:


    Thanks for the tip – I’ll look into it. I planned to use something like distorm for disassembly, but libBFD may be useful for other purposes. Ultimately I would like to explain how source locations are translated into addresses, so I do want to reduce the question from “how debuggers work” to “how libBFD works” :-)


    Thanks for these – I’m sure it will be useful to others trying to run this on a Mac. Regarding procmsg – it’s a function I defined, you can find it in the full source listing for this article.

  7. Adam MatanNo Gravatar Says:

    Great article. looking forward for the next parts.

  8. MalleNo Gravatar Says:

    Great article, very interesting topic. Looking forward to part 2.

    @brooksbp: Thanks for the “Mac translation” as well.

  9. Juri LelliNo Gravatar Says:

    Very interesting article, looking forward to part 2 and beyond!

  10. NadavNo Gravatar Says:

    Distorm is an awesome project by an awesome dude (Gil aka Arkon). One of the advantages of Distorm is its Python bindings. But, as you said, it does not have any elf/pe reading capabilities.

  11. huyNo Gravatar Says:

    This is absolutely awesome post, high quality , very educating. I would like to point out that by understanding it we will know more about impact of capturing program back trace using gdb and pstack that is very popular technique used by administrator to watch the program behavior

  12. MarsNo Gravatar Says:

    Interesting!! I plan to translate this series of articles, “how debuggers work”, into Traditional Chinese. May I?
    I will let you know the result when work is done. :)

  13. elibenNo Gravatar Says:


    Sure thing, go ahead.

  14. Lukasz PiwkoNo Gravatar Says:

    There is a Polish translation of this article at .

  15. Lukasz PiwkoNo Gravatar Says:

    The address of the Polish translation has been changed to .

  16. RainerNo Gravatar Says:

    Great tutorial! What assembler did you use in order to build the Hello world example program?
    All I can see is that it’s not as (no AT&T syntax).

  17. elibenNo Gravatar Says:


    That’s NASM. I like its clean, readable syntax.

  18. JeetNo Gravatar Says:

    Great article as always, Eli! I truly admire your dedicated effort towards writing such great articles with amazing clarity of thought.

    For other readers, or x86_64 system, in place of eip use rip. And procmsg, I am not sure where it came from. I used it as
    #define procmsg printf

  19. RaghuNo Gravatar Says:

    Hi Eli,
    It is a great article, which helped me alot in understanding the functionality of debugger.
    I am new to Linux environment. I have a small doubt. Could you pls help me in clarifying this doubt.
    I have Linux Centos is installed. I was trying to play around with the example provided in this article.
    Here my doubt is i have written small example i.e sum of two number, which i would like to debug with help of the this debugger program, Instead of normal gdb.
    Here i am giving details, How i have approached.
    Step-1:Written Sum of 2 numbers program.
    Step-2:Copied debugger program which is provided in this article.
    Step-3:Compiled both the programs and generated the executable file.
    Step-4: Run the sum of 2 number program to read the inputs.
    Step-5: using command ps -ef, noted the process id (Example: PID-12738).
    Step-6:Run the debugger program by providing the PID i.e 12738
    Step-7: Output of the Debugger program says child executed with 0 instructions. which is not the output i was expecting.
    Step-8: Debugged the debugger program, i have observed that PID which i am passing is not going to run_debugger function, it is taking some other pid, which is created from the result of fork() function.
    Note: using -g option debug info is also generated for both debugger program and sum of 2 numbers program. i.e a.out file and sum_2_number.out files.

    i am sure that i am missing some thing, Could you pls correct me where i am done wrong.
    Here my objective is using the debugger program, i would like to debug the sum of 2 number program.
    It will great help that if you provide some inputs to read and display the symbols information of the sum of 2 numbers program.

    Thanks in advance for your help….!


  20. HappyPandaFaceNo Gravatar Says:

    If you’re getting the error:
    error: ‘struct user_regs_struct’ has no member named ‘eip’
    It’s because of your version of libc
    You can update libc to 2.19 which has eip or what I did (version 2.12) is just vim it and replace regs.eip with with this command in vim:

Leave a Reply

To post code with preserved formatting, enclose it in `backticks` (even multiple lines)