2.6. Debugging

2.6.1. Introduction to Available Debuggers

Using a debugger allows running the program under more controlled circumstances. Typically, it is possible to step through the program a line at a time, inspect the value of variables, change them, tell the debugger to run up to a certain point and then stop, and so on. It is also possible to attach to a program that is already running, or load a core file to investigate why the program crashed. It is even possible to debug the kernel, though that is a little trickier than the user applications we will be discussing in this section.

This section is intended to be a quick introduction to using debuggers and does not cover specialized topics such as debugging the kernel. For more information about that, refer to Chapter 10, Kernel Debugging.

The standard debugger supplied with FreeBSD 12.1 is called lldb (LLVM debugger). As it is part of the standard installation for that release, there is no need to do anything special to use it. It has good command help, accessible via the help command, as well as a web tutorial and documentation.

Note:

The lldb command is available for FreeBSD 11.3 from ports or packages as devel/llvm. This will install the default version of lldb (currently 9.0).

The other debugger available with FreeBSD is called gdb (GNU debugger). Unlike lldb, it is not installed by default on FreeBSD 12.1; to use it, install devel/gdb from ports or packages. The version installed by default on FreeBSD 11.3 is old; instead, install devel/gdb there as well. It has quite good on-line help, as well as a set of info pages.

Which one to use is largely a matter of taste. If familiar with one only, use that one. People familiar with neither or both but wanting to use one from inside Emacs will need to use gdb as lldb is unsupported by Emacs. Otherwise, try both and see which one you prefer.

2.6.2. Using lldb

2.6.2.1. Starting lldb

Start up lldb by typing

% lldb -- progname

2.6.2.2. Running a Program with lldb

Compile the program with -g to get the most out of using lldb. It will work without, but will only display the name of the function currently running, instead of the source code. If it displays a line like:

Breakpoint 1: where = temp`main, address = …

(without an indication of source code filename and line number) when setting a breakpoint, this means that the program was not compiled with -g.

Tip:

Most lldb commands have shorter forms that can be used instead. The longer forms are used here for clarity.

At the lldb prompt, type breakpoint set -n main. This will tell the debugger not to display the preliminary set-up code in the program being run and to stop execution at the beginning of the program's code. Now type process launch to actually start the program— it will start at the beginning of the set-up code and then get stopped by the debugger when it calls main().

To step through the program a line at a time, type thread step-over. When the program gets to a function call, step into it by typing thread step-in. Once in a function call, return from it by typing thread step-out or use up and down to take a quick look at the caller.

Here is a simple example of how to spot a mistake in a program with lldb. This is our program (with a deliberate mistake):

#include <stdio.h>

int bazz(int anint);

main() {
	int i;

	printf("This is my program\n");
	bazz(i);
	return 0;
}

int bazz(int anint) {
	printf("You gave me %d\n", anint);
	return anint;
}

This program sets i to be 5 and passes it to a function bazz() which prints out the number we gave it.

Compiling and running the program displays

% cc -g -o temp temp.c
% ./temp
This is my program
anint = -5360

That is not what was expected! Time to see what is going on!

% lldb -- temp
(lldb) target create "temp"
Current executable set to 'temp' (x86_64).
(lldb) breakpoint set -n main				Skip the set-up code
Breakpoint 1: where = temp`main + 15 at temp.c:8:2, address = 0x00000000002012ef	lldb puts breakpoint at main()
(lldb) process launch					Run as far as main()
Process 9992 launching
Process 9992 launched: '/home/pauamma/tmp/temp' (x86_64)	Program starts running

Process 9992 stopped
* thread #1, name = 'temp', stop reason = breakpoint 1.1	lldb stops at main()
    frame #0: 0x00000000002012ef temp`main at temp.c:8:2
   5	main() {
   6		int i;
   7
-> 8		printf("This is my program\n");			Indicates the line where it stopped
   9		bazz(i);
   10		return 0;
   11	}
(lldb) thread step-over			Go to next line
This is my program						Program prints out
Process 9992 stopped
* thread #1, name = 'temp', stop reason = step over
    frame #0: 0x0000000000201300 temp`main at temp.c:9:7
   6		int i;
   7
   8		printf("This is my program\n");
-> 9		bazz(i);
   10		return 0;
   11	}
   12
(lldb) thread step-in			step into bazz()
Process 9992 stopped
* thread #1, name = 'temp', stop reason = step in
    frame #0: 0x000000000020132b temp`bazz(anint=-5360) at temp.c:14:29	lldb displays stack frame
   11	}
   12
   13	int bazz(int anint) {
-> 14		printf("You gave me %d\n", anint);
   15		return anint;
   16	}
(lldb)

Hang on a minute! How did anint get to be -5360? Was it not set to 5 in main()? Let us move up to main() and have a look.

(lldb) up		Move up call stack
frame #1: 0x000000000020130b temp`main at temp.c:9:2		lldb displays stack frame
   6		int i;
   7
   8		printf("This is my program\n");
-> 9		bazz(i);
   10		return 0;
   11	}
   12
(lldb) frame variable i			Show us the value of i
(int) i = -5360							lldb displays -5360

Oh dear! Looking at the code, we forgot to initialize i. We meant to put


main() {
	int i;

	i = 5;
	printf("This is my program\n");

but we left the i=5; line out. As we did not initialize i, it had whatever number happened to be in that area of memory when the program ran, which in this case happened to be -5360.

Note:

The lldb command displays the stack frame every time we go into or out of a function, even if we are using up and down to move around the call stack. This shows the name of the function and the values of its arguments, which helps us keep track of where we are and what is going on. (The stack is a storage area where the program stores information about the arguments passed to functions and where to go when it returns from a function call.)

2.6.2.3. Examining a Core File with lldb

A core file is basically a file which contains the complete state of the process when it crashed. In the good old days, programmers had to print out hex listings of core files and sweat over machine code manuals, but now life is a bit easier. Incidentally, under FreeBSD and other 4.4BSD systems, a core file is called progname.core instead of just core, to make it clearer which program a core file belongs to.

To examine a core file, specify the name of the core file in addition to the program itself. Instead of starting up lldb in the usual way, type lldb -c progname.core -- progname

The debugger will display something like this:

% lldb -c progname.core -- progname
(lldb) target create "progname" --core "progname.core"
Core file '/home/pauamma/tmp/progname.core' (x86_64) was loaded.
(lldb)

In this case, the program was called progname, so the core file is called progname.core. The debugger does not display why the program crashed or where. For this, use thread backtrace all. This will also show how the function where the program dumped core was called.

(lldb) thread backtrace all
* thread #1, name = 'progname', stop reason = signal SIGSEGV
  * frame #0: 0x0000000000201347 progname`bazz(anint=5) at temp2.c:17:10
    frame #1: 0x0000000000201312 progname`main at temp2.c:10:2
    frame #2: 0x000000000020110f progname`_start(ap=<unavailable>, cleanup=<unavailable>) at crt1.c:76:7
(lldb)

SIGSEGV indicates that the program tried to access memory (run code or read/write data usually) at a location that does not belong to it, but does not give any specifics. For that, look at the source code at line 10 of file temp2.c, in bazz(). The backtrace also says that in this case, bazz() was called from main().

2.6.2.4. Attaching to a Running Program with lldb

One of the neatest features about lldb is that it can attach to a program that is already running. Of course, that requires sufficient permissions to do so. A common problem is stepping through a program that forks and wanting to trace the child, but the debugger will only trace the parent.

To do that, start up another lldb, use ps to find the process ID for the child, and do

(lldb) process attach -p pid

in lldb, and then debug as usual.

For that to work well, the code that calls fork to create the child needs to do something like the following (courtesy of the gdb info pages):


if ((pid = fork()) < 0)		/* _Always_ check this */
	error();
else if (pid == 0) {		/* child */
	int PauseMode = 1;

	while (PauseMode)
		sleep(10);	/* Wait until someone attaches to us */
	
} else {			/* parent */
	

Now all that is needed is to attach to the child, set PauseMode to 0 with expr PauseMode = 0 and wait for the sleep() call to return.

2.6.3. Using gdb

2.6.3.1. Starting gdb

Start up gdb by typing

% gdb progname

although many people prefer to run it inside Emacs. To do this, type:

M-x gdb RET progname RET

Finally, for those finding its text-based command-prompt style off-putting, there is a graphical front-end for it (devel/xxgdb) in the Ports Collection.

2.6.3.2. Running a Program with gdb

Compile the program with -g to get the most out of using gdb. It will work without, but will only display the name of the function currently running, instead of the source code. A line like:

… (no debugging symbols found) …

when gdb starts up means that the program was not compiled with -g.

At the gdb prompt, type break main. This will tell the debugger to skip the preliminary set-up code in the program being run and to stop execution at the beginning of the program's code. Now type run to start the program— it will start at the beginning of the set-up code and then get stopped by the debugger when it calls main().

To step through the program a line at a time, press n. When at a function call, step into it by pressing s. Once in a function call, return from it by pressing f, or use up and down to take a quick look at the caller.

Here is a simple example of how to spot a mistake in a program with gdb. This is our program (with a deliberate mistake):

#include <stdio.h>

int bazz(int anint);

main() {
	int i;

	printf("This is my program\n");
	bazz(i);
	return 0;
}

int bazz(int anint) {
	printf("You gave me %d\n", anint);
	return anint;
}

This program sets i to be 5 and passes it to a function bazz() which prints out the number we gave it.

Compiling and running the program displays

% cc -g -o temp temp.c
% ./temp
This is my program
anint = 4231

That was not what we expected! Time to see what is going on!

% gdb temp
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.13 (i386-unknown-freebsd), Copyright 1994 Free Software Foundation, Inc.
(gdb) break main				Skip the set-up code
Breakpoint 1 at 0x160f: file temp.c, line 9.	gdb puts breakpoint at main()
(gdb) run					Run as far as main()
Starting program: /home/james/tmp/temp		Program starts running

Breakpoint 1, main () at temp.c:9		gdb stops at main()
(gdb) n						Go to next line
This is my program				Program prints out
(gdb) s						step into bazz()
bazz (anint=4231) at temp.c:17			gdb displays stack frame
(gdb)

Hang on a minute! How did anint get to be 4231? Was it not set to 5 in main()? Let us move up to main() and have a look.

(gdb) up					Move up call stack
#1  0x1625 in main () at temp.c:11		gdb displays stack frame
(gdb) p i					Show us the value of i
$1 = 4231					gdb displays 4231

Oh dear! Looking at the code, we forgot to initialize i. We meant to put


main() {
	int i;

	i = 5;
	printf("This is my program\n");

but we left the i=5; line out. As we did not initialize i, it had whatever number happened to be in that area of memory when the program ran, which in this case happened to be 4231.

Note:

The gdb command displays the stack frame every time we go into or out of a function, even if we are using up and down to move around the call stack. This shows the name of the function and the values of its arguments, which helps us keep track of where we are and what is going on. (The stack is a storage area where the program stores information about the arguments passed to functions and where to go when it returns from a function call.)

2.6.3.3. Examining a Core File with gdb

A core file is basically a file which contains the complete state of the process when it crashed. In the good old days, programmers had to print out hex listings of core files and sweat over machine code manuals, but now life is a bit easier. Incidentally, under FreeBSD and other 4.4BSD systems, a core file is called progname.core instead of just core, to make it clearer which program a core file belongs to.

To examine a core file, start up gdb in the usual way. Instead of typing break or run, type

(gdb) core progname.core

If the core file is not in the current directory, type dir /path/to/core/file first.

The debugger should display something like this:

% gdb progname
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.13 (i386-unknown-freebsd), Copyright 1994 Free Software Foundation, Inc.
(gdb) core progname.core
Core was generated by `progname'.
Program terminated with signal 11, Segmentation fault.
Cannot access memory at address 0x7020796d.
#0  0x164a in bazz (anint=0x5) at temp.c:17
(gdb)

In this case, the program was called progname, so the core file is called progname.core. We can see that the program crashed due to trying to access an area in memory that was not available to it in a function called bazz.

Sometimes it is useful to be able to see how a function was called, as the problem could have occurred a long way up the call stack in a complex program. bt causes gdb to print out a back-trace of the call stack:

(gdb) bt
#0  0x164a in bazz (anint=0x5) at temp.c:17
#1  0xefbfd888 in end ()
#2  0x162c in main () at temp.c:11
(gdb)

The end() function is called when a program crashes; in this case, the bazz() function was called from main().

2.6.3.4. Attaching to a Running Program with gdb

One of the neatest features about gdb is that it can attach to a program that is already running. Of course, that requires sufficient permissions to do so. A common problem is stepping through a program that forks and wanting to trace the child, but the debugger will only trace the parent.

To do that, start up another gdb, use ps to find the process ID for the child, and do

(gdb) attach pid

in gdb, and then debug as usual.

For that to work well, the code that calls fork to create the child needs to do something like the following (courtesy of the gdb info pages):


if ((pid = fork()) < 0)		/* _Always_ check this */
	error();
else if (pid == 0) {		/* child */
	int PauseMode = 1;

	while (PauseMode)
		sleep(10);	/* Wait until someone attaches to us */
	
} else {			/* parent */
	

Now all that is needed is to attach to the child, set PauseMode to 0, and wait for the sleep() call to return!

All FreeBSD documents are available for download at https://download.freebsd.org/ftp/doc/

Questions that are not answered by the documentation may be sent to <freebsd-questions@FreeBSD.org>.
Send questions about this document to <freebsd-doc@FreeBSD.org>.