strace
Description
Strace is a handy tool for seeing how an application is interacting with the operating system using the system calls. It displays the used system calls and parameters. It also shows the return codes of the system calls. In addition strace can attach to a running process making it a suitable tool for debugging the system calls that an already running process is calling.
Strace is especially helpful for:
Figuring out why the program is failing to start or exits early.
Finding out what (configuration) files are used by the program.
Checking out whether the program is idle or wasting battery when it should not be doing anything.
Finding out where a frozen program got stuck.
Viewing an application's communications with the network.
Packages
source: strace
binary: strace
Usage Examples
Installing Strace
The strace
tool can be installed to the Internet
Tablet. In Scratchbox it is possible to use the Scratchbox-provided
strace
. Using strace
in an X86 target rather
than in an ARMEL target is preferable for the reason that in an ARMEL
target strace
will actually trace the system calls of the
QEMU
process (which is used to run the arm binary).
Using Strace
These examples can be run both in Scratchbox X86 environment and in the Internet Tablet device itself.
Getting a System Call Trace from an Application
To get a simple list of the system calls the application is calling,
strace is run with the path to the program as a parameter. Below
strace is run with the small gdb_example
program that is
introduced in the maemo debugging guide.
[sbox-DIABLO_X86: ~/src/testing/gdb_example] > strace ./gdb_example execve("./gdb_example", ["./gdb_example"], [/* 50 vars */]) = 0 brk(0) = 0x804a000 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4001c000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib/tls/i686/sse2", 0xbff86948) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib/tls/i686", 0xbff86948) = -1 ENOENT (No such file or directory) open("/lib/tls/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib/tls/sse2", 0xbff86948) = -1 ENOENT (No such file or directory) open("/lib/tls/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib/tls", 0xbff86948) = -1 ENOENT (No such file or directory) open("/lib/i686/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib/i686/sse2", 0xbff86948) = -1 ENOENT (No such file or directory) open("/lib/i686/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib/i686", 0xbff86948) = -1 ENOENT (No such file or directory) open("/lib/sse2/libc.so.6", O_RDONLY) = -1 ENOENT (No such file or directory) stat64("/lib/sse2", 0xbff86948) = -1 ENOENT (No such file or directory) open("/lib/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 Y\1\000"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1213256, ...}) = 0 mmap2(NULL, 1222300, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x4001d000 mmap2(0x40142000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x124) = 0x40142000 mmap2(0x40145000, 9884, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40145000 close(3) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40148000 set_thread_area({entry_number:-1 -> 6, base_addr:0x401486c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 mprotect(0x40142000, 4096, PROT_READ) = 0 fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 2), ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40149000 write(1, "Now in main().\n", 15Now in main(). ) = 15 write(1, "Now in example_1(1) function.\n", 30Now in example_1(1) function. ) = 30 write(1, "Now in example_2(parameter here)"..., 43Now in example_2(parameter here) function. ) = 43 write(1, "Now in example_3()\n", 19Now in example_3() ) = 19 exit_group(0) = ? [sbox-DIABLO_X86: ~/src/testing/gdb_example] >
strace
in an ARMEL target, the option
-f
needs to be given, so that the process run under the
QEMU is also traced.
Essentially, one line in the output refers to one system call made to the Linux kernel. Every line has the name of the system call, the parameters to the call, and finally the return value of the system call.
For example, the line:
write(1, "Now in main().\n", 15Now in main(). ) = 15
tells that the system call that was called from gdb_example was
write()
with parameters (1,"Now in main().\n",
15)
, next the output of the call is shown: "Now in
main()."
and finally the return code is displayed:
=15
.
Note: Failed system calls usually (not always) return a return code of -1.
An example of a failed stat64
call would be:
stat64("/lib/sse2", 0xbff86948) = -1 ENOENT (No such file or directory)
Here the stat64
return value of -1 is displayed together
with the symbolic name of ENOENT
error code. You can see
clearly that the program tried to access /lib/sse2
, but
failed for the reason that the file does not exist.
Studying the output of strace and comparing it to the source file
tells that, for example, the gdb_example program's
printf()
functions are each calling the
write()
system call. In addition, it shows all the
open()
and stat64()
and other system calls
in the list.
Attaching to a Running Process with Strace
One can attach to a running process with strace. This is helpful, especially when debugging a daemon type of an application, and it is necessary to see what system calls the daemon is calling without restarting the daemon.
In the following example, the top
application is started
inside Scratchbox and left running in one terminal. Then another
terminal will be opened and strace
is attached to the
top
application.
Start the
top
application inside one terminal in Scratchbox X86 target:[sbox-DIABLO_X86: ~] > top ## the screen clears and top starts to loop... top - 14:49:35 up 17 days, 23:38, 0 users, load average: 0.07, 0.02, 0.00 Tasks: 115 total, 2 running, 113 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0% us, 0.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 775176k total, 754708k used, 20468k free, 126608k buffers Swap: 1622524k total, 19716k used, 1602808k free, 413752k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 12807 user 16 0 2184 1172 888 R 79.0 0.2 0:00.09 top 1 root 16 0 1632 536 448 S 0.0 0.1 0:01.56 init 2 root 34 19 0 0 0 R 0.0 0.0 0:00.08 ksoftirqd/0 3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0 4 root 10 -5 0 0 0 S 0.0 0.0 0:00.18 events/0 5 root 10 -5 0 0 0 S 0.0 0.0 0:00.01 khelper 6 root 10 -5 0 0 0 S 0.0 0.0 0:00.00 kthread 8 root 10 -5 0 0 0 S 0.0 0.0 0:00.11 kblockd/0 9 root 10 -5 0 0 0 S 0.0 0.0 0:00.05 kacpid ... etc ...
Start another terminal window, log in on the Scratchbox and run the following commands:
[sbox-DIABLO_X86: ~] > pidof top 12807 [sbox-DIABLO_X86: ~] > strace -p 12807
With the
-p
flag strace can be attached to a running process.The
pidof
command tells the process id (PID) of a running program. In this example, it is 12807 (in your machine the number will be different). The output of strace would look similar to this:Process 12807 attached - interrupt to quit select(1, [0], NULL, NULL, {2, 320000}) = 0 (Timeout) fcntl64(0, F_SETFL, O_RDWR|O_NONBLOCK) = 0 read(0, 0xbfeb191f, 1) = -1 EAGAIN (Resource temporarily unavailable) ioctl(0, TCFLSH, 0) = 0 fcntl64(0, F_SETFL, O_RDWR) = 0 gettimeofday({1195023907, 63212}, {4294967176, 0}) = 0 open("/proc", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 7 fstat64(7, {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0 fcntl64(7, F_SETFD, FD_CLOEXEC) = 0 getdents64(7, /* 35 entries */, 1024) = 1008 getdents64(7, /* 37 entries */, 1024) = 1008 stat64("/proc/1", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0 open("/proc/1/stat", O_RDONLY) = 8 read(8, "1 (init) S 0 1 1 0 -1 4194560 43"..., 1023) = 187 close(8) = 0 open("/proc/1/statm", O_RDONLY) = 8 read(8, "738 463 133 21 0 361 0\n", 1023) = 23 close(8) = 0 stat64("/proc/2", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0 open("/proc/2/stat", O_RDONLY) = 8 read(8, "2 (kthreadd) S 0 0 0 0 -1 41024 "..., 1023) = 128 close(8) = 0 open("/proc/2/statm", O_RDONLY) = 8 read(8, "0 0 0 0 0 0 0\n", 1023) = 14 close(8) = 0 stat64("/proc/3", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0 open("/proc/3/stat", O_RDONLY) = 8 read(8, "3 (migration/0) S 2 0 0 0 -1 410"..., 1023) = 135 close(8) = 0 ... long list of output ...
Studying the output of strace, you can see what system calls the
top
program is calling during its main loop. In this case,top
is opening a lot of files under the/proc/
file system to get information about the running processes in the system.
How to Trace All the Running Applications
Occasionally it may be necessary to trace all the running applications
in the device. For example, it may be necessary to know which process
accesses some specific file, while not keeping it open all the time
(if it is kept open all the time, it can be found from
/proc/*/fd/
or with lsof
).
In the following example from the device, the open()
system calls are traced from all running programs started after the
DSME
process.
/home/user # strace -e trace=open \ $(cd /proc/; dsme=$(pidof dsme|cut -d' ' -f1); \ for pid in [0-9]*; do \ if [ $pid -gt $dsme ] && [ $pid != $$ ]; then \ echo "-p $pid"; \ fi; \ done)
Note: The above command does not start to trace the
DSME
. Accidentally suspending DSME
would
cause the device to reboot because DSME
keeps the device's
HW watchdog refreshed.
The output of the command will look similar to this:
/home/user # strace -e trace=open \ > $(cd /proc/; dsme=$(pidof dsme|cut -d' ' -f1); \ > for pid in [0-9]*; do \ > if [ $pid -gt $dsme ] && [ $pid != $$ ]; then \ > echo "-p $pid"; \ > fi; \ > done) Process 1019 attached - interrupt to quit Process 1036 attached - interrupt to quit Process 1039 attached - interrupt to quit ...snip... Process 13559 attached - interrupt to quit attach: ptrace(PTRACE_ATTACH, ...): No such process Process 332 attached - interrupt to quit Process 3606 attached - interrupt to quit attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted Process 434 attached - interrupt to quit Process 682 attached - interrupt to quit ...snip... Process 332 detached [pid 762] open("/dev/dspctl/ctl", O_RDWR) = 6 [pid 762] open("/dev/dspctl/ctl", O_RDWR) = 6 [pid 762] open("/dev/dspctl/ctl", O_RDWR) = 6 [pid 709] --- SIGALRM (Alarm clock) @ 0 (0) --- [pid 709] --- SIGALRM (Alarm clock) @ 0 (0) --- [pid 762] open("/dev/dspctl/ctl", O_RDWR) = 6 [pid 709] --- SIGALRM (Alarm clock) @ 0 (0) --- ... etc ...
The output shows which processes strace attaches to and what are the files those processes are opening.
strace
might not show all of them. So, basically the
output in this kind of a situation is only indicative. For more
accurate information, a specific single processes can be traced.
Other Strace Options
Here are examples of some of the useful strace options.
-f fork
The
-f fork
option tells strace to follow the execution to the forked process from the calling process. By default, strace does not follow forks.-e expr
The
-e
option is handy when it is necessary to limit the scope of the output. For example,[sbox-DIABLO_X86: ~/src/testing/gdb_example] > strace -e trace=network ./gdb_example
would only output those calls that are network-related. In this example, there would be no other output than the normal output of the program.
However,
[sbox-DIABLO_X86: ~/src/testing/gdb_example] > strace -e trace=file ./gdb_example
would only show which system calls are called related to accessing files.
-tt
The
-tt
option makes strace precede each system call with the time it was called. This could potentially be used for rough performance analysis.Start a process and trace all IPC and network related system calls it makes and print the time of day with microseconds for each traced system call.
[sbox-DIABLO_X86: ~/] > strace -tt -e trace=ipc,network ./some_program -programs -own -args