oprofile
Description
Oprofile is a low overhead system-wide profiler for linux. It can be used to find CPU usage bottlenecks in the whole system and within processes.
Packages
source: oprofile
binary: oprofile
Installing Oprofile
Configuring the device
In order to run oprofile on your device, the device's kernel needs to be oprofile-enabled. A pre-compiled kernel with oprofile support is provided in the kernel-diablo-oprofile package.
Getting a new kernel
The kernel package can be downloaded from the Diablo tools repository in scratchbox environment (in an ARM target) with the command
[sbox-arm: ~] > fakeroot apt-get install kernel-diablo-oprofile
This will install the kernel image to /boot/
[sbox-arm: ~] > ls -l /boot/ total 1512 -rw-r--r-- 1 user users 1543892 Dec 20 10:02 zImage-oprofile-diablo-200850
the numbers at the end of the filename resemble a date and will be different if the kernel has been updated since the writing of this document.
By copying the image to outside of the scratchbox chroot environment, you'll be able to easily access it when you're ready to flash the image.
[sbox-arm: ~] > cp /boot/zImage-oprofile-diablo-200850 /tmp
Using the new kernel
There are two ways to use the kernel image. You can flash it to the device or you can just boot the device so that it uses this kernel until you reboot it again.
Option 1: Flashing the kernel
To flash your new kernel image, use the linux flasher utility (as root)
$ flasher -f --kernel /tmp/zImage-oprofile-diablo-200850 -R
Flashing just the kernel does not destroy your other data. After the flasher has finished, your device is ready for oprofile.
Option 2: Boot with the new kernel
$ flasher --load --boot --kernel /tmp/zImage-oprofile-diablo-200850
With the boot option the power cord has to be kept plugged in until you want to revert back to the previous kernel.
Restoring the normal kernel
After you are done with oprofile you can restore the normal kernel. How you do this depends on which one of the options above you used to load the oprofile kernel.
Option 1: You chose to flash the oprofile kernel
In this case the normal kernel needs to be re-flashed:
$ flasher -f --flash-only kernel -F <FIASCO image> -R # The FIASCO image is the whole product image with a name like: # RX-44_DIABLO_3.2008.17-8_PR_COMBINED_MR0_ARM.bin
Note that just re-flashing the kernel part does not overwrite any of the other parts, so your data and settings will be intact.
Option 2: You chose to just boot with the new kernel
Restoring the old kernel is as easy as unplugging the power cable and power cycling the device.
Recognizing a suitable kernel
It's easy to see if your current kernel does not support oprofile by testing it with opcontrol:
Nokia-N810:~# opcontrol --status modprobe: cannot parse modules.dep modprobe: cannot parse modules.dep Kernel doesn't support oprofile
Installing oprofile to the device
Providing that you have the Diablo tools repository in your APT sources.list, the easiest way to install oprofile is using apt.
Nokia-N810:~# apt-get install oprofile
This will also install binutils.
Installing debug symbols
In order to view any useful profiling information at functions level, you will have to install debugging symbols. Debugging symbols normally come with debugging (-dbg) packages. The easiest way to install all dbg packages required for a given binary is to use debug-dep-install script which comes with the maemo-debug-scripts package:
Nokia-N810:~# apt-get install maemo-debug-scripts Nokia-N810:~# debug-dep-install /usr/bin/osso-xterm.launch
Usage
On the device, type:
Nokia-N810:~# opcontrol --init Nokia-N810:~# opcontrol --no-vmlinux Nokia-N810:~# opcontrol -e=CPU_CYCLES:100000
100000 indicates the number of CPU cycles between interrupts. Increasing this number lowers down the accuracy but decreases the CPU overhead. --no-vmlinux indicates we are not interested in the kernel or do not have an unstripped kernel image.
Start the usecase you are interested in and type:
Nokia-N810:~# opcontrol --start
When you've finished, type:
Nokia-N810:~# opcontrol --stop
Now you've collected the data.
Viewing profile reports
To see basic per-process picture, type opreport:
Nokia-N810:~# opreport CPU: ARM11 PMU, speed 0 MHz (estimated) Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00 (No unit mask) count 100000 CPU_CYCLES:100000| samples| %| ------------------ 1677 69.1546 no-vmlinux 240 9.8969 libc-2.5.so 230 9.4845 busybox 215 8.8660 ld-2.5.so 58 2.3918 oprofiled 3 0.1237 ophelp 2 0.0825 libcrypt-2.5.so
To see more detailed symbol analysis use opreport -l:
Nokia-N810:~# opreport -l|more warning: /no-vmlinux could not be found. BFD: /usr/lib/debug/lib/ld-2.5.so: warning: sh_link not set for section `.ARM.exidx' BFD: /usr/lib/debug/lib/libc-2.5.so: warning: sh_link not set for section `.ARM.exidx' CPU: ARM11 PMU, speed 0 MHz (estimated) Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00 (No unit mask) count 100000 samples % app name symbol name 1695 65.3179 no-vmlinux (no symbols) 255 9.8266 busybox (no symbols) 222 8.5549 oprofiled (no symbols) 43 1.6570 libc-2.5.so strchr 37 1.4258 ld-2.5.so check_match.0 32 1.2331 ld-2.5.so do_lookup_x 17 0.6551 ld-2.5.so _dl_relocate_object 17 0.6551 libc-2.5.so _dl_addr 16 0.6166 ld-2.5.so strcmp 14 0.5395 ld-2.5.so _dl_lookup_symbol_x 13 0.5010 ld-2.5.so __udivsi3 13 0.5010 ld-2.5.so _dl_fixup 12 0.4624 libc-2.5.so _int_malloc 10 0.3854 libc-2.5.so strcmp 8 0.3083 ophelp (no symbols) 7 0.2697 libc-2.5.so strpbrk 7 0.2697 libc-2.5.so vfprintf
Profiling with callgraphs
Quite often a basic flat profile is useless. In such a cases a callgraph profile can be used. In order to profile code with callgraphs:
Add -fno-omit-frame-pointer to GCC options and recompile all the code (binaries, libraries) involved. You can do this without changing the package build rules by setting the appropriate scratchbox environment variable (documented in the Scratchbox installation /scratchbox/doc/variables.txt file) before re-building the packages:
[sbox-arm: ~] > export SBOX_BLOCK_COMPILER_ARGS="-fomit-frame-pointer" [sbox-arm: ~] > export SBOX_EXTRA_COMPILER_ARGS="-fno-omit-frame-pointer" [sbox-arm: ~] > cd package-1/ [sbox-arm: ~/package-1] > dpkg-buildpackage -rfakeroot [sbox-arm: ~/package-1] > cd ../package-2/ [sbox-arm: ~/package-2] > dpkg-buildpackage -rfakeroot ...
Install re-built code and debug packages on the device
Init oprofile as usually, but add -c parameter:
Nokia-N810:~# opcontrol --init Nokia-N810:~# opcontrol --no-vmlinux Nokia-N810:~# opcontrol -e=CPU_CYCLES:100000 -c
Add -c to opreport:
Nokia-N810:~# opreport -l -c
Viewing reports from a PC
opreport -l, and especially opreport -c -l can take quite a long time (10 minutes) when fired up on N800/N810 devices. Therefore, it often makes sense to run opreport in scratchbox.
Configure scratchbox target in a way that its binaries and libraries 100% match the target's.
Collect profiling data as usual
Copy contents of /var/lib/oprofile from the device to the corresponding directory in scratchbox target.
in scratchbox, apt-get install maemo-debug-scripts (this may not be omitted)
install debug packages either with debug-dep-install or by hand
Note: the binaries and libraries in the scratchbox target must match what's in the device, otherwise you will get bogus results.
Oprofile with kcachegrind
kcachegrind is a useful GUI tool for viewing performance data interactively. It comes with many modern linux distros.
To use it:
Get the callgraph oprofile data (see above) and install the same packages also to scratchbox.
Copy the profile data to scratchbox session as described above.
install kcachegrind-converters package on HOST (debian, ubuntu)
in scratchbox: opreport -gdf | op2calltree (you might want to copy op2calltree script somewhere on target)
the resulting files can now be opened with kcachegrind on host, provided you set it to display ALL files (extensions are wrong)