oprofile

Description

Oprofile is a low overhead system-wide profiler for linux. It can be used to find CPU usage bottlenecks in the whole system and within processes.

Packages

source: oprofile

binary: oprofile

Installing Oprofile

Configuring the device

In order to run oprofile on your device, the device's kernel needs to be oprofile-enabled. A pre-compiled kernel with oprofile support is provided in the kernel-diablo-oprofile package.

Getting a new kernel

The kernel package can be downloaded from the Diablo tools repository in scratchbox environment (in an ARM target) with the command

[sbox-arm: ~] > fakeroot apt-get install kernel-diablo-oprofile

This will install the kernel image to /boot/

[sbox-arm: ~] > ls -l /boot/
total 1512
-rw-r--r--  1 user users 1543892 Dec 20 10:02 zImage-oprofile-diablo-200850

the numbers at the end of the filename resemble a date and will be different if the kernel has been updated since the writing of this document.

By copying the image to outside of the scratchbox chroot environment, you'll be able to easily access it when you're ready to flash the image.

[sbox-arm: ~] > cp /boot/zImage-oprofile-diablo-200850 /tmp

Using the new kernel

There are two ways to use the kernel image. You can flash it to the device or you can just boot the device so that it uses this kernel until you reboot it again.

  • Option 1: Flashing the kernel

    To flash your new kernel image, use the linux flasher utility (as root)

    $ flasher -f --kernel /tmp/zImage-oprofile-diablo-200850 -R
    

    Flashing just the kernel does not destroy your other data. After the flasher has finished, your device is ready for oprofile.

  • Option 2: Boot with the new kernel

    $ flasher --load --boot --kernel /tmp/zImage-oprofile-diablo-200850
    

    With the boot option the power cord has to be kept plugged in until you want to revert back to the previous kernel.

Restoring the normal kernel

After you are done with oprofile you can restore the normal kernel. How you do this depends on which one of the options above you used to load the oprofile kernel.

  • Option 1: You chose to flash the oprofile kernel

    In this case the normal kernel needs to be re-flashed:

    $ flasher -f --flash-only kernel -F <FIASCO image> -R
    
    
    # The FIASCO image is the whole product image with a name like:
    # RX-44_DIABLO_3.2008.17-8_PR_COMBINED_MR0_ARM.bin
    

    Note that just re-flashing the kernel part does not overwrite any of the other parts, so your data and settings will be intact.

  • Option 2: You chose to just boot with the new kernel

    Restoring the old kernel is as easy as unplugging the power cable and power cycling the device.

Recognizing a suitable kernel

It's easy to see if your current kernel does not support oprofile by testing it with opcontrol:

Nokia-N810:~# opcontrol --status
modprobe: cannot parse modules.dep
modprobe: cannot parse modules.dep
Kernel doesn't support oprofile

Installing oprofile to the device

Providing that you have the Diablo tools repository in your APT sources.list, the easiest way to install oprofile is using apt.

Nokia-N810:~# apt-get install oprofile

This will also install binutils.

Installing debug symbols

In order to view any useful profiling information at functions level, you will have to install debugging symbols. Debugging symbols normally come with debugging (-dbg) packages. The easiest way to install all dbg packages required for a given binary is to use debug-dep-install script which comes with the maemo-debug-scripts package:

Nokia-N810:~# apt-get install maemo-debug-scripts
Nokia-N810:~# debug-dep-install /usr/bin/osso-xterm.launch

Usage

  1. On the device, type:

    Nokia-N810:~# opcontrol --init
    Nokia-N810:~# opcontrol --no-vmlinux
    Nokia-N810:~# opcontrol -e=CPU_CYCLES:100000
    

    100000 indicates the number of CPU cycles between interrupts. Increasing this number lowers down the accuracy but decreases the CPU overhead. --no-vmlinux indicates we are not interested in the kernel or do not have an unstripped kernel image.

  2. Start the usecase you are interested in and type:

    Nokia-N810:~# opcontrol --start
    
  3. When you've finished, type:

    Nokia-N810:~# opcontrol --stop
    

Now you've collected the data.

Viewing profile reports

To see basic per-process picture, type opreport:

Nokia-N810:~# opreport
CPU: ARM11 PMU, speed 0 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00 (No unit mask) count 100000
CPU_CYCLES:100000|
  samples|      %|
------------------
     1677 69.1546 no-vmlinux
      240  9.8969 libc-2.5.so
      230  9.4845 busybox
      215  8.8660 ld-2.5.so
       58  2.3918 oprofiled
    3  0.1237 ophelp
    2  0.0825 libcrypt-2.5.so

To see more detailed symbol analysis use opreport -l:

Nokia-N810:~# opreport -l|more
warning: /no-vmlinux could not be found.
BFD: /usr/lib/debug/lib/ld-2.5.so: warning: sh_link not set for section `.ARM.exidx'
BFD: /usr/lib/debug/lib/libc-2.5.so: warning: sh_link not set for section `.ARM.exidx'
CPU: ARM11 PMU, speed 0 MHz (estimated)
Counted CPU_CYCLES events (clock cycles counter) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        app name                 symbol name
1695     65.3179  no-vmlinux               (no symbols)
255       9.8266  busybox                  (no symbols)
222       8.5549  oprofiled                (no symbols)
43        1.6570  libc-2.5.so              strchr
37        1.4258  ld-2.5.so                check_match.0
32        1.2331  ld-2.5.so                do_lookup_x
17        0.6551  ld-2.5.so                _dl_relocate_object
17        0.6551  libc-2.5.so              _dl_addr
16        0.6166  ld-2.5.so                strcmp
14        0.5395  ld-2.5.so                _dl_lookup_symbol_x
13        0.5010  ld-2.5.so                __udivsi3
13        0.5010  ld-2.5.so                _dl_fixup
12        0.4624  libc-2.5.so              _int_malloc
10        0.3854  libc-2.5.so              strcmp
8         0.3083  ophelp                   (no symbols)
7         0.2697  libc-2.5.so              strpbrk
7         0.2697  libc-2.5.so              vfprintf

Profiling with callgraphs

Quite often a basic flat profile is useless. In such a cases a callgraph profile can be used. In order to profile code with callgraphs:

  1. Add -fno-omit-frame-pointer to GCC options and recompile all the code (binaries, libraries) involved. You can do this without changing the package build rules by setting the appropriate scratchbox environment variable (documented in the Scratchbox installation /scratchbox/doc/variables.txt file) before re-building the packages:

    [sbox-arm: ~] > export SBOX_BLOCK_COMPILER_ARGS="-fomit-frame-pointer"
    [sbox-arm: ~] > export SBOX_EXTRA_COMPILER_ARGS="-fno-omit-frame-pointer"
    [sbox-arm: ~] > cd package-1/
    [sbox-arm: ~/package-1] > dpkg-buildpackage -rfakeroot
    [sbox-arm: ~/package-1] > cd ../package-2/
    [sbox-arm: ~/package-2] > dpkg-buildpackage -rfakeroot
    ...
    
  2. Install re-built code and debug packages on the device

  3. Init oprofile as usually, but add -c parameter:

    Nokia-N810:~# opcontrol --init
    Nokia-N810:~# opcontrol --no-vmlinux
    Nokia-N810:~# opcontrol -e=CPU_CYCLES:100000 -c
    
  4. Add -c to opreport:

    Nokia-N810:~# opreport -l -c
    

Viewing reports from a PC

opreport -l, and especially opreport -c -l can take quite a long time (10 minutes) when fired up on N800/N810 devices. Therefore, it often makes sense to run opreport in scratchbox.

  1. Configure scratchbox target in a way that its binaries and libraries 100% match the target's.

  2. Collect profiling data as usual

  3. Copy contents of /var/lib/oprofile from the device to the corresponding directory in scratchbox target.

  4. in scratchbox, apt-get install maemo-debug-scripts (this may not be omitted)

  5. install debug packages either with debug-dep-install or by hand

Note: the binaries and libraries in the scratchbox target must match what's in the device, otherwise you will get bogus results.

Oprofile with kcachegrind

kcachegrind is a useful GUI tool for viewing performance data interactively. It comes with many modern linux distros.

To use it:

  1. Get the callgraph oprofile data (see above) and install the same packages also to scratchbox.

  2. Copy the profile data to scratchbox session as described above.

  3. install kcachegrind-converters package on HOST (debian, ubuntu)

  4. in scratchbox: opreport -gdf | op2calltree (you might want to copy op2calltree script somewhere on target)

  5. the resulting files can now be opened with kcachegrind on host, provided you set it to display ALL files (extensions are wrong)

Links

oprofile man page

See Also

oprofileui