
        The Compaq Simple Instruction Profiler for Handhelds

  ------------------------------------------------------------------------
                  
Overview

Profiling is done in two steps:

  1. Data is gathered about applications as they run on the client
machine. This step is called profiling.
  2. The acquired data is processed. This step is may be done on a host
computer or on a client.

  ------------------------------------------------------------------------

Installation

You must have the profiler module (profiler.o) loaded on your client
machine in order to profile.  When the module is inserted, a major number
will be dynamically allocated for it.  Make sure that a:
  mknod /dev/prof c <MAJ_NUM> 0
has been done before trying to use it.  You must also have the profiler
daemon (profd). 

In order to get the greatest advantage out of profiling, you will also need
a set of unstripped versions of every binary/library/kernel that you are
running on the client machine that you are interested in getting detailed
information about.  For example, if you do not have such information, the
profiler may tell you that 4% of the time was spent in libc.so.6.  If you
have an unstripped version of libc.so.6, it will be able to give a function
by function breakdown of the time spent in libc.so.6.

At the current time, there is no canonical directory of unstripped
libraries.  When there is, it can be added to the iscan perl script so
that the script will automatically scan it.  

Notes:

  1. Make sure the iprof, iscan, and imageid binaries are in your
     path before attempting to process profile data.  The iprof and
     iscan scripts can be found in tools/profiling/utilities; imageid
     will be in tools/profiling/daemon (arm, alpha, or x86 versions can be
     compiled).  

  2. profd can be built, if needed, from the tools/profiling/daemon
     directory.  Make sure you have built libutilities.a in tools/util
     first.  You will have to set the LINUXHOME variable in the Makefiles
     to the location of your linux directory, if it is not where the
     makefile expects it. 

  3. Directories containing the basic set of libraries, applications, and
     the location of linux kernels that you'd like scanned should be added
     to iscan perl script (or they can be passed as arguments).  For
     example, in my script I have:  

       @libs = (
         "/skiff/local/lib/gcc-lib/arm-linux/2.95.2/",
         "/wrl/proj/itsy/kerr/ipaq/lib",
       );
       
       @apps = (
       );
       
       @linuxdirs = (
         "/wrl/proj/itsy/kerr/linuxes/linux-2.4.0-test8-rmk5-np2-hh2/kernel"
         "/wrl/proj/itsy/kerr/linuxes/linux-2.4.0-test10-rmk2-np2/kernel"
       );

       
     You can now process the profile data.

  ------------------------------------------------------------------------

Profiling (i.e., Gathering Data)

The Itsy profiler is an interrupt-driven statistical profiler. The
profiling functionality is provided by the profd daemon.

To start profiling, on the client create an empty "database" directory
(i.e., db) for storing profiles, and then invoke profd. For example:

     % mkdir db
     % profd db &

profd accepts a number of comman-line options. These are:

   * a "-f" option to specify the sampling frequency in samples per second
     (defaults to 256);
   * a "-r" option to specify whether or not to randomize the sampling
     period (on by default),
   * a "-p" option to separate out data by process id (pid); and
   * a "-v" option that will result in verbose logging.

To stop profiling, kill the profd process. Profd will catch the signal,
save its in-memory sample data to files in "db", and terminate.

Here is an example directory listing after running profd on an Itsy for a
couple of minutes:

/bin/ls -l db
total 30
-rw-r--r--   1 root     root          134 Jun  4 12:05 bash-0e3383fb.prof
-rw-r--r--   1 root     root          130 Jun  4 12:05 gmanager-d98893c8.prof
-rw-r--r--   1 root     root          191 Jun  4 12:05 kaffe-dce80653.prof
-rw-r--r--   1 root     root         2079 Jun  4 12:05 kernel-00000000.prof
-rw-r--r--   1 root     root        13022 Jun  4 12:05 kernel.syms
-rw-r--r--   1 root     root          305 Jun  4 12:05 ldso1-4631481a.prof
-rw-r--r--   1 root     root          408 Jun  4 12:05 libawtso-73723a7c.prof
-rw-r--r--   1 root     root          155 Jun  4 12:05 libcso4627-5f673247.prof
-rw-r--r--   1 root     root          411 Jun  4 12:05 libcso6-0fa854e0.prof
-rw-r--r--   1 root     root          211 Jun  4 12:05 libggiso-bb74823f.prof
-rw-r--r--   1 root     root          717 Jun  4 12:05 libkaffevmso-e86f76dc.prof
-rw-r--r--   1 root     root          234 Jun  4 12:05 libnativeso-50d434a5.prof
-rw-r--r--   1 root     root          220 Jun  4 12:05 nomap-00000000.prof
-rw-r--r--   1 root     root          122 Jun  4 12:05 open-b03fac0d.prof
-rw-r--r--   1 root     root          138 Jun  4 12:05 tcsh-80dc1857.prof

The filename format is name-imageid[-pid].prof, where name is the
executable image whose unique identifier (i.e., signature) is imageid, and
pid is the process id of the executable. Note that the imageid is generated
by iscan, and pid is included only if the -p option is given to
profd. Also, note that profd scans the entire system starting at "/" upon
startup. Each .prof profile is simply a plain ASCII file (future versions
could use compression via zlib). The profile starts with a short hearder
containing key/value paris that indicate the image pathnames, profiler
verison, etc.  This hearder is followed by sample data, arranged in
lines. Each line has the format "offset count", where "offset" is a hex
file offset -- i.e. a byte offset into the executable image, suitable for
lseek-ing, and "count" is the number of samples that landed on the
instruction at that offset.  Finally, note that the SA1100 drains its
pipeline before handling an interrupt (we believe), so instructions may not
be uniformly sampled.

  ------------------------------------------------------------------------

Processing the Data

Data processing is done on any machine that has access to the unstripped
versions of the binaries/libraries that you are interested in getting
access to.  If necessary, copy the entire profile directory created by
profd to this machine for processing. The most basic level of information
available is a summary of where the cycles (i.e., samples) were spent on a
per-image basis. Here is typical output:

iprof *.prof
--------------------------------------------------------------
  cycles         %     cum%  image
   67448    58.37%   58.37%  crafty
   28278    24.47%   82.84%  kernel
    8477     7.34%   90.18%  libkaffevm.so
    4951     4.28%   94.47%  libc.so.6
    4008     3.47%   97.93%  libawt.so
    1321     1.14%   99.08%  kaffe
     876     0.76%   99.84%  nomap
      49     0.04%   99.88%  profd
      49     0.04%   99.92%  libggi.so
      32     0.03%   99.95%  libnative.so
      25     0.02%   99.97%  ld.so.1
      17     0.01%   99.98%  pppd
      10     0.01%   99.99%  libui.so
       2     0.00%   99.99%  libm.so.6
       1     0.00%  100.00%  libc.so.4.6.27
       1     0.00%  100.00%  open
       1     0.00%  100.00%  liboss.so
       1     0.00%  100.00%  bash
       1     0.00%  100.00%  gmanager
       1     0.00%  100.00%  libio.so
  115549       Total
--------------------------------------------------------------

All you need to generate this is a set of profile data. profd (see above)
will generate these files for you. Note that:

  1. "kernel" includes time spent in loaded modules

  2. "nomap" includes all instructions not associated with image files (e.g.
     jit'ed java code)

Also available is a per-function breakdown (for all images for which
unstripped versions are available). Because we keep mostly stripped images
on the Itsy, the profiler generates a signature of the image at run time,
and compares this signature to the versions available when iprof digests
the data. A per-user map from signatures to file locations needs to be
generated before running the profiler in the per-function mode:

      iscan > MAPFILE

Note that iscan searches a set of default directories for unstripped
binaries. You can specify other directories on the command line, separating
each with a space. Also note that you will have to regenerate this map file
every time the images that you are profiling change.

A simple function breakdown example is:

iprof -f -mapfile map crafty-c642cf09.prof
--------------------------------------------------------------
/wrl/proj/itsy/profile/apps/crafty

  cycles         %     cum%        addr function
   13830    20.50%   20.50%  0x02004934 Evaluate
    5497     8.15%   28.65%  0x02016224 Attacked
    5024     7.45%   36.10%  0x0201173c MakeMove
    4576     6.78%   42.89%  0x0201a7d4 ClearHashTables
    4297     6.37%   49.26%  0x0200bdd4 GenerateCaptures
    4202     6.23%   55.49%  0x02001f64 Search
    3765     5.58%   61.07%  0x02038248 Iterate
...
       1     0.00%  100.00%  0x0202053c EPDTokenize
   67448    Total
--------------------------------------------------------------

If you want a function breakdown of kernel code, you will need to specify
the kernel map file to iprof with the "-kernelmapfile KERNELMAPFILE"
flag, where KERNELMAPFILE is automatically generated by profd, is called
kernel.syms, and will be stored in the same directory as the *.prof files.

iprof has been designed to work with jit'ed code as well.  If an
appropriate map file is generated, a function-by-function breakdown can be
done.  An early version of kaffe was capable of dumpting such a file, but
at this time it probably does not work.  If anyone is interested in
profiling jit'ed code, they will have to do some hacking.


Please note that the version of the files that you are running on your
client machine may not match anything the profiler can find. 


Finally, the command line

iprof -f -mapfile map -kernelmapfile kernel.syms *.prof

will generate the most complete listing possible, showing the number of
cycles attributable to each function profiled sorted in order of the most
frequently profiled to the least frequently profiled, where kernel.syms is
the kernel map file.

  ------------------------------------------------------------------------

This profiler was originally written for the Itsy Pocket Computer, a
joint project between Compaq's Western Research Laboratory and Systems
Research Center, by Carl Waldspurger and Deborah A. Wallach.

     Use consistent with the GNU GPL is permitted,
     provided that this copyright notice is
     preserved in its entirety in all copies and derived works.

     COMPAQ COMPUTER CORPORATION MAKES NO WARRANTIES, EXPRESSED OR IMPLIED,
     AS TO THE USEFULNESS OR CORRECTNESS OF THIS CODE OR ITS
     FITNESS FOR ANY PARTICULAR PURPOSE.

  ------------------------------------------------------------------------
