Thursday, April 28, 2011

Exploring Performance Analyzer experiments

I was recently profiling a script to see where the time went, and I ended up wanting to extract the profiles for just a single component. The structure of an analyzer experiment is that there's a single root directory (test.1.er) and inside that there's a single level of subdirectories representing all the child processes. Each subdirectory is a profile of a single process - these can all be loaded as individual experiments. Inside every experiment directory there is a log.xml file. This file contains a summary of what the experiment contains.

The name of the executable that was run is held on an xml "process" line. So the following script can extract a list of all the profiles of a particular application.

$ grep myapp `find test.1.er -name 'log.xml'`|grep process | sed 's/\:.*//' > myapp_profiles

Once we have a list of every time my application was run, I can now extract the times from that list, and sort them using the following line:

$ grep exit `cat <myapp_files`|sed 's/.*tstamp=\"//'|sed 's/\".*//'|sort -n 

Once I have the list of times I can use then locate an experiment with a particular runtime - it's probably going to be the longest runtime:

$ grep exit `cat <myapp_files`|grep 75.9 

Catching the macro bug

I have to admit a dislike for macros. I've seen plenty of codes where it has been a Herculean task to figure out exactly what source code generated the particular assembly code. So perhaps I'm biased to begin with. However, I recently hit another annoyance with macros. The following code looks pretty benign:

#include <stdio.h>
#include <sys/time.h>

int timercmp(struct timeval *end, struct timeval *begin,struct timeval *result)
{
  printf("TIMERCMP");
}

However, at compile time it produces the following error.

cc error.c
"error.c", line 4: syntax error before or at: struct
"error.c", line 4: syntax error before or at: )
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_sec
"error.c", line 4: syntax error before or at: )
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_sec
"error.c", line 4: syntax error before or at: )
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_usec
"error.c", line 4: syntax error before or at: ->
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_usec
"error.c", line 4: syntax error before or at: )
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_sec
"error.c", line 4: identifier redefined: result
        current : function(pointer to struct timeval {long tv_sec, long tv_usec}) returning pointer to struct timeval {long tv_sec, long tv_usec}
        previous: function(pointer to struct timeval {long tv_sec, long tv_usec}) returning pointer to struct timeval {long tv_sec, long tv_usec} : "error.c", line 4
"error.c", line 4: syntax error before or at: ->
"error.c", line 4: warning: old-style declaration or incorrect type for: tv_sec
cc: acomp failed for error.c

The C++ compiler produces fewer errors:

 CC error.c
"error.c", line 4: Error: No direct declarator preceding "(".
1 Error(s) detected.

Of course, the problem is that timercmp is a macro defined in sys/time.h. This is revealed when the preprocessed source is examined:

$ cc -P error.c
$ tail error.i

int  ( ( ( struct timeval * end ) -> tv_sec == ( struct timeval * begin ) -> tv_sec ) ? ( ( struct timeval * end ) -> tv_usec struct timeval * result ( struct timeval * begin ) -> tv_usec ) : ( ( struct timeval * end ) -> tv_sec struct timeval * result ( struct timeval * begin ) -> tv_sec ) )
{
  printf("TIMERSUB");
}

Now, we can narrow the problem down more rapidly by trying to compile the preprocessed code. This takes us to the exact line with the problem, and it's obvious from inspection exactly what is going on:

$ cc error.i
"error.i", line 1135: syntax error before or at: struct
"error.i", line 1135: syntax error before or at: )

Monday, April 25, 2011

Using pragma opt

The Studio compiler has the ability to control the optimisation level that is applied to particular functions in an application. This can be useful if the functions are designed to work at a specific optimisation level, or if the application fails at a particular optimisation level, and you need to figure out where the problem lies.

The optimisation levels are controlled through pragma opt. The following steps need to be followed to use the pragma:

  • The directive needs to be inserted into the source file. The format of the directive is #pragma opt /level/ (/function/). This needs to be inserted into the code before the start of the function definition, but after the function header.
  • The code needs to be compiled with the flag -xmaxopt=level. This sets the maximum optimisation level for all functions in the file - including those tagged with #pragma opt.

We can see this in action using the following code snippet. This contains two identical functions, both return the square of a global variable. However, we are using #pragma opt to control the optimisation level of the function f().

int f();
int g();

#pragma opt 2 (f)

int d;

int f()
{
  return d*d;
}

int g()
{
  return d*d;
}

The code is compiled with the flag -xmaxopt=5, this specifies the maximum optimisation level that can be applied to any functions in the file.

$ cc -O -xmaxopt=5 -S opt.c

If we compare the disassembly for the functions f() and g(), we can see that g() is more optimal as it does not reload the global data.

                        f:
/* 000000          0 */         sethi   %hi(d),%o5

!   10                !  return d*d;

/* 0x0004         10 */         ldsw    [%o5+%lo(d)],%o4 ! volatile    // First load of d
/* 0x0008            */         ldsw    [%o5+%lo(d)],%o3 ! volatile    // Second load of d
/* 0x000c            */         retl    ! Result =  %o0
/* 0x0010            */         mulx    %o4,%o3,%o0

                        g:
/* 000000         14 */         sethi   %hi(d),%o5
/* 0x0004            */         ld      [%o5+%lo(d)],%o4               // Single load of d

!   15                !  return d*d;

/* 0x0008         15 */         sra     %o4,0,%o3
/* 0x000c            */         retl    ! Result =  %o0
/* 0x0010            */         mulx    %o3,%o3,%o0

Friday, April 1, 2011

Profiling scripts

One feature that crept into the Oracle Solaris Studio 12.2 release was the ability for the performance analyzer to follow scripts. It is necessary to set the environment variable SP_COLLECTOR_SKIP_CHECKEXEC to use this feature - as shown below.

bash-3.00$ file `which which`
/bin/which:     executable /usr/bin/csh script
bash-3.00$ collect which
Target `which' is not a valid ELF executable
bash-3.00$ export SP_COLLECTOR_SKIP_CHECKEXEC=1
bash-3.00$ collect which
Creating experiment database test.1.er ...