Thursday, February 5, 2015

Digging into microstate accounting

Solaris has support for microstate accounting. This gives huge insight into where an application and its threads are spending their time. It breaks down time into the (obvious) user and system, but also allows you to see the time spent waiting on page faults and other useful-to-know states.

This level of detail is available through the usage file in /proc/pid, there's a corresponding file for each lwp in /proc/pid/lwp/lwpid/lwpusage. You can find more details about the /proc file system in documentation, or reading my recent article about tracking memory use.

Here's an example of using it to report idle time, ie time when the process wasn't busy:

#include <stdio.h>
#include <sys/resource.h>
#include <unistd.h>
#include <fcntl.h>
#include <procfs.h>

void busy()
{
  for (int i=0; i<100000; i++)
  {
   double d = i;
   while (d>0) { d=d *0.5; }
  }
}

void lazy()
{
  sleep(10);
}

double convert(timestruc_t ts)
{
  return ts.tv_sec + ts.tv_nsec/1000000000.0;
}

void report_idle()
{
  prusage_t prusage;
  int fd;
  fd = open( "/proc/self/usage", O_RDONLY);
  if (fd == -1) { return; }
  read( fd, &prusage, sizeof(prusage) );
  close( fd );
  printf("Idle percent = %3.2f\n",
  100.0*(1.0 - (convert(prusage.pr_utime) + convert(prusage.pr_stime))
               /convert(prusage.pr_rtime) ) );
}

void main()
{
  report_idle();
  busy();
  report_idle();
  lazy();
  report_idle();
}

The code has two functions that take time. The first does some redundant FP computation (that cannot be optimised out unless you tell the compiler to do FP optimisations), this part of the code is CPU bound. When run the program reports low idle time for this section of the code. The second routine calls sleep(), so the program is idle at this point waiting for the sleep time to expire, hence this section is reported as being high idle time.