Thursday, May 30, 2013

Binding to the current processor

Just hacked up a snippet of code to stop a thread migrating to a different CPU while it runs. This should help the thread get, and keep, local memory. This in turn should reduce run-to-run variance.

#include <sys/processor.h>

void bindnow()
{
  processorid_t proc = getcpuid();
  if (processor_bind(P_LWPID, P_MYID, proc, 0)) 
    { printf("Warning: Binding failed\n"); } 
  else
    { printf("Bound to CPU %i\n", proc); }
}

Tuesday, May 28, 2013

One executable, many platforms

Different processors have different optimal sequences of code. Fortunately, most of the time the differences are minor, and we can easily accommodate them by generating generic code.

If you needed more than this, then the "old" model was to use dynamic string tokens to pick the best library for the platform. This works well, and was the mechanism that libc.so used. However, the downside is that you now need to ship a bundle of libraries with the application; this can get (and look) a bit messy.

There's a "new" approach that uses a family of capability functions. The idea here is that multiple versions of the routine are linked into the executable, and the runtime linker picks the best for the platform that the application is running on. The routines are denoted with a suffix, after a percentage sign, indicating the platform. For example here's the family of memcpy() implementations in libc:

$ elfdump -H /usr/lib/libc.so.1 2>&1 |grep memcpy
      [10]  0x0010627c 0x00001280  FUNC LOCL  D    0 .text          memcpy%sun4u
      [11]  0x001094d0 0x00000b8c  FUNC LOCL  D    0 .text          memcpy%sun4u-opl
      [12]  0x0010a448 0x000005f0  FUNC LOCL  D    0 .text          memcpy%sun4v-hwcap1
...

It takes a bit of effort to produce a family of implementations. Imagine we want to print something different when an application is run on a sun4v machine. First of all we'll have a bit of code that prints out the compile-time defined string that indicates the platform we're running on:

#include <stdio.h>
static char name[]=PLATFORM;

double platform()
{
  printf("Running on %s\n",name);
}

To compile this code we need to provide the definition for PLATFORM - suitably escaped. We will need to provide two versions, a generic version that can always run, and a platform specific version that runs on sun4v platforms:

$ cc -c -o generic.o p.c -DPLATFORM=\"Generic\"
$ cc -c -o sun4v.o   p.c -DPLATFORM=\"sun4v\"

Now we have a specialised version of the routine platform() but it has the same name as the generic version, so we cannot link the two into the same executable. So what we need to do is to tag it as being the version we want to run on sun4v platforms.

This is a two step process. The first step is that we tag the object file as being a sun4v object file. This step is only necessary if the compiler has not already tagged the object file. The compiler will tag the object file appropriately if it uses instructions from a particular architecture - for example if you compiled explicitly targeting T4 using -xtarget=t4. However, if you need to tag the object file, then you can use a mapfile to add the appropriate hardware capabilities:

$mapfile_version 2

CAPABILITY sun4v {
        MACHINE=sun4v;
};

We can then ask the linker to apply these hardware capabilities from the mapfile to the object file:

$ ld -r -o sun4v_1.o -Mmapfile.sun4v sun4v.o

You can see that the capabilities have been applied using elfdump:

$ elfdump -H sun4v_1.o

Capabilities Section:  .SUNW_cap

 Object Capabilities:
     index  tag               value
       [0]  CA_SUNW_ID       sun4v
       [1]  CA_SUNW_MACH     sun4v

The second step is to take these capabilities and apply them to the functions. We do this using the linker option -zsymbolcap

:
$ ld -r -o sun4v_2.o -z symbolcap sun4v_1.o

You can now see that the platform function has been tagged as being for sun4v hardware:

$ elfdump -H sun4v_2.o

Capabilities Section:  .SUNW_cap

 Symbol Capabilities:
     index  tag               value
       [1]  CA_SUNW_ID       sun4v
       [2]  CA_SUNW_MACH     sun4v

  Symbols:
     index    value      size      type bind oth ver shndx          name
      [24]  0x00000010 0x00000070  FUNC LOCL  D    0 .text          platform%sun4v

And finally you can combine the object files into a single executable. The main() routine of the executable calls platform() which will print out a different message depending on the platform. Here's the source to main():

extern void platform();

int main()
{
  platform();
}

Here's what happens when the program is compiled and run on a non-sun4v platform:

$ cc -o main -O main.c sun4v_2.o generic.o
$ ./main
Running on Generic

Here's the same executable running on a sun4v platform:

$ ./main
Running on sun4v