Monday, December 30, 2013


Before I start, this is not about security, it's probably the antithesis of security. So I'd recommend starting by reading about how using privileges can break the security of your system.

There are three tools that I regularly use that require escalated privileges: dtrace, cpustat, and busstat. You can read up on the way that Solaris manages privileges. But if you know what you want to do, the process to figure out how to get the necessary privileges is reasonable straightforward.

To find out what privileges you have you can use the ppriv -v $$ command. This will report all the privileges for the current shell.

To find out what privileges are stopping you from running a command, you should run it under ppriv -eD command. For example:

ppriv -eD cpustat -c instruction_counts 1 1
cpustat[13222]: missing privilege "sys_resource" (euid = 84945, syscall = 128) needed at rctl_rlimit_set+0x98
cpustat[13222]: missing privilege "cpc_cpu" (euid = 84945, syscall = 5) needed at kcpc_open+0x4

It is also possible to list all the privileges on the system using ppriv -l. This is helpful if the privilege is has a name that maps onto what you want to do. The privileges for dtrace are good examples of this:

$ ppriv -l|grep dtrace

You can then use usermod -K ... to assign the necessary privileges to a user. For example:

$ usermod -K defaultpriv=basic,sys_resource,cpc_cpu username

Information about privileges for users is recorded in /etc/user_attr, so it is possible to directly edit that file to add or remove privileges.

Using this approach you can determine that busstat needs sys_config, cpustat needs sys_resource and cpc_cpu, and dtrace needs dtrace_kernel, dtrace_proc, and dtrace_user.

Thursday, August 29, 2013

Timezone troubles when dual booting

I have a laptop that dual boots Solaris and Windows XP. When I switched between the two OSes I would have to reset the clock because the time would be eight hours out. This has been naggging at me for a while, so I dug into what was going on.

It seems that Windows assumes that the Real-Time Clock (RTC) in the bios is using local time. So it will read the clock and display whatever time is shown there.

Solaris on the other hand assumes that the clock is in Universal Time Format (UTC), so you have to apply a time zone transformation in order to get to the local time.

Obviously, if you adjust the clock to make one correct, then the other is wrong.

To me, it seems more natural to have the clock in a laptop work on UTC - because when you travel the local time changes. There is a registry setting in Windows that, when set to 1, tells the machine to use UTC:


However, it has some problems and is potentially not robust over sleep. So we have to work the other way, and get Solaris to use local time. Fortunately, this is a relatively simple change running the following as root (pick the appropriate timezone for your location):

rtc -z US/Pacific

Tuesday, August 27, 2013

My Oracle Open World and JavaOne schedule

I've got my schedule for Oracle Open World and JavaOne:

Note that on Thursday I have about 30 minutes between my two talks, so expect me to rush out of the database talk in order to get to the Java talk.

Friday, August 9, 2013

How to use a lot of threads ....

The SPARC M5-32 has 1,536 virtual CPUs. There is a lot that you can do with that much resource and a bunch of us sat down to write a white paper discussing the options(revised link).

There are a couple of key observations in there. First of all it is quite likely that such a large system will not end up running a single instance of a single application. Therefore it is useful to understand the various options for virtualising the system. The second observation is that there are a number of details to take care of when writing an application that is expected to scale to large numbers of threads.

Anyhow, I hope you find it a useful read.

Monday, July 8, 2013

JavaOne and Oracle Open World 2013

I'll be up at both JavaOne and Oracle Open World presenting. I have a total of three presentations:

  • Mixed-Language Development: Leveraging Native Code from Java
  • Performance Tuning Where Java Meets the Hardware
  • Developing Efficient Database Applications with C and C++

I'm excited by these opportunities - particularly working with Charlie Hunt diving into Java Performance.

Thursday, May 30, 2013

Binding to the current processor

Just hacked up a snippet of code to stop a thread migrating to a different CPU while it runs. This should help the thread get, and keep, local memory. This in turn should reduce run-to-run variance.

#include <sys/processor.h>

void bindnow()
  processorid_t proc = getcpuid();
  if (processor_bind(P_LWPID, P_MYID, proc, 0)) 
    { printf("Warning: Binding failed\n"); } 
    { printf("Bound to CPU %i\n", proc); }

Tuesday, May 28, 2013

One executable, many platforms

Different processors have different optimal sequences of code. Fortunately, most of the time the differences are minor, and we can easily accommodate them by generating generic code.

If you needed more than this, then the "old" model was to use dynamic string tokens to pick the best library for the platform. This works well, and was the mechanism that used. However, the downside is that you now need to ship a bundle of libraries with the application; this can get (and look) a bit messy.

There's a "new" approach that uses a family of capability functions. The idea here is that multiple versions of the routine are linked into the executable, and the runtime linker picks the best for the platform that the application is running on. The routines are denoted with a suffix, after a percentage sign, indicating the platform. For example here's the family of memcpy() implementations in libc:

$ elfdump -H /usr/lib/ 2>&1 |grep memcpy
      [10]  0x0010627c 0x00001280  FUNC LOCL  D    0 .text          memcpy%sun4u
      [11]  0x001094d0 0x00000b8c  FUNC LOCL  D    0 .text          memcpy%sun4u-opl
      [12]  0x0010a448 0x000005f0  FUNC LOCL  D    0 .text          memcpy%sun4v-hwcap1

It takes a bit of effort to produce a family of implementations. Imagine we want to print something different when an application is run on a sun4v machine. First of all we'll have a bit of code that prints out the compile-time defined string that indicates the platform we're running on:

#include <stdio.h>
static char name[]=PLATFORM;

double platform()
  printf("Running on %s\n",name);

To compile this code we need to provide the definition for PLATFORM - suitably escaped. We will need to provide two versions, a generic version that can always run, and a platform specific version that runs on sun4v platforms:

$ cc -c -o generic.o p.c -DPLATFORM=\"Generic\"
$ cc -c -o sun4v.o   p.c -DPLATFORM=\"sun4v\"

Now we have a specialised version of the routine platform() but it has the same name as the generic version, so we cannot link the two into the same executable. So what we need to do is to tag it as being the version we want to run on sun4v platforms.

This is a two step process. The first step is that we tag the object file as being a sun4v object file. This step is only necessary if the compiler has not already tagged the object file. The compiler will tag the object file appropriately if it uses instructions from a particular architecture - for example if you compiled explicitly targeting T4 using -xtarget=t4. However, if you need to tag the object file, then you can use a mapfile to add the appropriate hardware capabilities:

$mapfile_version 2


We can then ask the linker to apply these hardware capabilities from the mapfile to the object file:

$ ld -r -o sun4v_1.o -Mmapfile.sun4v sun4v.o

You can see that the capabilities have been applied using elfdump:

$ elfdump -H sun4v_1.o

Capabilities Section:  .SUNW_cap

 Object Capabilities:
     index  tag               value
       [0]  CA_SUNW_ID       sun4v
       [1]  CA_SUNW_MACH     sun4v

The second step is to take these capabilities and apply them to the functions. We do this using the linker option -zsymbolcap

$ ld -r -o sun4v_2.o -z symbolcap sun4v_1.o

You can now see that the platform function has been tagged as being for sun4v hardware:

$ elfdump -H sun4v_2.o

Capabilities Section:  .SUNW_cap

 Symbol Capabilities:
     index  tag               value
       [1]  CA_SUNW_ID       sun4v
       [2]  CA_SUNW_MACH     sun4v

     index    value      size      type bind oth ver shndx          name
      [24]  0x00000010 0x00000070  FUNC LOCL  D    0 .text          platform%sun4v

And finally you can combine the object files into a single executable. The main() routine of the executable calls platform() which will print out a different message depending on the platform. Here's the source to main():

extern void platform();

int main()

Here's what happens when the program is compiled and run on a non-sun4v platform:

$ cc -o main -O main.c sun4v_2.o generic.o
$ ./main
Running on Generic

Here's the same executable running on a sun4v platform:

$ ./main
Running on sun4v

Monday, April 1, 2013

OpenMP and language level parallelisation

The C11 and C++11 standards introduced some very useful features into the language. In particular they provided language-level access to threading and synchronisation primitives. So using the new standards we can write multithreaded code that compiles and runs on standard compliant platforms. I've tackled translating Windows and POSIX threads before, but not having to use a shim is fantastic news.

There's some ideas afoot to do something similar for higher level parallelism. I have a proposal for consideration at the April meetings - leveraging the existing OpenMP infrastructure.

Pretty much all compilers use OpenMP, a large chunk of shared memory parallel programs are written using OpenMP. So, to me, it seems a good idea to leverage the existing OpenMP library code, and existing developer knowledge. The paper is not arguing that we need take the OpenMP syntax - that is something that can be altered to fit the requirements of the language.

What do you think?

Thursday, March 14, 2013

The pains of preprocessing

Ok, so I've encountered this twice in 24 hours. So it's probably worth talking about it.

The preprocessor does a simple text substitution as it works its way through your source files. Sometimes this has "unanticipated" side-effects. When this happens, you'll normally get a "hey, this makes no sense at all" error from the compiler. Here's an example:

$ more c.c
#include <ucontext.h>
#include <stdio.h>

int main()
  int FS;

$ CC c.c
$ CC c.c
"c.c", line 6: Error: Badly formed expression.
"c.c", line 7: Error: The left operand must be an lvalue.
2 Error(s) detected.

A similar thing happens with g++:

$  /pkg/gnu/bin/g++ c.c
c.c: In function 'int main()':
c.c:6:7: error: expected unqualified-id before numeric constant
c.c:7:6: error: lvalue required as left operand of assignment

The Studio C compiler gives a bit more of a clue what is going on. But it's not something you can rely on:

$ cc c.c
"c.c", line 6: syntax error before or at: 1
"c.c", line 7: left operand must be modifiable lvalue: op "="

As you can guess the issue is that FS gets substituted. We can find out what happens by examining the preprocessed source:

$ CC -P c.c
$ tail c.i
int main ( )
int 1 ;
1 = 0 ;
printf ( "FS=%i" , 1 ) ;

You can confirm this using -xdumpmacros to dump out the macros as they are defined. You can combine this with -H to see which header files are included:

$ CC -xdumpmacros c.c 2>&1 |grep FS
#define _SYS_ISA_DEFS_H
#define _FILE_OFFSET_BITS 32
#define REG_FSBASE 26
#define REG_FS 22
#define FS 1

If you're using gcc you should use the -E option to get preprocessed source, and the -dD option to get definitions of macros and the include files.