Omegahat Statistical Computing

Ideas for statistical computing


Posted by omegahat on September 1, 2010

Over the past 10 years, I have been torn between building a new stat. computing environment
or trying to overhaul R. There are many issues on both sides. But the key thing is to
enable doing new and better things in stat. computing rather than just making the existing things
easier and more user-friendly.

If we are to continue with R for the next few years, it is essential that it get faster.
There are many aspects to this. One is compiling interpreted R code into something faster.
LLVM is a toolkit that facilitates the compilation of machine code. So in the past few days
I have looked into this and developed an R package that provides R-bindings to some of
the LLVM functionality.

The package is available from, as are several examples
of its use.
I used the package to implement a compiled version of one of Luke Tierney’s compilation examples
which uses a loop in R to add 1 to each element of a vector. The compiled version gives a speedup
of a factor of 100, i.e. 100 times faster than interpreted R code. This is slower than x + 1
in R which is implemented in C and does more. But it is a promising start. The compiled version is also faster than bytecode interpreter approaches. So this is reasonably promising.

Of course, it would be nicer to leverage an existing compiler! (Think SBCL and building on top of LISP).


3 Responses to “Rllvm”

  1. Charlie said

    Hell yes!

    This is an idea that has been kicking around in the back of my skull for about 6 months now— causing much loss of productivity. Unfortunately, I have no experience in compiler design or enough free time to learn which made digging into the source of the R interpreter rather difficult.

    Duncan, you are a beast. I salute you!

  2. Jason said

    I was just wondering if this project (OmegaHat as a whole) is still active?

    • omegahat said

      Hi Jason

      Yes, it is still going, but I have am very busy with a project that is just about to end.
      And then I am back to the idea of compiling parts of the R language and other DSLs related
      to R into machine code. There are still good things that we can do that JITs, etc. may not be able
      to do such as recognize better memory management in a script, and so on.
      I am always looking for people to contribute, explore and experiment.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: