Posted by omegahat on September 1, 2010
Over the past 10 years, I have been torn between building a new stat. computing environment
or trying to overhaul R. There are many issues on both sides. But the key thing is to
enable doing new and better things in stat. computing rather than just making the existing things
easier and more user-friendly.
If we are to continue with R for the next few years, it is essential that it get faster.
There are many aspects to this. One is compiling interpreted R code into something faster.
LLVM is a toolkit that facilitates the compilation of machine code. So in the past few days
I have looked into this and developed an R package that provides R-bindings to some of
the LLVM functionality.
The package is available from http://www.omegahat.org/Rllvm, as are several examples
of its use.
I used the package to implement a compiled version of one of Luke Tierney’s compilation examples
which uses a loop in R to add 1 to each element of a vector. The compiled version gives a speedup
of a factor of 100, i.e. 100 times faster than interpreted R code. This is slower than x + 1
in R which is implemented in C and does more. But it is a promising start. The compiled version is also faster than bytecode interpreter approaches. So this is reasonably promising.
Of course, it would be nicer to leverage an existing compiler! (Think SBCL and building on top of LISP).