Ted Leung on the air: Open Source, Java, Python, and ...
In the comments to my post about IronPython and JRuby there were some comments about C libraries for Python or Ruby which would be unusable in in a CLR or JVM based implementation. This is correct, of course, and it is a problem for people trying to port software across implementations.
Earlier this week, Joel Spolsky made some comments about Ruby performance which triggered a bunch of posts, including a lesson from Avi on the 20 year old technique of in-line method caching. David Heinemeier Hansson weighed in with a post titled "Outsourcing the performance-intensive functions", where he argues that one of the benefits of scripting languages is that you can "cheat" by calling functions written in some other language.
Of course, that capability isn't limited to scripting languages. Other languages like Smalltalk, Lisp, Dylan, and others have foreign functions interfaces that let them talk to C code, and SWIG, which is a favorite tool for making it easy to link bind C libraries to scripting languages, also works for those languages. Reusing existing C code is a fine and worthwhile thing to do.
I don't agree, however, that users of dynamic languages should just agree to outsource high performance functions to C. The whole point of using a dynamic language is developer productivity, and that should be the case for performance critical code as well. And it's not like this is an impossible task either. There are implementations of dynamic languages which are very efficient, and applying those techniques to "scripting languages" is worthwhile endeavor, which is being pursued by folks like the PyPy team. As Avi also points out, the StrongTalk VM has now been open sourced, which may make it easier for language implementors to adopt some of the rich body of work that has been done on dynamic language performance.
Will there be cases where even the most advanced implementation techniques won't yield enough performance? Sure. That's why C compilers have a feature called in-line assembly code. But you rarely see it used. Having to rewrite my performance critical dynamic language code in C should be a rarity. The better the VM's get, the more rare those occasions will be, and that's a good thing. Let's not throw up our hands and say "yeah, you're right, we're slow, but it doesn't matter because we can cheat".
Several days ago I alluded (obscurely) to the possibility of 8 core Mac Pro's based on an upcoming quad-core Xeon. Tuesday, Anandtech demonstrated that the existing Mac Pro hardware is capable of the feat (memory bus speed notwithstanding). Their benchmarks also show that many applications are not able to exploit 4 cores very well, never mind 8. Now, where did I leave that Erlang disk image....