Back in July, I gave an OSCON talk that was a survey of language constructs for concurrency. That talk has been making the rounds lately. Jacob Kaplan-Moss made referred to it in a major section of his excellent keynote Snakes on the Web, and Tim Bray has cited it as a reference in his Concur.next series. It seems like a good time for me to explain some of the talk content in writing and add my perspective on the current conversations.
The Cambrian period was marked by a rapid diversification of lifeforms. I think that we are in a similar situation with concurrency today. Although many of the ideas that are being tossed around for concurrency have been around for some time, I don’t think that we really have a broad body of experience with any of them. So I’m less optimistic than Tim and Bruce Tate, at least on time frame. I think that we have a lot of interesting languages, embodying a number of interesting ideas for working with concurrency. I think that some of those languages have gained enough interest/adoption that we are now in a position to get a credible amount of experience so that we can start evaluating these ideas on their merits. But I think that the window for doing that is pretty large, on the order of 5 to 10 years.
What kinds of problems
The kinds of problems I am interested in are general purpose programming problems. I’m specifically not interested in scientific, numeric, highly regular kinds of computations or data parallel computations. Unlike Tim, I do think that web systems are a valid problem domain. I see this being driven by the need to drive down latency to provide good user response time, not to provide additional scalability (although it probably will).
It’s not like Java
Erik Engbrecht, one of Tim’s commenters said:
To get Java, you basically take Smalltalk and remove all of the powerful concepts from it while leaving in the benign ones that everyday developers use.
I think there’s something to be learned from that.
This presupposes that you know what all the good concepts are and what the benign ones are. It doesn’t seem like we are at that point. When Java was created, both Lisp and Smalltalk had existed for quite sometime and it was possible to do this kind of surgery. I don’t have a clear sense of what actually works well, much less what is powerful or benign.
The hardware made me do it
From where I sit, this is all about exploiting multicore hardware, and when I say this I mean machines with more than 4 or 8 hardware threads (I say threads, not cores – actual parallelism is what is important). The Sun T5440 is a 256 thread box. Intel’s Nehalem EX will let you build a 128 thread box later this year. Those are multicore boxes. If you look at experiments, you see that systems that seem to work well at 2 or 4 threads don’t’ work well at 16 or 64 threads. Since there’s not a huge amount of that kind of hardware around yet, it’s hard for people to run experiments at larger sizes. Experiments being run on 2 thread MacBook Pro’s are probably not good indicators of what happens at even 8 threads.. This is partially because dealing with more hardware threads requires more administrative overhead, and as the functional programming people found out, that overhead is very non-trivial. The point is, you have to run on actual hardware to have believable numbers. This makes it hard for me to take certain kinds of systems seriously, like concurrency solutions running on language implementations with Global Interpreter Locks. See David Beazley’s presentation on Python’s Global Interpreter Lock, for an example.
Comments on specific languages
At this point I am more interested in paradigms and constructs as opposed to particular languages. However, the only way to get real data on those is for them to be realized in language designs and implementations.
- Haskell – Functional Laziness aside, the big concurrency thing in Haskell is Software Transactional Memory (STM). There are other features in Haskell, but STM is the big one. STM is an active research field in computer science, and I’ve read a decent number of papers trying to make heads from tails. Among the stack that I have read, it seems to be running about even between the papers touting the benefits of STM and the the papers saying that STM cannot scale and will not work in practice. The jury is very much out on this one, at least in my mind.
- Erlang – I like Erlang. It’s been in production use for a long time, and real systems have been built using it. In addition to writing some small programs and reviewing some papers by Erlang’s designers, I spent a few days at the Erlang Factory earlier this year trying to get a better sense of what was really happening in the Erlang community. While there’s lots of cool stuff happening in Erlang, I observed two things. First, the biggest Erlang systems I heard described (outside of Facebook’s) are pretty small compared to a big system today. Second, and more importantly, SMP support in Erlang is still relatively new. Ulf Wiger’s DAMP09 presentation has a lot of useful information in it. On the other hand, BEAM, the Erlang VM is architected specifically for the Erlang process/actor model. This feels important to me, but we need some experimental evidence.
- Clojure – Clojure as a ton of interesting ideas in it. Rich Hickey has really done his homework, and I have a lot of respect for the work that he is doing. Still it’s the early days for Clojure, and I want to see more data. I know Rich has run some stuff on one of those multiple hundred core Azul boxes, but as far as I know, there’s not a lot of other data.
- Scala – The big thing in Scala for concurrency is Actors, but if you compare to Erlang, Actors are the equivalent of Erlang processes. A lot of the leverage that you get in Erlang comes from OTP, and to get that in Scala, you need to look at Jonas Boner’s highly interesting Akka Actor Kernel project. Akka also includes an implementation of dataflow variables, so Akka would give you a system with Actors, supervision, STM, and Dataflow (when it’s done).
- libdispatch/Grand Central Dispatch – Several of Tim’s commenters brought up Apple’s Grand Central Dispatch, now open sourced as libdispatch. This is a key technology for taking advantage of multicore in Snow Leopard. GCD relies on programmers to create dispatch queues which are then managed by the operating system. Programmers can send computations to these queues via blocks (closures), which are a new extension to Objective-C. When I look at Apple’s guide to migrating to GCD from threads, I do see a model that I prefer to threads, but it is not as high level as some of the others. Also, the design seems oriented towards very loosely coupled computations. It will be several years before we can really know how well GCD is working. I am typing this post on a 16 thread Nehalem Mac Pro, and I rarely see even half of the CPU meters really light up, even when I am running multiple compute intensive tasks. Clearly more software needs to take advantage of this technology before we have verdict on its effectiveness in production.
- .Net stuff like F#/Axum, etc – There is some concurrency work happening over on the CLR, most notably in F# and Axum. I spent some time at Lang.NET earlier this year, and got a chance to learn a bit about these two technologies. If you look at paradigms, the concurrency stuff looks very much like Erlang or Scala, with the notable exception of join patterns, which are on Martin Odersky’s list for Scala. I will admit to not being very up to speed on these, mostly for lack of Windows and the appropriate tools.
Jacob’s take away from my talk at OSCON was “we’re screwed”. That’s not what I wanted to convey. I don’t see a clear winner at the moment, and we have a lot of careful experimentation and measuring to do. We are quite firmly in the Cambrian, and I’m not in a hurry to get out – these things need to bake a bit longer, as well as having some more experimentation.
In addition to my talk, and Tim’s wiki page, if you are really interested in this space, I think that you should read Concepts, Techniques, and Models of Computer Programming by Peter van Roy and Seif Haridi. No book can be up to date with the absolute latest developments, but this book has the best treatment that I’ve seen in terms of trying to stratify the expressiveness of sequential and concurrent programming models.