Scalability != concurrency

Sam Ruby is writing about Russell Beattie writing about Java and Erlang.

Russell thinks Java needs an overhaul. I think that Java has reached the point where technical, community, and business forces well exert pressure on the language to evolve in a uniformly bad manner.

Russell wrote:

The reason people are looking at Erlang is not because its beautiful syntax, great documentation, or up-to-date libraries. Trust me. It’s because the Erlang VM can run for long periods of time, scaling linearly across cores or processors filling the same niche that Java does right now on the server.

Actually, I am looking at Erlang as a solution for anywhere, (including the client) where concurrency will be an issue. By the way, it is not VM’s that scale linearly, but computational problems. And there are some problems which just can’t scale linearly, no matter what VM we put them on.

Sam goes on to make the point which is the title of this post.

Next, to dispel a few myths. Slashdot is written in Perl, seems to handle the load, and also seems to stay up. While there are a number of BitTorrent implementations, the original and (to the best of my knowledge) the most pervasive version is written in Python. Yahoo is a mix, but a good portion of it is written in PHP, with critical functions written in C. Twitter is written in Ruby, had early scalability issues, but seems to be past them. These are all examples of massively scalable applications.

Scalability is not the same thing as concurrency. It is certainly possible to scale a program written in any language – that’s a given. Especially when scaling = throwing more hardware at it. But there’s got to be a better way of doing it. Question is whether the better way is worth the price of admission.

But as far as Erlang vs Java, the real kicker is here:

Unlike the CLR which was designed to be multi-language, and unlike the JVM which is in the process of being repurposed to be multi-language also, Erlang’s VM is designed from the ground up assuming that objects typically are immutable and serializable.

Which is what makes the situation with Java so bad. Not only is the language bad, the VM is fatally flawed when it comes to actor style concurrency (which is why for all its niceties, Scala will suffer the same problems as Java). There’s a real problem here — ask yourself why there is a market for these things, if all that is needed is to throw even more boxes at the problem.

In the comments, Sam wrote;

The biggest problem I have with Erlang is clearly an addressable one: the documentation of the libraries, and the lack of good samples that can be quickly found by Google/MSN-Live/Yahoo!/Ask searches. And many of the libraries appear to be abandoned at 0.n versions.

This is actually 2 problems. There’s the issue with the libraries, and there’s the issue with the community that did/didn’t produce the libraries. We don’t just need a technology, we need a community. Hmm, Erlang lab, anyone?

11 thoughts on “Scalability != concurrency

  1. ingo

    What does “designed from the ground-up assuming that objects typically are immutable and serializable” mean for the VM?

  2. Isaac Gouy

    “…the VM is fatally flawed when it comes to actor style concurrency”

    It would be nice to see some explanation of that assertion.

  3. Ted Leung Post author

    Isaac,

    Well you could start with the fact that you don’t need any of the machinery for volatile, or for enforcing the Java Memory Model. Immutability and shared nothing get you a lot.

  4. Isaac Gouy

    Does needing the machinery for volatile or for enforcing the memory model make the VM fatally flawed for actor style concurrency – or just a pain?

    It surprises me that someone managed to implement Occam style concurrency for JVM – but they have Integrating and Extending JCSP.

    “Fatally flawed” sounds so very definite, so very conclusive.

  5. Ted Leung Post author

    If the machinery for that gets in the way of making the VM efficient for actors, then I consider that a fatal flaw. We know that any computational model can emulate any other. But efficiency matters.

  6. Jasen Halmes

    I think that your core argument is that you can’t achieve effective concurrency with today’s JVM architecture? That may be true, but there are alternate ways to achieve concurrency without depending on the JVM design. When you use a multiple commodity box solution you get the scale and the potential for concurrency. With an application fabric, like the Appistry EAF you can abstract the hardware from the Java application. You therefore have the ability to take advantage of massive concurrency (and for a much cheaper price tag than buying bigger boxes).

    For example, Google uses the map/reduce pattern to scale out their problems concurrently and recombine the answers. Appistry provides a productized platform to get that kind of massive scale for Java (and C/C++/.NET).

    -jasen

Leave a Reply

Your email address will not be published. Required fields are marked *