Monthly Archive for May, 2010

Thoughts on Open Source and Platform as a Service

The question

Last week there were some articles, blog posts and tweets about the relationship between Platform as a Service (PaaS) offerings and open source. The initial framing of the conversation was around PaaS and the LAMP (Linux/Apache/MySQL/{PHP/Perl/Python/Ruby}) stack. An article on InfoQ gives the jumping off points to posts by Geva Perry and James Uruqhardt. There’s a lot of discussion which I’m not going to recapitulate, but Uruqhardt’s post ends with the question

I’d love to hear your thoughts on the subject. Has cloud computing reduced the relevance of the LAMP stack, and is this indicative of what cloud computing will do to open-source platform projects in general?

Many PaaS offerings are based on open source software. Heroku is based on Ruby and is now doing a beta of Node.js. Google’s App Engine was originally based on Python, and later on Java (the open sourceness of Java can be debated). Joyent’s Smart Platform is based on Javascript and is open source. Of the major PaaS offerings, only Force.net and Azure are based on proprietary software. I don’t have hard statistics on market share or number of applications, but from where I sit, open source software still looks pretty relevant.

Also I think it’s instructive to look at how cloud computing providers are investing in open source software. Rackspace is a big sponsor of the Drizzle project, and of Cassandra, both directly and indirectly through its investment in Riptano. EngineYard hired key JRuby committers away from Sun. Joyent has hired the lead developer of node.js, and VMWare bought SpringSource and incorporated it into VMForce. That doesn’t sound to me like open source software is less relevant.

Cloud computing is destined to become a commodity

The end game for cloud computing is to attain commodity status. I expect to see markets in the spirit of CloudExchange, but instead of trading in EC2 spot instances, you will trade in the ability to run an application with specific resource requirements. In order for this to happen, there needs to be interoperability. In the limit, that is going to make it hard for PaaS vendors to build substantial platform lockin, because businesses will want the ability to bid out their application execution needs. Besides, as Tim O’Reilly has been pointing out for years, there’s a much more substantial lock in to be had by holding a business’s data as opposed to a platform lock. This is all business model stuff, and the vendors need to work this out prior to large scale adoption of PaaS.

Next Generation Infrastructure Software

The more interesting question for developers has to do with infrastructure software. In my mind LAMP is really a proxy for “infrastructure software” If you’ve been paying any attention at all to the development of web application software, you know that there is a lot happening with various kinds of infrastructure software. Kiril Shenynkman, one of the commenters on Geva Perry’s post wrote:

Yes, yes, yes. PHP is huge. Yes, yes, yes. MySQL has millions of users. But, the “MP” part of LAMP came into being when we were hosting, not cloud computing. There are alternative application service platforms to PHP and alternatives to MySQL (and SQL in general) that are exciting, vibrant, and seem to have the new developer community’s ear. Whether it’s Ruby, Groovy, Scala, or Python as a development language or Mongo, Couch, Cassandra as a persistence layer, there are alternatives. MySQL’s ownership by Oracle is a minus, not a plus. I feel times are changing and companies looking to put their applications in the cloud have MANY attractive alternatives, both as stacks or as turnkey services s.a. Azure and App Engine.

How many of the technologies that Shenykman lists are open source? All of them.   

Look at Twitter and Facebook, companies whose application architecture is very different from traditional web applications. They’ve developed a variety of new pieces of infrastructure. Interestingly enough, many of these technology pieces are now open source (Twitter, Facebook). Open source is being used in two ways in these situations. It is being used as a distribution mechanism, to propagate these new infrastructure pieces throughout the industry. But more importantly (and for those observing more closely, quite imperfectly), open source is being used as a development methodology. The use of open source as a development methodology (also known as commons-based peer production) is definitely contributing to these innovative technologies. Open source projects are driving innovation (this also happened in the Java space. Witness the disasters of EJB 1.0 and 2.0 which lead to the development of EJB 3.0 using open source technologies like Hibernate, and which provided the impetus for the development of Spring). Infrastructure software is a commons, and should be developed as a commons. The cloud platform vendors can (and are) harvesting these innovations into their platforms, and then finding other axes on which to compete. I want this to continue. As I mentioned in my DjangoCon keynote last year, I also want open source projects to spend more time thinking about how to be relevant in a cloud world.   

My question on PaaS is this: Who will build a PaaS that consolidates innovations from the open source community, and will remain flexible enough to continue to integrate those innovations as they continue to happen?