Tag Archive for 'internet'

Strange Loop 2012

I think that the most ringing endorsement that I can give Strange Loop is that it has been a very long time since I experienced so much agony when trying to pick which talks to go to during any given block.

Emerging Languages Camp

This year Strange Loop hosted the Emerging Languages Camp (ELC), which previously had been hosted at OSCON. I liked the fact that it was its own event, not yet another track in the OSCON panoply. That, coupled with a very PLT oriented audience this year, made Strange Loop a much better match for ELC than OSCON.

I definitely went into ELC interested in a particular set of talks. There is a lot of buzz around big data, and some of the problems around big data and data management more generally. Also I did my graduate work around implementing “database programming languages”, so there was some academic interest to go along with the practical necessity. There were three talks that fell into that bucket: Bandicoot: code reuse for the relational model, The Reemergence of Datalog, and Julia: A Fast Dynamic Language for Technical Computing.

I found Bandicoot a little disappointing. I think that the mid 90′ work of Buneman’s group at UPenn on Structural Recursion as a Query Language and Comprehension Syntax would be a better basis for a modulary and reusable system for programming relations.   

Logic Programming may be making a resurgence via the work on core.logic in Clojure and the influence of Datalog on Cascalog, Datomic and Bloom. The Reemergence of Datalog was tutorial on Datalog for those who had never seen it before, as well as a survey of Datalog usage in those modern day systems.

Julia is a language that sits in the same conceptual space as R, SAS, SPSS, and so forth. The problem with most of those systems is that they were designed by statisticians and not programmers. So while they are great for statistical analysis, they are less good for statistical programming. Julia aims to improve on this, while adding support for distributed compuation and a very high performance implementation. There’s no decisive winner in the technical computing space, and it seems like Julia might have a chance to shine.

There were, of course, some other interesting language talks at ELC.   

Dave Herman from Mozilla talked about Rust for the first time (at least to a large group). Rust is being developed as a systems programming language. There are some interesting ideas in it, particularly a very Erlang like concurrency model. At the same time, there were some scary things. Part of what Rust is trying to do is achieve performance, and part of how this happens is via explicit specification of memory/variable lifetimes. Syntactically this is accomplished via punctuation prefixes, and I was wondering if the code was going to look very Perl-ish. servo is browser engine that is being written in Rust, and looking at the source code of a real application will help me to see whether my Perlishness concern is valid.

Elixir: Modern Programming for the Erlang VM looks like a very nice way to program atop BEAM (the Erlang VM). Eliminating the prolog inspired syntax goes a long way, and it appears that Elixir also addresses some of the issues around using strings in Erlang. It wasn’t clear to me that all of the string issues have been addressed, but I was definitely impressed with what I saw.

Strange Loop Talks and Unsessions

I’m going to cover these by themes. I’m not sure these are the actual themes of the conference, but they are the themes that emerged from the talks that I went to.

First, and unsurprisingly, a data theme. The opening keynote, In Memory Databases: the Future is Now! was by Mike Stonebraker. It’s been a long time since I saw Stonebraker speak – I think that the last time was when I was in graduate school. He was basically making the case that transaction processing (TP) is not going away, and that there might be applications for a new generation of TP systems in some of the places where the various NoSQL systems are now being used. Based on that hypothesis/assumption, he then went on to describe the trends in modern systems and how they would lead to different design, much of which is embodied in VoltDB. This was a very controversial talk, at for some people. I considered the trend/system analysis part to be reasonable in a TP setting. I’m not sure that I agree with his views on the applicability of TP, but I’m fairly sure that time will sort all of that out. I think that this is an important point for the NoSQL folks to keep in mind. When the original work on RDBMS was done, it was mocked, called impractical, not useful and so forth. It took many years of research and technology development. I think that we should expect to see something similar with NoSQL, although I have no idea how long that timeline will be.

Nathan Marz’s talk Runaway Complexity in Big Data… and a plan to stop it. was basically making the case for, and explaining the hybrid/combined batch/realtime architecture that he pioneered at BackType, and which is now in production at Twitter. That same architecture led to Cascading and Storm, which are pretty interesting systems. Marz is working on a book with Manning that will go into the details of his approach.

The other interesting data talks revolved around Datomic. Unfortunately, I was unable to attend Rich Hickey’s The Database as a Value, so I didn’t get to hear him speak directly about Datomic. There are several Datomic related videos floating around, so I’ll be catching up on those. I was able to attend the evening unsession Datomic Q&A / Hackfest. This session was at 9pm, and was standing room only. I didn’t have quite enough background on Datomic to follow all of what was said, but I was very interested by what I saw: the time model, the immutability of data which leads to interesting scalability, the use of Datalog. I’m definitely going to be looking into it some more. The one thing that troubles me is that it is not open source. I have no problem with a paid supported version, but it’s hard to make the argument for proprietary system or infrastructure software nowadays.

Another theme, which carried over from ELC was logic programming. I had already heard Friedman and Byrd speak at last fall’s Clojure/conj, and I was curious to see where they have taken miniKanren since then. In their talk, Relational Programming in miniKanren, they demonstrated some of what they showed previously, and then they ran out of material. So on the fly, they decided to implement a type inferencer for simple lambda terms live on stage. Not only were they able to finish it, but since it was a logic program, they were also able to run it in reverse, which was pretty impressive. I was hoping that they might have some additional work on constraints to talk about, but other than disequality constraints, they didn’t discuss anything. Afterwards in Twitter, Alex Payne pointed out that there are some usability issues with miniKanren’s API’s. I think that this is true, but it’s also true that this is a research system. You might look at something like Clojure’s core.logic for a system that’s being implemented for practitioners.

David Nolen did an unsession Core Logic: A Tutorial Reconstruction where he walked the audience through the operation of core.logic, and by extension, miniKanren, since the two systems are closely related. He pointed out that he read parts of “The Reasoned Schemer” 8 times until he understood it enough to implement it, and then he found that he didn’t really understand it until after the implementation was done. There was also a large crowd in this session, and Christopher Petrelli made a video recording on his phone, since InfoQ wasn’t recording the unsessions.

The final talk in the logic programming them was Oleg Kiselyov’s talk Guess lazily! Making a program guess and guess well. Kiselyov has been around for a long time and written or coauthored many important papers related to Scheme and continuations. I’ve be following (off and on) his work for a long time, but this is the first time I was at a conference where he was speaking. I was shocked to find that the room was packed. His talk was about how to defer making the inevitable choices required by non-determinism, especially in the context of logic type systems. His examples were in OCaml, which I had some trouble following, but after Friedman and Byrd the day before, he apparently felt compelled to write a type inferencer that could be run backwards as well. His code was a bit longer than the miniKanren version.

The next theme is what I’d call effective use of functional programming. The first talk was Stuart Sierra’s Functional Design Patterns. This was a very worthwhile talk, which I won’t attempt to summarize since the slides are available. Needless to say, he found a number of examples that could be called design patterns. This was one of the talks where I need to sit down and look at the patterns and think on them for a while. That’s hard to do during the talk (and the conference, really). Some things require pondering, and this is one of them.

The other talk in this category was Graph: composable production systems in Clojure, which described the Prismatic team’s approach to composing systems in Clojure. What they have is an abstraction that allows them to declaratively specify how the parts of the system are connected. For a while it just looked to me like a way to encode a data flow graph in a Clojure abstraction. The aha moment was when he showed how they use Clojure metadata to annotate the arguments or pipe connectors if you will. The graphs can be compiled in a variety of ways including Clojure lazy maps, which present some interesting possibilities. Unfortunately, I had to leave half way through the talk, so I missed the examples of how the apply this abstraction in their system.

Theme number four was programming environments. I hesitate to use the term IDE, because it connotes a class of tools that is loved by some, reviled by others, and when you throw that term around, it seems to limit people’s imagination. I contributed to the Kickstarter for Light Table, so I definitely wanted to attend Chris Granger’s talk Behind the Mirror: The birth of Light Table. Chris gave a philosophical preamble before showing off the current version of Light Table. He demonstrated adding support for Git in a short amount of code, and went on to demonstrate a mode for developing games. He said that they are planning to release version 1 sometime in May, and that Light Table will be open source. I also learned that Kickstarter money is counted as revenue, so they have lost a significant amount of the donations to taxes, which is part of the reason that Kodawa participated in Y Combinator, and is trying to raise some money to get a bigger team.

Not long after the Light Table kickstarter, this video by Bret Victor made the rounds. It went really well with all the buzz about Light Table, and Alex Miller, the organizer of Strange Loop, went out and persuaded Bret to come and talk. Bret’s title was Taking off the Blindfold, and I found this to be a very well motivated talk. In the talk, Bret talked about the kinds of proerties that our programming tools should have. The talk was vey philosophical despite the appearance of a number of toy demos of environment features.

During both of these talks there was a lot of chatter. Some was harking back to the Smalltalk (but sadly, not the Lisp Machine) environments,while some questioned the value of a more visual style of tools (those emacs and vi graybeards). When I first got into computers I read book called “Interactive Programming Environments” and ever since i’ve always been wishing for better tools.   I am glad to see some experimentation come back into this space.

Some old friends are busy making hay in the Node.js and Javascript communities, and it probably horrifies theme that I have ClojureScript as a theme, but so be it. I went to two ClojureScript talks. One was David Nolen’s ClojureScript: Better Semantics at Low Prices!, which was really state of the union of ClojureScript. The second was Kevin Lynagh’s Building visual data driven UI’s with ClojureScript. Visualization is becoming more and more important and ClojureScript’s C2 library look really appealing.

It’s fitting that the last them should be Javascript. Well, maybe. I went to two Javascript talks, and both of them were keynotes, so I didn’t actually choose them. But Javascript is so important these days that it really is a theme. In fact, it’s so much of a theme, that I’ve been going to Javascript conferences for the last 2 years. It’s been several years since I saw Lars Bak speak. His talk on Pushing the Limits of Web Browsers was in two parts. Or so I think. I arrived just as he was finishing the first part which seemed like an account of the major things that the V8 team has learned during their amazing journey of speeding up Javascript. The second part of his talk was about Dart. I didn’t know that Bak was the lead of the Dart project, but that doesn’t change how I feel about Dart. I see the language, I understand the rationale, and I just can’t get excited about it.   

I’ve been to enough of those Javascript only talks to hear Brendan Eich talk about The State of Javascript. Brendan opened by giving a brief history of how Javascript got to be the way it is, and then launched into a list of the improvement coming in EcmaScript 6 (ES6). That was all well and good, and towards the end, after the ES6 stuff, he threw in some items that were new, like the sweet.js hygienic macro project, and the lljs typed JavaScript project. It seemed like this was a good update for this audience, who seemed unaware of all the goings on over in JavaScript land. From a PLT point of view, I guess that’s understandable, but at the same time, JavaScript is too important to ignore.

Final Thoughts

Strange Loop has grown to over 1000 people, much larger than when I attended in 2010 (I had to miss 2011). I think that Alex Miller is doing a great job of running the conference, and of finding interesting and timely speakers. This was definitely the best conference that I attended this year, and probably the last 2-3 years as well.

If you’re looking for more information on what happened at Strange Loop 2012:

Slides: https://github.com/strangeloop/strangeloop2012/tree/master/slides

Other Strange Loop Coverage: https://github.com/strangeloop/strangeloop2012/wiki/Coverage

JSConf 2012

This year JSConf was in Scottsdale Arizona, which provided some welcome relief from the cold, wet, Seattle winter/spring.

News

One of the biggest pieces of news was that Mozilla gave all attendees a Nexus S smartphone running a developer version of the Boot to Gecko (B2G) phone operating system. When I say developer, I mean, camera support was broken, things were crashing, that sort of thing. These phones were a big hit among the attendees. They contributed to knocking the conference wifi out temporarily, and I saw several groups of people who were working on projects for the phone. My experience at Google I/O had soured me on the idea of giving away free devices. In the case of Google I/O, device giveaways have become an expectation, and there is some proportion of people who sign up for that conference based on the hope of getting a free device. Still, Mozilla is going to need all the help that they can get, and people seemed to take the challenge to heart. I did find it interesting that the Mozilla folks were speaking of B2G as a great feature phone software stack. This is a realistic way of climbing up the stairs in the mobile phone market. It’s hard to imagine a real competitor to iOS and Android, but I’m glad to see an effort in this direction. There’s WebOS, Windows Phone 7, and B2G all using some variant of the open web stack. It seems like there ought to be some collaboration between B2G and WebOS’s Enyo framework.

Talks

There were a bunch of talks on the internals of Javascript Virtual Machines. From a computer science point of view, these talks are interesting. I heard a lot of these kinds of talks at PyCon and during my days at Sun. It seemed that most of the audience appreciated this material, so the selections were good. The part of this that I found disturbing is wrapped up in one of the questions, which was basically, how can we write our code to take advantage of how the VM works. Given the number of VM’s that Javascript code might execute on, this seems like a path fraught with peril.

Also on the language front, there was more representation from functional programming. There was a talk on Roy, and David Nolen gave a talk that was billed as being about Clojurescript, but was really more about having a sense of play regarding all this technical work. Closely related to the functional programming was GPU programming. Jarred Nichols talked about implementing a Javascript interpreter in OpenCL. Stephan Herhut from Intel talked about the RiverTrail parallel extensions to Javascript which do data parallel computing using operations taken from functional programming. The extensions compile to OpenCL, which I found interesting. I wonder how many more languages we’ll see compiling to OpenCL or partially compiling to OpenCL.

Paul Irish did a nice presentation on tools which gave a great overview of the state of the practice in the various areas related to web application development. There were several tools that I didn’t know about. The presentation is all HTML5 but has some very nice visuals and animation. I’d love to know the name of the package that he used.

Ever since Node.js came out, I’ve been enamored of the idea that you could share/move some amount of code back and forth between the client and the server, much as code used to move back in the days of NeWS. Yahoo’s Mojito is an investigation in this space. It relies heavily on YUI, which I haven’t used. I’m looking forward to looking into the code and seeing how it all fits together.

The team at Bitovi make a special lunchtime presentation about CanJS, which is another MVC framework for Javascript. CanJS is in the same space as backbone, knockout, and so forth. It’s claims to fame are reduction of certain kinds of memory leaks, size, and speed. From the benchmark slides it seems worth a look.

Keynotes

Dan Ingalls delivered the closing keynote on the first day. I met Dan briefly when I worked at Sun, and I was familiar with his work on the Lively Kernel. The Lively Kernel is the answer to the question “what if we tried to build Squeak Smalltalk in Javascript”. It is much more than a language, it is an environment for building programs and simulations. I’m of two minds about this work. On the one hand, there’s depression that we still haven’t managed to catch up to the work that Ingalls and his contemporaries pioneered 30 years ago, and that today’s practitioners are completely oblivious to this work (a comment on Twitter confused Lively with an advanced version of the NeXT Interface Builder — the causality is reversed). On the other hand, although the Lively Kernel is written in Javascript and runs in a browser, it’s not really connected to today’s world, and so it’s applicability to solving web problems is limited. Nonetheless, Ingalls received a well deserved standing ovation. He is among the pioneers of our field, and as his generation is starting to pass on, it feels good to be able to personally honor them for their contributions to the field.

I have no idea how Chris Williams convinced Rick Falkvinge, the founder of the first (Swedish) Pirate Party to come and speak at JSConf. The topic of his talk was the politics of the net generation. Falkvinge told the story of how he came to found the Pirate party in Sweden, and described the success that the party is having in Europe. He claimed that about every 40 years, we have a new key idea. Apparently the theme for the period which is now ending was sustainability, and the claim is that the theme for the next 40 years will be free speech and openness. He credits this theme with the rise of the various Pirate parties in Europe, pointing to the European protests around ACTA and the US protest around SOPA as additional corroborating evidence. Falkvinge claims that the Pirate party has widened the scope of politics and given young people a way to vote for the issues that they care about. I wish that something similar was happening in American politics.

Hallway

As always, JSConf had a rich hallway/party track. I had lots of great conversations with people on topics including the Javascript API’s for Windows 8, the mismatch between many concurrency models and real shared memory hardware, and proper use and optimization of CSS. I think that facilitating the hallway track is one of the areas where JSConf excels. The venues are always nice, and this year there were hallway conversations, in pools, around campfires, as well as the usual hotel lobbies and restaraunts/bars/lounges. I was also happy to be able to introduce Matthew Podwysocki, who has been doing excellent work on RX.js, and David Nolen, who has been working on Clojurescript. I think that there can be some nice synergy between these two projects, and I’m eager to see if they agree.

The best roundup of JSConf coverage appears to be on Lanyrd.

Strata 2012

Here’s a roundup of last week’s Strata conference.

Jumpstart

This year, the O’Reilly team introduced a new tutorial day track, called “Jumpstart”. This track was more oriented towards the business side of big data, and I think that the word MBA actually appeared in the marketing. I think that the track was a success, and was very appropriate. The effect of the next generation of data oriented technologies and applications is going to be very significant, and will have a big impact on the way that business operate. It’s very important that technologists and business people work closely in order to produce the best results.

There were two talks that stood out for me. The first was Avinash Kaushik’s What Marketers can learn from Analysis. Kaushik is a very entertaining and dynamic speaker, and he has had a lot of experience working to help companies use analytics effectively. In his world, processing and storage is 10% of what you need, and analysts – humans are the other 90%. In other words, technology is not nearly as important as having people who can ask the right questions and verify hypotheses experimentally. And even good analysis is not enough. Organizations must be able to act on the results of analysis. I have been (and will continue to be) interested in the ability to use data as quickly as it is collected. Some people call this a “real-time” data capability, although in computer science terms, this is a misnomer. One of the best quotes from Kaushik’s talk was “If you do not have the capacity to take real time action, why do we need real time data?”. Without the ability to act, all the data collection and analysis in the world is fruitless. Kaushik’s claim was that we must remove all humans from the process in order to achieve this. Back to analysis, Kaushik feels that the three key skills of data analysis are: the scientific method, design of experiments, and statistical analysis.

The second talk was 3 Skills of a Data Driven CEO by Diego Saenz. I liked his notion that a company’s data is a raw material, just like any other raw material that might be used by a company. Raw materials must be collected, mined, purifed, and transformed before they can turn into a product, and so with a company’s data. The most important information that I got out of this talk was the case study that he presented on the Bob McDonald, the CEO of Proctor and Gamble. P&G has built a business wide real time information system called Business Sphere. One manifestation of Business Sphere is a pair of 8 foot high video screens that sit in the conference room used by the CEO for his regular staff meeting. Real time data on any aspect of the company’s operations can be displayed on these screens, discussed and acted upon at the CEO staff level. Also of note is that a data analyst attends the CEO staff meeting in order to facilitate discussion and questions about the data. I remember back in the 2000′s when Cisco talked about how they could close their books in a day. Now we have the worlds largest consumer products company with a real time data dashboard in the CEO’s conference room. The bar is being raised on all companies in all industries.

Talks

I felt that the talks In the regular conference were weaker than last year. Part of that may be due to my talk selection – there were lots of tracks, and in some cases it was hard to figure out which talks to pick. I tend to seek out unusual content, which means more risk in terms of a “quality” talk. The advent of the O’Reilly all access path has taken some of the risk out, since that pass gives you access to the full video archive of the entire conference. The topic of video archives is probably content for another blog post. I know that there are some talks that I missed that I want to watch the videos for, but apparently, I’ll need to wait several weeks. It will be interesting to contrast that with this week’s mostly volunteer run PyCon, which has a great track record of getting all their videos up on the web during the conference, for no fee.

Talks which were easy to remember included Sam Shah’s Collaborative Filtering with MapReduce, which included a description of how to implement collaborative filtering on Hadoop, but more importantly discussed many of the issues around building a production worthy version of such a system. It’s one thing the implement a core algorithm. It’s another to have all the rest of the infrastructure so that the algorithm can be used for production tasks.

A large portion of the data the people are interested in analyzing is coming from social networks. I attended Marcel Salathé’s Understanding Social Contagion in the hopes of gaining some greater insight into virality. Salathé works at an infectious disease center and he spent a long time comparing biological contagion with internet virality. I didn’t find this to be particularly enlightening. However, in the last third of the talk, he started talking about some of the experimental work that his group had done, which was a little more interesting. The code for his system is available on github.

I really enjoyed DJ Patil’s talk Data Jujitsu: The Art of Turning Data into Product. According to Patil, data jujitsu is using data elements in an iterative way to solve otherwise impossible data problems. A lot of his advice had to do with starting small and simple, and moving problems to where they were easiest to solve, particularly in conjunction with human input. As an example, he discussed the problem of entity resolution in one of the LinkedIn products, and described how they moved the problem from the server side, where it was hard, to the client side, where it was easy if you asked the user a simple question. The style he discussed was iterative, opportunistic, and “lazy”.

Jeremy Howard from Kaggle talked about From Predictive Modelling to Optimization: The Next Frontier. Many companies are now building a lifetime value model of a customer, and some companies are even starting to build predictive models. Howard’s claim was that the next steps in the progression are take these models and use them to build simulations. Once we have simulations, we can then use optimization algorithms on the inputs to the simulation, and optimize the results in the direction

Keynotes

Last year, I was pretty unhappy with a number of the keynotes, which were basically vendor pitches. This year things were much better, although there were one or two offenders. Microsoft was NOT one of the offenders. Dave Campbell’s Do We Have The Tools We Need To Navigate The New World Of Data? was one of the better Microsoft keynotes that I’ve seen at an O’Reilly event (i.e. out of the Microsoft ecosystem). The talk included good non-Microsoft specific discussion of the problems, references to academic papers (each with at least one Microsoft author), and a friendly, collegial, non-patronizing tone. I hope that we’ll see more of this from Redmond.

Avinash Kaushik had a keynote spot, and one of the most entertaining, but insightful slides was an infamous quote from Donald Rumsfeld

[T]here are known knowns; there are things we know we know.

We also know there are known unknowns; that is to say we know there are some things we do not know.

But there are also unknown unknowns – there are things we do not know we don’t know.

Kaushik was very keen on “unknown unknowns”. These are the kind of things that we are looking to find, and which analytics and big data techniques might actually help discover. He demonstrating a way of sorting data which leaves out the extremes, and leaves the rest of the data, which is likely where the unknown unknowns are hiding.

I’ve been a fan of Hal Varian ever since I read his book “Information Rules: A Strategic Guide to the Network Economy” back during the dot-com boom. One the one hand, his talk  Using Google Data for Short-term Economic Forecasting, was basically a commercial for Google Insights for Search. On the other hand, the way that he used it and showed how it was pretty decent for economic data was interesting. There were several talks that included the use of Google Insights for Search. It’s a tool that I’ve never paid much attention to, but I think that I’m going to rectify that.

The App

This is the first O’Reilly conference I’ve attended where they had a mobile app. There were iPad, iPhone, and Android versions. I only installed the iPad version, and I really liked it. I used it a lot when I was sitting in sessions to retrieve information about speakers, leave ratings and so forth. I’d love to see links to supplemental materials appear there. I also liked the fact that the app synced to the O’Reilly site, so that my personal schedule was reflected there. I didn’t like the fact that the app synced to the O’Reilly website because the WiFi at the conference was slow, and I often found myself waiting for those updates to finish before I could use the app. The other interesting thing was that I preferred the daily paper schedule when I was walking the hall between sessions. Part of this was due to having to wait for those updates, but part of it was that there was no view in the app that corresponded to the grid/track view of the paper schedule. More work to do here, but a great start.

Final thoughts

This year’s attendance was over 2300, up from 1400 last year, and I saw badges from all sorts of companies. It is apparent to me that the use of data and analytics being discussed at Strata is going to be the new normal for business.

Web 2.0 Summit

Last week I attended the Web 2.0 Summit in San Francisco. The theme this years was “The Data Frame”, an attempt to look at the “Points of Control Theme” from last year through the lens of data.   

Data Frame talks

Most of the good data frame stuff was in the short “High Order Bit” and “Pivot” talks. The interviews with big company CEO’s are generally of little value, because CEO’s at large companies have been heavily media trained, and it is rare to get them to say anything really interesting.

Genevieve Bell from Intel posed the question “Who is data and if it were a person what would it be like?” Her answers included:

  • Data keeps it real – it will resist being digitized
  • Data loves a good relationships – what happens when data is intermediated
  • Data has a country (context is important)
  • Data is feral (privacy security,etc )
  • Data has responsibilities
  • Data wants to look good
  • Data doesn’t last forever (and shouldn’t in some cases)

One Kings Lane was one of the startups described by Kleiner Perkins’ Aileen Lee. The interesting thing about their presentation was their realtime dashboard of purchasing activity during one of their flash sales events. You can see the demo at 6:03 in the video from the session.

Mary Meeker has moved from Morgan Stanley to Kleiner Perkins, but her Internet Trends presentation is still a tour de force of statistics and trends. It’s interesting to watch how her list of trends is changing over time.

Alyssa Henry from Amazon talked about AWS from the perspective of S3, and her talk was mostly statistics and customer experiences. One of her closing sentences stuck in my mind: “What would you do if every developer in your organization had access to a supercomputer”. Hilary Mason has talked about how people in sitting at home in their pajamas now have access to big data crunching capability. Alyssa’s remark pushes that idea – pushing the thought that access to supercomputing resources is at the same level as access to a personal computer.

TrialPay is a startup in the online payment space. Their interesting twist is that they will provide payment services free of charge, without a transaction fee. They are willing to do this because they collect the data about the payment, and can then use / sell information about payment behaviors and so on (apparently Visa and Mastercard plan to do something similar).

I am not a fan of talks that are product launches or feature launches on existing products, so I was all set to ignore Susan Wojcicki’s talk on Google Analytics. But then I saw this picture in her slides:

Edward Tufte has made this diagram famous, calling it “probably the best statistical graphic ever drawn”. I remember seeing this graphic in one of his seminars and wondering how to bring this type of visualization to a computer. I appreciated the graphic, but I wasn’t sure how many times one would need to graph death marches. The Google Analytics team found a way to apply this visualization to conversion and visitor falloffs. Sure enough, those visualizations are now in my Google Analytics account. Wojcicki also demonstrated that analytics are now being updated in “real time”. Clearly, there’s no need to view instant feedback from analytics as a future item.

Last year there was a panel on education reform. This year, Salman Khan, the creator of the Khan academy spoke. Philosophically I’m in agreement with what Khan is trying to do – provide a way for every student to attain mastery of a topic before moving on. What was more interesting was that he came with some actual data from a whole school pilot of Khan Academy materials. Their data shows that it is possible for children assigned to a remedial math class to jump to the same level as students in an advanced math class. They have a very nice set of analytic tools that work with their videos, which should lead to a more data based discussion of how to help more kids succeed in learning what they need to learn to be successful in life.

Anne Wojcicki (yes, she and Susan are sisters) talked about the work they are doing at 23andMe. She gave an example of a rare form of Parkinson’s disease, where they were able to assemble a sizable number of people with the genetic predisposition, and present that group to medical researchers who are working on treatments for Parkinsons. It was interesting story of online support groups, gene sequencing, and preventative medicine.

It seems worth pointing out that almost all the talks that I listed in this section were by women.

Inspirational Talks

There were some talks which didn’t fit the data frame theme that well, but I found them interesting or inspirational anyway.

Flipboard CEO Mike McCue made an impassioned plea that we learn when to ignore the data, and build products that have emotion in them. He contrasted the Jaguar XJSS and the Honda Insight as products built with emotion and built on data, respectively. He went on to say that tablets are important because the content becomes the interface. He believes that the future of the web is to be more like print, putting content first, because the content has a soul. Great content is about art, art creates emotion, and emotion defies the data. It was a great, thoughtful talk.

Alison Lewis from Coca Cola talked about their new, high tech, internet connected Freestyle soda machine. A number of futuristic internet scenarios seem to involve soda machines, so it was interesting to hear what actual soda companies are doing in this space. The geek in me thinks that the machine is cool, although I rarely drink soft drinks. I went to the Facebook page for the machine to see what was up, and discovered that the only places in Seattle that had them were places where I would never go to eat.

IBM’s David Barnes talked about IBM’s smart cities initiative, which involves instrumenting the living daylights out of city. Power, water, transportation grid, everything. His main points were:

  1. Cities will have a healthier immune systems.  The health web
  2. City buildings will sense and respond like living organisms – water, power, etc systems
  3. Car and city buses will run on empty..
  4. Smarter systems will quench cities thirst and save energy
  5. Cities will respond to a crisis – even before receiving an emergency call

He left us with a challenge to “Look at the organism that is the city.  What can we do to improve and create a smarter city?”. I have questions about how long it would take to actually build a smart city or worse, retrofit an existing city, but this is a challenge type of long term project. I’m glad to see that there are companies out there that are still willing to take that big long view.

Final Thoughts

I really liked the short talk formats that were used this year. It forced many of the speakers to really be crisp and interesting, or at least crisp, and I really liked the volume of what got presented. One thing seems true, that from the engineering audience of Strata to the executive audience at Web 2.0, data and data related topics are at the top of everyone’s mind.

And there in addition to ponies and unicorns, be dragons.

Surge 2011

Last week I was in Baltimore attending OmniTI’s Surge Conference. I can’t remember exactly when I first met OmniTI CEO Theo Schlossnagle, but it was at an ApacheCon after he had delivered one of his 3 hour tutorials on Scalable Internet Architectures, back in the early 2000′s. Theo’s been at this scalability business for a long time, and I was sad to have missed the first Surge, which was held last year.

Talks

Ben Fried, Google’s CIO started the conference (and one of the major themes) with a “disaster porn” talk. He described a system that he built in a previous life, for a major wall street company. The system had to be very scalable to accommodate the needs of traders. One day, the system started failing, and ended up costing his employer a significant amount of money. In the ensuing effort to get the system working again, he ended up with all the people from the various specializations (development, operations, networking, etc) all stuck in a very large room with a lot of whiteboards. It turned out that no one really understood how the entire system worked, and that issues at the boundaries of the specialties were causing many of the problems. The way that they had scaled up their organization was to specialize, but that specialization caused them to lose an end to end view of the system. Their organization of their people had led to some of the problems they were experiencing, and was impeding their ability to solve the problems.   The quote that I most remember was “specialization is an industrial age notion and needs to be discounted in spaces where we operate at the boundary of the known versus unknown”. The lessons that Fried learned on that project have influenced the way that Google works (Site Reliability Engineers as an example), and are similar to the ideas being espoused by the “DevOps” movement. His description of the solution was to “reward and recognize generalist skill and end to end knowledge”. There was a pretty lively Q&A around this notion of generalists.

Mark Imbriaco’s talk was titled “Anatomy of a Failure” in the program, but he actually presented a very detailed account of how Heroku responds to incidents. My background isn’t in operations, so I found this to be pretty interesting and useful. I particularly liked the idea of playbooks to be followed when incidents occur, and that alert messages actually contain links to the necessary playbooks. The best quote from Mark’s talk was probably “Automation is also a great way to distribute failure across an entire system”.

Raymond Blum presented the third of three Google talks that were shoe horned into a single session. He described the kind of problems involved in doing backups at Google scale. Backup is one of those problems that needs to be solved, but is mostly unglamourous. Unless you are Google, that is. Blum talked about how they actually read their backup tapes to be sure that they work, their strategy of backing up to data centers in different geographies, and clever usage of map reduce to parallelize the backup and restore process. He cited the Gmail outage earlier this year as a way of grasping the scale of the problem of backing up a service like GMail, much less all of Google. One way to know if a talk succeeds is if it provokes thoughts. Based on my conversations with other attendees, this one succeeded.

David Pacheco and Bryan Cantrill talked about “Realtime Cloud Analytics with Node.js”. This work is an analog of the work that they did on the analytics for the “Fishworks”/Sun Storage 7000 products, except instead of measuring a storage appliance, they are doing analytics for Joyent’s cloud offering. This is basically a system which talks to DTrace on every machine, and then reports the requested metrics to an analytics service once a second. The most interesting part of the talk was listening to two guys who are hard core C programmers / kernel developers walk us through their decision to write the system in Javascript on Node.js instead of using C. They also discussed the areas where they expected there to be performance problems, and were surprised when those problems never appeared. When it came time for the demo, it was quite funny to see one of the inventors of DTrace being publicly nervous about running DTrace on every machine in the Joyent public cloud.   ”Automation is also a great way to distribute failure across an entire system”. But everything was fine, and people were impressed with the analytics.

Fellow ASF member Geir Magnusson’s talk was named “When Business Models Attack”. The title alludes to the two systems that Geir described, both of which are designed specifically to handle extreme numbers of users. Geir was the VP of Platform and Architecture at Gilt Groupe, and one description of their model is that every day at Noon is Black Friday. So the Gilt system has to count on handling peak numbers of users every day at a particular time. Geir’s new employer, Function(x), also has a business model that depends on large numbers of users. The challenge is to design systems that will handle big usage spikes as a matter of course, not as a rarity. One of architectures that Geir described involved writing data into a Riak cluster in order to absorb the write traffic, and then using a Node.js based process to do a “write-behind” of that data into a relational database.

Takeaways

There were several technology themes that I encountered during the course of the 2 days:

  • Many of the talks that I attended involved the use of some kind of messaging system (most frequently RabbitMQ). Messaging is an important component in connecting systems that are operating a different rates, which is frequently the case in systems operating at high scale.
  • Many people are using Amazon EC2, and liking it, but there were a lot of jokes about the reliability of EC2.
  • I was surprised by how many people appear to be using Node.js. This is not a Javascript or dynamic language oriented community. There’s an inclination towards C, systems programming, and systems administration. Hardly an audience where you’d expect to see lots of Node usage, but I think that it’s notable that Node is finding some uptake.

One thing that I especially liked about Surge was the focus on learning from failure, otherwise known as a “fascination with disaster porn”. Most of the time you only hear about things that worked, but hearing about what didn’t work is at least as instructive, and in some case more instructive. This is something that (thus far) is unique to Surge.

W3C Web and TV Workshop

Last week I attended the Third W3C Web and TV Workshop (disclosure: I was a member of the program committee). This was the third in a series of three workshops that the W3C has organized around the intersection of web technologies and television. The purpose of the workshops is to bring these two communities together and help them understand and work with each other. The W3C has formed an interest group for member companies who are interested in working on issues related to the web and television.

Some of the topics discussed at the workshop included multi-screen experiences (there were 2.5 sessions on this topic, including some demonstrations), synchronized metadata, codecs (particularly around adaptive bit rate streaming over HTTP), and (inevitably) content protection/DRM.   

Given the advent of the iPad and other tablets, it should be no surprise that multi-screen experiences were a big topic. Apple has done some interesting work with AirPlay, but the general technology infrastructure for enabling multi-screen experiences is a mess. There are issues ranging from the “bottom”, related to the discovery of the various devices, through the negotiation of which devices have which roles, up to the mechanism for synchronizing content and metadata amongst these devices. There’s a lot of work to be done here, and some of that will be done in conjunction with other industry groups like DLNA and so forth. I’m most interested in the upper levels, which should be helping with synchronizing the experience and facilitating inter device/application communication.   

There was also significant discussion around synchronized metadata, which is highly relevant to multi-screen experiences, although there was more discussion/demonstration of end experiences as opposed to technologies that could be standardized to facilitate those experiences. Sylvia Pfeiffer gave an interesting demo of WebVTT using the Captionator polyfill. One of the best things about this discussion was that one of my colleagues from ESPN later explained to me the details of how captioning is done in their broadcast and internet workflows.

It’s impossible to talk about television without talking about video, and the two largest topics around video and the web are codecs and content protection. Most of the discussion around codecs revolved around the work at MPEG on Dynamic Adaptive Streaming over HTTP (DASH). There are at least three solutions in the market for streaming video via HTTP, all mutually incompatible for dumb reasons. DASH is an attempt to standardize that mechanism, while remaining silent on the question of which codec is used to produce the video file being streamed.

On the content protection front, there was the usual disconnect between the web world and the tv world. For me, the discussion here really centers around the ability to use the HTML5 video tag to deliver “premium” content. Today that content is delivered via the object tag and associated browser plugins. The problem is that each plugin works differently, so your web application code has to deal with all the possibilities that it might encounter. There appears to be some interest in standardizing a small and narrow set of API’s that web applications could use to interact with a content protection mechanism. Unsurprisingly, there was very little interest in standardizing a content protection mechanism for HTML5, especially since there isn’t agreement on a standard video codec.

Recently the W3C has been working very hard at getting consumer/content side companies to participate in its activities. Because the workshop was open to anyone, not just W3C member companies, there were a lot of attendees who were not from the traditional W3C constituencies. Personally, I think that this is a good thing, and not just in the Web and TV space. It will be interesting to see how much progress can be made – the Apple and Google native application models, are this generation’s Flash and Silverlight. I hope that we can find a way to build the next generation of television experiences atop the Open Web technology stack.

Google I/O 2011

Google I/O has a different feel than many of the conferences that I attend. Like Apple’s WWDC, there is a distinctly vendor partisan tone to the entire show — having the show in the same location as WWDC probably reinforces that. Unlike WWDC, the web focused portion of Google I/O helps to blunt that feeling, and the fact that lots of things are open or being open sourced also helps with the partisan feeling.

I’m going to split this writeup into two parts, the two keynotes, and the rest of the talks.

Android Keynote

The first keynote was the Android keynote and opened with a recap of Android’s marketplace accomplishments over the last year. The tone was decidedly less combative towards Apple than last year. There weren’t many platform technology announcements. There was the expected discussion of features for the next version of Android, but I didn’t really see much that was new. There was a very nice head tracking demo that involved front facing cameras and OpenGL – I believe this will be a platform feature, which is cool. Much was made of Music and Movies, but this is mostly an end user and business development story. The ability to buy/stream without a cable is nice, but as long as devices need to be plugged in to recharge (which in my case is every day), I don’t find this to be as compelling as those who were clapping loudly. What I did find interesting was the creation of a Council that will specify how quickly devices will be updated to a particular release of Android, and how long a device will be supported. This is pretty much an admission that fragmentation is real and a problem that needs addressing. I hope that it works.

The most interesting announcement during the Android keynote was the open accessories initiative. This is in direct contrast to Apple’s tight control over the iOS device connector. Google’s initiative is based on the open source Arduino hardware platform, and they showed some cool integration with an exercise bike, control over a home made labyrinth board, and some very interesting home automation work. As part of the home automation stuff, they showed an NFC enabled CD package being swiped against a home audio device, which then caused the CD to be loaded into the Google music service. This is cool, but I don’t know if CD’s will be around long enough for NFC enabled packaging to become pervasive. I’m very curious to see how the accessories initiative will play out, especially versus the iOS device connector. If this were to take off, Apple could build support for the specs into future iOS devices, although they would have to swallow their pride first. This will be very interesting to watch.

Chrome Keynote

Day two’s keynote was about Chrome, and the open web, although the focus was on Google’s contributions. Adoption of Chrome is going really nicely – 160M users. There was a demonstration of adding speech input by adding a single attribute to an element (done via the Chrome Developer Tools). Performance got several segments. The obligatory Javascript performance slide when up, showing a 20x improvement since 2008, and the speaker said he hoped to stop showing such slides, claiming that the bottlenecks in the system were now in other parts of the browser. This was a perfect segue to show hardware accelerated CSS transforms as well as hardware accelerated Canvas and WebGL.

I’ve been curious whether the Chrome web store is really a good idea or not, and we got some statistics to ponder. Apparently people spend twice as much time in applications when they are obtained via the web store, and people perform 2.5x the number of transactions. I wish there were some more information on these stats. Of course this is all before in-app purchasing, which was announced, along with a very small 5% cut for Google.   

Of course, no discussion of an app store should be without a killer app, so Google brought Rovio onto the stage to announce that Angry Birds is now available for the web, although it’s called Angry Birds for Chrome, and has special levels just for Chrome users. Apparently Chrome’s implementation of Open Web technologies has advanced to the point where doing a no compromises version of Angry Birds is possible.   Another indication of how far the Open Web has come is “3D dreams of Black“, which is a cool interactive media piece that is part film, part 3d virtual world. I’m keeping a pretty close eye on the whole HTML5 space, but this piece really shows how the next generation of the web is coming together as a medium all its own.

The final portion of the keynote was about ChromeOS and the notebooks or “Chromebook”s that run it. A lot of the content in this section was a repeat of content from Google’s Chrome Update event in December, but there were a few new things. Google has been hard at work solving some of the usage problems discovered during the CR-48 beta. This includes the trackpad (which was awful), Movies and Music, local file storage, and offline access. The big news for I/O is that Google has decided that ChromeOS is ready to be installed on laptops which will be sold as “Chromebooks”. Samsung and Acer have signed up to manufacture the devices. Google will also rent Chromebooks to businesses ($28/mo per user) and schools ($20/mo per user). This is latest round of the network computer vision, and it’s going to be interesting to see whether the windows of technology readiness and user mindset are overlapping or not. The Chrome team appears to have the best marketing team at Google, and in their classic style, they’ve produced a video which they hope will persuade people of the Chromebook value proposition.

Talks

On to the talks.

“Make the Web Faster” by Richard Rabbat, Joshua Marantz, and Håkon Wium Lie was a double header talk covering mod_pagespeed and WebP. mod_pagespeed is a module for the Apache HTTP server, which speeds up web pages by using filters to rewrite pages and resources before they are delivered to the client. These rewrites are derived from the rules tested by the client side Page Speed tool. The other half of the talk was about WebP which is a new format for images. Microsoft also proposed a new web image format several years ago, but it didn’t go anywhere.   

Nick Pelly and Jeff Hamilton presented “How to NFC”. The NFC landscape is complicated and there are lots of options because of hardware types and capabilities. The examples that were shown were reasonably straightforward, but the whole time I found myself thinking that NFC is way more complicated than it should be. Having written device drivers in a previous life, I shouldn’t be surprised, but I still am. It seems obvious to me that the concept of NFC is a great one. The technical end of thing seems tractable, if annoying. The business model issues are still unclear to me. I hope that it all comes together.

I really enjoyed Eric Bidelman and Arne Roomann-Kurrik’s HTML5 Showcase.   They showed some neat demos of things that you can do in HTML5. I particularly liked this one using 3D CSS. They also did some entertaining stuff with a command line interface. All of the source code to their demos is available – the link is in the slides.

I wasn’t able to get to Paul Irish’s talk on the Chrome Developer Tools at JSConf – there was quite a bit of Twitter buzz about it. I wasn’t too worried because I knew that the talk would be given again at Google I/O. For this version Paul teamed up with Pavel Feldman. There are a lot of really cool features going into the Chrome Developer tools. My favorite new features are the live editing of CSS and Javascript, revisions, saving modified resources, and remote debugging. The slide deck has pointers to the rest of the new features. If they go much further, they are going to turn the Developer Tools into an IDE (which they said they didn’t want to do).

Ray Cromwell and Phillip Rogers did a talk titled “Kick-ass Game Programming with Google Web Toolkit”, which was a talk about ForPlay, which is a library for writing games that they developed on top of GWT. This is the library that Rovio used to do Angry Birds for Chrome. If you implement your game using GWT, ForPlay can compile your game into an HTML5 version, an Android native app version, a Flash version, and a desktop Java version. They also showed a cool feature where you could modify the code of the game in Eclipse, save it, and then switch to a running instance of the Java version of the game, and see the changes reflected instantly.   

Postscript

Google has an undeniably large footprint in the mobile and open web spaces. I/O is a good way to keep abreast of what is happening at the Googleplex.

NodeConf 2011

Although I was definitely interested in JSConf (writeup), Nodeconf was the part of the week that I was really looking forward to. I’ve written a few small prototypes using Node and some networking / web swiss army knife code, so I was really curious to see what people are doing with Node, whether they were running into the same issues that I was, and overall just get a sense of the community.

Talks

Ryan Dahl’s keynote covered the plans for the next version of Node. The next release is focused on Windows, and the majority of the time was spent on the details of how one might implement Node on Windows. Since I’m not a Windows user, that means an entire release with nothing for me (besides bug fixes). At the same time, Ryan acknowledged the need for some kind of multiple Node on a single machine facility, which would appear in a subsequent. I can see the wisdom of making sure that the Windows implementation works well before tackling clustering or whatever it ends up being called. This is the third time I’ve heard Ryan speak, and this week is the first time I’ve spent any time talking with him directly. Despite all the hype swirling around Node, Ryan is quiet, humble, and focused on making a really good piece of software.

Guillermo Rauch talked about Socket.io, giving an overview of features and talking about what is coming next. Realtime apps and devices are a big part of my interest in Node, and Socket.io is providing an important piece of functionality towards that goal.

Henrik Joreteg’s talk was about Building Realtime Single Page applications, again in the sweet spot of my interest in Node. Henrik has built a framework called Capsule which combines Socket.io and Backbone.js to do real time synchronization of model states between the client and server. I’m not sure I believe the scalability story as far as the single root model, but there’s definitely some interesting stuff in there.

Brendan Eich talked about Mozilla’s SpiderNode project, where they’ve taken Mozilla’s SpiderMonkey Javascript Engine and implemented V8′s API around it as a veneer (V8Monkey) and then plugged that into Node. There are lots of reasons why this might be interesting. Brendan listed some of the reasons in his post. For me, it means a chance to see how some proposed JS.Next features might ease some of the pain of writing large programs in a completely callback oriented style. The generator examples Brendan showed are interesting, and I’d be interested in seeing some larger examples. Pythonistas will rightly claim that the combination of generators and callbacks is a been there / done that idea, but I am happy to see some recognition that callbacks cause pain. There are some other benefits of SpiderMonkey in Node such as access to a new debugging API that is in the works, and (at the moment) the ability to switch runtimes between V8 and SpiderMonkey via a command line switch. I would be fine if Mozilla decided to really take a run at making a “production quality” SpiderNode. Things are still early during this cycle of server side JavaScript, and I think we should be encouraging experimentation rather than consolidation.

One of the things that I’ve enjoyed the most during my brief time with Node is npm, the package management system. npm went 1.0 shortly before NodeConf, so Isaac Schleuter, the primary author of npm, described the changes. When I started using Node I knew that big changes were in the works for npm, so I was using a mix of npm managed packages and linking stuff into the Node search path directly. Now I’m using npm. When I work in Python I’m always using a virtualenv and pip, but I don’t like the fact that those two systems are loosely coupled. I find that npm is doing exactly what I want and I’m both happy and impressed.

I’ve been using Matt Ranney’s node_redis in several of my projects, it has been a good piece of code, so I was interested to hear what he had to say about debugging large node clusters. Most of what he described was pretty standard stuff for working in clustered environments. He did present a trick for using the REPL on a remote system to aid in debugging, but this is a trick that other dynamic language communities have been doing for some time.

Felix Geisendorfer’s talk was titled “How to test Asynchronous Code”. Unfortunately his main points were 1. No I/O (which takes out the asynchrony 2. TDD and 3. Discipline. He admitted in his talk that he was really advocating unit testing and mocking. While this is good and useful, it’s not really serious testing against the asynchronous aspects of the code, and I don’t really know of any way to do good testing of the non-determinism introduced by asynchrony. Felix released several pieces of code, including a test framework, a test runner, and some faking/mocking code.

Charlie Robbins from Nodejitsu talked about Node.js in production, and described some techniques that Nodejitsu uses to manage their hosted Node environment. Many of these techniques are embodied in Haibu, which is the system that Nodejitsu uses to manage their installation. Charlie pushed the button to publish the github repository for Haibu at the end of his talk.

Issues with Node

The last talk of the day was a panel of various Node committers and relevant folks from the broader Node community depending on the question. There were two of the audience questions that I wanted to cover.

The first was what kind of applications is Node.js not good for. The consensus of the panel was you wouldn’t want to use Node for applications involving lots of numeric computation, especially decimal or floating point, and that longer running computations were a bad fit as well. Several people also said that databases (as in implementing a database) were a problem space that Node would be bad at. Despite the hype surrounding Node on Twitter and in the blogosphere, I think that the core members of the Node community are pretty realistic about what Node is good for an where it could be usefully applied.

The second issue had to do with Joyent’s publication of a trademark policy for Node. One of the big Node events in the last year was Joyent’s hiring of Ryan Dahl, and subsequently a few other Node contributors. Joyent is basing its Platform as a Service offering on Node, and is mixing its Node committers with some top notch systems people who used to be at Sun, including some of the founding members of the DTrace team. Joyent has also taken over “ownership” of the Node.js codebase from Ryan Dahl, and that, in combination with the trademark policy is causing concern in the broader Node community.

All things being equal, I would prefer to see Node.js in the hands of a foundation. At the same time, I understand Joyent’s desire to try and make money from Node. I know a number of people at Joyent personally, and I have no reason to suspect their motives. However, with the backdrop of Oracle’s acquisition of Sun, and the way that Oracle is handling Sun’s open source projects, I think that it’s perfectly reasonable to have questions about Joyent or any other company “owning” an open source project. Let’s look at the ways that an open source project is controlled. There’s 1) licensing 2) intellectual property/patents 3) trademarks 4) governance. Now, taking them one at a time:

  1. Licensing – Node.JS is licensed under the MIT license. There are no viral/reciprocal terms to prevent forking (or taking a fork private). Unfortunately, there are no patent provisions in the MIT license. This applies to #2 below. The MIT license is one of the most liberal licenses around – it’s hard to see anything nefarious in its selection, and forking as a nuclear option in the case of bad behavior by Joyent or an acquirer is not a problem. This is the same whether Node is at a foundation or at Joyent.
  2. Intellectual Property – Code which is contributed to Node is governed by the Node Contributor License Agreement, which appears to be partially derived from the Apache Individual and Corporate Contributor license agreements (Joyent’s provision of an on-line form is something that I wish the ASF would adopt – we are living in the 21st century after all). Contributed IP is licensed to Node, but the copyright is not assigned as in the case of the FSF. Since all contributors retain their rights to their contributions, the IP should be clean. The only hitch would be if Joyent’s contributions were not licensed back on these terms as well, but given the use of the MIT license for the entire codebase, I don’ think that’s the case. As far as I can tell, there isn’t much difference between having Node at a foundation or having it at Joyent.
  3. Trademark – Trademark law is misunderstood by lots of people, and the decision to obtain a trademark can be a controversial one for an open source project. Whether or not Node.js should have been trademarked is a separate discussion. Given that there will be a trademark for Node.js, what is the difference between having Node at a foundation or at Joyent? Trademark law says that you have to defend your trademark or risk losing it. That applies to foundations as well as for profit companies. The ASF has sent cease and desist letters to companies which are misusing Apache trademarks. The requirement to defend the mark does not change between a non-profit and a for-profit. Joyent’s policy is actually more liberal than the ASF trademark policy. The only difference between a foundation and a company would be the decision to provide a license for use of the trademark as opposed to disallowing a use altogether. If a company or other organization is misusing the Node.js trademark, they will have to either obtain a license or stop using the mark. That’s the same regardless of who owns the mark. What may be different is whether or not a license is granted or usage is forbidden. In the event of acquisition by a company unfriendly to the community, the community would lose the trademarks – see the Hudson/Jenkins situation to see what that scenario looks like.   
  4. Governance – Node.js is run on a “benevolent dictator for life” model of governance. Python and Perl are examples of community/foundation based open source projects which have this model of governance. The risk here is that Ryan Dahl is an employee of Joyent, and could be instructed to do things a certain way, which I consider unlikely. I suppose that a foundation you could try to write additional policy about removal of the dictator in catastrophic scenarios, but I’m not aware of any projects that have such a policy. The threat of forking is the other balance to a dictator gone rogue, and aside from the loss of the trademark, there are no substantial roadblocks to a fork if one became necessary.

To riff on the 2010 Web 2.0 Summit, these are the four “points of control” for open source projects. As I said, my first choice would have been a foundation, and for now I can live with the situation as it is, but I am also not a startup trying to use the Node name to help gain visibility.

Final thoughts

On the whole, I was really pleased with Nodeconf. I did pick up some useful information, but more importantly I got some sense of the community / ecosystem, which is really important. While the core engine of Node.js is important, it’s the growth and flourishing of the community and ecosystem that matter the most. As with most things Node, we are still in the early days but thing seem promising.

The best collections of JSConf/NodeConf slides seem to be in gists rather than Lanyrd, so here’s a link to the most up to date one that I could find.

Update: corrected misspelling of Henrik Joreteg’s name. And incorrectly calling Matt Ranney Mark.

JSConf 2011

Last year when I attended JSConf I had some ideas about the importance of Javascript. I was concerned in a generic way about building “richer” applications in the browser and Javascript’s role in building those applications. Additionally, I was interested in the possibility of using Javascript on the server, and was starting to learn about Node.js.

A year later, I have some more refined ideas. The fragmentation of mobile platforms means that open web technologies are the only way to deliver applications across the spectrum of telephones, tables, televisions and what have you, without incurring the pain of multi platform development. The types of applications that are most interesting to me are highly interactive with low latency user interfaces – note that I am intentionally avoiding the use of the word “native”. Demand for these applications is going to raise the bar on the skill sets of web developers. I think that we will see more applications where the bulk of the interface and logic are in the browser, and where the server becomes a REST API endpoint. The architecture of “New Twitter” is in this vein. API endpoints have far less of a need for HTML templating and server side MVC frameworks. But those low latency applications are going mean that servers are doing more asynchronous delivery of data, whether that is via existing Comet like techniques or via Websockets (once it finally stabilizes). Backend systems are going to partition into parts that do asynchronous delivery of data, and other parts which run highly computationally intensive jobs.

I’ll save the discussion of the server parts for my Nodeconf writeup, but now I’m ready to report on JSConf.

Talks

Here are some of the talks that I found interesting or entertaining.

Former OSAF colleague Adam Christian talked about Jellyfish, which is a tool for executing Javascript in a variety of environments from Node to desktop browsers to mobile browsers. One great application for Jellyfish is testing, and Jellyfish sprang out of the work that Adam and others did on Windmill.

It’s been a while since I looked at Bespin/Skywriter/Ace, and I was pleased to see that it seems to be progressing quite nicely. I particularly liked the Github support.

I enjoyed Mary Rose Cook’s account of how writing a 2D platform game using Javascript cause her to have a falling in love like experience with programming. It’s nice to be reminded of the sheer fun and art of making something using code.

Unfortunately I missed Andrew Dupont’s talk on extending built-ins. The talk was widely acclaimed on Twitter, and fortunately the slides are available. More on this (perhaps) once I get some time to read the slide deck.

Mark Headd showed some cool telephony apps built using Node.js including simple control of a web browser via cell phone voice commands or text messages. The code that he used is available, and uses Asterisk, Tropos, Couchbase, and a few other pieces of technology.

Dethe Elze showed of Waterbear, which is a Scratch-like environment running in the browser. It’s not solely targeted at Javascript, which I have mixed feelings about. My girls have done a bunch of Scratch programming, so I am glad to see that environment coming to languages that are more widely used.

The big topics

There were four talks in the areas that am really concerned about, and I missed one of them, which was Rebecca Murphey’s talk on Modern Javascript, which appeared to be derived from some blog posts that she has written on the topic. I think that the problems she is pointing out – ability to modularize, dependency management, and intentional interoperability are going to be major impediments to building large applications in the browser, never mind on the server.

Dave Herman from Mozilla did a presentation on a module system for the next version of Javascript (which people refer to as JS.next). The design looks reasonable to me, and you can actually play with it in Narcissus, Mozilla’s meta circular Javascript interpreter, which is a testbed for JS.next ideas. One thing that’s possible with the design is to run different module environments in the same page, which Dave demonstrated by running Javascript, Coffeescript, and Scheme syntaxed code in different parts of a page.

The last two talks of the conference were also focused on the topic of JS.next.

Jeremy Askenas was scheduled to talk about Coffeescript, but he asked Brendan Eich to join him and talk about some of the new features that have been approved or proposed for JS.next. Many of these ideas look similar to ideas that are in Coffeescript. Jeremy then went on to try and explain what he’s trying to do in Coffeescript, and encouraged people to experiment with their own language extensions. He and Brendan are calling programs like the Coffeescript compiler, “transpilers” – compilers which compile into Javascript. I’ve written some Coffeescript code just to get a feel for it, and parts of the experience reminded me of the days when C++ programs went through CFront, which then translated them into C which was then compiled. I didn’t care for that experience then, and I didn’t care for it this time, although the fact that most of what Coffeescript does is pure syntax means that the generated code is easy to associate back to the original Coffeescript. There appears to be considerable angst around Coffeescript, at least in the Javascript community. Summarizing that angst and my own experience with Coffeescript is enough for a separate post. Instead I’ll just say that I like many of the language ideas in Coffeescript, but I’d prefer not to see Coffeescript code in libraries used by the general Javascript community. If individuals or organizations choose to adopt Coffeescript, that’s fine by me, but having Coffeescript go into the wild in library code means that pressure will build to adapt Javascript libraries to be Coffeescript friendly, which will be detrimental to efforts to move to JS.next.

The last talk was given by Alex Russell, and included a triple head fake where Alex was ostensibly to talk about feature detection, although only after a too long comedic delay involving Dojo project lead Pete Higgins. A few minutes into the content on feature detection, Alex “threw up his hands”, and pulled out the real topic of his talk, which is the work that he’s been doing on Traceur, which is Google’s transpiler for experimenting with JS.next features. Alex then left the stage and a member of the Traceur team gave the rest of the talk. I am all in favor of cleverness to make a talk interesting, but I would have to say that the triple head fake didn’t add anything to the presentation. Instead, it dissipated the energy from the Brendan / Jeremy talk, and used up time that could have been used to better motivate the technical details that were shown. The Traceur talk ended up being less energetic and less focused than the talk before it, which is a shame because the content was important. While improving the syntax of JS.next is important, it’s even more important to fix the problems that prevent large scale code reuse and interoperability. The examples being given in the Traceur talk were those kinds of examples, but they were buried by a lack of energy, and the display of the inner workings of the transpiler.

I am glad to see that the people working on JS.next are trying to implement their ideas to the point where they could be used in large Javascript programs. I would much rather that the ECMAScript committee had actual implementation reports to base their decisions on, rather than designing features on paper in a committee (update: I am not meaning to imply that TC39 is designing by committee — see the comment thread for more on that. ). It is going to be several more years before any of these features get standardized, so in the meantime we’ll be working with the Javascript that we have, or in some lucky cases, with the recently approved ECMAScript 5.

Final Thoughts

If your interests are different than mine, here is a list of pointers to all the slides (I hope someone will help these links make it onto the Lanyrd coverage page for JSConf 2011.

JSConf is very well organized, there are lots of social events, and there are lots of nice touches. I did feel that last year’s program was stronger than this years. There are lots of reasons for why this might be the case, including what happened in Javascript in 2010/11, who was able to submit a talk, a change in my focus and interests. Chris Williams has a very well reasoned description of how he selects speakers for JSConf. In general I really agree with what he’s trying to do. One thing that might help is to keep all the sessions to 30 minutes, which would allow more speakers, and also reduce the loss if a talk doesn’t live up to expectations.

On the whole, I definitely got a lot out the conference, and as far as I can tell if you want to know what is happening or about to happen in the Javascript world, JSConf is the place to be.

South by Southwest Interactive 2011

Back in 2006, Julie made the trek to Austin for South By Southwest Interactive (SXSWi) because she was organizing a panel. This year, I finally got a chance to go. In recent years, I’ve been to a lot of conferences. Many of them have been O’Reilly conferences, and the rest have been conferences organized by various open source communities. What almost all of them have in common is that they are developer centric. What is intriguing about SXSWi, to use John Gruber’s words, is that it is a conference where both developers and designers are welcome (As are a whole pile of people working in the social media space). One of the reasons that I decided to go this year was to try to get some perspective from a different population of people.   

SXSWi is a very large conference with this year’s attendance at around 14000 people. There are conferences which are bigger (Oracle OpenWorld, JavaOne in its heyday, or ComicCon San Diego), but not many. If you mix in the Film conference, which runs at the same time, you have a lot of people in Austin. Any way you slice it, it’s a huge conference. According to “old-timers” that I spoke to, the scale is new, and I would say it’s the source of almost all of the problems that I had with the conference.

Talks

Common wisdom in recent years is that SXSWi is more about the networking than the panel / talk content. I did find a number of interesting talks.

I’ve been loosely aware of Jane McGonigal’s work on games for quite some time, but I’ve never actually been able to hear her speak until now. Gamification is a big topic in some circles right now. I think that Jane’s approach to gaming is deeper and has much longer term impact than just incorporating some of the types of game mechanics that are currently in vogue. I also really appreciated the scientific evidence that she presented about games. I’m looking forward to reading her book “Reality Is Broken: Why Games Make Us Better and How They Can Change the World”.

I had no idea who Felicia Day was when I got to SXSWi. Like all conferences, I did my real planning for each day of SXSWi the night before, doing the usual research on speakers that I was unfamiliar with. Felicia’s story resonated with me because she was homeschooled (like my daughters), went on to be very successful academically and then went into the entertainment business. She is among the leaders in bringing original video content to the internet instead of going through the traditional channels of broadcast television or movie studios. It’s a path that seems more and more likely to widen (witness Netflix’s licensing of “House of Cards”, or Google’s acquisition of Next New Networks). I learned all of that before I sat in the keynote. By the time that I left the keynote, I found myself charmed by her humility and down to earthness, and impressed by the way that she has built a real relationship with her fans in such a way that she can rally them for support when needed.

For the last year or so I’ve been seeing reviews for “The Power of Pull: How Small Moves, Smartly Made, Can Set Big Things in Motion” by John Hagel, John Seely Brown and Lang Davison. It sounded like the authors have found an interesting structuring for some of the changes that I’ve observed by being in the middle of open source software, blogging, and so forth. I still haven’t gotten around to reading that book (the stack is tall – well actually, the directory on the iPad is full), but I was glad for the chance to hear John Hagel talk about shaping strategies, his theory on how to make big changes by leveraging the resources of an entire market or ecosystem rather than taking on all the risk in a solo fashion. His talk was on the last day of the conference, and I was wiped out by then, so I need a refresher and some additional think time on his ideas.

Much to my surprise, there were a number of really interesting talks on the algorithmic side of Data Science/Big Data. Many of these talks were banished to the AT&T Conference center at UT Austin, which was really far from the Austin Convention Center and very inconvenient to get to. I wasn’t able to make it to many of these talks due to this – having venues so far away – the AT&T Center, the Sheraton, and the Hyatt – pretty much dooms the talks that get assigned to those venues. It’s not a total loss, since these days it’s pretty easy to find the speakers of the talks and contact them for more information. But that’s a much higher friction effort than going to their talk, having a chance to talk to them afterwards or over dinner, and going from there. I did really enjoy the talk Machines Trading Stocks on News. I am not a financial services guy, and there was no algorithmic heavy lifting on display, but the talk still provided a really interesting look at the issues around analyzing semistructured data and then acting on it. As usual, the financial guys are quietly doing some seriously sophisticated stuff, while the internet startup guys get all the attention. In a related vein, I also went to How to Personalize Without Being Creepy which had a good discussion of the state of the art of integrating personalization into products. There was not statistical machine learning on display, but the product issues around personalization are at least as important as the particulars of personalization technology.

One of the nice things about having such a huge conference is that you get some talks from interesting vectors. Our middle daughter has decided that she wants to go to Mars when she grows up. Now it’s quite some time between now and then, but just in case, I stopped into the talk on Participatory Space Exploration and collected a bunch of references that she can go chase. I was also able to chat with the folks from NASA afterwards and pick up some good age appropriate pointers.

There were some interesting sounding talks that I wasn’t able to get into because the rooms were full. And as I’ve mentioned there were also some talks that I wasn’ t able to go to because they were located too far away. As a first time SXSWi attendee but a veteran tech conference attendee and speaker, I’d say that SXSWi is groaning under its own scale at this point. It’s affecting the talks, the “evening track” and pretty much everything else. This is definitely a case of bigger is not better.

Party Scene

I am used to conferences with an active “evening track”, and of course, this includes parties. SXSWi is like no other event that I’ve been to. The sheer number of parties, both public and private is staggering. I’ve never had to wait in line to get into parties before, and there are very few VIP lists, whereas at SXSWi both lines and VIP lists seem to be the order of the day. Part of that is due to the scale, and I’m sure that part of that is SXSW’s reputation as a party or euphemistically, networking, conference. The other issue that I had with the parties is that the atmosphere at many of them just wasn’t conducive to meeting people. I went to several parties where the music was so loud that my ears were ringing within a short time. It’s great that there was good music (a benefit of SXSW), and lots of free sponsor alcohol, but that isn’t really my style.

Despite all that, I did have some good party experiences. I accidentally/serendipitously met a group of folks who are responsible for social media presences at big brands in the entertainment sector, so I got some good insight in to the kind of problems that they face and the back channel on business arrangements with some of the bigger social networks. I definitely got some serious schooling on how to use Foursquare. At another party, I got ground’s eye view on what parts of Microsoft’s Azure PaaS offering is real, and how much is not. I’m not planning to be an Azure user any time soon, but it’s always nice to know what is hype and what is reality. I also really enjoyed the ARM party. It was a great chance to see what people are doing with ARM processors – these days. This video that I saw at the TI table made me realize just how close we are to seeing some pretty cool stuff. Nikon USA and Vimeo sponsored a fun party at an abandoned power plant. The music was really loud, but the light was cool and I made some decent pictures.

Other activities

There are activities of all kinds going on during SXSW. I wasn’t able to do a lot of them because they conflicted with sessions, but I was able to go on a pair of photowalks, which was kind of fun. The photowalk with Trey Ratcliff was pretty fun. As usual, scale was an issue, because we pretty much clogged up streets and venues wherever we went. I’ve started to put some of those photos up on Flickr, but I decided to finish this post rather than finish the post production on the pictures.

App Round Up

One of the things that makes SXSWi is that you have a large group of people who are willing to try a new technology or application. It’s conventional wisdom that SXSWi provided launching pads for Twitter and Foursquare, so now every startup is trying to get you to try their application during the week of the conference. While by no means foolproof or definitive, this is a unique opportunity to observe how people might use a piece of technology.

Before flying down to Austin, I downloaded a bunch of new apps on my iPhone and iPad – so many that I had to make a SXSW folder. I had no preconceived notions about which of these new apps I was going to use.

There were also two web applications that I ended up using quite a bit: Lanyrd’s SXSW guide, and Plancast. Lanyrd launched last year as kind of a directory for conferences, and I’ve been using it to keep track of my conference schedule for a good number of months. For SXSWi, they created a SXSW specific part of the site that included all the panels, along with useful information like the Twitter handles and bios of the speakers. Although SXSW itself had a web application with the schedule, I found that Lanyrd worked better for the way that I wanted to use the schedule. This is despite the face that SXSW had an iPhone app while Lanyrd’s app has yet to ship. With Lanryd covering the sessions, I used Plancast (and along the way Eventbrite) to manage the parties. Plancast had all the parties in their system, including the Alaska direct flight from Seattle to Austin that I was on. Many of the parties were using Eventbrite to limit attendance, and while I had used Eventbrite here and there in the past, this finally got me to actually create an account there and use it. Eventbrite and Plancast integrate in a nice way, and it all worked pretty well for me.

Of all the ballyhooed applications that I downloaded, I really only ended up using two. There were a huge number of group chat/small group broadcast applications competing for attention. The one that I ended up using was GroupMe, mostly because the people I wanted to keep up with were using it. Beyond the simple group chat/broadcast functionality, it has some other nice features like voice conference calling that I didn’t really make use of during SXSW. Oddly enough, I first started using Twitter when I was working with a distributed team, and I always wished that Twitter had some kind of group facility. It’s nice that GroupMe and its competitors exist, but I also can’t help feeling like Twitter missed an opportunity here. Facebook’s acquisition of Beluga suggests as much.

The other application that I ended up using was Hashable. Hashable’s marketing describes it as “A fun and useful way to track your relationships”. I’d describe my usage of it as a way to exchange business cards moderately quickly using Twitter handles. A lot of my Hashable use centered around using my Belkin Mini Surge Protector Dual USB Charger to multiply the power outlets at the back of the ballrooms. I’ve made a lot of friends with that little device. In any case, I used Hashable as a quick way to swap information with my new power strip friends. While I used it, I’m ambivalent about it. I like that it can be keyed off of either email address or Twitter handle – I always used Twitter handle. My official business cards don’t have a space for the handle, which is annoying here in the 21st century. However, the profile that it records is not that detailed, so any business card information that is going to a new contact isn’t that detailed. It seems obvious to me that there ought to be some kind of connection to LinkedIn, but there’s no space for that. So I couldn’t really use Hashable as a replacement for a business card because all the information isn’t there. It’s also more clumsy to take notes about a #justmet on the iPhone keyboard than to write on the back of a card. The difficulty of typing on the iPhone keyboard also makes it time consuming and kind of antisocial to use. In a world where everyone used Hashable, and phones were NFC equipped, you can imagine a more streamlined exchange, but even then, the right app would have to be open on the phone. Long term, that’s an interface issue that phones are going to run into. Selecting the right functionality at the right time is getting to be harder and harder – pages of folders of apps means that everything gets on the screen, but it doesn’t mean that accessing them is fast.

In a similar vein, there were QR codes plastered all over pamphlets, flyers, and posters, but as @larrywright asked me on Twitter, I didn’t see very many people scanning them. Maybe people were scanning all that literature in their rooms after being out till 2am. There’s still an interface problem there.

In addition to all the hot new applications, there were the “old” standby’s, Foursquare and Twitter.

I am a purpose driven Foursquare user. I use Foursquare when I want people to know where I am. I’ve never really been into the gamification aspects of Foursquare, but I figured that SXSWi was the place to give that aspect of Foursquare more of a try. Foursquare rolled out a truckload of badges for SXSWi, and sometimes it seemed like you could check into every individual square foot of the Austin Convention Center and surrounding areas. So I did do a lot more checking in, mostly because there were more places to check in, and secondarily because I was trying to rack up some points. Not that the points ever turned into any tangible value for me. But as has been true at other conferences, the combination of checking on Foursquare and posting those checkins to Twitter did in fact result in some people actually tracking me down and visiting.

If you only allowed me one application, it would still be Twitter. If I wanted to know what was happening, Twitter was the first place I looked. Live commentary on the talks was there. I ended up coordinating several serendipitous meetings with people from Twitter. Twitter clients with push notifications made things both easy and timely. While I’m very unhappy with Twitter’s recent decree on new Twitter clients, the service is still without equal for the things that I use it for.

One word on hardware. There were lots of iPad 2′s floating around. I’m not going to do a commentary on that product here. For a conference like SXSWi, the iPad is the machine of choice. After the first day, I locked my laptop in the hotel safe. I would be physically much more worn out if I had hauled that laptop around. The iPad did everything that I needed it to do, even when I forgot to charge it one night.   

Interesting Tech

While SXSWi is not a hard core technology conference, I did manage to see some very interesting technology. I’ve already mentioned the TI OMAP5 product line at the ARM party. I took a tour of the exhibit floor with Julie Steele from O’Reilly, and one of the interesting things that we saw was an iPhone app called Neer. Neer is an application that let’s you set to-do’s based on location. This is sort of an interesting idea, but the more interesting point came out after I asked about Neer’s impact on the phone’s battery life. I had tried an application called Future Checkin, which would monitor your location and and check you into places on Foursquare, because I was so bad about remembering to check in. It turned out that this destroyed the battery life on my phone, so I stopped using it. When I asked the Neer folks how they dealt with this, they told me that they use the phone’s accelerometer to detect when the phone is actually moving, and they only ping the GPS when they know you are moving, thus saving a bunch of battery life. This is a clever use of multiple sensors to get the job done, and I suspect that we’re really only at the beginning of seeing how the various sensors in mobile devices will be put to use. It turns out that the people working on Neer are part of a Qualcomm lab that is focused on driving the usage of mobile devices. I’d say they are doing their job.

The other thing that Julie and I stumbled upon was 3taps, which is trying to build a Data Commons. The whole issue of data openness, provenence, governance, and so forth is going to be a big issue in the next several years, and I expect to see lots of attempts to figure this stuff out.

The last interesting piece of technology that I learned about is comes from Acunu. The Acunu folks have developed a new low-level data store for NoSQL storage engines, particularly engines like Cassandra. The performance gains are quite impressive. The engine will be open source and should be available in a few months.   

In conclusion

SXSWi is a huge conference and it took a lot out of me, more than any other conference that I’ve been to. While I definitely got some value out of the conference, I’m not sure that the value I got corresponded to the amount of energy that I had to put in. Some of that is my own fault. If I were coming back to SXSWi, here are some things that I would do:

  • Work harder at being organized about the schedule and setting up meetings with people prior to the conference
  • Skip many of the parties and try to organize get togethers with people outside of the parties
  • Eat reasonably – SXSW has no official lunch or dinner breaks – this makes it to easy to go too long without eating which leads to problems.
  • Always sit at the back of the room and make friends over the power outlets

Lanyrd is collecting various types of coverage of the conference whether that is slide decks, writeups, or audio recordings.   

I like the idea of SXSWi, and I like the niche that it occupies, but I think that scale has overtaken the conference and is detracting from the value of it. Long time attendees told me that repeatedly when I asked. I would love to see some alternatives to SXSWi, so that we don’t have to put our eggs all in one basket.