Category Archives: internet

Strange Loop 2013

It’s been a while since I have written a post, or been to a conference.    I wish that I had time to write more and that I could write about what I am up to. In lieu of that, here is a report on Strange Loop 2013, which is the only conference that I am attending all year.

Emerging Languages Camp

One does not attend Strange Loop without attending Emerging Languages Camp (ELC). I view the camp as a kind of roulette. Of course it’s unlikely that many, or perhaps any of the languages presented in the forum will ever see widespread adoption. That’s not really the point. The camp is about getting language designers and implementors together, and giving them a forum to exchange ideas. Much of what has happened in languages recently is that we’ve been taking old ideas and trying to recast them in a modern context, either because computing platforms are now fast enough to bear the inefficiencies of those ideas, or because the computing community as a whole appears to be of a mind to accept them. ELC is also just a place for people who want to experiment with languages. Here was some of what stood out to me:

  • Gershwin – an embedding of Forth semantics in Clojure – apparently this can be helpful when one is using the Clojure threading macros all over the place
  • Noether – a highly stratified approach to language design. Unfortunately this talk dwelt too much on the philosophy and mathematical semantics and ran out of time before covering the details of the actual language. It’s a shame that we didn’t get to see the full content.
  • Qbrt bytecode – This was an interesting look at a byte code system for a different design space, where the byte code was representing somewhat high level functionality as opposed to machine level instructions.
  • J – J is cool because it’s basically APL 2.0. But now we also have R, Julia, and other languages. This is a space that has many contenders but no clear leader. My problem with J is that it looks even more write only than Perl.
  • BODOL – This talk wasn’t interest so much for the language, which the presenter acknowledged was “another toy lisp”, but for the presentation, which involved HTML based slides being displayed inside of Emacs. I felt like I was back in 6.001 watching the construction of a lisp, but the presentation quality was really high, which made for a great talk.

In addition to the talks I had a number of dinner and hotel lobby discussions around ELC related topics. The ELC attendees are a great bunch.

Sessions

Jenny Finkel from Prismatic gave a great overview of how they use machine learning in the product. As a user this was a great explanation of how the system really works. Machine learning in keynotes is fough because it’s not something that everyone has studied so it is hard to get the level right. I definitely enjoyed it. The most useful information in the talk was when she said that coding on an iPad version has begun. I will definitely be using the living daylights out of that when it comes out.

This year there was definitely a theme around making it easier to deal with asynchronous / event-driven systems. I was able to make it to two of the talks – there were several more. Matthew Podwysocki and Donna Malayeri from Microsoft presented on the Rx extensions, which I’ve written plenty about on this blog already. This time they came with some cool Kinect based demos. Nice to see something other than web systems. The other talk that I saw was Rich Hickey’s talk on core.async. As usual, Rich brought a great mix of theoretical concerns around core.async’s CSP based model, while melding it with treal world system building. I consider this to be a hallmark of Rich’s style, and he is one of the very very few people who is really able to fuse theory and practice into a whole. And of course, Clojure is the manifestation of that fusion. I’ve got a bunch of David Nolen’s posts on core.async in tabs in my browser, but just haven’t had the time to sit down and read them. I feel a little more caught up now.

Another talk that I really enjoyed was Martin Odersky’s keynote on “The Trouble with Types“. The beginning of the talk was about the usual comparison between static and dynamic typing, and the end of the talk was about his work on Dotty, which is his new work on types based on “projecting higher kinded functional types onto names in modules”. He hopes that Dotty will eventually become the foundation for a future version of Scala. The interesting part for me happened in the middle of the talk, because that was the part where he admitted to some of the problems with types. Like needing a debugger for your type system because it had become so computationally powerful. The aha moment for me was around his discussion of how orthongonality in the type system design had contributed to the problems that he saw with Scala’s type system. It is a tenet among computer scientist that orthogonality is desirable. It was one of the foundational arguments in the CISC vs RISC computer architecture wars, and it is a badge of honor among language designers to have as much orthogonality as possible. However, orthogonality leads to a potentially large surface area of possibilities and interactions, which users need to master and be aware of, and which implementors need to implement completely and efficiently. On reflection, this should be obvious, but the lights went on for me during the talk.

I stopped into to see the latest installment of Friedman and Byrd’s journey with MiniKanren. I was very interested to see their exploration of the Chomsky hierarchy (of computational complexity). As usual, this is related to my interest in Clojure’s core.logic. They “cheated” a little bit in what they showed, but it was still interesting.   

Avi Bryant gave a great talk on the applicability of abstract algebra, well mostly monoids, to the type of streaming calculations that are needed in many analytics systems. He showed how this provides a foundation for solutions like hyperloglog, min-hash, and count-min sketch.   

Crista Lopes gave a talk called Exercises in Style which was about styles of programming. She observed that within teams, one can often figure out who wrote a piece of code just by looking at it. Another observation was that in art, students are taught what comprises the various styles and how to produce them. I thought that this was leading to a very promising discussion. She then presented the same program written in 9 different styles that she had observed. The first 3-4 styles (which have numbers and not names yet) were really what I was expecting to see. As we moved to more styles, they started to look like language paradigms, which I think is less interesting. Lopes is working on a book on this topic and has 33 styles that she’s identified so far. I’ll be keeping my eye out for that.

Another theme at Strange Loop this year was diversity. You saw it both in the speakers roster and in the session content. I didn’t get a chance to ask him, but I am pretty sure that Alex Miller made a very concerted effort to invite women speakers, because there was a much higher number of women speakers than last year, and also much higher than any other conference that I can remember. On the content side, there were several sessions. There was a good presentation on the history of women in computing. I definitely learned a bunch of things. The focus was on the early history of computing which was great, but I was disappointed that several prominent women in recent history were omitted. That’s the problem with histories, invariable someone gets left out. Some history is better than no history, especially on this topic. Alex also invited Jen Myers to do one of the keynotes. I’m not sure how to summarize this presentation because there are too many different angles. There was the angle about diversity, there was the angle about boosting education, there was the angle of making something good because we are good as people. So rather than try, I’ll just insert a Ray Bradbury quote that Jen used in her talk. This version is longer than the version in the talk, but it speaks to me.

I think it’s part of the nature of man to start with romance and build to a reality. There’s hardly a scientist or an astronaut I’ve met who wasn’t beholden to some romantic before him who led him to doing something in life.

I think it’s so important to be excited about life. In order to get the facts we have to be excited to go out and get them, and there’s only one way to do that — through romance. We need this thing which makes us sit bolt upright when we are nine or ten and say, ‘I want to go out and devour the world, I want to do these things.’ The only way you start like that is with this kind of thing we are talking about today. We may reject it later, we may give it up, but we move on to other romances then. We find, we push the edge of science forward, and I think we romance on beyond that into the universe ever beyond. We’re talking not about Alpha Centauri. We’re talking of light-years. We have sitting here on the stage a person who has made the film* with the greatest metaphor for the coming billion years. That film is going to romance generations to come and will excite the people to do the work so that we can live forever. That’s what it’s all about. So we start with the small romances that turn out to be of no use. We put these tools aside to get another romantic tool. We want to love life, to be excited by the challenge, to life at the top of our enthusiasm. The process enables us to gather more information. Darwin was the kind of romantic who could stand in the middle of a meadow like a statue for eight hours on end and let the bees buzz in and out of his ear. A fantastic statue standing there in the middle of nature, and all the foxes wandering by and wondering what the hell he was doing there, and they sort of looked at each other and examined the wisdom in each other’s eyes. But this is a romantic man — when you think of any scientist in history, he was a romancer of reality.

Alex has historically done a great job of getting great speakers for Strange Loop, and just just recent stars, but pioneers and old timers. This year we had Chuck Moore, Dan Friedman, and Douglas Hofstader. This year’s closing keynote was Hofstader, whose book “I am a Strange Loop” was the inspiration for the name of the conference.   Hofstader’s talk was an exploration of that topic, and was everything that you could have hoped for given Hofstader’s amazing works of literature. What one could not have hoped for, however, was what followed. Alex commissioned David Stutz to do a multimedia performance based on Hofstader’s work. “Thrown for a Loop: A Carnival of Consciousness” was a performance that involved theater, a 5 piece brass quintet, Macintosh driven multimedia (including forays into Emacas and Clojure’s nREPL), and an aerialist. You will have to go and watch the video when it comes out, because I don’t have the words to describe it.

Miscellanea

Thursday night of Strange Loop we were treated to a conference party at the City Museum in St. Louis, which can only be described as part architectural museum and part playground, or as @petrellic put it “A habit rail for adults”. This was one of the most amazing venues that I have ever been to for a conference party. The three hours that we were allotted vanished quickly as we all explored the mysteries of the museums and its paths, trail, tunnels, stairways, and slides.

I’ve always had a little trouble describing what Strange Loop is to my coworkers and friends. I found a tagline, courtesy of @samberan: “Strange Loop is TED for programmers”.

Conference organizers take note: Strange Loop has seriously raised the already high bar on you.

Strange Loop 2012

I think that the most ringing endorsement that I can give Strange Loop is that it has been a very long time since I experienced so much agony when trying to pick which talks to go to during any given block.

Emerging Languages Camp

This year Strange Loop hosted the Emerging Languages Camp (ELC), which previously had been hosted at OSCON. I liked the fact that it was its own event, not yet another track in the OSCON panoply. That, coupled with a very PLT oriented audience this year, made Strange Loop a much better match for ELC than OSCON.

I definitely went into ELC interested in a particular set of talks. There is a lot of buzz around big data, and some of the problems around big data and data management more generally. Also I did my graduate work around implementing “database programming languages”, so there was some academic interest to go along with the practical necessity. There were three talks that fell into that bucket: Bandicoot: code reuse for the relational model, The Reemergence of Datalog, and Julia: A Fast Dynamic Language for Technical Computing.

I found Bandicoot a little disappointing. I think that the mid 90′ work of Buneman’s group at UPenn on Structural Recursion as a Query Language and Comprehension Syntax would be a better basis for a modulary and reusable system for programming relations.   

Logic Programming may be making a resurgence via the work on core.logic in Clojure and the influence of Datalog on Cascalog, Datomic and Bloom. The Reemergence of Datalog was tutorial on Datalog for those who had never seen it before, as well as a survey of Datalog usage in those modern day systems.

Julia is a language that sits in the same conceptual space as R, SAS, SPSS, and so forth. The problem with most of those systems is that they were designed by statisticians and not programmers. So while they are great for statistical analysis, they are less good for statistical programming. Julia aims to improve on this, while adding support for distributed compuation and a very high performance implementation. There’s no decisive winner in the technical computing space, and it seems like Julia might have a chance to shine.

There were, of course, some other interesting language talks at ELC.   

Dave Herman from Mozilla talked about Rust for the first time (at least to a large group). Rust is being developed as a systems programming language. There are some interesting ideas in it, particularly a very Erlang like concurrency model. At the same time, there were some scary things. Part of what Rust is trying to do is achieve performance, and part of how this happens is via explicit specification of memory/variable lifetimes. Syntactically this is accomplished via punctuation prefixes, and I was wondering if the code was going to look very Perl-ish. servo is browser engine that is being written in Rust, and looking at the source code of a real application will help me to see whether my Perlishness concern is valid.

Elixir: Modern Programming for the Erlang VM looks like a very nice way to program atop BEAM (the Erlang VM). Eliminating the prolog inspired syntax goes a long way, and it appears that Elixir also addresses some of the issues around using strings in Erlang. It wasn’t clear to me that all of the string issues have been addressed, but I was definitely impressed with what I saw.

Strange Loop Talks and Unsessions

I’m going to cover these by themes. I’m not sure these are the actual themes of the conference, but they are the themes that emerged from the talks that I went to.

First, and unsurprisingly, a data theme. The opening keynote, In Memory Databases: the Future is Now! was by Mike Stonebraker. It’s been a long time since I saw Stonebraker speak – I think that the last time was when I was in graduate school. He was basically making the case that transaction processing (TP) is not going away, and that there might be applications for a new generation of TP systems in some of the places where the various NoSQL systems are now being used. Based on that hypothesis/assumption, he then went on to describe the trends in modern systems and how they would lead to different design, much of which is embodied in VoltDB. This was a very controversial talk, at for some people. I considered the trend/system analysis part to be reasonable in a TP setting. I’m not sure that I agree with his views on the applicability of TP, but I’m fairly sure that time will sort all of that out. I think that this is an important point for the NoSQL folks to keep in mind. When the original work on RDBMS was done, it was mocked, called impractical, not useful and so forth. It took many years of research and technology development. I think that we should expect to see something similar with NoSQL, although I have no idea how long that timeline will be.

Nathan Marz’s talk Runaway Complexity in Big Data… and a plan to stop it. was basically making the case for, and explaining the hybrid/combined batch/realtime architecture that he pioneered at BackType, and which is now in production at Twitter. That same architecture led to Cascading and Storm, which are pretty interesting systems. Marz is working on a book with Manning that will go into the details of his approach.

The other interesting data talks revolved around Datomic. Unfortunately, I was unable to attend Rich Hickey’s The Database as a Value, so I didn’t get to hear him speak directly about Datomic. There are several Datomic related videos floating around, so I’ll be catching up on those. I was able to attend the evening unsession Datomic Q&A / Hackfest. This session was at 9pm, and was standing room only. I didn’t have quite enough background on Datomic to follow all of what was said, but I was very interested by what I saw: the time model, the immutability of data which leads to interesting scalability, the use of Datalog. I’m definitely going to be looking into it some more. The one thing that troubles me is that it is not open source. I have no problem with a paid supported version, but it’s hard to make the argument for proprietary system or infrastructure software nowadays.

Another theme, which carried over from ELC was logic programming. I had already heard Friedman and Byrd speak at last fall’s Clojure/conj, and I was curious to see where they have taken miniKanren since then. In their talk, Relational Programming in miniKanren, they demonstrated some of what they showed previously, and then they ran out of material. So on the fly, they decided to implement a type inferencer for simple lambda terms live on stage. Not only were they able to finish it, but since it was a logic program, they were also able to run it in reverse, which was pretty impressive. I was hoping that they might have some additional work on constraints to talk about, but other than disequality constraints, they didn’t discuss anything. Afterwards in Twitter, Alex Payne pointed out that there are some usability issues with miniKanren’s API’s. I think that this is true, but it’s also true that this is a research system. You might look at something like Clojure’s core.logic for a system that’s being implemented for practitioners.

David Nolen did an unsession Core Logic: A Tutorial Reconstruction where he walked the audience through the operation of core.logic, and by extension, miniKanren, since the two systems are closely related. He pointed out that he read parts of “The Reasoned Schemer” 8 times until he understood it enough to implement it, and then he found that he didn’t really understand it until after the implementation was done. There was also a large crowd in this session, and Christopher Petrelli made a video recording on his phone, since InfoQ wasn’t recording the unsessions.

The final talk in the logic programming them was Oleg Kiselyov’s talk Guess lazily! Making a program guess and guess well. Kiselyov has been around for a long time and written or coauthored many important papers related to Scheme and continuations. I’ve be following (off and on) his work for a long time, but this is the first time I was at a conference where he was speaking. I was shocked to find that the room was packed. His talk was about how to defer making the inevitable choices required by non-determinism, especially in the context of logic type systems. His examples were in OCaml, which I had some trouble following, but after Friedman and Byrd the day before, he apparently felt compelled to write a type inferencer that could be run backwards as well. His code was a bit longer than the miniKanren version.

The next theme is what I’d call effective use of functional programming. The first talk was Stuart Sierra’s Functional Design Patterns. This was a very worthwhile talk, which I won’t attempt to summarize since the slides are available. Needless to say, he found a number of examples that could be called design patterns. This was one of the talks where I need to sit down and look at the patterns and think on them for a while. That’s hard to do during the talk (and the conference, really). Some things require pondering, and this is one of them.

The other talk in this category was Graph: composable production systems in Clojure, which described the Prismatic team’s approach to composing systems in Clojure. What they have is an abstraction that allows them to declaratively specify how the parts of the system are connected. For a while it just looked to me like a way to encode a data flow graph in a Clojure abstraction. The aha moment was when he showed how they use Clojure metadata to annotate the arguments or pipe connectors if you will. The graphs can be compiled in a variety of ways including Clojure lazy maps, which present some interesting possibilities. Unfortunately, I had to leave half way through the talk, so I missed the examples of how the apply this abstraction in their system.

Theme number four was programming environments. I hesitate to use the term IDE, because it connotes a class of tools that is loved by some, reviled by others, and when you throw that term around, it seems to limit people’s imagination. I contributed to the Kickstarter for Light Table, so I definitely wanted to attend Chris Granger’s talk Behind the Mirror: The birth of Light Table. Chris gave a philosophical preamble before showing off the current version of Light Table. He demonstrated adding support for Git in a short amount of code, and went on to demonstrate a mode for developing games. He said that they are planning to release version 1 sometime in May, and that Light Table will be open source. I also learned that Kickstarter money is counted as revenue, so they have lost a significant amount of the donations to taxes, which is part of the reason that Kodawa participated in Y Combinator, and is trying to raise some money to get a bigger team.

Not long after the Light Table kickstarter, this video by Bret Victor made the rounds. It went really well with all the buzz about Light Table, and Alex Miller, the organizer of Strange Loop, went out and persuaded Bret to come and talk. Bret’s title was Taking off the Blindfold, and I found this to be a very well motivated talk. In the talk, Bret talked about the kinds of proerties that our programming tools should have. The talk was vey philosophical despite the appearance of a number of toy demos of environment features.

During both of these talks there was a lot of chatter. Some was harking back to the Smalltalk (but sadly, not the Lisp Machine) environments,while some questioned the value of a more visual style of tools (those emacs and vi graybeards). When I first got into computers I read book called “Interactive Programming Environments” and ever since i’ve always been wishing for better tools.   I am glad to see some experimentation come back into this space.

Some old friends are busy making hay in the Node.js and Javascript communities, and it probably horrifies theme that I have ClojureScript as a theme, but so be it. I went to two ClojureScript talks. One was David Nolen’s ClojureScript: Better Semantics at Low Prices!, which was really state of the union of ClojureScript. The second was Kevin Lynagh’s Building visual data driven UI’s with ClojureScript. Visualization is becoming more and more important and ClojureScript’s C2 library look really appealing.

It’s fitting that the last them should be Javascript. Well, maybe. I went to two Javascript talks, and both of them were keynotes, so I didn’t actually choose them. But Javascript is so important these days that it really is a theme. In fact, it’s so much of a theme, that I’ve been going to Javascript conferences for the last 2 years. It’s been several years since I saw Lars Bak speak. His talk on Pushing the Limits of Web Browsers was in two parts. Or so I think. I arrived just as he was finishing the first part which seemed like an account of the major things that the V8 team has learned during their amazing journey of speeding up Javascript. The second part of his talk was about Dart. I didn’t know that Bak was the lead of the Dart project, but that doesn’t change how I feel about Dart. I see the language, I understand the rationale, and I just can’t get excited about it.   

I’ve been to enough of those Javascript only talks to hear Brendan Eich talk about The State of Javascript. Brendan opened by giving a brief history of how Javascript got to be the way it is, and then launched into a list of the improvement coming in EcmaScript 6 (ES6). That was all well and good, and towards the end, after the ES6 stuff, he threw in some items that were new, like the sweet.js hygienic macro project, and the lljs typed JavaScript project. It seemed like this was a good update for this audience, who seemed unaware of all the goings on over in JavaScript land. From a PLT point of view, I guess that’s understandable, but at the same time, JavaScript is too important to ignore.

Final Thoughts

Strange Loop has grown to over 1000 people, much larger than when I attended in 2010 (I had to miss 2011). I think that Alex Miller is doing a great job of running the conference, and of finding interesting and timely speakers. This was definitely the best conference that I attended this year, and probably the last 2-3 years as well.

If you’re looking for more information on what happened at Strange Loop 2012:

Slides: https://github.com/strangeloop/strangeloop2012/tree/master/slides

Other Strange Loop Coverage: https://github.com/strangeloop/strangeloop2012/wiki/Coverage

JSConf 2012

This year JSConf was in Scottsdale Arizona, which provided some welcome relief from the cold, wet, Seattle winter/spring.

News

One of the biggest pieces of news was that Mozilla gave all attendees a Nexus S smartphone running a developer version of the Boot to Gecko (B2G) phone operating system. When I say developer, I mean, camera support was broken, things were crashing, that sort of thing. These phones were a big hit among the attendees. They contributed to knocking the conference wifi out temporarily, and I saw several groups of people who were working on projects for the phone. My experience at Google I/O had soured me on the idea of giving away free devices. In the case of Google I/O, device giveaways have become an expectation, and there is some proportion of people who sign up for that conference based on the hope of getting a free device. Still, Mozilla is going to need all the help that they can get, and people seemed to take the challenge to heart. I did find it interesting that the Mozilla folks were speaking of B2G as a great feature phone software stack. This is a realistic way of climbing up the stairs in the mobile phone market. It’s hard to imagine a real competitor to iOS and Android, but I’m glad to see an effort in this direction. There’s WebOS, Windows Phone 7, and B2G all using some variant of the open web stack. It seems like there ought to be some collaboration between B2G and WebOS’s Enyo framework.

Talks

There were a bunch of talks on the internals of Javascript Virtual Machines. From a computer science point of view, these talks are interesting. I heard a lot of these kinds of talks at PyCon and during my days at Sun. It seemed that most of the audience appreciated this material, so the selections were good. The part of this that I found disturbing is wrapped up in one of the questions, which was basically, how can we write our code to take advantage of how the VM works. Given the number of VM’s that Javascript code might execute on, this seems like a path fraught with peril.

Also on the language front, there was more representation from functional programming. There was a talk on Roy, and David Nolen gave a talk that was billed as being about Clojurescript, but was really more about having a sense of play regarding all this technical work. Closely related to the functional programming was GPU programming. Jarred Nichols talked about implementing a Javascript interpreter in OpenCL. Stephan Herhut from Intel talked about the RiverTrail parallel extensions to Javascript which do data parallel computing using operations taken from functional programming. The extensions compile to OpenCL, which I found interesting. I wonder how many more languages we’ll see compiling to OpenCL or partially compiling to OpenCL.

Paul Irish did a nice presentation on tools which gave a great overview of the state of the practice in the various areas related to web application development. There were several tools that I didn’t know about. The presentation is all HTML5 but has some very nice visuals and animation. I’d love to know the name of the package that he used.

Ever since Node.js came out, I’ve been enamored of the idea that you could share/move some amount of code back and forth between the client and the server, much as code used to move back in the days of NeWS. Yahoo’s Mojito is an investigation in this space. It relies heavily on YUI, which I haven’t used. I’m looking forward to looking into the code and seeing how it all fits together.

The team at Bitovi make a special lunchtime presentation about CanJS, which is another MVC framework for Javascript. CanJS is in the same space as backbone, knockout, and so forth. It’s claims to fame are reduction of certain kinds of memory leaks, size, and speed. From the benchmark slides it seems worth a look.

Keynotes

Dan Ingalls delivered the closing keynote on the first day. I met Dan briefly when I worked at Sun, and I was familiar with his work on the Lively Kernel. The Lively Kernel is the answer to the question “what if we tried to build Squeak Smalltalk in Javascript”. It is much more than a language, it is an environment for building programs and simulations. I’m of two minds about this work. On the one hand, there’s depression that we still haven’t managed to catch up to the work that Ingalls and his contemporaries pioneered 30 years ago, and that today’s practitioners are completely oblivious to this work (a comment on Twitter confused Lively with an advanced version of the NeXT Interface Builder — the causality is reversed). On the other hand, although the Lively Kernel is written in Javascript and runs in a browser, it’s not really connected to today’s world, and so it’s applicability to solving web problems is limited. Nonetheless, Ingalls received a well deserved standing ovation. He is among the pioneers of our field, and as his generation is starting to pass on, it feels good to be able to personally honor them for their contributions to the field.

I have no idea how Chris Williams convinced Rick Falkvinge, the founder of the first (Swedish) Pirate Party to come and speak at JSConf. The topic of his talk was the politics of the net generation. Falkvinge told the story of how he came to found the Pirate party in Sweden, and described the success that the party is having in Europe. He claimed that about every 40 years, we have a new key idea. Apparently the theme for the period which is now ending was sustainability, and the claim is that the theme for the next 40 years will be free speech and openness. He credits this theme with the rise of the various Pirate parties in Europe, pointing to the European protests around ACTA and the US protest around SOPA as additional corroborating evidence. Falkvinge claims that the Pirate party has widened the scope of politics and given young people a way to vote for the issues that they care about. I wish that something similar was happening in American politics.

Hallway

As always, JSConf had a rich hallway/party track. I had lots of great conversations with people on topics including the Javascript API’s for Windows 8, the mismatch between many concurrency models and real shared memory hardware, and proper use and optimization of CSS. I think that facilitating the hallway track is one of the areas where JSConf excels. The venues are always nice, and this year there were hallway conversations, in pools, around campfires, as well as the usual hotel lobbies and restaraunts/bars/lounges. I was also happy to be able to introduce Matthew Podwysocki, who has been doing excellent work on RX.js, and David Nolen, who has been working on Clojurescript. I think that there can be some nice synergy between these two projects, and I’m eager to see if they agree.

The best roundup of JSConf coverage appears to be on Lanyrd.

Strata 2012

Here’s a roundup of last week’s Strata conference.

Jumpstart

This year, the O’Reilly team introduced a new tutorial day track, called “Jumpstart”. This track was more oriented towards the business side of big data, and I think that the word MBA actually appeared in the marketing. I think that the track was a success, and was very appropriate. The effect of the next generation of data oriented technologies and applications is going to be very significant, and will have a big impact on the way that business operate. It’s very important that technologists and business people work closely in order to produce the best results.

There were two talks that stood out for me. The first was Avinash Kaushik’s What Marketers can learn from Analysis. Kaushik is a very entertaining and dynamic speaker, and he has had a lot of experience working to help companies use analytics effectively. In his world, processing and storage is 10% of what you need, and analysts – humans are the other 90%. In other words, technology is not nearly as important as having people who can ask the right questions and verify hypotheses experimentally. And even good analysis is not enough. Organizations must be able to act on the results of analysis. I have been (and will continue to be) interested in the ability to use data as quickly as it is collected. Some people call this a “real-time” data capability, although in computer science terms, this is a misnomer. One of the best quotes from Kaushik’s talk was “If you do not have the capacity to take real time action, why do we need real time data?”. Without the ability to act, all the data collection and analysis in the world is fruitless. Kaushik’s claim was that we must remove all humans from the process in order to achieve this. Back to analysis, Kaushik feels that the three key skills of data analysis are: the scientific method, design of experiments, and statistical analysis.

The second talk was 3 Skills of a Data Driven CEO by Diego Saenz. I liked his notion that a company’s data is a raw material, just like any other raw material that might be used by a company. Raw materials must be collected, mined, purifed, and transformed before they can turn into a product, and so with a company’s data. The most important information that I got out of this talk was the case study that he presented on the Bob McDonald, the CEO of Proctor and Gamble. P&G has built a business wide real time information system called Business Sphere. One manifestation of Business Sphere is a pair of 8 foot high video screens that sit in the conference room used by the CEO for his regular staff meeting. Real time data on any aspect of the company’s operations can be displayed on these screens, discussed and acted upon at the CEO staff level. Also of note is that a data analyst attends the CEO staff meeting in order to facilitate discussion and questions about the data. I remember back in the 2000’s when Cisco talked about how they could close their books in a day. Now we have the worlds largest consumer products company with a real time data dashboard in the CEO’s conference room. The bar is being raised on all companies in all industries.

Talks

I felt that the talks In the regular conference were weaker than last year. Part of that may be due to my talk selection – there were lots of tracks, and in some cases it was hard to figure out which talks to pick. I tend to seek out unusual content, which means more risk in terms of a “quality” talk. The advent of the O’Reilly all access path has taken some of the risk out, since that pass gives you access to the full video archive of the entire conference. The topic of video archives is probably content for another blog post. I know that there are some talks that I missed that I want to watch the videos for, but apparently, I’ll need to wait several weeks. It will be interesting to contrast that with this week’s mostly volunteer run PyCon, which has a great track record of getting all their videos up on the web during the conference, for no fee.

Talks which were easy to remember included Sam Shah’s Collaborative Filtering with MapReduce, which included a description of how to implement collaborative filtering on Hadoop, but more importantly discussed many of the issues around building a production worthy version of such a system. It’s one thing the implement a core algorithm. It’s another to have all the rest of the infrastructure so that the algorithm can be used for production tasks.

A large portion of the data the people are interested in analyzing is coming from social networks. I attended Marcel Salathé’s Understanding Social Contagion in the hopes of gaining some greater insight into virality. Salathé works at an infectious disease center and he spent a long time comparing biological contagion with internet virality. I didn’t find this to be particularly enlightening. However, in the last third of the talk, he started talking about some of the experimental work that his group had done, which was a little more interesting. The code for his system is available on github.

I really enjoyed DJ Patil’s talk Data Jujitsu: The Art of Turning Data into Product. According to Patil, data jujitsu is using data elements in an iterative way to solve otherwise impossible data problems. A lot of his advice had to do with starting small and simple, and moving problems to where they were easiest to solve, particularly in conjunction with human input. As an example, he discussed the problem of entity resolution in one of the LinkedIn products, and described how they moved the problem from the server side, where it was hard, to the client side, where it was easy if you asked the user a simple question. The style he discussed was iterative, opportunistic, and “lazy”.

Jeremy Howard from Kaggle talked about From Predictive Modelling to Optimization: The Next Frontier. Many companies are now building a lifetime value model of a customer, and some companies are even starting to build predictive models. Howard’s claim was that the next steps in the progression are take these models and use them to build simulations. Once we have simulations, we can then use optimization algorithms on the inputs to the simulation, and optimize the results in the direction

Keynotes

Last year, I was pretty unhappy with a number of the keynotes, which were basically vendor pitches. This year things were much better, although there were one or two offenders. Microsoft was NOT one of the offenders. Dave Campbell’s Do We Have The Tools We Need To Navigate The New World Of Data? was one of the better Microsoft keynotes that I’ve seen at an O’Reilly event (i.e. out of the Microsoft ecosystem). The talk included good non-Microsoft specific discussion of the problems, references to academic papers (each with at least one Microsoft author), and a friendly, collegial, non-patronizing tone. I hope that we’ll see more of this from Redmond.

Avinash Kaushik had a keynote spot, and one of the most entertaining, but insightful slides was an infamous quote from Donald Rumsfeld

[T]here are known knowns; there are things we know we know.

We also know there are known unknowns; that is to say we know there are some things we do not know.

But there are also unknown unknowns – there are things we do not know we don’t know.

Kaushik was very keen on “unknown unknowns”. These are the kind of things that we are looking to find, and which analytics and big data techniques might actually help discover. He demonstrating a way of sorting data which leaves out the extremes, and leaves the rest of the data, which is likely where the unknown unknowns are hiding.

I’ve been a fan of Hal Varian ever since I read his book “Information Rules: A Strategic Guide to the Network Economy” back during the dot-com boom. One the one hand, his talk  Using Google Data for Short-term Economic Forecasting, was basically a commercial for Google Insights for Search. On the other hand, the way that he used it and showed how it was pretty decent for economic data was interesting. There were several talks that included the use of Google Insights for Search. It’s a tool that I’ve never paid much attention to, but I think that I’m going to rectify that.

The App

This is the first O’Reilly conference I’ve attended where they had a mobile app. There were iPad, iPhone, and Android versions. I only installed the iPad version, and I really liked it. I used it a lot when I was sitting in sessions to retrieve information about speakers, leave ratings and so forth. I’d love to see links to supplemental materials appear there. I also liked the fact that the app synced to the O’Reilly site, so that my personal schedule was reflected there. I didn’t like the fact that the app synced to the O’Reilly website because the WiFi at the conference was slow, and I often found myself waiting for those updates to finish before I could use the app. The other interesting thing was that I preferred the daily paper schedule when I was walking the hall between sessions. Part of this was due to having to wait for those updates, but part of it was that there was no view in the app that corresponded to the grid/track view of the paper schedule. More work to do here, but a great start.

Final thoughts

This year’s attendance was over 2300, up from 1400 last year, and I saw badges from all sorts of companies. It is apparent to me that the use of data and analytics being discussed at Strata is going to be the new normal for business.

Web 2.0 Summit

Last week I attended the Web 2.0 Summit in San Francisco. The theme this years was “The Data Frame”, an attempt to look at the “Points of Control Theme” from last year through the lens of data.   

Data Frame talks

Most of the good data frame stuff was in the short “High Order Bit” and “Pivot” talks. The interviews with big company CEO’s are generally of little value, because CEO’s at large companies have been heavily media trained, and it is rare to get them to say anything really interesting.

Genevieve Bell from Intel posed the question “Who is data and if it were a person what would it be like?” Her answers included:

  • Data keeps it real – it will resist being digitized
  • Data loves a good relationships – what happens when data is intermediated
  • Data has a country (context is important)
  • Data is feral (privacy security,etc )
  • Data has responsibilities
  • Data wants to look good
  • Data doesn’t last forever (and shouldn’t in some cases)

One Kings Lane was one of the startups described by Kleiner Perkins’ Aileen Lee. The interesting thing about their presentation was their realtime dashboard of purchasing activity during one of their flash sales events. You can see the demo at 6:03 in the video from the session.

Mary Meeker has moved from Morgan Stanley to Kleiner Perkins, but her Internet Trends presentation is still a tour de force of statistics and trends. It’s interesting to watch how her list of trends is changing over time.

Alyssa Henry from Amazon talked about AWS from the perspective of S3, and her talk was mostly statistics and customer experiences. One of her closing sentences stuck in my mind: “What would you do if every developer in your organization had access to a supercomputer”. Hilary Mason has talked about how people in sitting at home in their pajamas now have access to big data crunching capability. Alyssa’s remark pushes that idea – pushing the thought that access to supercomputing resources is at the same level as access to a personal computer.

TrialPay is a startup in the online payment space. Their interesting twist is that they will provide payment services free of charge, without a transaction fee. They are willing to do this because they collect the data about the payment, and can then use / sell information about payment behaviors and so on (apparently Visa and Mastercard plan to do something similar).

I am not a fan of talks that are product launches or feature launches on existing products, so I was all set to ignore Susan Wojcicki’s talk on Google Analytics. But then I saw this picture in her slides:

Edward Tufte has made this diagram famous, calling it “probably the best statistical graphic ever drawn”. I remember seeing this graphic in one of his seminars and wondering how to bring this type of visualization to a computer. I appreciated the graphic, but I wasn’t sure how many times one would need to graph death marches. The Google Analytics team found a way to apply this visualization to conversion and visitor falloffs. Sure enough, those visualizations are now in my Google Analytics account. Wojcicki also demonstrated that analytics are now being updated in “real time”. Clearly, there’s no need to view instant feedback from analytics as a future item.

Last year there was a panel on education reform. This year, Salman Khan, the creator of the Khan academy spoke. Philosophically I’m in agreement with what Khan is trying to do – provide a way for every student to attain mastery of a topic before moving on. What was more interesting was that he came with some actual data from a whole school pilot of Khan Academy materials. Their data shows that it is possible for children assigned to a remedial math class to jump to the same level as students in an advanced math class. They have a very nice set of analytic tools that work with their videos, which should lead to a more data based discussion of how to help more kids succeed in learning what they need to learn to be successful in life.

Anne Wojcicki (yes, she and Susan are sisters) talked about the work they are doing at 23andMe. She gave an example of a rare form of Parkinson’s disease, where they were able to assemble a sizable number of people with the genetic predisposition, and present that group to medical researchers who are working on treatments for Parkinsons. It was interesting story of online support groups, gene sequencing, and preventative medicine.

It seems worth pointing out that almost all the talks that I listed in this section were by women.

Inspirational Talks

There were some talks which didn’t fit the data frame theme that well, but I found them interesting or inspirational anyway.

Flipboard CEO Mike McCue made an impassioned plea that we learn when to ignore the data, and build products that have emotion in them. He contrasted the Jaguar XJSS and the Honda Insight as products built with emotion and built on data, respectively. He went on to say that tablets are important because the content becomes the interface. He believes that the future of the web is to be more like print, putting content first, because the content has a soul. Great content is about art, art creates emotion, and emotion defies the data. It was a great, thoughtful talk.

Alison Lewis from Coca Cola talked about their new, high tech, internet connected Freestyle soda machine. A number of futuristic internet scenarios seem to involve soda machines, so it was interesting to hear what actual soda companies are doing in this space. The geek in me thinks that the machine is cool, although I rarely drink soft drinks. I went to the Facebook page for the machine to see what was up, and discovered that the only places in Seattle that had them were places where I would never go to eat.

IBM’s David Barnes talked about IBM’s smart cities initiative, which involves instrumenting the living daylights out of city. Power, water, transportation grid, everything. His main points were:

  1. Cities will have a healthier immune systems.  The health web
  2. City buildings will sense and respond like living organisms – water, power, etc systems
  3. Car and city buses will run on empty..
  4. Smarter systems will quench cities thirst and save energy
  5. Cities will respond to a crisis – even before receiving an emergency call

He left us with a challenge to “Look at the organism that is the city.  What can we do to improve and create a smarter city?”. I have questions about how long it would take to actually build a smart city or worse, retrofit an existing city, but this is a challenge type of long term project. I’m glad to see that there are companies out there that are still willing to take that big long view.

Final Thoughts

I really liked the short talk formats that were used this year. It forced many of the speakers to really be crisp and interesting, or at least crisp, and I really liked the volume of what got presented. One thing seems true, that from the engineering audience of Strata to the executive audience at Web 2.0, data and data related topics are at the top of everyone’s mind.

And there in addition to ponies and unicorns, be dragons.

Surge 2011

Last week I was in Baltimore attending OmniTI’s Surge Conference. I can’t remember exactly when I first met OmniTI CEO Theo Schlossnagle, but it was at an ApacheCon after he had delivered one of his 3 hour tutorials on Scalable Internet Architectures, back in the early 2000’s. Theo’s been at this scalability business for a long time, and I was sad to have missed the first Surge, which was held last year.

Talks

Ben Fried, Google’s CIO started the conference (and one of the major themes) with a “disaster porn” talk. He described a system that he built in a previous life, for a major wall street company. The system had to be very scalable to accommodate the needs of traders. One day, the system started failing, and ended up costing his employer a significant amount of money. In the ensuing effort to get the system working again, he ended up with all the people from the various specializations (development, operations, networking, etc) all stuck in a very large room with a lot of whiteboards. It turned out that no one really understood how the entire system worked, and that issues at the boundaries of the specialties were causing many of the problems. The way that they had scaled up their organization was to specialize, but that specialization caused them to lose an end to end view of the system. Their organization of their people had led to some of the problems they were experiencing, and was impeding their ability to solve the problems.   The quote that I most remember was “specialization is an industrial age notion and needs to be discounted in spaces where we operate at the boundary of the known versus unknown”. The lessons that Fried learned on that project have influenced the way that Google works (Site Reliability Engineers as an example), and are similar to the ideas being espoused by the “DevOps” movement. His description of the solution was to “reward and recognize generalist skill and end to end knowledge”. There was a pretty lively Q&A around this notion of generalists.

Mark Imbriaco’s talk was titled “Anatomy of a Failure” in the program, but he actually presented a very detailed account of how Heroku responds to incidents. My background isn’t in operations, so I found this to be pretty interesting and useful. I particularly liked the idea of playbooks to be followed when incidents occur, and that alert messages actually contain links to the necessary playbooks. The best quote from Mark’s talk was probably “Automation is also a great way to distribute failure across an entire system”.

Raymond Blum presented the third of three Google talks that were shoe horned into a single session. He described the kind of problems involved in doing backups at Google scale. Backup is one of those problems that needs to be solved, but is mostly unglamourous. Unless you are Google, that is. Blum talked about how they actually read their backup tapes to be sure that they work, their strategy of backing up to data centers in different geographies, and clever usage of map reduce to parallelize the backup and restore process. He cited the Gmail outage earlier this year as a way of grasping the scale of the problem of backing up a service like GMail, much less all of Google. One way to know if a talk succeeds is if it provokes thoughts. Based on my conversations with other attendees, this one succeeded.

David Pacheco and Bryan Cantrill talked about “Realtime Cloud Analytics with Node.js”. This work is an analog of the work that they did on the analytics for the “Fishworks”/Sun Storage 7000 products, except instead of measuring a storage appliance, they are doing analytics for Joyent’s cloud offering. This is basically a system which talks to DTrace on every machine, and then reports the requested metrics to an analytics service once a second. The most interesting part of the talk was listening to two guys who are hard core C programmers / kernel developers walk us through their decision to write the system in Javascript on Node.js instead of using C. They also discussed the areas where they expected there to be performance problems, and were surprised when those problems never appeared. When it came time for the demo, it was quite funny to see one of the inventors of DTrace being publicly nervous about running DTrace on every machine in the Joyent public cloud.   “Automation is also a great way to distribute failure across an entire system”. But everything was fine, and people were impressed with the analytics.

Fellow ASF member Geir Magnusson’s talk was named “When Business Models Attack”. The title alludes to the two systems that Geir described, both of which are designed specifically to handle extreme numbers of users. Geir was the VP of Platform and Architecture at Gilt Groupe, and one description of their model is that every day at Noon is Black Friday. So the Gilt system has to count on handling peak numbers of users every day at a particular time. Geir’s new employer, Function(x), also has a business model that depends on large numbers of users. The challenge is to design systems that will handle big usage spikes as a matter of course, not as a rarity. One of architectures that Geir described involved writing data into a Riak cluster in order to absorb the write traffic, and then using a Node.js based process to do a “write-behind” of that data into a relational database.

Takeaways

There were several technology themes that I encountered during the course of the 2 days:

  • Many of the talks that I attended involved the use of some kind of messaging system (most frequently RabbitMQ). Messaging is an important component in connecting systems that are operating a different rates, which is frequently the case in systems operating at high scale.
  • Many people are using Amazon EC2, and liking it, but there were a lot of jokes about the reliability of EC2.
  • I was surprised by how many people appear to be using Node.js. This is not a Javascript or dynamic language oriented community. There’s an inclination towards C, systems programming, and systems administration. Hardly an audience where you’d expect to see lots of Node usage, but I think that it’s notable that Node is finding some uptake.

One thing that I especially liked about Surge was the focus on learning from failure, otherwise known as a “fascination with disaster porn”. Most of the time you only hear about things that worked, but hearing about what didn’t work is at least as instructive, and in some case more instructive. This is something that (thus far) is unique to Surge.

W3C Web and TV Workshop

Last week I attended the Third W3C Web and TV Workshop (disclosure: I was a member of the program committee). This was the third in a series of three workshops that the W3C has organized around the intersection of web technologies and television. The purpose of the workshops is to bring these two communities together and help them understand and work with each other. The W3C has formed an interest group for member companies who are interested in working on issues related to the web and television.

Some of the topics discussed at the workshop included multi-screen experiences (there were 2.5 sessions on this topic, including some demonstrations), synchronized metadata, codecs (particularly around adaptive bit rate streaming over HTTP), and (inevitably) content protection/DRM.   

Given the advent of the iPad and other tablets, it should be no surprise that multi-screen experiences were a big topic. Apple has done some interesting work with AirPlay, but the general technology infrastructure for enabling multi-screen experiences is a mess. There are issues ranging from the “bottom”, related to the discovery of the various devices, through the negotiation of which devices have which roles, up to the mechanism for synchronizing content and metadata amongst these devices. There’s a lot of work to be done here, and some of that will be done in conjunction with other industry groups like DLNA and so forth. I’m most interested in the upper levels, which should be helping with synchronizing the experience and facilitating inter device/application communication.   

There was also significant discussion around synchronized metadata, which is highly relevant to multi-screen experiences, although there was more discussion/demonstration of end experiences as opposed to technologies that could be standardized to facilitate those experiences. Sylvia Pfeiffer gave an interesting demo of WebVTT using the Captionator polyfill. One of the best things about this discussion was that one of my colleagues from ESPN later explained to me the details of how captioning is done in their broadcast and internet workflows.

It’s impossible to talk about television without talking about video, and the two largest topics around video and the web are codecs and content protection. Most of the discussion around codecs revolved around the work at MPEG on Dynamic Adaptive Streaming over HTTP (DASH). There are at least three solutions in the market for streaming video via HTTP, all mutually incompatible for dumb reasons. DASH is an attempt to standardize that mechanism, while remaining silent on the question of which codec is used to produce the video file being streamed.

On the content protection front, there was the usual disconnect between the web world and the tv world. For me, the discussion here really centers around the ability to use the HTML5 video tag to deliver “premium” content. Today that content is delivered via the object tag and associated browser plugins. The problem is that each plugin works differently, so your web application code has to deal with all the possibilities that it might encounter. There appears to be some interest in standardizing a small and narrow set of API’s that web applications could use to interact with a content protection mechanism. Unsurprisingly, there was very little interest in standardizing a content protection mechanism for HTML5, especially since there isn’t agreement on a standard video codec.

Recently the W3C has been working very hard at getting consumer/content side companies to participate in its activities. Because the workshop was open to anyone, not just W3C member companies, there were a lot of attendees who were not from the traditional W3C constituencies. Personally, I think that this is a good thing, and not just in the Web and TV space. It will be interesting to see how much progress can be made – the Apple and Google native application models, are this generation’s Flash and Silverlight. I hope that we can find a way to build the next generation of television experiences atop the Open Web technology stack.

Google I/O 2011

Google I/O has a different feel than many of the conferences that I attend. Like Apple’s WWDC, there is a distinctly vendor partisan tone to the entire show — having the show in the same location as WWDC probably reinforces that. Unlike WWDC, the web focused portion of Google I/O helps to blunt that feeling, and the fact that lots of things are open or being open sourced also helps with the partisan feeling.

I’m going to split this writeup into two parts, the two keynotes, and the rest of the talks.

Android Keynote

The first keynote was the Android keynote and opened with a recap of Android’s marketplace accomplishments over the last year. The tone was decidedly less combative towards Apple than last year. There weren’t many platform technology announcements. There was the expected discussion of features for the next version of Android, but I didn’t really see much that was new. There was a very nice head tracking demo that involved front facing cameras and OpenGL – I believe this will be a platform feature, which is cool. Much was made of Music and Movies, but this is mostly an end user and business development story. The ability to buy/stream without a cable is nice, but as long as devices need to be plugged in to recharge (which in my case is every day), I don’t find this to be as compelling as those who were clapping loudly. What I did find interesting was the creation of a Council that will specify how quickly devices will be updated to a particular release of Android, and how long a device will be supported. This is pretty much an admission that fragmentation is real and a problem that needs addressing. I hope that it works.

The most interesting announcement during the Android keynote was the open accessories initiative. This is in direct contrast to Apple’s tight control over the iOS device connector. Google’s initiative is based on the open source Arduino hardware platform, and they showed some cool integration with an exercise bike, control over a home made labyrinth board, and some very interesting home automation work. As part of the home automation stuff, they showed an NFC enabled CD package being swiped against a home audio device, which then caused the CD to be loaded into the Google music service. This is cool, but I don’t know if CD’s will be around long enough for NFC enabled packaging to become pervasive. I’m very curious to see how the accessories initiative will play out, especially versus the iOS device connector. If this were to take off, Apple could build support for the specs into future iOS devices, although they would have to swallow their pride first. This will be very interesting to watch.

Chrome Keynote

Day two’s keynote was about Chrome, and the open web, although the focus was on Google’s contributions. Adoption of Chrome is going really nicely – 160M users. There was a demonstration of adding speech input by adding a single attribute to an element (done via the Chrome Developer Tools). Performance got several segments. The obligatory Javascript performance slide when up, showing a 20x improvement since 2008, and the speaker said he hoped to stop showing such slides, claiming that the bottlenecks in the system were now in other parts of the browser. This was a perfect segue to show hardware accelerated CSS transforms as well as hardware accelerated Canvas and WebGL.

I’ve been curious whether the Chrome web store is really a good idea or not, and we got some statistics to ponder. Apparently people spend twice as much time in applications when they are obtained via the web store, and people perform 2.5x the number of transactions. I wish there were some more information on these stats. Of course this is all before in-app purchasing, which was announced, along with a very small 5% cut for Google.   

Of course, no discussion of an app store should be without a killer app, so Google brought Rovio onto the stage to announce that Angry Birds is now available for the web, although it’s called Angry Birds for Chrome, and has special levels just for Chrome users. Apparently Chrome’s implementation of Open Web technologies has advanced to the point where doing a no compromises version of Angry Birds is possible.   Another indication of how far the Open Web has come is “3D dreams of Black“, which is a cool interactive media piece that is part film, part 3d virtual world. I’m keeping a pretty close eye on the whole HTML5 space, but this piece really shows how the next generation of the web is coming together as a medium all its own.

The final portion of the keynote was about ChromeOS and the notebooks or “Chromebook”s that run it. A lot of the content in this section was a repeat of content from Google’s Chrome Update event in December, but there were a few new things. Google has been hard at work solving some of the usage problems discovered during the CR-48 beta. This includes the trackpad (which was awful), Movies and Music, local file storage, and offline access. The big news for I/O is that Google has decided that ChromeOS is ready to be installed on laptops which will be sold as “Chromebooks”. Samsung and Acer have signed up to manufacture the devices. Google will also rent Chromebooks to businesses ($28/mo per user) and schools ($20/mo per user). This is latest round of the network computer vision, and it’s going to be interesting to see whether the windows of technology readiness and user mindset are overlapping or not. The Chrome team appears to have the best marketing team at Google, and in their classic style, they’ve produced a video which they hope will persuade people of the Chromebook value proposition.

Talks

On to the talks.

“Make the Web Faster” by Richard Rabbat, Joshua Marantz, and Håkon Wium Lie was a double header talk covering mod_pagespeed and WebP. mod_pagespeed is a module for the Apache HTTP server, which speeds up web pages by using filters to rewrite pages and resources before they are delivered to the client. These rewrites are derived from the rules tested by the client side Page Speed tool. The other half of the talk was about WebP which is a new format for images. Microsoft also proposed a new web image format several years ago, but it didn’t go anywhere.   

Nick Pelly and Jeff Hamilton presented “How to NFC”. The NFC landscape is complicated and there are lots of options because of hardware types and capabilities. The examples that were shown were reasonably straightforward, but the whole time I found myself thinking that NFC is way more complicated than it should be. Having written device drivers in a previous life, I shouldn’t be surprised, but I still am. It seems obvious to me that the concept of NFC is a great one. The technical end of thing seems tractable, if annoying. The business model issues are still unclear to me. I hope that it all comes together.

I really enjoyed Eric Bidelman and Arne Roomann-Kurrik’s HTML5 Showcase.   They showed some neat demos of things that you can do in HTML5. I particularly liked this one using 3D CSS. They also did some entertaining stuff with a command line interface. All of the source code to their demos is available – the link is in the slides.

I wasn’t able to get to Paul Irish’s talk on the Chrome Developer Tools at JSConf – there was quite a bit of Twitter buzz about it. I wasn’t too worried because I knew that the talk would be given again at Google I/O. For this version Paul teamed up with Pavel Feldman. There are a lot of really cool features going into the Chrome Developer tools. My favorite new features are the live editing of CSS and Javascript, revisions, saving modified resources, and remote debugging. The slide deck has pointers to the rest of the new features. If they go much further, they are going to turn the Developer Tools into an IDE (which they said they didn’t want to do).

Ray Cromwell and Phillip Rogers did a talk titled “Kick-ass Game Programming with Google Web Toolkit”, which was a talk about ForPlay, which is a library for writing games that they developed on top of GWT. This is the library that Rovio used to do Angry Birds for Chrome. If you implement your game using GWT, ForPlay can compile your game into an HTML5 version, an Android native app version, a Flash version, and a desktop Java version. They also showed a cool feature where you could modify the code of the game in Eclipse, save it, and then switch to a running instance of the Java version of the game, and see the changes reflected instantly.   

Postscript

Google has an undeniably large footprint in the mobile and open web spaces. I/O is a good way to keep abreast of what is happening at the Googleplex.

NodeConf 2011

Although I was definitely interested in JSConf (writeup), Nodeconf was the part of the week that I was really looking forward to. I’ve written a few small prototypes using Node and some networking / web swiss army knife code, so I was really curious to see what people are doing with Node, whether they were running into the same issues that I was, and overall just get a sense of the community.

Talks

Ryan Dahl’s keynote covered the plans for the next version of Node. The next release is focused on Windows, and the majority of the time was spent on the details of how one might implement Node on Windows. Since I’m not a Windows user, that means an entire release with nothing for me (besides bug fixes). At the same time, Ryan acknowledged the need for some kind of multiple Node on a single machine facility, which would appear in a subsequent. I can see the wisdom of making sure that the Windows implementation works well before tackling clustering or whatever it ends up being called. This is the third time I’ve heard Ryan speak, and this week is the first time I’ve spent any time talking with him directly. Despite all the hype swirling around Node, Ryan is quiet, humble, and focused on making a really good piece of software.

Guillermo Rauch talked about Socket.io, giving an overview of features and talking about what is coming next. Realtime apps and devices are a big part of my interest in Node, and Socket.io is providing an important piece of functionality towards that goal.

Henrik Joreteg’s talk was about Building Realtime Single Page applications, again in the sweet spot of my interest in Node. Henrik has built a framework called Capsule which combines Socket.io and Backbone.js to do real time synchronization of model states between the client and server. I’m not sure I believe the scalability story as far as the single root model, but there’s definitely some interesting stuff in there.

Brendan Eich talked about Mozilla’s SpiderNode project, where they’ve taken Mozilla’s SpiderMonkey Javascript Engine and implemented V8’s API around it as a veneer (V8Monkey) and then plugged that into Node. There are lots of reasons why this might be interesting. Brendan listed some of the reasons in his post. For me, it means a chance to see how some proposed JS.Next features might ease some of the pain of writing large programs in a completely callback oriented style. The generator examples Brendan showed are interesting, and I’d be interested in seeing some larger examples. Pythonistas will rightly claim that the combination of generators and callbacks is a been there / done that idea, but I am happy to see some recognition that callbacks cause pain. There are some other benefits of SpiderMonkey in Node such as access to a new debugging API that is in the works, and (at the moment) the ability to switch runtimes between V8 and SpiderMonkey via a command line switch. I would be fine if Mozilla decided to really take a run at making a “production quality” SpiderNode. Things are still early during this cycle of server side JavaScript, and I think we should be encouraging experimentation rather than consolidation.

One of the things that I’ve enjoyed the most during my brief time with Node is npm, the package management system. npm went 1.0 shortly before NodeConf, so Isaac Schleuter, the primary author of npm, described the changes. When I started using Node I knew that big changes were in the works for npm, so I was using a mix of npm managed packages and linking stuff into the Node search path directly. Now I’m using npm. When I work in Python I’m always using a virtualenv and pip, but I don’t like the fact that those two systems are loosely coupled. I find that npm is doing exactly what I want and I’m both happy and impressed.

I’ve been using Matt Ranney’s node_redis in several of my projects, it has been a good piece of code, so I was interested to hear what he had to say about debugging large node clusters. Most of what he described was pretty standard stuff for working in clustered environments. He did present a trick for using the REPL on a remote system to aid in debugging, but this is a trick that other dynamic language communities have been doing for some time.

Felix Geisendorfer’s talk was titled “How to test Asynchronous Code”. Unfortunately his main points were 1. No I/O (which takes out the asynchrony 2. TDD and 3. Discipline. He admitted in his talk that he was really advocating unit testing and mocking. While this is good and useful, it’s not really serious testing against the asynchronous aspects of the code, and I don’t really know of any way to do good testing of the non-determinism introduced by asynchrony. Felix released several pieces of code, including a test framework, a test runner, and some faking/mocking code.

Charlie Robbins from Nodejitsu talked about Node.js in production, and described some techniques that Nodejitsu uses to manage their hosted Node environment. Many of these techniques are embodied in Haibu, which is the system that Nodejitsu uses to manage their installation. Charlie pushed the button to publish the github repository for Haibu at the end of his talk.

Issues with Node

The last talk of the day was a panel of various Node committers and relevant folks from the broader Node community depending on the question. There were two of the audience questions that I wanted to cover.

The first was what kind of applications is Node.js not good for. The consensus of the panel was you wouldn’t want to use Node for applications involving lots of numeric computation, especially decimal or floating point, and that longer running computations were a bad fit as well. Several people also said that databases (as in implementing a database) were a problem space that Node would be bad at. Despite the hype surrounding Node on Twitter and in the blogosphere, I think that the core members of the Node community are pretty realistic about what Node is good for an where it could be usefully applied.

The second issue had to do with Joyent’s publication of a trademark policy for Node. One of the big Node events in the last year was Joyent’s hiring of Ryan Dahl, and subsequently a few other Node contributors. Joyent is basing its Platform as a Service offering on Node, and is mixing its Node committers with some top notch systems people who used to be at Sun, including some of the founding members of the DTrace team. Joyent has also taken over “ownership” of the Node.js codebase from Ryan Dahl, and that, in combination with the trademark policy is causing concern in the broader Node community.

All things being equal, I would prefer to see Node.js in the hands of a foundation. At the same time, I understand Joyent’s desire to try and make money from Node. I know a number of people at Joyent personally, and I have no reason to suspect their motives. However, with the backdrop of Oracle’s acquisition of Sun, and the way that Oracle is handling Sun’s open source projects, I think that it’s perfectly reasonable to have questions about Joyent or any other company “owning” an open source project. Let’s look at the ways that an open source project is controlled. There’s 1) licensing 2) intellectual property/patents 3) trademarks 4) governance. Now, taking them one at a time:

  1. Licensing – Node.JS is licensed under the MIT license. There are no viral/reciprocal terms to prevent forking (or taking a fork private). Unfortunately, there are no patent provisions in the MIT license. This applies to #2 below. The MIT license is one of the most liberal licenses around – it’s hard to see anything nefarious in its selection, and forking as a nuclear option in the case of bad behavior by Joyent or an acquirer is not a problem. This is the same whether Node is at a foundation or at Joyent.
  2. Intellectual Property – Code which is contributed to Node is governed by the Node Contributor License Agreement, which appears to be partially derived from the Apache Individual and Corporate Contributor license agreements (Joyent’s provision of an on-line form is something that I wish the ASF would adopt – we are living in the 21st century after all). Contributed IP is licensed to Node, but the copyright is not assigned as in the case of the FSF. Since all contributors retain their rights to their contributions, the IP should be clean. The only hitch would be if Joyent’s contributions were not licensed back on these terms as well, but given the use of the MIT license for the entire codebase, I don’ think that’s the case. As far as I can tell, there isn’t much difference between having Node at a foundation or having it at Joyent.
  3. Trademark – Trademark law is misunderstood by lots of people, and the decision to obtain a trademark can be a controversial one for an open source project. Whether or not Node.js should have been trademarked is a separate discussion. Given that there will be a trademark for Node.js, what is the difference between having Node at a foundation or at Joyent? Trademark law says that you have to defend your trademark or risk losing it. That applies to foundations as well as for profit companies. The ASF has sent cease and desist letters to companies which are misusing Apache trademarks. The requirement to defend the mark does not change between a non-profit and a for-profit. Joyent’s policy is actually more liberal than the ASF trademark policy. The only difference between a foundation and a company would be the decision to provide a license for use of the trademark as opposed to disallowing a use altogether. If a company or other organization is misusing the Node.js trademark, they will have to either obtain a license or stop using the mark. That’s the same regardless of who owns the mark. What may be different is whether or not a license is granted or usage is forbidden. In the event of acquisition by a company unfriendly to the community, the community would lose the trademarks – see the Hudson/Jenkins situation to see what that scenario looks like.   
  4. Governance – Node.js is run on a “benevolent dictator for life” model of governance. Python and Perl are examples of community/foundation based open source projects which have this model of governance. The risk here is that Ryan Dahl is an employee of Joyent, and could be instructed to do things a certain way, which I consider unlikely. I suppose that a foundation you could try to write additional policy about removal of the dictator in catastrophic scenarios, but I’m not aware of any projects that have such a policy. The threat of forking is the other balance to a dictator gone rogue, and aside from the loss of the trademark, there are no substantial roadblocks to a fork if one became necessary.

To riff on the 2010 Web 2.0 Summit, these are the four “points of control” for open source projects. As I said, my first choice would have been a foundation, and for now I can live with the situation as it is, but I am also not a startup trying to use the Node name to help gain visibility.

Final thoughts

On the whole, I was really pleased with Nodeconf. I did pick up some useful information, but more importantly I got some sense of the community / ecosystem, which is really important. While the core engine of Node.js is important, it’s the growth and flourishing of the community and ecosystem that matter the most. As with most things Node, we are still in the early days but thing seem promising.

The best collections of JSConf/NodeConf slides seem to be in gists rather than Lanyrd, so here’s a link to the most up to date one that I could find.

Update: corrected misspelling of Henrik Joreteg’s name. And incorrectly calling Matt Ranney Mark.

JSConf 2011

Last year when I attended JSConf I had some ideas about the importance of Javascript. I was concerned in a generic way about building “richer” applications in the browser and Javascript’s role in building those applications. Additionally, I was interested in the possibility of using Javascript on the server, and was starting to learn about Node.js.

A year later, I have some more refined ideas. The fragmentation of mobile platforms means that open web technologies are the only way to deliver applications across the spectrum of telephones, tables, televisions and what have you, without incurring the pain of multi platform development. The types of applications that are most interesting to me are highly interactive with low latency user interfaces – note that I am intentionally avoiding the use of the word “native”. Demand for these applications is going to raise the bar on the skill sets of web developers. I think that we will see more applications where the bulk of the interface and logic are in the browser, and where the server becomes a REST API endpoint. The architecture of “New Twitter” is in this vein. API endpoints have far less of a need for HTML templating and server side MVC frameworks. But those low latency applications are going mean that servers are doing more asynchronous delivery of data, whether that is via existing Comet like techniques or via Websockets (once it finally stabilizes). Backend systems are going to partition into parts that do asynchronous delivery of data, and other parts which run highly computationally intensive jobs.

I’ll save the discussion of the server parts for my Nodeconf writeup, but now I’m ready to report on JSConf.

Talks

Here are some of the talks that I found interesting or entertaining.

Former OSAF colleague Adam Christian talked about Jellyfish, which is a tool for executing Javascript in a variety of environments from Node to desktop browsers to mobile browsers. One great application for Jellyfish is testing, and Jellyfish sprang out of the work that Adam and others did on Windmill.

It’s been a while since I looked at Bespin/Skywriter/Ace, and I was pleased to see that it seems to be progressing quite nicely. I particularly liked the Github support.

I enjoyed Mary Rose Cook’s account of how writing a 2D platform game using Javascript cause her to have a falling in love like experience with programming. It’s nice to be reminded of the sheer fun and art of making something using code.

Unfortunately I missed Andrew Dupont’s talk on extending built-ins. The talk was widely acclaimed on Twitter, and fortunately the slides are available. More on this (perhaps) once I get some time to read the slide deck.

Mark Headd showed some cool telephony apps built using Node.js including simple control of a web browser via cell phone voice commands or text messages. The code that he used is available, and uses Asterisk, Tropos, Couchbase, and a few other pieces of technology.

Dethe Elze showed of Waterbear, which is a Scratch-like environment running in the browser. It’s not solely targeted at Javascript, which I have mixed feelings about. My girls have done a bunch of Scratch programming, so I am glad to see that environment coming to languages that are more widely used.

The big topics

There were four talks in the areas that am really concerned about, and I missed one of them, which was Rebecca Murphey’s talk on Modern Javascript, which appeared to be derived from some blog posts that she has written on the topic. I think that the problems she is pointing out – ability to modularize, dependency management, and intentional interoperability are going to be major impediments to building large applications in the browser, never mind on the server.

Dave Herman from Mozilla did a presentation on a module system for the next version of Javascript (which people refer to as JS.next). The design looks reasonable to me, and you can actually play with it in Narcissus, Mozilla’s meta circular Javascript interpreter, which is a testbed for JS.next ideas. One thing that’s possible with the design is to run different module environments in the same page, which Dave demonstrated by running Javascript, Coffeescript, and Scheme syntaxed code in different parts of a page.

The last two talks of the conference were also focused on the topic of JS.next.

Jeremy Askenas was scheduled to talk about Coffeescript, but he asked Brendan Eich to join him and talk about some of the new features that have been approved or proposed for JS.next. Many of these ideas look similar to ideas that are in Coffeescript. Jeremy then went on to try and explain what he’s trying to do in Coffeescript, and encouraged people to experiment with their own language extensions. He and Brendan are calling programs like the Coffeescript compiler, “transpilers” – compilers which compile into Javascript. I’ve written some Coffeescript code just to get a feel for it, and parts of the experience reminded me of the days when C++ programs went through CFront, which then translated them into C which was then compiled. I didn’t care for that experience then, and I didn’t care for it this time, although the fact that most of what Coffeescript does is pure syntax means that the generated code is easy to associate back to the original Coffeescript. There appears to be considerable angst around Coffeescript, at least in the Javascript community. Summarizing that angst and my own experience with Coffeescript is enough for a separate post. Instead I’ll just say that I like many of the language ideas in Coffeescript, but I’d prefer not to see Coffeescript code in libraries used by the general Javascript community. If individuals or organizations choose to adopt Coffeescript, that’s fine by me, but having Coffeescript go into the wild in library code means that pressure will build to adapt Javascript libraries to be Coffeescript friendly, which will be detrimental to efforts to move to JS.next.

The last talk was given by Alex Russell, and included a triple head fake where Alex was ostensibly to talk about feature detection, although only after a too long comedic delay involving Dojo project lead Pete Higgins. A few minutes into the content on feature detection, Alex “threw up his hands”, and pulled out the real topic of his talk, which is the work that he’s been doing on Traceur, which is Google’s transpiler for experimenting with JS.next features. Alex then left the stage and a member of the Traceur team gave the rest of the talk. I am all in favor of cleverness to make a talk interesting, but I would have to say that the triple head fake didn’t add anything to the presentation. Instead, it dissipated the energy from the Brendan / Jeremy talk, and used up time that could have been used to better motivate the technical details that were shown. The Traceur talk ended up being less energetic and less focused than the talk before it, which is a shame because the content was important. While improving the syntax of JS.next is important, it’s even more important to fix the problems that prevent large scale code reuse and interoperability. The examples being given in the Traceur talk were those kinds of examples, but they were buried by a lack of energy, and the display of the inner workings of the transpiler.

I am glad to see that the people working on JS.next are trying to implement their ideas to the point where they could be used in large Javascript programs. I would much rather that the ECMAScript committee had actual implementation reports to base their decisions on, rather than designing features on paper in a committee (update: I am not meaning to imply that TC39 is designing by committee — see the comment thread for more on that. ). It is going to be several more years before any of these features get standardized, so in the meantime we’ll be working with the Javascript that we have, or in some lucky cases, with the recently approved ECMAScript 5.

Final Thoughts

If your interests are different than mine, here is a list of pointers to all the slides (I hope someone will help these links make it onto the Lanyrd coverage page for JSConf 2011.

JSConf is very well organized, there are lots of social events, and there are lots of nice touches. I did feel that last year’s program was stronger than this years. There are lots of reasons for why this might be the case, including what happened in Javascript in 2010/11, who was able to submit a talk, a change in my focus and interests. Chris Williams has a very well reasoned description of how he selects speakers for JSConf. In general I really agree with what he’s trying to do. One thing that might help is to keep all the sessions to 30 minutes, which would allow more speakers, and also reduce the loss if a talk doesn’t live up to expectations.

On the whole, I definitely got a lot out the conference, and as far as I can tell if you want to know what is happening or about to happen in the Javascript world, JSConf is the place to be.