NodeConf 2011

Although I was definitely interested in JSConf (writeup), Nodeconf was the part of the week that I was really looking forward to. I’ve written a few small prototypes using Node and some networking / web swiss army knife code, so I was really curious to see what people are doing with Node, whether they were running into the same issues that I was, and overall just get a sense of the community.

Talks

Ryan Dahl’s keynote covered the plans for the next version of Node. The next release is focused on Windows, and the majority of the time was spent on the details of how one might implement Node on Windows. Since I’m not a Windows user, that means an entire release with nothing for me (besides bug fixes). At the same time, Ryan acknowledged the need for some kind of multiple Node on a single machine facility, which would appear in a subsequent. I can see the wisdom of making sure that the Windows implementation works well before tackling clustering or whatever it ends up being called. This is the third time I’ve heard Ryan speak, and this week is the first time I’ve spent any time talking with him directly. Despite all the hype swirling around Node, Ryan is quiet, humble, and focused on making a really good piece of software.

Guillermo Rauch talked about Socket.io, giving an overview of features and talking about what is coming next. Realtime apps and devices are a big part of my interest in Node, and Socket.io is providing an important piece of functionality towards that goal.

Henrik Joreteg’s talk was about Building Realtime Single Page applications, again in the sweet spot of my interest in Node. Henrik has built a framework called Capsule which combines Socket.io and Backbone.js to do real time synchronization of model states between the client and server. I’m not sure I believe the scalability story as far as the single root model, but there’s definitely some interesting stuff in there.

Brendan Eich talked about Mozilla’s SpiderNode project, where they’ve taken Mozilla’s SpiderMonkey Javascript Engine and implemented V8′s API around it as a veneer (V8Monkey) and then plugged that into Node. There are lots of reasons why this might be interesting. Brendan listed some of the reasons in his post. For me, it means a chance to see how some proposed JS.Next features might ease some of the pain of writing large programs in a completely callback oriented style. The generator examples Brendan showed are interesting, and I’d be interested in seeing some larger examples. Pythonistas will rightly claim that the combination of generators and callbacks is a been there / done that idea, but I am happy to see some recognition that callbacks cause pain. There are some other benefits of SpiderMonkey in Node such as access to a new debugging API that is in the works, and (at the moment) the ability to switch runtimes between V8 and SpiderMonkey via a command line switch. I would be fine if Mozilla decided to really take a run at making a “production quality” SpiderNode. Things are still early during this cycle of server side JavaScript, and I think we should be encouraging experimentation rather than consolidation.

One of the things that I’ve enjoyed the most during my brief time with Node is npm, the package management system. npm went 1.0 shortly before NodeConf, so Isaac Schleuter, the primary author of npm, described the changes. When I started using Node I knew that big changes were in the works for npm, so I was using a mix of npm managed packages and linking stuff into the Node search path directly. Now I’m using npm. When I work in Python I’m always using a virtualenv and pip, but I don’t like the fact that those two systems are loosely coupled. I find that npm is doing exactly what I want and I’m both happy and impressed.

I’ve been using Matt Ranney’s node_redis in several of my projects, it has been a good piece of code, so I was interested to hear what he had to say about debugging large node clusters. Most of what he described was pretty standard stuff for working in clustered environments. He did present a trick for using the REPL on a remote system to aid in debugging, but this is a trick that other dynamic language communities have been doing for some time.

Felix Geisendorfer’s talk was titled “How to test Asynchronous Code”. Unfortunately his main points were 1. No I/O (which takes out the asynchrony 2. TDD and 3. Discipline. He admitted in his talk that he was really advocating unit testing and mocking. While this is good and useful, it’s not really serious testing against the asynchronous aspects of the code, and I don’t really know of any way to do good testing of the non-determinism introduced by asynchrony. Felix released several pieces of code, including a test framework, a test runner, and some faking/mocking code.

Charlie Robbins from Nodejitsu talked about Node.js in production, and described some techniques that Nodejitsu uses to manage their hosted Node environment. Many of these techniques are embodied in Haibu, which is the system that Nodejitsu uses to manage their installation. Charlie pushed the button to publish the github repository for Haibu at the end of his talk.

Issues with Node

The last talk of the day was a panel of various Node committers and relevant folks from the broader Node community depending on the question. There were two of the audience questions that I wanted to cover.

The first was what kind of applications is Node.js not good for. The consensus of the panel was you wouldn’t want to use Node for applications involving lots of numeric computation, especially decimal or floating point, and that longer running computations were a bad fit as well. Several people also said that databases (as in implementing a database) were a problem space that Node would be bad at. Despite the hype surrounding Node on Twitter and in the blogosphere, I think that the core members of the Node community are pretty realistic about what Node is good for an where it could be usefully applied.

The second issue had to do with Joyent’s publication of a trademark policy for Node. One of the big Node events in the last year was Joyent’s hiring of Ryan Dahl, and subsequently a few other Node contributors. Joyent is basing its Platform as a Service offering on Node, and is mixing its Node committers with some top notch systems people who used to be at Sun, including some of the founding members of the DTrace team. Joyent has also taken over “ownership” of the Node.js codebase from Ryan Dahl, and that, in combination with the trademark policy is causing concern in the broader Node community.

All things being equal, I would prefer to see Node.js in the hands of a foundation. At the same time, I understand Joyent’s desire to try and make money from Node. I know a number of people at Joyent personally, and I have no reason to suspect their motives. However, with the backdrop of Oracle’s acquisition of Sun, and the way that Oracle is handling Sun’s open source projects, I think that it’s perfectly reasonable to have questions about Joyent or any other company “owning” an open source project. Let’s look at the ways that an open source project is controlled. There’s 1) licensing 2) intellectual property/patents 3) trademarks 4) governance. Now, taking them one at a time:

  1. Licensing – Node.JS is licensed under the MIT license. There are no viral/reciprocal terms to prevent forking (or taking a fork private). Unfortunately, there are no patent provisions in the MIT license. This applies to #2 below. The MIT license is one of the most liberal licenses around – it’s hard to see anything nefarious in its selection, and forking as a nuclear option in the case of bad behavior by Joyent or an acquirer is not a problem. This is the same whether Node is at a foundation or at Joyent.
  2. Intellectual Property – Code which is contributed to Node is governed by the Node Contributor License Agreement, which appears to be partially derived from the Apache Individual and Corporate Contributor license agreements (Joyent’s provision of an on-line form is something that I wish the ASF would adopt – we are living in the 21st century after all). Contributed IP is licensed to Node, but the copyright is not assigned as in the case of the FSF. Since all contributors retain their rights to their contributions, the IP should be clean. The only hitch would be if Joyent’s contributions were not licensed back on these terms as well, but given the use of the MIT license for the entire codebase, I don’ think that’s the case. As far as I can tell, there isn’t much difference between having Node at a foundation or having it at Joyent.
  3. Trademark – Trademark law is misunderstood by lots of people, and the decision to obtain a trademark can be a controversial one for an open source project. Whether or not Node.js should have been trademarked is a separate discussion. Given that there will be a trademark for Node.js, what is the difference between having Node at a foundation or at Joyent? Trademark law says that you have to defend your trademark or risk losing it. That applies to foundations as well as for profit companies. The ASF has sent cease and desist letters to companies which are misusing Apache trademarks. The requirement to defend the mark does not change between a non-profit and a for-profit. Joyent’s policy is actually more liberal than the ASF trademark policy. The only difference between a foundation and a company would be the decision to provide a license for use of the trademark as opposed to disallowing a use altogether. If a company or other organization is misusing the Node.js trademark, they will have to either obtain a license or stop using the mark. That’s the same regardless of who owns the mark. What may be different is whether or not a license is granted or usage is forbidden. In the event of acquisition by a company unfriendly to the community, the community would lose the trademarks – see the Hudson/Jenkins situation to see what that scenario looks like.   
  4. Governance – Node.js is run on a “benevolent dictator for life” model of governance. Python and Perl are examples of community/foundation based open source projects which have this model of governance. The risk here is that Ryan Dahl is an employee of Joyent, and could be instructed to do things a certain way, which I consider unlikely. I suppose that a foundation you could try to write additional policy about removal of the dictator in catastrophic scenarios, but I’m not aware of any projects that have such a policy. The threat of forking is the other balance to a dictator gone rogue, and aside from the loss of the trademark, there are no substantial roadblocks to a fork if one became necessary.

To riff on the 2010 Web 2.0 Summit, these are the four “points of control” for open source projects. As I said, my first choice would have been a foundation, and for now I can live with the situation as it is, but I am also not a startup trying to use the Node name to help gain visibility.

Final thoughts

On the whole, I was really pleased with Nodeconf. I did pick up some useful information, but more importantly I got some sense of the community / ecosystem, which is really important. While the core engine of Node.js is important, it’s the growth and flourishing of the community and ecosystem that matter the most. As with most things Node, we are still in the early days but thing seem promising.

The best collections of JSConf/NodeConf slides seem to be in gists rather than Lanyrd, so here’s a link to the most up to date one that I could find.

Update: corrected misspelling of Henrik Joreteg’s name. And incorrectly calling Matt Ranney Mark.

JSConf 2011

Last year when I attended JSConf I had some ideas about the importance of Javascript. I was concerned in a generic way about building “richer” applications in the browser and Javascript’s role in building those applications. Additionally, I was interested in the possibility of using Javascript on the server, and was starting to learn about Node.js.

A year later, I have some more refined ideas. The fragmentation of mobile platforms means that open web technologies are the only way to deliver applications across the spectrum of telephones, tables, televisions and what have you, without incurring the pain of multi platform development. The types of applications that are most interesting to me are highly interactive with low latency user interfaces – note that I am intentionally avoiding the use of the word “native”. Demand for these applications is going to raise the bar on the skill sets of web developers. I think that we will see more applications where the bulk of the interface and logic are in the browser, and where the server becomes a REST API endpoint. The architecture of “New Twitter” is in this vein. API endpoints have far less of a need for HTML templating and server side MVC frameworks. But those low latency applications are going mean that servers are doing more asynchronous delivery of data, whether that is via existing Comet like techniques or via Websockets (once it finally stabilizes). Backend systems are going to partition into parts that do asynchronous delivery of data, and other parts which run highly computationally intensive jobs.

I’ll save the discussion of the server parts for my Nodeconf writeup, but now I’m ready to report on JSConf.

Talks

Here are some of the talks that I found interesting or entertaining.

Former OSAF colleague Adam Christian talked about Jellyfish, which is a tool for executing Javascript in a variety of environments from Node to desktop browsers to mobile browsers. One great application for Jellyfish is testing, and Jellyfish sprang out of the work that Adam and others did on Windmill.

It’s been a while since I looked at Bespin/Skywriter/Ace, and I was pleased to see that it seems to be progressing quite nicely. I particularly liked the Github support.

I enjoyed Mary Rose Cook’s account of how writing a 2D platform game using Javascript cause her to have a falling in love like experience with programming. It’s nice to be reminded of the sheer fun and art of making something using code.

Unfortunately I missed Andrew Dupont’s talk on extending built-ins. The talk was widely acclaimed on Twitter, and fortunately the slides are available. More on this (perhaps) once I get some time to read the slide deck.

Mark Headd showed some cool telephony apps built using Node.js including simple control of a web browser via cell phone voice commands or text messages. The code that he used is available, and uses Asterisk, Tropos, Couchbase, and a few other pieces of technology.

Dethe Elze showed of Waterbear, which is a Scratch-like environment running in the browser. It’s not solely targeted at Javascript, which I have mixed feelings about. My girls have done a bunch of Scratch programming, so I am glad to see that environment coming to languages that are more widely used.

The big topics

There were four talks in the areas that am really concerned about, and I missed one of them, which was Rebecca Murphey’s talk on Modern Javascript, which appeared to be derived from some blog posts that she has written on the topic. I think that the problems she is pointing out – ability to modularize, dependency management, and intentional interoperability are going to be major impediments to building large applications in the browser, never mind on the server.

Dave Herman from Mozilla did a presentation on a module system for the next version of Javascript (which people refer to as JS.next). The design looks reasonable to me, and you can actually play with it in Narcissus, Mozilla’s meta circular Javascript interpreter, which is a testbed for JS.next ideas. One thing that’s possible with the design is to run different module environments in the same page, which Dave demonstrated by running Javascript, Coffeescript, and Scheme syntaxed code in different parts of a page.

The last two talks of the conference were also focused on the topic of JS.next.

Jeremy Askenas was scheduled to talk about Coffeescript, but he asked Brendan Eich to join him and talk about some of the new features that have been approved or proposed for JS.next. Many of these ideas look similar to ideas that are in Coffeescript. Jeremy then went on to try and explain what he’s trying to do in Coffeescript, and encouraged people to experiment with their own language extensions. He and Brendan are calling programs like the Coffeescript compiler, “transpilers” – compilers which compile into Javascript. I’ve written some Coffeescript code just to get a feel for it, and parts of the experience reminded me of the days when C++ programs went through CFront, which then translated them into C which was then compiled. I didn’t care for that experience then, and I didn’t care for it this time, although the fact that most of what Coffeescript does is pure syntax means that the generated code is easy to associate back to the original Coffeescript. There appears to be considerable angst around Coffeescript, at least in the Javascript community. Summarizing that angst and my own experience with Coffeescript is enough for a separate post. Instead I’ll just say that I like many of the language ideas in Coffeescript, but I’d prefer not to see Coffeescript code in libraries used by the general Javascript community. If individuals or organizations choose to adopt Coffeescript, that’s fine by me, but having Coffeescript go into the wild in library code means that pressure will build to adapt Javascript libraries to be Coffeescript friendly, which will be detrimental to efforts to move to JS.next.

The last talk was given by Alex Russell, and included a triple head fake where Alex was ostensibly to talk about feature detection, although only after a too long comedic delay involving Dojo project lead Pete Higgins. A few minutes into the content on feature detection, Alex “threw up his hands”, and pulled out the real topic of his talk, which is the work that he’s been doing on Traceur, which is Google’s transpiler for experimenting with JS.next features. Alex then left the stage and a member of the Traceur team gave the rest of the talk. I am all in favor of cleverness to make a talk interesting, but I would have to say that the triple head fake didn’t add anything to the presentation. Instead, it dissipated the energy from the Brendan / Jeremy talk, and used up time that could have been used to better motivate the technical details that were shown. The Traceur talk ended up being less energetic and less focused than the talk before it, which is a shame because the content was important. While improving the syntax of JS.next is important, it’s even more important to fix the problems that prevent large scale code reuse and interoperability. The examples being given in the Traceur talk were those kinds of examples, but they were buried by a lack of energy, and the display of the inner workings of the transpiler.

I am glad to see that the people working on JS.next are trying to implement their ideas to the point where they could be used in large Javascript programs. I would much rather that the ECMAScript committee had actual implementation reports to base their decisions on, rather than designing features on paper in a committee (update: I am not meaning to imply that TC39 is designing by committee — see the comment thread for more on that. ). It is going to be several more years before any of these features get standardized, so in the meantime we’ll be working with the Javascript that we have, or in some lucky cases, with the recently approved ECMAScript 5.

Final Thoughts

If your interests are different than mine, here is a list of pointers to all the slides (I hope someone will help these links make it onto the Lanyrd coverage page for JSConf 2011.

JSConf is very well organized, there are lots of social events, and there are lots of nice touches. I did feel that last year’s program was stronger than this years. There are lots of reasons for why this might be the case, including what happened in Javascript in 2010/11, who was able to submit a talk, a change in my focus and interests. Chris Williams has a very well reasoned description of how he selects speakers for JSConf. In general I really agree with what he’s trying to do. One thing that might help is to keep all the sessions to 30 minutes, which would allow more speakers, and also reduce the loss if a talk doesn’t live up to expectations.

On the whole, I definitely got a lot out the conference, and as far as I can tell if you want to know what is happening or about to happen in the Javascript world, JSConf is the place to be.

South by Southwest Interactive 2011

Back in 2006, Julie made the trek to Austin for South By Southwest Interactive (SXSWi) because she was organizing a panel. This year, I finally got a chance to go. In recent years, I’ve been to a lot of conferences. Many of them have been O’Reilly conferences, and the rest have been conferences organized by various open source communities. What almost all of them have in common is that they are developer centric. What is intriguing about SXSWi, to use John Gruber’s words, is that it is a conference where both developers and designers are welcome (As are a whole pile of people working in the social media space). One of the reasons that I decided to go this year was to try to get some perspective from a different population of people.   

SXSWi is a very large conference with this year’s attendance at around 14000 people. There are conferences which are bigger (Oracle OpenWorld, JavaOne in its heyday, or ComicCon San Diego), but not many. If you mix in the Film conference, which runs at the same time, you have a lot of people in Austin. Any way you slice it, it’s a huge conference. According to “old-timers” that I spoke to, the scale is new, and I would say it’s the source of almost all of the problems that I had with the conference.

Talks

Common wisdom in recent years is that SXSWi is more about the networking than the panel / talk content. I did find a number of interesting talks.

I’ve been loosely aware of Jane McGonigal’s work on games for quite some time, but I’ve never actually been able to hear her speak until now. Gamification is a big topic in some circles right now. I think that Jane’s approach to gaming is deeper and has much longer term impact than just incorporating some of the types of game mechanics that are currently in vogue. I also really appreciated the scientific evidence that she presented about games. I’m looking forward to reading her book “Reality Is Broken: Why Games Make Us Better and How They Can Change the World”.

I had no idea who Felicia Day was when I got to SXSWi. Like all conferences, I did my real planning for each day of SXSWi the night before, doing the usual research on speakers that I was unfamiliar with. Felicia’s story resonated with me because she was homeschooled (like my daughters), went on to be very successful academically and then went into the entertainment business. She is among the leaders in bringing original video content to the internet instead of going through the traditional channels of broadcast television or movie studios. It’s a path that seems more and more likely to widen (witness Netflix’s licensing of “House of Cards”, or Google’s acquisition of Next New Networks). I learned all of that before I sat in the keynote. By the time that I left the keynote, I found myself charmed by her humility and down to earthness, and impressed by the way that she has built a real relationship with her fans in such a way that she can rally them for support when needed.

For the last year or so I’ve been seeing reviews for “The Power of Pull: How Small Moves, Smartly Made, Can Set Big Things in Motion” by John Hagel, John Seely Brown and Lang Davison. It sounded like the authors have found an interesting structuring for some of the changes that I’ve observed by being in the middle of open source software, blogging, and so forth. I still haven’t gotten around to reading that book (the stack is tall – well actually, the directory on the iPad is full), but I was glad for the chance to hear John Hagel talk about shaping strategies, his theory on how to make big changes by leveraging the resources of an entire market or ecosystem rather than taking on all the risk in a solo fashion. His talk was on the last day of the conference, and I was wiped out by then, so I need a refresher and some additional think time on his ideas.

Much to my surprise, there were a number of really interesting talks on the algorithmic side of Data Science/Big Data. Many of these talks were banished to the AT&T Conference center at UT Austin, which was really far from the Austin Convention Center and very inconvenient to get to. I wasn’t able to make it to many of these talks due to this – having venues so far away – the AT&T Center, the Sheraton, and the Hyatt – pretty much dooms the talks that get assigned to those venues. It’s not a total loss, since these days it’s pretty easy to find the speakers of the talks and contact them for more information. But that’s a much higher friction effort than going to their talk, having a chance to talk to them afterwards or over dinner, and going from there. I did really enjoy the talk Machines Trading Stocks on News. I am not a financial services guy, and there was no algorithmic heavy lifting on display, but the talk still provided a really interesting look at the issues around analyzing semistructured data and then acting on it. As usual, the financial guys are quietly doing some seriously sophisticated stuff, while the internet startup guys get all the attention. In a related vein, I also went to How to Personalize Without Being Creepy which had a good discussion of the state of the art of integrating personalization into products. There was not statistical machine learning on display, but the product issues around personalization are at least as important as the particulars of personalization technology.

One of the nice things about having such a huge conference is that you get some talks from interesting vectors. Our middle daughter has decided that she wants to go to Mars when she grows up. Now it’s quite some time between now and then, but just in case, I stopped into the talk on Participatory Space Exploration and collected a bunch of references that she can go chase. I was also able to chat with the folks from NASA afterwards and pick up some good age appropriate pointers.

There were some interesting sounding talks that I wasn’t able to get into because the rooms were full. And as I’ve mentioned there were also some talks that I wasn’ t able to go to because they were located too far away. As a first time SXSWi attendee but a veteran tech conference attendee and speaker, I’d say that SXSWi is groaning under its own scale at this point. It’s affecting the talks, the “evening track” and pretty much everything else. This is definitely a case of bigger is not better.

Party Scene

I am used to conferences with an active “evening track”, and of course, this includes parties. SXSWi is like no other event that I’ve been to. The sheer number of parties, both public and private is staggering. I’ve never had to wait in line to get into parties before, and there are very few VIP lists, whereas at SXSWi both lines and VIP lists seem to be the order of the day. Part of that is due to the scale, and I’m sure that part of that is SXSW’s reputation as a party or euphemistically, networking, conference. The other issue that I had with the parties is that the atmosphere at many of them just wasn’t conducive to meeting people. I went to several parties where the music was so loud that my ears were ringing within a short time. It’s great that there was good music (a benefit of SXSW), and lots of free sponsor alcohol, but that isn’t really my style.

Despite all that, I did have some good party experiences. I accidentally/serendipitously met a group of folks who are responsible for social media presences at big brands in the entertainment sector, so I got some good insight in to the kind of problems that they face and the back channel on business arrangements with some of the bigger social networks. I definitely got some serious schooling on how to use Foursquare. At another party, I got ground’s eye view on what parts of Microsoft’s Azure PaaS offering is real, and how much is not. I’m not planning to be an Azure user any time soon, but it’s always nice to know what is hype and what is reality. I also really enjoyed the ARM party. It was a great chance to see what people are doing with ARM processors – these days. This video that I saw at the TI table made me realize just how close we are to seeing some pretty cool stuff. Nikon USA and Vimeo sponsored a fun party at an abandoned power plant. The music was really loud, but the light was cool and I made some decent pictures.

Other activities

There are activities of all kinds going on during SXSW. I wasn’t able to do a lot of them because they conflicted with sessions, but I was able to go on a pair of photowalks, which was kind of fun. The photowalk with Trey Ratcliff was pretty fun. As usual, scale was an issue, because we pretty much clogged up streets and venues wherever we went. I’ve started to put some of those photos up on Flickr, but I decided to finish this post rather than finish the post production on the pictures.

App Round Up

One of the things that makes SXSWi is that you have a large group of people who are willing to try a new technology or application. It’s conventional wisdom that SXSWi provided launching pads for Twitter and Foursquare, so now every startup is trying to get you to try their application during the week of the conference. While by no means foolproof or definitive, this is a unique opportunity to observe how people might use a piece of technology.

Before flying down to Austin, I downloaded a bunch of new apps on my iPhone and iPad – so many that I had to make a SXSW folder. I had no preconceived notions about which of these new apps I was going to use.

There were also two web applications that I ended up using quite a bit: Lanyrd’s SXSW guide, and Plancast. Lanyrd launched last year as kind of a directory for conferences, and I’ve been using it to keep track of my conference schedule for a good number of months. For SXSWi, they created a SXSW specific part of the site that included all the panels, along with useful information like the Twitter handles and bios of the speakers. Although SXSW itself had a web application with the schedule, I found that Lanyrd worked better for the way that I wanted to use the schedule. This is despite the face that SXSW had an iPhone app while Lanyrd’s app has yet to ship. With Lanryd covering the sessions, I used Plancast (and along the way Eventbrite) to manage the parties. Plancast had all the parties in their system, including the Alaska direct flight from Seattle to Austin that I was on. Many of the parties were using Eventbrite to limit attendance, and while I had used Eventbrite here and there in the past, this finally got me to actually create an account there and use it. Eventbrite and Plancast integrate in a nice way, and it all worked pretty well for me.

Of all the ballyhooed applications that I downloaded, I really only ended up using two. There were a huge number of group chat/small group broadcast applications competing for attention. The one that I ended up using was GroupMe, mostly because the people I wanted to keep up with were using it. Beyond the simple group chat/broadcast functionality, it has some other nice features like voice conference calling that I didn’t really make use of during SXSW. Oddly enough, I first started using Twitter when I was working with a distributed team, and I always wished that Twitter had some kind of group facility. It’s nice that GroupMe and its competitors exist, but I also can’t help feeling like Twitter missed an opportunity here. Facebook’s acquisition of Beluga suggests as much.

The other application that I ended up using was Hashable. Hashable’s marketing describes it as “A fun and useful way to track your relationships”. I’d describe my usage of it as a way to exchange business cards moderately quickly using Twitter handles. A lot of my Hashable use centered around using my Belkin Mini Surge Protector Dual USB Charger to multiply the power outlets at the back of the ballrooms. I’ve made a lot of friends with that little device. In any case, I used Hashable as a quick way to swap information with my new power strip friends. While I used it, I’m ambivalent about it. I like that it can be keyed off of either email address or Twitter handle – I always used Twitter handle. My official business cards don’t have a space for the handle, which is annoying here in the 21st century. However, the profile that it records is not that detailed, so any business card information that is going to a new contact isn’t that detailed. It seems obvious to me that there ought to be some kind of connection to LinkedIn, but there’s no space for that. So I couldn’t really use Hashable as a replacement for a business card because all the information isn’t there. It’s also more clumsy to take notes about a #justmet on the iPhone keyboard than to write on the back of a card. The difficulty of typing on the iPhone keyboard also makes it time consuming and kind of antisocial to use. In a world where everyone used Hashable, and phones were NFC equipped, you can imagine a more streamlined exchange, but even then, the right app would have to be open on the phone. Long term, that’s an interface issue that phones are going to run into. Selecting the right functionality at the right time is getting to be harder and harder – pages of folders of apps means that everything gets on the screen, but it doesn’t mean that accessing them is fast.

In a similar vein, there were QR codes plastered all over pamphlets, flyers, and posters, but as @larrywright asked me on Twitter, I didn’t see very many people scanning them. Maybe people were scanning all that literature in their rooms after being out till 2am. There’s still an interface problem there.

In addition to all the hot new applications, there were the “old” standby’s, Foursquare and Twitter.

I am a purpose driven Foursquare user. I use Foursquare when I want people to know where I am. I’ve never really been into the gamification aspects of Foursquare, but I figured that SXSWi was the place to give that aspect of Foursquare more of a try. Foursquare rolled out a truckload of badges for SXSWi, and sometimes it seemed like you could check into every individual square foot of the Austin Convention Center and surrounding areas. So I did do a lot more checking in, mostly because there were more places to check in, and secondarily because I was trying to rack up some points. Not that the points ever turned into any tangible value for me. But as has been true at other conferences, the combination of checking on Foursquare and posting those checkins to Twitter did in fact result in some people actually tracking me down and visiting.

If you only allowed me one application, it would still be Twitter. If I wanted to know what was happening, Twitter was the first place I looked. Live commentary on the talks was there. I ended up coordinating several serendipitous meetings with people from Twitter. Twitter clients with push notifications made things both easy and timely. While I’m very unhappy with Twitter’s recent decree on new Twitter clients, the service is still without equal for the things that I use it for.

One word on hardware. There were lots of iPad 2′s floating around. I’m not going to do a commentary on that product here. For a conference like SXSWi, the iPad is the machine of choice. After the first day, I locked my laptop in the hotel safe. I would be physically much more worn out if I had hauled that laptop around. The iPad did everything that I needed it to do, even when I forgot to charge it one night.   

Interesting Tech

While SXSWi is not a hard core technology conference, I did manage to see some very interesting technology. I’ve already mentioned the TI OMAP5 product line at the ARM party. I took a tour of the exhibit floor with Julie Steele from O’Reilly, and one of the interesting things that we saw was an iPhone app called Neer. Neer is an application that let’s you set to-do’s based on location. This is sort of an interesting idea, but the more interesting point came out after I asked about Neer’s impact on the phone’s battery life. I had tried an application called Future Checkin, which would monitor your location and and check you into places on Foursquare, because I was so bad about remembering to check in. It turned out that this destroyed the battery life on my phone, so I stopped using it. When I asked the Neer folks how they dealt with this, they told me that they use the phone’s accelerometer to detect when the phone is actually moving, and they only ping the GPS when they know you are moving, thus saving a bunch of battery life. This is a clever use of multiple sensors to get the job done, and I suspect that we’re really only at the beginning of seeing how the various sensors in mobile devices will be put to use. It turns out that the people working on Neer are part of a Qualcomm lab that is focused on driving the usage of mobile devices. I’d say they are doing their job.

The other thing that Julie and I stumbled upon was 3taps, which is trying to build a Data Commons. The whole issue of data openness, provenence, governance, and so forth is going to be a big issue in the next several years, and I expect to see lots of attempts to figure this stuff out.

The last interesting piece of technology that I learned about is comes from Acunu. The Acunu folks have developed a new low-level data store for NoSQL storage engines, particularly engines like Cassandra. The performance gains are quite impressive. The engine will be open source and should be available in a few months.   

In conclusion

SXSWi is a huge conference and it took a lot out of me, more than any other conference that I’ve been to. While I definitely got some value out of the conference, I’m not sure that the value I got corresponded to the amount of energy that I had to put in. Some of that is my own fault. If I were coming back to SXSWi, here are some things that I would do:

  • Work harder at being organized about the schedule and setting up meetings with people prior to the conference
  • Skip many of the parties and try to organize get togethers with people outside of the parties
  • Eat reasonably – SXSW has no official lunch or dinner breaks – this makes it to easy to go too long without eating which leads to problems.
  • Always sit at the back of the room and make friends over the power outlets

Lanyrd is collecting various types of coverage of the conference whether that is slide decks, writeups, or audio recordings.   

I like the idea of SXSWi, and I like the niche that it occupies, but I think that scale has overtaken the conference and is detracting from the value of it. Long time attendees told me that repeatedly when I asked. I would love to see some alternatives to SXSWi, so that we don’t have to put our eggs all in one basket.

Strata 2011

I spent three days last week at O’Reilly’s Strata Conference. This is the first year of the conference, which is focused on topics around data. The tag line of the conference was “Making Data Work”, but the focus of the content was on “Big Data”.

The state of the data field

Big Data as a term is kind of undefined in a “I’ll know it when I see it” kind of way. As an example,I saw tweets asking how much data one needed to have in order to qualify as having a Big Data problem. Whatever the complete meaning is, if one exists, there is a huge amount of interest in this area. O’Reilly planned for 1200 people, but actual attendance was 1400, and due to the level of interest, there will be another Strata in September 2011, this time in New York. Another term that was used frequently was data science, or more often data scientists, people who have a set of skill that make them well suited to dealing with data problems. These skills include programming, statistics, machine learning, and data visualization, and depending on who you ask, there will be additions or subtractions from that list. Moreover, this skill set is in high demand. There was a very full job board, and many presentations ended with the words “we’re hiring”. And as one might suspect, the venture capitalists are sniffing around — at the venture capital panel, one person said that he believed there was a 10-25 year run in data problems and the surrounding ecosystem.

The Strata community is a multi disciplinary community. There were talks on infrastructure for supporting big data (Hadoop, Cassandra, Esper, custom systems), algorithms for machine learning (although not as many as I would have liked), the business and ethics of possessing large data sets, and all kinds of visualizations. In the executive summit, there were also a number of presentations from traditional business intelligence, analytics, and data warehousing folks. It is very unusual to have all these communities in one place and talking to each other. One side effect of this, especially for a first time conference, is that it is difficult to assess the quality of speakers and talks. There were a number of talks which had good looking abstracts, but did not live up to those aspirations in the actual presentation.    I suspect that it is going to take several iterations to identify the the best speakers and the right areas – par for a new conference in a multidisciplinary field.

General Observations

I did my graduate work in object databases, which is a mix of systems, databases, and programming languages. I also did a minor in AI, although it was in the days before machine learning really became statistically oriented. I’m looking forward to going a bit deeper into all these areas as I look around in the space.

One theme that appeared in many talks was the importance of good, clean data. In fact, Bob Page from eBay showed a chart comparing 5 different learning algorithms, and it was clear that having a lot of data made up for differences in the algorithms, making the quality and volume of the data more important than the details of the algorithms being used. That’s not to say that algorithms are unimportant, just that high quality data is more important. It seems obvious that having access to good data is really important.

Another theme that appeared in many talks was the combination of algorithms and humans. I remember this being said repeatedly in the panel on predicting the future. I think that there’s a great opportunity in figuring out how to make the algorithm and human collaboration work as pleasantly and efficiently as possible.

There were two talks that at least touched on building data science teams, and on Twitter it seemed that LinkedIn was viewed as having one of the best data science teams in the industry. Not to take anything away from the great job that the LinkedIn folks are doing, or the importance of helping people find good jobs, but I hope that in a few years, we are looking up to data science teams from healthcare, energy, and education.

It amused me to see tweets and have discussions on the power of Python as a tool in this space. With libraries like numpy, scipy, nltk, and scikits.learn, along with an interactive interpreter loop, Python is well suited for data science/big data tasks. It’s interesting to note that tools like R and Incanter have similar properties.

There were two areas that I am particularly interested in, and which I felt were somewhat under represented. The issue of doing analysis in low latency / “realtime” scenarios, and the notion of “personal analytics” (analytics around a single person’s data). I hope that we’ll see more on these topics in the future.

The talks

As is the case nowadays, the proceedings from the conference are available online in the form of slide decks, and in some cases video. Material will probably continue to show up over the course of the next week or so. Below are some of the talks I found noteworthy.

Day 1

I spent the tutorial day in the Executive Summit, looking for interesting problems or approaches that companies are taking with their data efforts. There were two talks that stood out to me. The first was Bob Page’s talk Building the Data Driven Organization, which was really about eBay. Bob shared from eBay’s experience over the last 10 years. Probably the most interesting thing he described was an internal social network like tool, which allowed people to discover and then bookmark analytics reports from other people.

Marilyn and Terence Craig presented Retail: Lessons Learned from the First Data-Driven Business and Future Directions, which was exactly how it sounded. It’s conventional wisdom among Internet people that retail as we know it is dead. I came away from this talk being impressed by the problems that retail logistics presents, and by how retail’s problems are starting to look like Internet problems. Or is that vice versa?

Day 2

The conference proper started with the usual slew of keynotes. I’ve been to enough O’Reilly conferences to know that some proportion of the keynotes are given in exchange for sponsorships, but some of the keynotes were egregiously commercial. The Microsoft keynote included a promotional video, and the EnterpriseDB keynote on Day 3 was a bald faced sales pitch. I understand that the sponsors want to get value for the money they paid (I helped sponsor several conferences during my time at Sun). The sponsors should look at the twitter chatter during their keynotes to realize that these advertising keynotes hurt them far more than they help them. Before Strata, I didn’t really know anything about EnterpriseDB except that they had something to do with Postgres. Now I all I know is that they wasted a bunch of my time during a keynote spot.

Day 2 was a little bit light on memorable talks. I went to Generating Dynamic Social Networks from Large Scale Unstructured Data which was in the vendor presentation track. Although I didn’t learn much about the actual techniques and technologies that were used, I did at least gain some appreciation for the issues involved. The panel Real World Applications Panel: Machine Learning and Decision Support only had two panelists. Jonathan Seidman and Robert Lancaster from Orbitz described how they use learning for sort optimization, intelligent caching, and personalization/segmentation, and Alasdair Allan from the University of Exeter described the use of learning and multiagent systems to control networks telescopes at observatories around the world. The telescope control left me with a vaguely SkyNet ish feeling. Matthew Russell has written a book called Mining the Social Web. I grabbed his code off of github and it looked interesting, so I dropped into his talk Unleashing Twitter Data for Fun and Insight. He’s also written 21 Recipes for Mining Twitter, and the code for that is on github as well.

Day 3

Day 3 produced a reprieve on the keynote front. Despite the aforementioned horrible EnterpriseDB keynote, there were 3 very good talks. LinkedIn’s keynote on Innovating Data Teams was good. They presented some data science on the Strata attendees and described how they recruited and organized their data team. They did launch a product, LinkedIn Skills, but it was done in such a way as to show off the data science relevant aspects of the product.

Scott Yara from EMC did a keynote called Your Data Rules the World. This is how a sponsor keynote should be done. No EMC products were promoted, and Scott did a great job of demonstrating a future filled with data, right down to still and video footage of him being stopped for a traffic violation. The keynote provoked you to really thing about where all this is heading, and what some of the big issues are going to be. I know that EMC make storage and other products. But more than that, I know that they employ Product Management people who have been thinking deeply about a future that is swimming with data.

The final keynote was titled Can Big Data Fix Healthcare?. Carol McCall has been working on data oriented healthcare solutions for quite some time now, and her talk was inspirational and gave me some hope that improvements can happen.

Day 3 was the day of the Where’s the Money in Big Data? panel, where a bunch of venture capitalists talked about how they see the market and where it might be headed. It was also the day of two really good sessions. In Present Tense: The Challenges and Trade-offs in Building a Web-scale Real-time Analytics System, Ben Black described Fast-IP’s journey to build a web-scale real-time analytics system. It was an honest story of attempts and failures as well as the technical lessons that they learned after each attempt. This was the most detailed technical talk I attended, although the terms distributed lower dimensional cuboid and word-aligned bitmap index were tossed around, but not covered in detail. It’s worth noting that Fast-IP’s system and Twitter’s Analytics system, Rainbird, are both based, to varying degrees, on Cassandra.

I ended up spending an extra night in San Jose so that I could stay for Predicting the Future: Anticipating the World with Data, which was in the last session block of the conference. I think that it was worth it. This was a panel format, but each panelist was well prepared. Recorded Future is building a search engine that uses the past to predict the future. They didn’t give out much of their secret sauce, but they did say that they have built a temporally based index as opposed to a keyword based one. Unfortunately their system is domain specific, with finance and geopolitics being the initial domains. Palantir Technologies is trying to predict terrorist attacks. In the abstract, this means predicting in the face of an adaptive adversary, and in contexts like this, the key is to stop thinking in terms of machine learning and start thinking in terms of game theory. It seems like there’s a pile of interesting stuff in that last statement. Finally, Rion Snow from Twitter took us through a number of academic papers where people have successfully made predictions about box office revenue, the stock market, and the flu, just from analyzing information available via Twitter. I had seen almost all of the papers before, but it was nice to feel that I hadn’t missed any of the important results.

Talks I missed but had twitter buzz

You can’t go to every talk at a conference (nor should you, probably), but here are some talks that I missed, but which had a lot of buzz on Twitter. MAD Skills: A Magnetic, Agile and Deep Approach to Scalable Analytics – the hotness of this talk seemed related more to the DataWrangler tool (for cleansing data) than the MAD library (scalable analytics engine running inside Postgres) itself. Big Data, Lean Startup: Data Science on a Shoestring seemed like it had a lot of just good commonsense about running in a startup in addition to know how to do data science without doing overkill. Joseph Turian’s New Developments in Large Data Techniques looked like a great talk. His slides are available online, as well as the papers that he referenced. It seemed like the demos were the topic of excitement in Data Journalism: Applied Interfaces, given jointly by folks from ReadWriteWeb, The Guardian, and The New York Times. Rainbird is Twitter’s analytics system, which was described in Real-time Analytics at Twitter. Notable news on that one is that Twitter will be open sourcing Rainbird once the requisite version of Cassandra is released.

Evening activities

There were events both evenings of the show, which made for very long days. On Day 1 there was a showcase of various startup companies, and on Day 2, there was a “science fair”. In all honesty, the experience was pretty much the same both nights. Walk your way around some tables/pedestals, and talk to people who are working on stuff that you might think is cool. The highlights for me were:

Links

Here is a bunch of miscellaneous interesting links from the conference:

Tweet Mining

Finally, no conference on data should be without it’s own Twitter exhaust. So I’ll leave you with some analysis and visualizations done on the tweets from Strata.

Update: Thanks to bear for a typo correction.

Blogaversary 2011

I’m not that good at remembering my Blogaversary — it’s been two years since I remembered last. You can thank the OmniGroup’s wonderful OmniFocus for reminding me in time this year. Almost everything that I wrote describing my 6 year blogaversary is still true today. In fact, I’m doing more traveling than I was when I wrote that. In the past much of my travel has been for conferences, but last year, I did a lot of traveling for other meetings. I’m expecting that I’ll be at fewer conferences this year than last year. I’ve started using Simon Willison’s excellent Lanyrd to manage my conference tracking. My list for this year will give you some hints about some of the stuff that I am looking at. One thing that is difference since I’ve been at Disney is that I am seeing lots of interesting stuff, but much of it is covered by Non Disclosure Agreements. Needless to say, I don’t write about any of that.   

Here’s to another year of blogging, tweeting, and whatever else is coming down the path.

2010 in Photography

Once again it is time for a summary of the year in photos. For 2010, I decided that I was going to try and do “The Daily Shoot” every day. On the whole this was a good experience for me. The variety of subjects for the assignments helped to take me out of the zone of things that I would normally shoot, both in terms of subject matter and style. The variety of subject matter has really helped my “situational awareness”. I notice a lot more things in my surroundings, and I’ve noticed that it is easier for me to find subjects for the assignments, particularly when I am out and about. There were a number of assignments that focused on particular styles or techniques in photography. In principle I’ve known how to shoot these things, but because I have my preferred style to shoot, I’ve never actually done so. These assignments were particularly good, because I was forced to take the theory and put it into practice.

Back in April I picked up a Panasonic GF-1, and from then on, I did every assignment with that camera and the 20mm f/1.7 lens. I’ve mostly shot zoom lenses, and I wanted to try shooting only with a prime lens, to get a more intuitive grasp of the 50mm (20mm on Micro 4/3 camera is close to 50mm on a full frame DSLR) field of view, and to force my self to compose by moving the camera as opposed to zooming all the time.

I did find some drawback to the experience. Shooting everyday can be arduous at times. There were days when the combination of time commitments and subjects left me casting about for a picture at 9 or 10 in the evening. There were definitely days where I put up a photo that was just barely acceptable in my eyes, which rankled me both on the day, and unconsciously thereafter.   

Duncan and I have spent some time talking about the whole experience of the Dailyshoot. I think that it’s the kind of thing that everyone ought to attempt. For 2011, I’ll be keeping an eye on the assignments, but I’m going to be a lot more relaxed about it.   

Here are some of the better photos from the year (the entire set is here). Also mixed in are some dance photos from this year’s dance events.

January

Dailyshoot 52

Feb

Dailyshoot 102

March

Dailyshoot 116

April

Dailyshoot 160

May

Dailyshoot 179

June

Dailyshoot 215

Bainbridge Ballet Recital 2010

Bainbridge Ballet’s end of year recital

July

Dailyshoot 236

August

Dailyshoot 265

September

Dailyshoot 293

October

Dailyshoot 322

November

Dailyshoot 373

December

Dailyshoot 388

OPG Nutcracker 2010

OPG Nutcracker 2010

The Olympic Performance Group‘s 2010 Nutcracker.

Google Chrome Update

On Tuesday I attended Google’s Chrome update event in San Francisco. There were three topics on the agenda: Chrome, the Chrome Web Store, and ChromeOS. I’m not going to try to go over all the specifics of each topic. It’s a pointless exercise when Engadget, PC Magazine, etc are also at the event and live blogging/tweeting. I’m just going to give some perspectives that I haven’t seen in the reporting thus far.

Chrome

If you are using a Chrome beta or dev channel build, none of the features announced would be new to you. The only exception is the Crankshaft technology that was added to V8. The claim is that Crankshaft can boost V8 performance up to 50%, using techniques which sound reminiscent of the HotSpot compiler for Java. Unsurprising that the V8 team includes veterans of the HotSpot team. Improving Javascript performance is good, and in this case it’s even better because V8 is the engine inside Node.js, so in theory Node should get some improvements on long running Javascript programs on the server. I’m pretty sure that there is some performance headroom left in Crankshaft, so I’d expect to see more improvements in the months ahead.

The Chrome team has the velocity lead in the browser wars. It seems like everytime I turn around Chrome is getting better along a number of dimensions. I also have to say, that I love the Chrome videos and comic books.

Chrome Web Store

So Chrome has an app store, but the apps are websites. If you accept Google’s stats, there are 120M Chrome users worldwide, many of them outside the US, and all of them are potential customers of the Chrome Web Store, giving it a reach comparable to or beyond existing mobile app stores. The thing that we’ve learned about app stores is that they fill up with junk fast. So while the purpose of the Web Store is to solve the app discover problem (which I agree is a real problem for normal people), we know that down that path lie dragons.

The other question that I have is will people pay to use apps which are just plain web apps? Developers, especially content developers, are looking for ways to make money from their work, and the Chrome Web Store gives them a channel. The question is, will people pay?

ChromeOS

The idea behind ChromeOS is simple. Browser as operating system. Applications are web applications. Technically there are some interesting ideas.   

The boot loader is in ROM and uses crypto to ensure that only verified images can be booted (the CR-48 has a jailbreak switch to get around this, but real hardware probably won’t). It’s the right thing to do, and Google can do it because they are launching a new platform. Is it a differentiator, maybe if you are a CIO, or a geek, but to the average person this won’t mean much.

Synchronization is built in. You can unbox a ChromeOS device, enter your Google login credentials and have everything synced up with your Google stuff. Of course, if you haven’t drunk the Google ecosystem Cool-Aid, then this won’t help you very much. It’s still interesting because it shows what a totally internet dependent device might be like. Whatever one might say, Android isn’t that, iOS isn’t that, and Windows, OS X, and Linux aren’t that. When I worked at Sun, I had access to Sun-Ray’s, but the Sun Ray experience was nowhere as good as what I saw yesterday.

There’s also some pragmatism there. Google is working with Citrix on an HTML5 version of Citrix’s receiver, which would allow access to Enterprise Applications. There are already HTML VNC’s and so forth. The Google presenter said that they have had an unexpectedly large amount of interest from CIO’s. Actually, that’s what led to the Citrix partnership.

Google is piloting ChromeOS on an actual device, dubbed CR-48 (Chromium isotope 48). CR-48 is not for sale, and it’s not final production hardware. It’s a beta testing platform for ChromeOS. Apparently Inventec (ah, brings back my Newton days) has made 60,000 devices. Some of those are in use by Googlers, and Google is going to make them available to qualified early adopters via a pilot program. The most interesting part of the specs are 8 hours of battery life, 8 days of standby time, and a built in Verizon 3G modem with a basic amount of data and a buy what you need for overages.

Hindsight

At the end of the presentation, Google CEO Eric Schmidt came out to make some remarks. That alone is interesting, because getting Schmidt there signals that this is a serious effort. I was more interested in the substance of his remarks. Schmidt acknowledged that in many ways, ChromeOS is not a new idea, harking back (at least) to the days of the Sun/Oracle Network Computer in the late 90′s. In computing timing matters a huge amount. The Network Computer idea has been around for a while, Schmidt claimed, but it’s only in this day, that we have all of the technology pieces needed to bring it to fruition, the last of the pieces being a version of the web platform that is powerful enough to be decent application platform. It’s going to be interesting to see whether all the pieces truly have arrived, or whether we need a few more technology cycles.

Web 2.0 Summit

This year I was able to go to the Web 2.0 Summit. Web 2.0 is billed as an executive conference, and it lives up to its billing. There is much more focus on business than technology, even though the web is technology through and through.

The World

The web is a global place, but for Americans, at least this American, it is easy to forget that. Wm Elfrink from Cisco did a great job discussing how internet technologies are changing society all over the world. I also enjoyed John Battelle’s interview with Baidu CEO, Robin Li. There is a lot of interesting stuff happening outside the United States, and it is only a matter of time before some of that starts working its way into American internet culture.

Inspiration

Mary Meeker is famous for being an information firehose, and she did not disappoint. Her 15 minute session contained more information than many of the longer talks and interviews. I wish that she had been given double the time, or an interview after her talk. Fortunately her talk and slides are available online.

Schulyer Erle did an Ignite presentation titled How Crowdsourcing Changed Disaster Relief Forever, which was about how OpenStreetMaps was able to help with the Haiti disaster relief effort, and provide a level of help and service heretofore unseen. It’s good to technology making a real difference in the world.

Vinod Khosla gave a very inspiring talk about innovation. The core idea was that you have to ignore what conventional wisdom says is impossible, improbable or unlikely. Market research studies and focus groups won’t lead to breakthough innovations.

The session which resonated the most with me was the Point of Control session on Education, with David Guggenheim (director of Waiting for Superman), Ted Mitchell, and Diana Rhoten. Long time readers will know that our kids have been home schooled (although as they are getting older, we are transitioning them into more conventional settings), so perhaps it’s no surprise that the topic would engage me strongly. One of my biggest reasons for homeschooling was that almost all modern education, whether public or private is based on industrialized schooling – preparing kids to live in a lock-step command and control world. Homeschooling allows kids to learn what they need to learn at their own pace, whether that pace is “fast” or “slow”. One of the panelists, I think it was Ted Mitchell, described their goal as “distributed customized direct to student personalized learning”. That’s something that all students could use.

Just Business

Ron Conway’s Crystal Ball session was chance to see some new companies, and was a refreshing change from some of the very large companies that dominated the Summit. The problem with the large public companies is that their CEO’s have had tons of media training and are very good at keeping on message, which makes them pretty boring.

The Point of Control session on Finance got pretty lively. I thought that it was valuable to get two different VC perspectives on the market today, and on particular companies. One of the best sections was the part where Fred Wilson took John Doerr to task over Google’s recent record on innovation.

I’m a Facebook user but I’m not a rabid Facebook fan. Julie and I saw “The Social Network” when it came out in theaters, so I was curious to see Mark Zuckerberg speak in person. He did much better than I expected him to. While there wasn’t much in the way of new content, at least Zuckerberg demonstrated that he can do an interview the way that a big company CEO should.

Postscript

I found the content at Web 2.0 to be pretty uneven. Since this was my first year, I don’t have a lot to compare it to. I will note that the last time I went a high end O’Reilly conference (ETech, circa 2006), I had a similar problem with content not quite matching expectations. For Web 2.0 this year, there turned out to be a simple predictor for the quality of a session. If John Heilemann was doing an interview, more likely than not it would be a good one.

NewTeeVee 2010

I’ve been doing a lot of traveling in November, including some conferences. Here’s some information from NewTeeVee.

I dropped into NewTeeVee because I’m doing a lot with video and television these days, but I’m not really from that world. NewTeeVee is targeted at that space where the Internet and television overlap. As a result the conference feels kind of weird when you are used to going to conferences filled with open source developers and programmers of all kinds. There was very little talk about technology, at least in a form that would be recognizable to internet people. Quite a number of the presentations involved celebrities of one form or another, which is unsurprising, and I found it interesting to hear their takes on the future of television, and of entertainment as a whole. One of the most interesting sessions in this vein was with the showrunners of Lost and Heroes, two shows which have been very successful at combining broadcast television with the internet. Despite their pioneering efforts and their success, it was discouraging to hear them talking about how hard it would be to replicate the combined new media and old media combinations of their shows.

The closest that we got to technology in a form that I recognized was a talk by Samsung, which was really about their efforts to evangelize developers to write applications for Samsung connected TV’s. Samsung has its own application platform, and I found myself wondering whether or not they would be able to get enough developer attention. I’d much prefer to see TV’s adopt Open Web based technologies for their application platforms.

I came away from the conference feeling like a visitor to a country not my own, with a better sense of the culture, but still feeling very “other”.   

Strange Loop 2010

Last week I was in Saint Louis for Strange Loop 2010. This was the second year of Strange Loop, which is a by hackers for hackers conference. I’m used to this sort of conference when it’s organized by a single open source community – I’d put ApacheCon, PyCon, and CouchCamp in this category. At the same time, Strange Loop’s content was very diverse, and had some very high quality speakers. It’s sort of like a cross between ApacheCon and OSCON. One difference is that there isn’t a community that’s putting on Strange Loop, and the fun community feel of ApacheCon or PyCon is missing.   

One of the reasons that I was interested in attending Strange Loop was Hilary Mason’s talk on data science / machine learning. This is an area that I am starting to delve into, and I did study a little machine learning right around the time that it was starting to shift away from traditional AI and more towards the statistical approach that characterizes it now. Hilary is the chief scientist at bit.ly, and as it turns out, a Brown alumnae as well. Her talk was a good introduction to the current state of machine learning for people who didn’t have any background. She talked about some of the kinds of questions that they’ve been able to answer at bit.ly using machine learning techniques. Justin Bozonier used Twitter to ask Hilary if she would be wiling to sit down with interested people and do some data hacking, so I skipped the session block (which was painful because I missed Nathan Marz’s session on Cascalog, which was getting rave reviews). We ended up doing some simple stuff around the tweets about #strangeloop. Justin has a good summary of what happened, complete with code, and Hilary posted the resulting visualization on her blog. It was definitely useful to sit and work together as a group and get snippets of insight into how Hilary was approaching the problem.

Another area that I am looking at is changes in web application architecture due to the changing role of Javascript on both the client and the server. I went to Kyle Simpson’s talk on Strange UI architecture, as well as Ryan Dahl’s talk on node.js. Kyle has built BikechainJS, another wrapper around V8, like Node.js. There’s a lot of interest around server side javascript – the next step is to think about how to repartition the responsibilities of web applications in a world where clients are much more capable, and where some code could run on either the client or the server.

Guy Steele gave a great talk, and the number of people who can give such talk is decreasing by the day. As a prelude to talking about abstractions for parallel programming, Guy walked us through an IBM 1130 program that was written on a single punch card. He had to reverse engineer the source code from the card, which was complicated by the fact that he used self modifying code as well as some clever value punning in order to get the most out of the machine. The thrust of his comments on parallel programming was that the accumulator style of programming which pervades imperative programs is bad when it comes to exploiting parallelism. Instead, he emphasized being able to find algebraic properties such as associativity or commutativity which would allow parallelism to be exploited via the map/reduce style of programming pioneered decades ago in the functional programming community, and popularized by systems like Hadoop. Guy was proposing that mapreduce be the paradigm for regular programming problems, not just “big data” problems. For me, the most interesting comment that Guy made was about Haskell. He said that if he know what he knew now when he had started on Fortress, he would have started with Haskell and pushed it 1/10 of the way to FORTRAN instead of starting with FORTRAN and pushing it 9/10 of the way to Haskell.

I’m not generally a fan of panel sessions, because the vast majority of them don’t really live up to their promise. Ted Neward did a really good job of moderating the panel on “Future of Programming Languages”. At the end of the panel, Ted asked the panelists which languages they though people should be learning in order to get new ideas. The list included Io (Bruce Tate), Rebol (Douglas Crockford), Forth and Factor (Alex Payne), Scheme and Assembler (Josh Bloch), and Clojure (Guy Steele). Guy’s comments on Clojure rippled across Twitter, mutating in the process, and causing some griping amongst Scala adherents. The panel appears to have done it’s job in encouraging controversy.

Also in the Clojure vein, I attended Brian Marick’s talk “Outside in TDD in Clojure“. Marick has written midje, a testing framework that is more amenable to the a bottom up style of programming that is facilitated by REPL’s. It’s an interesting approach relying on functions that provide a simple way to specify placeholders for functions that haven’t been completed yet. This also serves as a leverage point for the Emacs support that he has developed.

Doug Crockford delivered the closing keynote. I’ve heard him speak before, mostly on Javascript. His talk wasn’t about Javascript at all, but it was very engaging and entertaining. If you have the chance to see him speak in that kind of setting, you should definitely do it.

A few words on logistics. The conference was spread out across three locations. I feared the worst when I heard this, but it turned out to be fine – OSCON in San Jose was much more inconvenient. The bigger logistical issue was WiFi. None of the three venues was really prepared for the internet requirements of the Strange Loop attendees. WiFi problems are not a surprise at a conference, but the higher quality conferences do distinguish themselves on their WiFi support.

All in all, I think that Strange Loop was definitely worthwhile. The computing world is becoming “multicultural”, and it’s good to see a conference that recognizes that fact.