Monthly Archive for October, 2010

Strange Loop 2010

Last week I was in Saint Louis for Strange Loop 2010. This was the second year of Strange Loop, which is a by hackers for hackers conference. I’m used to this sort of conference when it’s organized by a single open source community – I’d put ApacheCon, PyCon, and CouchCamp in this category. At the same time, Strange Loop’s content was very diverse, and had some very high quality speakers. It’s sort of like a cross between ApacheCon and OSCON. One difference is that there isn’t a community that’s putting on Strange Loop, and the fun community feel of ApacheCon or PyCon is missing.   

One of the reasons that I was interested in attending Strange Loop was Hilary Mason’s talk on data science / machine learning. This is an area that I am starting to delve into, and I did study a little machine learning right around the time that it was starting to shift away from traditional AI and more towards the statistical approach that characterizes it now. Hilary is the chief scientist at bit.ly, and as it turns out, a Brown alumnae as well. Her talk was a good introduction to the current state of machine learning for people who didn’t have any background. She talked about some of the kinds of questions that they’ve been able to answer at bit.ly using machine learning techniques. Justin Bozonier used Twitter to ask Hilary if she would be wiling to sit down with interested people and do some data hacking, so I skipped the session block (which was painful because I missed Nathan Marz’s session on Cascalog, which was getting rave reviews). We ended up doing some simple stuff around the tweets about #strangeloop. Justin has a good summary of what happened, complete with code, and Hilary posted the resulting visualization on her blog. It was definitely useful to sit and work together as a group and get snippets of insight into how Hilary was approaching the problem.

Another area that I am looking at is changes in web application architecture due to the changing role of Javascript on both the client and the server. I went to Kyle Simpson’s talk on Strange UI architecture, as well as Ryan Dahl’s talk on node.js. Kyle has built BikechainJS, another wrapper around V8, like Node.js. There’s a lot of interest around server side javascript – the next step is to think about how to repartition the responsibilities of web applications in a world where clients are much more capable, and where some code could run on either the client or the server.

Guy Steele gave a great talk, and the number of people who can give such talk is decreasing by the day. As a prelude to talking about abstractions for parallel programming, Guy walked us through an IBM 1130 program that was written on a single punch card. He had to reverse engineer the source code from the card, which was complicated by the fact that he used self modifying code as well as some clever value punning in order to get the most out of the machine. The thrust of his comments on parallel programming was that the accumulator style of programming which pervades imperative programs is bad when it comes to exploiting parallelism. Instead, he emphasized being able to find algebraic properties such as associativity or commutativity which would allow parallelism to be exploited via the map/reduce style of programming pioneered decades ago in the functional programming community, and popularized by systems like Hadoop. Guy was proposing that mapreduce be the paradigm for regular programming problems, not just “big data” problems. For me, the most interesting comment that Guy made was about Haskell. He said that if he know what he knew now when he had started on Fortress, he would have started with Haskell and pushed it 1/10 of the way to FORTRAN instead of starting with FORTRAN and pushing it 9/10 of the way to Haskell.

I’m not generally a fan of panel sessions, because the vast majority of them don’t really live up to their promise. Ted Neward did a really good job of moderating the panel on “Future of Programming Languages”. At the end of the panel, Ted asked the panelists which languages they though people should be learning in order to get new ideas. The list included Io (Bruce Tate), Rebol (Douglas Crockford), Forth and Factor (Alex Payne), Scheme and Assembler (Josh Bloch), and Clojure (Guy Steele). Guy’s comments on Clojure rippled across Twitter, mutating in the process, and causing some griping amongst Scala adherents. The panel appears to have done it’s job in encouraging controversy.

Also in the Clojure vein, I attended Brian Marick’s talk “Outside in TDD in Clojure“. Marick has written midje, a testing framework that is more amenable to the a bottom up style of programming that is facilitated by REPL’s. It’s an interesting approach relying on functions that provide a simple way to specify placeholders for functions that haven’t been completed yet. This also serves as a leverage point for the Emacs support that he has developed.

Doug Crockford delivered the closing keynote. I’ve heard him speak before, mostly on Javascript. His talk wasn’t about Javascript at all, but it was very engaging and entertaining. If you have the chance to see him speak in that kind of setting, you should definitely do it.

A few words on logistics. The conference was spread out across three locations. I feared the worst when I heard this, but it turned out to be fine – OSCON in San Jose was much more inconvenient. The bigger logistical issue was WiFi. None of the three venues was really prepared for the internet requirements of the Strange Loop attendees. WiFi problems are not a surprise at a conference, but the higher quality conferences do distinguish themselves on their WiFi support.

All in all, I think that Strange Loop was definitely worthwhile. The computing world is becoming “multicultural”, and it’s good to see a conference that recognizes that fact.

Haskell Workshop and CUFP 2010

It has been many years since I attended an ACM conference, and even more years since I attended the Lisp and Functional Programming Conference, which has evolved into the International Conference on Functional Programming (ICFP). ICFP was in the United States this year, and I’ve wanted to drop in for quite some time. There are many ideas pioneered by the functional programming community, and as much as possible I like to go to the original sources of ideas. ICFP is a long conference with many attached events, and it turns out that the best use of my time was to drop in for the Haskell workshop at the tail end of the conference, and the Commercial Users of Functional Programming (CUFP) conference.

Haskell Workshop

I’ve been around long enough to remember when Haskell first came out, and despite my stint as a database programming languages grad student, I’ve never had the chance to really give Haskell the attention that I feel it deserves. 20 year since its appearance Haskell is still barely on the radar. At the same time, I heard some very interesting talks at the workshop. Things like the Hoopl library for implementing dataflow optimzations in compilers, and the Orc DSL for concurrent scripting. The Haskell systems hackers have made great progress and doing some great work. Bryan O’ Sullivan described his work on improving GHC’s ability to handle lots of long lived open network connections. Given the recent burst of interest in event based programming models, such as Node.js, this is an interesting result. Simon Marlow presented a redesign of the Evaluation Strategies mechanism that GHC uses to control parallelism. Many of the talks that I heard have ideas that are applicable to problems that exist in modern systems. I just wish that I could see a path the involved using Haskell itself to solve those problems instead of the ideas migrating into another language/system.

Surgecon

Unbeknownst to me, my friend Theo Schlossnagle ran Surge, a conference on scalability, in Baltimore, and it overlapped the parts of ICFP that I attended. Surge seems to have flown pretty low under the radar. Google doesn’t return many relevant results for it, and the best information (other than talking to Surge attendees) I’ve been able to find on Surge is on Lanyrd. Theo told me that he was counting on this year’s attendees to be his PR for next year. I didn’t attend, but based on the tweets and dinner conversations, it sounds like it was great. I had dinner/beers with some Apache folks who were in town for Surge, as well as some Surge attendees like Bryan Cantrill. The “systems guys” gave me a good ribbing about being at a conference for “irrelevant languages”, and I had a really good conversation with Bryan about Node.js, cloud computing, and the Oracle acquisition (ok, that part wasn’t so good). Node.js is on a lot of people’s minds at the moment, and it was good to hear Bryan’s perspective on it. It was an interesting sidebar to the immersion in functional programming. I do think that in the medium term there are some interesting connections between Node and FP, but that ‘s probably an entire post of its own.

CUFP

There was a lot of F# related content at CUFP, and I think that Microsoft deserves kudos for the work that they are doing. I think it’s pretty clear that shipping F# in the box with Visual Studio 2010 is not a huge money maker for Microsoft at this point, and I’m impressed with their willingness to take a long term view of the future of programming. Unfortunately I’m not a Windows ecosystem person, so as attractive as F# and Visual Studio are, I doubt that I’ll be playing with this anytime soon.

Marius Eriksen‘s talk on Scala at Twitter was interesting because of the way that he described the conceptualization of Rockdove operations as folds, taking clear advantage of the benefits offered by a functional style. He also had some thought provoking comments about giving applications access to the behavior of the garbage collector. There are some interesting possibilities if you start to give developers control of the behavior of various parts of the runtime system.

Michael Fogus talked about his company’s experience using Scala. His talk was pretty entertaining, and there were some interesting comparisons between Scala features that they thought would be useful and Scala features that actually turned out to be useful. My only issue with his talk was the size of the sample, which isn’t something that he could do anything about. This was also true of the talk by the Intel compiler folks.

I’ve seen a number of talks on the Microsoft Reactive Extensions, mostly with respect to JavaScript. I continue to believe that RxJS could be a great help to Javascript programmers, particularly as things like Node.js take hold. Matt Podwysocki’s Node.js file server example shows how.

Warren Harris from Metaweb talked about his use of monads, arrows, and OCaml to build a more efficient query processor for Freebase’s MQL query language. This was a really interesting talk, because query optimization was the topic of my graduate school research, and at the time the connections between query languages and functional programming were a relatively new topic.

Final thoughts

It doesn’t take much to fan the flames of functional love in me. There are lots of smart people working on beautiful and interesting solutions. I wish that I could see a better path for those ideas to make it into mainstream practice.