Category Archives: programming

Why I finally believe in hashtags

I’ve been using Twitter for a while now, but I’ve never really used hashtags much. I’ve never been much for doing the stuff it takes to get a highly promoted blog or twitter stream. I figure that if my content is worthwhile, that should be enough. At PyCon I found the compelling hashtag use case for me.

There were a lot of people using hashtags in their PyCon tweets, and Jacob Kaplan-Moss showed me Twitterfall, which made it easy to keep track of uses of the tag. That made it *much* easier to find the virtual twitter stream for PyCon. This was also true at Lang.NET, the DSL DevCon, and the MySQL conference. This week(end) I’ll be using hashtags to track the progress of JSConf.   From now on I’ll always use hashtags when I’m at a conference or event.

One reason that it’s taken me so long to get the hash tag thing is that I use Twitter primarily via rich desktop (or iPhone) clients. Until recently I wasn’t using clients that could do searching. I had tried TweetDeck, and it never stayed with me. When Nambu came along, I was pretty enthusiastic because it was a native TweetDeck. Unfortunately, I had crashing problems with it at Lang.Net (since fixed, I think), and I put it aside when I realized that Syrinx 2.0 had searches. While Syrinx doesn’t save searches across restarts, its memory use is tolerable enough to leave it running all the time, so it’s not a big problem, and I am hopeful that MRR will include saved searches in a future version. Commenters: yes, I tried Tweetie for Mac, and I didn’t like it. I love Tweetie for iPhone, though. Go figure.

DSLDevCon 2009

I’ve been having trouble coming up with a good summary of the (Domain Specific Language) DSL DevCon. That’s partly because there was a lot of information to absorb between Lang.NET and the DevCon. Even more so, I’m finding it hard to distill what I saw, what I didn’t see, what I wanted to see, and what I think we need to see next. That’s odd because I’ve accepted the notion of DSL’s should be a part of the programmer’s toolbox ever since I sat through the “metalinguistic abstraction” section of Sussman and Abelson’s MIT class in 1984.

Reporting

I’m going to call out four talks that really stood out for me. There were more than just these four, but it was either these four or all of them, and all of them is too much work.

  • Guillaume LaForge’s talk on Groovy DSL’s was important because he not only showed how to build DSL’s using Groovy, but he’s actually working with real customers, like Mutual of Omaha, who are using those DSL’s in production.   

  • I was happy to hear Markus Voelter’s talk Textual DSL’s and Code Generation with Eclipse Tools because a lot of the noise that I’ve heard on the DSL front has been coming from the Ruby and .NET side of the world. One thing that got my attention at the DevCon was the importance of tooling, so it was good to see that there are some tooling efforts in the Java space. It’s too bad that no one from JetBrains was there to present on MPS.

  • Brad Cross and Ted Neward did a talk entitled “Functional vs. Dynamic DSLs: The Smackdown”. I came away from this talk wanting more, and not in a good way. Brad and Ted really needed about 2 hours in order to give all the relevant background a chance to settle in. During the talk they presented a set of things which differentiated the functional programming and dynamic language styles of creating “Internal” (I really dislike the Internal/External terms) DSLs. Unfortunately, there wasn’t enough time to really dig in and explore the meat of what they said. I think that a deep addressing of the points that they made would be a very important contribution to the DSL topic. Maybe we’ll get to see a series of blog posts, developerWorks articles, or even an academic paper of some kind.

  • I view Intentional Software as one of those grand computer science projects. Having worked on Chandler, I have an appreciation for the perils of large, grand efforts. This is the first time that I had a chance to see a presentation by anyone from Intentional Software, and it is just as well that it was a demo of their just shipped product. I took note when the Intentional Software project was started back in 2002, but I’ve not heard a lot about their progress since then. What we saw was a demonstration of a production version of their “Domain Workbench” which is a system for allowing domain experts and programmers to work together to build a system which domain experts can then use to write software. Instead of writing programs, the programmers write the generator which takes the domain language (which can be visual) and then generates code. The system represents the domain information in a way that allows multiple, editable, “projections” (views). The demonstrations that we saw included an actuarial workbench, complete with mathematical notation, and an electronics workbench, expressed as circuit diagrams. If you are interested, your best bet is to watch the video when the videos are posted.

    I am pretty impressed with what I saw, but there are lots of questions. How many domains can this actually work for? How hard is it to write generators? What’s the business model for domain workbenches? It seems pretty clear to me that for the domains and organizations where this can work, this approach is going to have a pretty sizable impact. Perhaps not this year, but within the next 5 years. I have to hand it to the Intentional Software guys. Their presentation was pretty low key, and they are going out of their way to not hype their stuff. They plan to work with a small number of customers to gradually prove out their approach. In an area which is highly susceptible to hype, it was refreshing to see people trying to keep expectations to a reasonable level.

The DevCon (and Lang.NET) were also my chance to meet two people who I’ve followed for sometime from afar: Ted Neward, and Larry O’Brien. Ted is well known and I’ve been following his blog for some time. He’s local to the Puget Sound area, and it’s probably just bad timing that we never met before this week. Larry O’Brien has been a commenter on my blog, as well as a responder on Twitter. I’ve appreciated his blog as well as the columns that he’s written over the years. It was great fun to run to the back of the room after each talk and see what the Twitter cabal (which included Larry) had to say about the material we had just seen.

Analysis

I think that DSL’s are inevitable. It’s remarkable to me how prescient Abelson and Sussman were when they defined three categories of abstraction: control abstraction, data abstraction, and metalinguistic abstraction. If you look at some of the recent frenzies in languages, you’ll see that we are mostly improving the ability of various languages to perform various kinds of abstraction. These concepts are not new, but they are appearing in languages which are approachable by today’s practitioners. Object oriented programming? Data abstraction. Closures? Control Abstraction. Pattern Matching/Algebraic datatypes? Data and control abstraction. DSLs and the capabilities needed to enable them? Metalinguistic abstraction.

Language as an abstraction is very powerful, and requires support from the underlying language as well as the tools. These two topics (as well as specific examples of domain specific languages) were the focus of the DevCon. The audience makeup appeared to be mostly language and compiler geeks. There were a few people (mostly consultants as far as I could tell) who write business applications, but this group was pretty small. This is important because most of the DSL’s presented were aimed at very computer science kinds of domains. If DSL’s are to have a broader impact, then it would be great to see more business people at events like this.

One thing which was not addressed at all was the process end of this. In order to build DSL’s for non computer domains, there has to be a collaboration between developers and domain experts. The Intentional Software guys recognize this via some “groupware” to facilitate this process. However, tooling alone is not enough to bridge this gap. I hope that we’ll be hearing reports on the process of collaboration between developers and domain experts as more and more people build DSL’s.

This is an interesting space, from a technical point of view. There is lots of cool language design and compiler stuff, some of my favorite topics.   On the business end, it seems like there are some decent sized opportunities here, and that tooling is going to play a very large role — language support for DSL’s will be important, but may be overshadowed by the importance of good tools.

Update: the videos are now avaiable

Best PyCon Evar

I probably should have chosen a different title for this post, because at the rate things are going for PyCon, I’ll just have to use the same title again for the next few years. This year, PyCon happened during the same week as ApacheCon EU (the 10th anniversary of the ASF), and EclipseCon. I have a slight bit of regret that I wasn’t at ApacheCon for the 10 year anniversary, but I’m planning to be at the 10th anniversary celebration at ApacheCon US in Oakland, in November. That roughly corresponds to the time when first got involved with Apache and open source, so it will be pretty meaningful. Beyond that, it was hands down for PyCon, my favorite conference. Even if the PyCon organizers hadn’t invited me to speak on a topic of my choosing, there are just so many things to love about PyCon.

The Talks

PyCon 2009

Despite a very active and fun hallway track, I did go to a number of talks.   

I went to Adam Christian and Mikeal Rogerstalk on Windmill mostly for moral support. We worked together at OSAF, and I like Windmill, and it’s really good to see Windmill picking up steam in the Python and other communities. If you are looking for a web testing framework, particularly one that is string at AJAX applications, you owe it to yourself to look at Windmill.

There were a few tools talks that I attended. I use IPython, so I was curious to see how Reinteract: a better way to interact with Python, would improve on IPython. I like the Visicalc/TkSolver like worksheet that allows you to change values in a Python interpreter history and have values propage forward. I’d love to see all these REPL tools come together in an integrated way. We might finally get back to the functionality of the Lisp Machine REPLs someday. I also attended How AlterWay releases web applications using “zc.buildout“ since Jacob Kaplan-Moss warned me that the zc.buildout documentation was sorely lacking. Even that talk wasn’t enough to get me going, but the sprints produced some great new documentation for buildout. I’m looking forward to digging into that.

Some talks dealt directly with topics that are relevant to work, particularly now that the dynamic languages folks at Sun are now a part of the Cloud Computing division. These talks included:

  • Twisted AMQP and Thrift: Bridging Messaging and RPC for building scalable distributed applications – Twisted bridges to AMQP and Thrift.

  • Concurrency and Distributed Computing with Python Today – Jesse Noller did a great job surveying the various offerings available in Python today. There’s a lot of stuff there, but I think that there’s still quite some way to go yet. That’s not picking on Python, that’s just my general view of this space.

  • Drop ACID and think about data – Bob Ippolito did a really nice survey of the various non-relational/non-transactional data storage options out there. Bob actually tried many of these, so the survey is useful for weeding out systems aren’t really ready for prime time. A must view if you haven’t been paying attention to this space.

  • Pinax: a platform for rapidly developing websites – I’ve been following Pinax via Twitter for some time now, and James Tauber and I were involved at the beginning of the Apache XML project almost 10 years ago. Despite all that, we’ve never actually met in person until this week. James had a tough job with his talk. Pinax is very new, so he could either talk for the people who didn’t know what Pinax is, or he could talk to people wanted to know where things were. James knew this was going to be a problem and said so in his talk. And it was, at least for me. Fortunately, I managed to sit down with James at the sprints and get my questions answered. Zed Shaw recently wrote a (very positive) review of Django. That’s interesting since Zed was a hard core Rails guy. It’s also interesting because he called out Django’s emphasis on modularity and Pinax as an example of that modularity. My questions about Pinax were mostly about what (if anything) Pinax has done to build on the modularity provided by Django. At the moment, the various Pinax components cooperate mostly via conventions. Things are still early in Pinax, and I wasn’t surprised to hear this. James did say that some conventions were close to getting codified/documented/supported by the framework, which is what I am really interested in. In some ways, the data representation and modularity problems are similar to the kinds of problems that we were trying to solve for Chandler. Pinax is in the social application domain and Chandler is in the PIM domain, so while there are some similarities there are also differences. I’ll definitely be sticking my nose a bit deeper into the Pinax checkout that’s been sitting on my hard disk.

The most entertaining talk that I attended was Ian Bicking’s Topics of Interest. Ian took the invitation to speak on something of interest quite literally which created an air of mystery. In the end, Ian prepared some slides (some of which were quite thoughtful and introspective), used an instance of the new Google Moderator to queue up some audience questions, and created an IRC backchannel which he kept on the screen during his talk. The result has to be watched (and the video is already up) to be understood. It was quite hilarious, with the exception of some unpleasant commentary after someone in IRC asked “why aren’t there more women at PyCon”. The resulting IRC conversation only serves as an explanation for why. Many people felt this way, and discussion of this spilled out into Twitter, and I hope that perhaps we can change things for the better.

I gave my talk, Challenges and Opportunities for Python, and got a pretty good reception. I had a number of hallway and other conversations with people based on the content. I think that I was successful in giving people a perspective on the dynamic language world as a whole, on Python’s place in it, and some things that we might be able to do in order to grow. You can watch the video and make your own assessment, and decide if there are actions worth taking.

This year the conference is benefitting from a great new website (built in Django), and you’ll find the slides and video for each talk on the links. The video team is doing a great job of cranking out the video, so all of them should be up soon, or you can go to pycon.blip.tv to see them all together. Here are some talks that I am going to be checking out:

The Lightning Talks

PyCon 2009

I put the lightning talks in a separate category from the talks because they are a phenomenon at PyCon. This year there were two lightning talk sessions, one at the beginning of each day and one at the end of each day. That’s 6 sessions of lightning talks! Jacob Kaplan-Moss only allowed signups for the next session, and it was truly first come first serve (without last year’s arrangement with the sponsors). There were a number of really good lightning talks. There really isn’t a good record of what got presented except perhaps on Twitter. A search for #pycon should get most of it.

Update: the lightning talks were also video’ed and will be posted on pycon.blip.tv

The Sprints

The PyCon sprints remain a phenomenon. While I don’t think quite as many people stayed this year as last year, there were still a lot of people — enough to fill the basement conference rooms at the Crowne Plaza hotel, and enough to need one of the ballrooms to serve lunch and dinner in. Once again, I hung at the Jython sprint, and wandered in and out of the Django and Pinax sprints. During the two days of sprints that I stayed for, I observed the folks working on ctypes for Jython actually crashing the JVM. SQLAlchemy started to really run on Jython and so did Twisted. Four days of hacking with the core developers of a project generally tends to produce results. So does spending time to bring new people from the community into your project.

I reported a bug in Django as I tried to get buildout setup to do Django on MySQL. I’m talking about Python and MySQL at the MySQL conference in a few weeks, so I was working on my example code. Turns out that MySQLdb doens’t build cleanly on the Mac. The trunk version almost builds cleanly, so I used that, but that version chokes something in Django. Before I discovered that I had done some gymnastics involving a git-svn clone of MySQLdb, a push of that to github, and a git recipe for buildout. I never quite got the git/buildout part working and I decided that it was overkill and that’s when I finally discovered that the trunk didn’t work with Django.

Of course, the sprints are also a time to catchup with/meet people in the community. It’s a time when there are friendly rivalries, joking, and alcohol. One of the momentous occasions during 2008, was that Django got a pony.

The exuberant Django people decided to bring the pony to PyCon…

PyCon 2009

Guido decided that he wanted the pony…

PyCon 2009

This all made for great fun and entertainment, which then spilled over into the sprints as a three way Python Core/Django/Pinax feud, which lead to things like this and this. This is hard core fun, people.

Overall Conference Commentary

The organizers estimated the attendance for this year’s PyCon at around 900 people. That’s a slight decline from last year, but the economic situation is much much worse than it was last year. I think that a 10% decline is a huge success, and a testament to the growth of interest in Python and it’s surrounding ecosystem.

From an organizational point of view, PyCon is continuing its tradition of being a mostly volunteer organized conference. It this respect it is a tour de force, at least in the space of open source conferences. PyCon is using a production company to assist, just as ApacheCon is, but the on site footprint of that company is much smaller than the on site footprint of the company for ApacheCon. Moreover, the number of volunteers helping with things is just enormous. Session chairs, runners to escort speakers from the green room to their sessions, a web site builder, lightning talk coordinator, open spaces coordinator, greeters at the conference desk, photographers, and I’m sure there are a bunch more people whose roles I didn’t even get to hear about. Absent a fancy lighted stage display for keynotes, production value wise, I feel that PyCon is operating at the same level of quality as any of the O’Reilly conferences. The program was excellent – tutorials, keynotes, invited talks, regular talks, open spaces, and lightning talks.

PyCon 2009

With PyCon, the Python community is getting way more mileage out of its face to face time than any other open source community. The combination of lightning talks, open space, and sprints creates a powerful feedback loop within the conference proper, which then extends into the sprint days. This dynamic has evolved over the years as PyCon attendees have come to understand the role of these vehicles. Here’s how it works:

PyCon 2009

The lightning talks allow anyone, regardless of stature, influence, or reputation to get in front of the entire conference. People now recognize that some of the most interesting, surprising, and entertaining moments of PyCon take place during the lightning talks. It’s a measure of the influence of the lightning talks that even the 8AM morning lightning talk sessions were well attended. At other conferences the morning sessions are reserved for keynote presentations by paying sponsors. I usually skip these because the content value is low. But I definitely got up to make sure that I hit those 8AM lightning talks. If you’ve gotten in front of the community with a lightning talk, you can extend your reach by scheduling an open space session.

PyCon 2009

Above is a shot of the open space board for Saturday. Note that the time slots go from 10AM to 10PM. There were a few prank type sessions, but for the most part, that board really is full all day long with 10 rooms available during each one hour time slot. Consider that there were 4 ballrooms for the talks, and that the talks went from 10:20AM till 5PM. There was way more air time in the open space sessions, and people certainly made use of it. This is why PyCon is a working conference – it’s not only about transfer of information, real work gets done there.

PyCon 2009

The only tricky thing with open space is that it would be great to have electronic access to the contents of the open space board during the conference. That would help make the open spaces a first class citizen in the minds of attendees. This is an interesting problem, because part of the value of the open space is the physical board, so turning it all electronic wouldn’t be a good idea. I wonder if Kaliya Hamlin has an experience with this sort of thing.

Used well, the open space sessions are great for organizing your little (or big) slice of the world wide Python community. They are also great as a prelude to a sprint once the conference has finshed. And as I’ve already mentioned, the sprints are a great time to reinforce a project’s community as well as move it forward.

PyCon 2009

All of this notwithstanding, the PyCon organizers are not sitting on their laurels. They keep on looking for ways to improve the conference. The buckets you see above are an example of this. Instead of paper or electronic surveys, attendees were asked to vote for talks by taking a red chip and tossing it a bucket on their way out the door. Green for good, yellow for ‘meh’, and red for bad. This is way less effort than the surveys, and I observed a decent number of people putting in their chips. Doug Napoleone has more on the origins of this system, as well as a pointer to the raw data on the results.   

Twitter is now in the mainstream at PyCon. Guido mentioned Twitter during his keynote, and used it to ask questions during the conference. One of James Tauber’s first slides told people which hashtag to use when covering his talk. I’d guess that I got at least 20 new followers each day of PyCon, and I think that I might even be trained to use hashtags now. #pycon was in the top 10 Twitter during the days of the conference. The takeway is that if you are going to a conference and you are not on twitter, you are missing out. The corollary is that if you are a conference, and you aren’t making use of twitter, you need to pay attention. Ian Skerrett has an interesting post on how they used Twitter during EclipseCon. One thing that was missing was a video display of the search for #pycon. I know from talking with Doug Napoleone that he has some wonderful ideas for taking all the social networking stuff to the next level. I’m really looking forward to seeing that next year.

Photography

I’ve been to a lot of conferences over the last few years, always with a camera in hand. At each conference I shoot less and less. There are now lots of people swarming around with cameras, and I feel a bit done out with shots of people speaking from the front of a room, rows of white male attendees listening to a talk, and the rest of the usual conference shots. The same thing happpend with me and liveblogging conferences. Also, it’s hard to do the hallway track and do decent photography.   Last year, the PyCon organizers asked me to take some official pictures, which I was happy to do. This year they didn’t (which was fine by me), but I had planned to bring the camera anyway, because PyCon is PyCon, and photographing there is one way that I try to give back to the community.

It turns out that the organizers were way more organized about photography this year. They actually had someone to coordinate the photography for the conference. Steven Wilcox had a last minute emergency and couldn’t make it. I found out about all of this just a half an hour before I left for the airport. Steven had planned to do headshots of Pythonistas, and was planning to get studio lighting equipment and so on. All of that was now up in the air. Since I had done a bunch of headshots of ASF people at ApacheCon, I tossed some Strobist lighting gear into my suitcase, just in case. By the time I landed in O’Hare, Erich Heine had stepped up to replace Steven, and I joined the “Python Paparazzi” or “pyparazzi”, along with Erich, Jason Samsa, Dan Ryder, and Stéphane Jolicoeur-Fidelia.

PyCon 2009

Since PyCon was in Chicago last year, I was familiar with the Crowne Plaza Hotel, which is a decent hotel, but nothing to write home about. This year the conference proper moved to the Hyatt Regency down the street. PyCon has a tradition of trying to keep costs low in order to keep the conference accessible to the community, so I was expecting something like the Crowne Plaza. I couldn’t have been more wrong. The Hyatt is a photographer’s paradise. There are lots of interesting colors, textures, and some areas with beautiful overhead natural light. If you were going to photograph a wedding, you would die for settings like these for the bridal portraits.

PyCon 2009

This tiled inset in wall turned into the backdrop for James Tauber’s headshot.

James Tauber

It doesn’t have to be strobe(ist) to be a good headshot!

PyCon 2009

This orange lit panel behind a bench seat turned into the backdrop for Jim Baker’s headshot.

Jim Baker

In addition to the pyparazzi, there were plenty of other cameras floating around the conference. Andy Smith decided to do a photographic project called the “Beards of Python“. When this set was announced on Flickr, it caused some Twitter buzzing amongst some of the female attendees of the conference. One thing about photographers is that we (or at least I) are always willing to take some interesting photos. So when the Twitter buzzing reached me, I offered to photograph any interested Geek Girls. James Duncan Davidson and I have discussed the value of trying to photograph female attendees at technology conferences. Since our photographs are often used for advertising, this can be a way of helping women feel more comfortable about attending — knowing that there will be other women there can be a help. So not only did I get to shoot more pictures of interesting people, I hope that in some small way this will contribute to making PyCon friendlier to women.

Catherine Devlin

This is Catherine Devlin, a contributor to sqlpython. Go read her post “Five minutes at PyCon change everything” for an actual example of the lightning talk/open space/sprint scenario that I described above.

The entire set of Pythonista headshots, as well as the rest of my conference coverage are up on Flickr. Who knows what we’ll come up with for next year in Atlanta…

Travel

Regular readers will know that a trip to PyCon traditionally involves some kind of travel mishap. This year was pretty minor compared to previous years.   United lost my luggage for the flight from Seattle to O’Hare, despite the fact that I arrived 2.5 hours early, and checked in at the “Premier” checkin line. I got my bag the next day, so it wasn’t really that bad. Maybe next year will be the PyCon with no travel glitches.


Refactoring in the Functional Programming world

I’m an Emacs guy. I was first exposed to Emacs back in 1984 on a VAX running BSD. This was prior to GNU Emacs, so the Emacs that I saw was James Gosling’s Emacs. At the time, I was working on a compiler for a functional programming language called SUPER, which was evaluated using combinator graph reduction.

For many years, and across many languages, including Scheme, C, C++, Perl, TCL, and Java, Emacs was my tool of choice. My hands had the muscle memory for the keystrokes, and over those years I accumulated a file full of Emacs-Lisp customizations for Emacs (by this time, mostly GNU Emacs). When Eclipse started to support refactoring I started using Eclipse as my primary tool for editing Java programs. Refactoring is an example of the kind of high leverage features that I want in my programming tool set.

A few days ago I found some gems buried in a thread on the Scala mailing list. Dave Griffith has been accumulating a list of refactorings for Scala. Here’s his complete list:

Curry Method (split a parameter list, and the arg lists of all callers).

Uncurry Method (merge split parameter list, including merging the arg lists of callers. If method is called with partial args, either complain or automatically create a helper method which represents the partial application, and replace partial calls with it.)

Extract Trait (including searching for other classes which can have the same trait extracted. Tricky with super calls, but not impossible)

Split Trait (splits trait into two traits (putting in self-types if needed), change all extending classes to extend both traits)

Extract Extractor (select a pattern, automatically create an extractor)

Extract Closure (similar to extract method, but creating a function object)

Introduce by-name parameter

Extract type definition (obvious)

Merge nested for-comprehensions into single for-comprehension (and converse)

Split guard from for-comprehension into nested if (and converse)

Convert for-comprehension into map/filter/flatmap chain (and converse)

Wrap parameter as Option (converting null checks, etc.)

Convert instanceOf/asInstance pair to match

Replace case clause with if body to guarded case clause(s)

I was particularly interested in those refactorings related to functional/higher-order programming and pattern matching. Between the surge of interest in Scala, F# and Haskell, it looks like there’s room for some more work in refactoring.

The First Annual JVM Language Summit

You know that a conference is good when you go home with a list of stuff that you never heard of but now need to go follow up on. The JVM Language Summit was exceptional in this regard. Sun provided a location and a few of the speakers, but most of the speakers at the Summit were not Sun employees, although there were a few Sun alumni amongst the speaker ranks. The topics that were discusssed went all the way from type theory (including the usual greekified type proofs), typical language design stuff, VM design, all the way down to discussions of how high allocation rates can cause hot data to get flushed out of caches on the bare metal. Slides for all the talks are available on the wiki for the conference, and some of the talks will have video at either InfoQ or YouTube. Here are some of my favorites from the three days.

Clojure

I’d been aware of Clojure prior to the summit and had looked at the page on Clojure’s use of persistent data structures, so I thought that I had some idea of what was interesting about Clojure. I was wrong. Rich Hickey’s 30 minute presentation on Clojure had a really large amount of information per unit time. By the end of the time I was really interested in Clojure, and I was able to find out a bit more about it by going to an open space session and by being at the same table as Rich during dinner one night. As and old Lisp guy, my usual reaction to Lisps on the JVM or CLR is why? They don’t typically fit in that well with the host VM, and there are great implementations of Common Lisp that can compile to very efficient machine code. I was looking forward to Arc, but that has turned out to be very disappointing. Clojure has taken a very practical approach to the Lisp parts of the language. It fits in very nicely with the JVM, is able to call Java code easily and has the potential to achieve very good performance on the JVM. Also, Rich has made a number of design decisions which improve the syntax (he showed a short program in both Python and Clojure, and they occupy the same amount of vertical space and have roughly the same visual density) of Clojure. He’s also generalized many operations that would have worked on lists to work on sequences, which really means any Java sequence type. Like many Lisps, you can supply data type hints, and the compiler will use those to make the program more efficient. There is a nice library of collection operations, which look very comparable (or better) than Python or Ruby’s facilities for collection types. There are some other really interesting data structures in the libraries, like bit partitioned hash tries and zippers.

Beyond the Lispish stuff in Clojure, there are a several interesting features in Clojure related to the problem of concurrency. In Clojure things are immutable by default, which is a huge benefit – a benefit shared by functional languages, and quasi functional languages like Erlang. Beyond that, Clojure supports persistent data structures as a way of managing state in a concurrency friendly manner. The idea is that “updating” a persistent data structure yields both the version before and after the update. This means that updates don’t impede readers of the old version, and are not blocked by readers of the old version.

Lastly, Clojure provides an interesting mechanism for utilizing Software Transactional Memory. Normally STM systems make all accesses to memory transactional. This makes the STM a bottleneck, and makes it much more likely that the performance of the STM system will be the limiting factor in a concurrent system. Clojure requires you to make uses of the STM explicit via its Ref structure. This yields a potentially much more controlled usage of the STM, which could help preventing the STM from being a bottleneck.

My original impression of Clojure was that it was still in the very early stages, but it seems to be bit further along that that. I was surprised by the size of the community, and by other parts of the ecosystem, like the tool support. There are several Emacs modes, integration with SLIME, and even a Netbeans plugin for Clojure.

I will definitely be giving Clojure a closer look, and I am not alone. There was a lot of energy in the room during and after Rich’s talk, and there was a burst of Twitter traffic during the talk. It’s pretty interesting to see the number of language geeks on Twitter.

Davinci Machine

If you’ve been following John Rose’s blog and Charlie Nutter’s recent writings on invokedynamic, you wouldn’t be very surprised by the content of John’s presentation on the DaVinci Machine Project. This is a highly important piece of work for non-Java languages on the JVM, so it was good to hear John tell a more complete version of what he’s been up to. It was also my chance to meet John in person. We somehow missed each other at JavaOne, so it was good to put a face to the name, and have some in person contact. John and Brian Goetz did a great job of organizing the summit, and John was always trying to find out what kind of features would be useful to JVM implementors. JSR-292 can’t happen soon enough.

Fortress

David Chase talked about the work that folks at Sun labs have been doing on Fortress. I never really paid much attention to Fortress, since they are really aiming at the scientific, high performance computing space, and that’s kind of outside of my interests. The Fortress guys are doing some interesting explorations as far as concurrency is concerned. In fact, David referred to Fortress as “infested with parallelism”. My todo item from the Fortress talk has to do with the work-stealing model that they have for concurrency. Apparentl this work is based on a data structure known as an ABP queue, so I’ll be tracking down the paper on that one.

JRuby

I’ve heard Charlie Nutter talk about JRuby several times, and have talked with him a little about JRuby. Even so, his talk on JRuby was really interesting, because he was able to go full out for an hour to a very sophisticated audience. I know from talking to some of the Jython guys, that there were a few aha moments for them, even though they’ve been to the same talks that I’ve been to.

Dynalang MOP

Attila Szegedi described his proposal for a MOP for dynamic languages. Once you start hosting a bunch of languages on the JVM (or any VM), then people start to ask if they can call code written in language A (say Clojure) from language B (say Python). The tough part is that the code in language A may have compiled to Java bytecodes in a way that doesn’t really resemble Java code, and you can end up in a situation where B can call A but does it by grabbing things which are really artifacts of A’s implementation. Of course, A’s implementors will continue to improve A, and in the process of doing so, might change the details. You can see what the problem is going to be. Attilla’s MOP would go a long way towards helping here. I hope that people will give it a serious look.

Gradual Typing

Jython committer Jim Baker has been after me about the work that Jeremy Siek (UC Boulder) has been doing on adding types to Python. His system is called gradual typing and allows a programmer to selectively add type annotations to a program. It’s a cool idea in principle, and I hope that it will end up being cool when it finally gets implemented all the way. I have to admit that the first time that I saw an annotated program, I had a violent reaction. There were a ton of angle brackets due to the type annotations. Jeremy and his students are working on ways the reduced the amount of notation that is needed. I hope that they’ll be successful — in Python at least, it’s going to be key to whether people will adopt it or not.

Fundamentalist FP

I’ve been an admirer of Erik Meijer‘s work for some time, so I was glad to be able to hear him speak in person. There was another talk on LINQ, so Erik didn’t talk about that. Instead he talked about what he called “Fundamentalist Functional Programming”, which is really just the functional programming that the old school functional programming people have always talked about. I think that Erik is concerned about the amount of lifting of functional programming ideas and idioms, without a full understanding of the essence of functional programming. His presentation style is very entertaining. The major thrust of his argument was that for the past 50 years of computing, we have been abstracting, but abstracting over the wrong things. He asserted that the thing that we really need to do is to abstract over evaluation order. Given the coming many/multicore world, this is understandable, but don’t think that I agree that all the other lessons that we’ve learned about abstraction are invalid. He provided the simplest explanation of monads that I have ever heard or read, as well as showing how to handle things like object creation and process creation monadically. In the end, though, his talk reduced to the essentials of lazy pure functional programming.

Bytecodes for fast JVM’s

Cliff Click asked that JVM language implementors send him an implementation of a particular program written in their language. Cliff then ran those programs (in their respective languages) on Hotspot and on the Azul JVM. His talk was a report of his findings as to what was keeping various languages from getting the best results on the JVM. He said that he wasn’t trying to compare the merits of one language versus another, but more to give the implementors insight into what was up with their code. I found this talk to be tremendously interesting because Cliff really knows the guts of HotSpot and because he was able to be very specific about what was causing problems for the various languages.

Parrot

I’ve known Allison Randal for several years now, mostly via her organizing of the FLOSSFoundations meetings that happen every year at OSCON. In all that time, we’ve never really sat down and talked about her work on Parrot, and it’s been several years since I heard a talk on Parrot. I give John and Brian a lot of credit for inviting Allison to come and talk about Parrot. The architecture of Parrot is very unlike either the JVM or the CLR. They started with very different assumptions and goals, which unsurprisingly lead to a different design. As far as I can tell, Parrot is looking reasonable on the performance front, will be able to use the C libraries of Python, Ruby, PHP, etc, without much hassle, and will have a good story for interoperability between hosted languages. Control flow is modeled using continuations which means that continuations are really cheap to create. Allison also talked about what a different method of doing call site caching – Parrot does the caching in the Parrot class object, not in little caches strewn all over the call sites. This makes it easy to invalidations the cache when the class hierarchy changes, for example. I’m still trying to digest all of what I heard, as well as the conversation that several of us had with Allison after her talk.

The Parrot team has been lying low and working away on Parrot, and they are definitely making progress. Allison showed some very preliminary benchmarks of the incomplete Python, Ruby, and PHP implementations on top of Parrot versus the C based versions. She told me afterwards that the project has reached the point where they are working to time based milestones, and that they are hoping to do a 1.0 release early in 2009. Chalk up another to-do.

Random Thoughts

There aren’t any pictures to go with this report because I was not motivated to take any. There were several people snapping away quite frequently during the conference, and I didn’t want to add the slaps from the D3 to the cacophany and the flash light show.

It seems clear to me that many folks share some of the same problems, and I hope that on result of the summit will be that people will start to work together when it makes sense. I know that the Jython and JRuby folks are working in that direction, and it seems likely to me that there will be some collaboration around the dynalang MOP as well. There was a lot of good energy in the room: people were very respectful and curious about other people’s work.

I think that the only regret that I had was that this was the first annual JVM Language Summit. Imagine where we’d be if this had been the fifth…

Update: finished the sentences about persistent data structures in Clojure

News sweep

I’ve managed to go the entire month of August without a post, due to a combination of travel, family activities, and vacations. So here’s a sweep of some of the things that I would have covered during that time.

1.0’s

The Chandler Project – Chandler has gone 1.0, so if you were put off by the version number, you can take it out for a spin. There are some good posts on the Chandler blog that describe how people are using it.

Django – Just today, the Django project had its 1.0 release. This is pretty important because there were a lot of changes in the subversion trunk that weren’t in the packaged builds. That’s all be done away with now. I expect that this will lead to even more Python webapps.

Tools

DTrace – DTrace is 5 years old today, and Bryan Cantrill has a good war story from that time. It’s amazing to me that something as good as DTrace can be around for 5 years, and still be relatively unknown. If you are on Solaris, OpenSolaris, or Mac OS X, go check it out.

Ubiquity – Ubiquity is like Quicksilver integrated into Firefox. It’s emphasizing the natural language aspects of that kind of interface. There’s also pretty good documentation on how to build additional commands, which is really important. There are extensions for Quicksilver, but there aren’t a lot of them. There are already a lot of third party Ubiquity commands. I really wish that Ubiquity could talk to other applications besides Firefox, but there are pretty nasty security problems down that path. Some of the commands are very Google oriented, like the mail and calendar, which makes it less useful for people like me who are still using desktop applications. In an event, I think that this is worth watching carefully. One unintended side effect might be additional pressure for page/application authors to embed machine-readable content (yes, that you, microformats, at least in part) into more pages. We’ll see.

Chrome – There’s a lot of buzz about Google’s Chrome browser. Since it doesn’t run on the Mac, I don’t have much to say. I’m not about to install Windows or fire up VMWare just to run a browser. One day the Mac port will be done, and then I’ll have a look. I am encouraged that the development team is doing a real Mac native experience.

Dynamic Language Runtimes

It’s been exciting to watch the progress in JavaScript runtime engines over the last few weeks. First there was Mozilla’s TraceMonkey, which is a tracing based JIT, which delivered some very impressive speedups, despite the fact that it still has cannot deal with recursion. As part of Google Chrome, a team lead by StrongTalk/HotSpot lead Lars Bak has done a JavaScript JIT called V8, which is also turning out some very impressive numbers. And of course, the SquirrelFish engine for WebKit was turning in pretty good numbers a few months back. This is great progress for JavaScript — it’s less so for the web because of the variety of deployed browsers. It’s exciting to watch the various JavaScript runtimes leapfrogging each other. It gives me the sense that JavaScript is really making some serious moves on the performance front. Of course, none of these folks are comparing their execution times to C or C++. I’d like to see those comparisons as well. It’s also great that all three of these engines are open-source, so that implementors of other languages can evaluate the internals of these VM’s. I’d love to see this kind of leapfrogging in the Python and Ruby communities.

Cameras

I’m not as interested in the camera body arms race as I once was. The Canon 50D is an upgrade of the 40D, but I’m not really sure that more pixels is better. The telltale feature on the camera is the autofocus system, which hasn’t been given much of an upgrade. That signals to me that the 5DMkII will not be the all out upgrade that many are hoping for, but what do I know? The Nikon D90 sounds cool if you want to shoot video. I have enough problems with still pictures.

I am interested in the Nikon P6000, point and shoot, but I am seriously annoyed by the NRW proprietary RAW file format for the camera. Everything about the camera seems awesome, especially the ability to do off camera flash, both iTTL and manual. The RAW thing is going to be the determiner for me. I won’t buy one unless there is Lightrooom/Adobe Camera Raw support for the camera. OS X native support wouldn’t be bad either. As a new Nikon owner, I am unimpressed by the NRW decision.

Travel

September is a heavy travel month for me. I will be in Birmingham, UK for PyCon UK, from Sept 12-14, and I’ll be at the JVM Language Summit from Sept 24-26. As always, stop by and say hello if you will be at one of these events.

OSCON 2008

Another OSCON has come and gone, and as usual, I am exhausted in the aftermath. I’ve developed a love-hate relationship with OSCON over the years. The diversity of the OSCON community is one of the huge pluses of the conference. I got involved in open source via Apache, and OSCON was where I really started to get more of a sense of the open source community as a whole. That’s led to friendships with people doing all sorts of open source stuff, which makes the conference a natural place to reconnect with many of those folks. Which leads to the primary downside of OSCON, which is that there is just no way to keep up with, never mind see all the people that you’d like to see. Combine that with the sheer scale of the event, and you have recipe for burnout. This year is no exception, which is why this post is delayed by a few days.

Languages

It’s fitting to start a review of OSCON with programming languages, since OSCON began as a Perl conference. There are still lots of Perl hackers running around, and by the distribution of the program (the Python track was 1 day shorter than the Perl, PHP, and Ruby tracks), it seems that Perl is not going anywhere anytime soon. I think that we are going to need to drum up some more Python talks for OSCON next year. Then again, with PyCon topping 1000 people this year, maybe all the Python folks are going there. It certainly is cheaper than going to OSCON. Despite all of this, I saw lots of people that I knew from the Python community, as well as plenty of people who had affixed a yellow Python ribbon to their badge. The ribbons are a nice way of helping people find their tribe at a big show like OSCON – a lower tech version of what the Pathable folks are doing.

I spent a lot of time nosing around various concurrency oriented sessions. I attended Steven Parkes’ tutorial on Actors, which was pretty well attended. Steven has implemented a version of Actors as a set of Ruby and Python libraries. During the tutorial I was able to meet Debasish Ghosh, who has a great blog and Twitterstream on high-level languages, and concurrency topics in general. I also took in a BOF on Actors, which had some really interesting conversation. There were a lot of Erlang folks in the room for that one, which made the discussion pretty interesting.

Databases

OSCON 2008

There was lots of non-traditional database stuff happening at OSCON this year. I am one of the mentors for the CouchDB project at Apache, and I was finally able to meet my first CouchDB commiter, Jan Lehnardt, at the show. Jan gave a nice high level overview talk on CouchDB, which was well attended, and I was interested to see Brian Aker of MySQL/Drizzle in the audience and among the throng of questioners after the talk.

OSCON 2008

I also went to a talk on Prophet, which is a peer to peer database that is being done by some of the folks that brought us SVK. I’m not sure that I quite recovered from my initial reaction to that revelation, but Jan was sitting next to me during the entire talk, and was saying something about stealing some ideas from the Prophet guys. In open source we call that standing on the shoulders of giants, or something like that.

“Memes”

The XMPP folks had a three day summit during the conference, which I gather was well attended. There was a decent amount of XMPP buzz floating around in the hallways, so I expect the blogosphere to be full of XMPPness during the next week or so. I’ve done a bunch of blogging on XMPP in the past, and while things have improved, they haven’t improved to the point where XMPP is taking over the world. Things like Twitter are definitely helping, but there is still a long way to go before XMPP achieves world domination. But we can hope. And at least XMPP makes a great advertisement for Erlang.

Along with XMPP, we had the microblogging meme. I made heavy use of Twitter throughout the week, and it definitely played a useful part in making connections with people. Well, except for the times when it was down. I was able to spend a little time with Leah Culver, the founder of Pownce, which has the virtue of being written in Python, and of having a very nice API for dealing with the service. It’s interesting to get additional perspectives on a problem, and since I had already talked some with the Twitter guys, it definitely helped to hear Leah describe Powce’s take on the problem(s) and solutions. O’Reilly was not to be outdone, and did some very active boostering for identi.ca. I’ve got very mixed feelings on identi.ca. One the one hand, I should love identi.ca, because it’s open source. On the other hand, it’s written in PHP, which means I won’t be touching the code, and more importantly, my network is not there. Actually, it was kind of annoying to have to explain to lots of identi.ca zealots that it’s the network that’s the value, not the software, or ironically, the quality of the service. Still if another microblogging service can convince my network to move, and remain up, and even deliver some new functionality, I would definitely switch. I think I could probably write another post about “microblogging”, but I’ll refrain for now.

Theo Schlossnagle gave an amazing presentation called “Full-stack introspection crash course”, which is code for “let me show you some amazing stuff that you’ll only be able to do with DTrace”. This was a brilliant choice of title on Theo’s part, because it didn’t scare away all of those people whose preconceptions about DTrace or Sun would prevent them from coming to such a talk. Instead, Theo played to a very full room, and I would say that about one-third of the audience actually uttered the phrase “Oh My God” out loud at some point during the presentation. This was certainly true for thetwo gentlemen sitting directly to my right and directly behind me. I later heard from people at the Sun booth, that a bunch of people came to the booth having heard about DTrace (I assume at Theo’s talk), asking for whatever CD’s they needed in order to be able to use it. Theo clearly understands how to communicate about DTrace. We at Sun need to learn that lesson.

Open Source

Of course, you can’t have a conference on open source without meta stuff about open source itself. I was fortunate to attend the morning session of Microsoft’s Participate08 event, which was an interesting case study led by Karim Lakhani from the Harvard Business School. The case was on threadless.com and involved a lot of issues which are very relevant to injecting corporate involvement into an existing community based organization. I’ve been following Karim’s work over the years (he studied under Eric von Hippel, whose work I am also fond of), so I was happy for the chance to meet him and participate in an activity with him. I also met Siobhan O’Mahony, who is also doing great work studying open source communities. I’m not sure what direct value Microsoft got out of sponsoring Participate, other than being able to say that they did an event around OSCON, but I know that I definitely appreciated the chance to interact with a bunch of people.

OSCON 2008

Microsoft was all over the news by the end of OSCON, having announced that they would become a Platinum sponsor of the Apache Software Foundation. This was not a complete surprise to me: Justin Erenkrantz, the current ASF president told me what was happening the night before at a party. I think that this is an interesting step for Microsoft, an it’s definitely a step in the right direction. However, as one questioner pointed out, Microsoft has a long history of incendiary rhetoric towards the open source community, and that’s going to mean that just about everything happens in steps. I do find it interesting that one of the reasons that the ASF has taken donations is to build up a legal defense fund against what we regarded as inevitable legal attacks. It’s somehow ironic to think of Microsoft’s $100,000 going into that pool. I think that the next interesting milestone in Microsoft’s relationship with the ASF will be when the first Microsoft sponsored project shows up at the front door of the Apache Incubator.

I also contributed to the metaness with a talk titled “Open Source Community Antipatterns” (slides are now available on the O’Reilly slide page). The talk was decently attended, but I suspect that the all-star antipatterns panel immediately following my talk drew off some of the audience that might have come to my talk. The people track expanded a great deal this year, which I think is a good thing.

Photography

I always have photographic memories associated with OSCON. I got my first digital SLR, right before OSCON 2005, and I’ve shot a bit a each OSCON, and even won the OSCON photo contest one year. This year I found myself shooting less. There were too many other things that I needed to do, and between knowing that Duncan is making is covering stuff and some artistic blockage, I lacked both time and motivation to crank out the shots.

Duncan has been a great friend and photographic mentor, and I always look forward to catching up with him during OSCON. This time was no exception. We did a bunch of stuff together, ranging from hanging out, having a wide angle shootout (well he was wide) to Duncan putting one of his cards into my D3 and giving the pixels a once over. Probably the most fun thing that we did was an impromptu photoshoot. Duncan was shooting headshots of the OSCON staff for a thank you slide for the closing keynotes. Only problem was that he needed one of himself, so he drafted me. With the safe shot in hand, we spent a few more minutes doing something a little edgier and fun.

OSCON 2008


Fin

That’s it for another OSCON. I hope we’ll be back in Portland again next year.

DTrace on Linux?

I’ve been meaning to write a post about DTrace, and Tim Bray’s tweet finally got me moving. It looks like some people are trying to make DTrace a topic for this year’s Linux Kernel Summit. I hope they succeed. I also hope that those folks pushing for user level tracing have their voices heard. I was amused to read one of the messages which claimed that DTrace is:

DTrace is more a piece of sun marketing coolaid which they use to beat us up at every opportunity.

My experience at Sun thus far is that people generally don’t really appreciate the benefits of DTrace. It stems from a view that I also saw in the LKS threads, which is that DTrace (and tools like Systemtap) is a tool for system administrators, because it reports on activity on the kernel. That’s not how I look at it. DTrace is a tool for dealing with full system stack problems, which initially manifest themselves as operating system level problems. The fact that DTrace can trace user land code as well as kernel code is what makes it so important, especially to people building and running web applications. Because of all the moving parts in a complicated web application (think relational database, memcached or other caching layers, programming language runtime, etc), it can be hard to debug a web application that has gone awry in production. Worse, sometimes the problems only appear in production. Tools which cut across several layers of the system are very important, and DTrace provides this capability, if all the layers have probes installed. When a web application goes wrong in production, you see it at the operating system level – high usage of various system resources. That’s where you start looking, but you will probably end up somewhere else (unless you are ace at exercising kernel bugs). Perhaps a bad SQL query or perhaps a bad piece of code in part of the application. A tool that can help connect the dots between operating system level resource problems and application level code is a vital tool. That’s where the value is.

One of the cooler features of DTrace is that you can register a user level stack helper (a ustack helper), which can translate the stack in a provider specific manner. One cool example of this is the ustack helper that John Levon wrote for Python, which annotates the stack with source level information about the Python file(s) being traced. On an appropriately probed system, this would mean that you could trace the Python code of a Django application, memcached, and your relational database (PostgreSQL and soon MySQL). That would be very handy.

I’d love to see DTrace on Linux, because I have it on OS X and it’s in OpenSolaris and FreeBSD, but I’d also be happy to see SystemTap get to the point where it could do the same job.

Thoughts on MagLev – VM’s for everybody!

One of the most visible presentations from last weeks RailsConf was Avi Bryant’s demonstration of MagLev, which is a RubyVM that is based on Gemstone’s S/64 VM for Smalltalk. This caused a stir because the micro benchmark performance of MagLev looks really good because S/64 has been out in production for a while and because it appears to have some really interesting features (an OODB, shared VM’s, etc). MagLev is a reminder that the world of production quality, high-performance virtual machines is bigger than many of us remember at times.

I believe that over the next few years we will see a flourishing of virtual machines, as well as languages atop existing virtual machines. Take for example Reia, a Ruby/Pythonesque experiment atop Erlang’s BEAM VM. As we return to a multi language world, we will also necessarily return to a multiple implementation world. Before Java, there were many languages and many implementations of those languages. You could argue that there were probably too many, and I think that’s probably true. I would argue that we need to enter a new period of language and runtime experimentation. A big driver, but not the only driver, for this is the approaching multi-core world. When you don’t know how to solve something, more attempts at solutions is better.