Ted Leung on the air
Ted Leung on the air: Open Source, Java, Python, and ...
Sun, 04 Jan 2004
Hookable RSS aggregators
As a result of Brent's posting yesterday about my gzip aggregator list, I got some mail from the makers of Awasu, to let me know that they have gzip support (which I verfiied in my Apache logs). Always interested in the latest in RSS aggregation (even if it does run on windows), I went over the Awasu page, and discovered that Awasu allows plugins to intercept the RSS processing pipeline and perform their own processing. So, is this the first AOP-like RSS aggregator (interception and all -- AOP gurus need not remind me that I'm abusing the terminology) or is it the Emacs of aggregators (due to the hooks)?
[23:04] | [computers/internet/microcontent] | # | TB | F | G | 4 Comments | Other blogs commenting on this post
RSS Feed cleanups
Brent Simmons posted tonight about Gzip compression of RSS feeds. There are a few things I'd like to point out.

If you are using Apache, mod_gzip will compress dynamically generated content as well -- my RSS feed is dynamically generated, and it's being compressed by mod_gzip just fine -- you can control compression on a mimetype, response header type, or uri pattern bases, among other things.

I'm glad that Brent posted because, as he pointed out, I'm keeping a list, and I did bug him about it, even before I had a Mac. But in truth I feel a little guilty about it and here's why. After Brent implemented Gzip encoding, he mailed me to get checked off on the list and asked for feeds to test on. He said that he couldn't find that many gzipped feeds to test with. At the time, I thought he was mistaken. Well, Brent also implemented a statistics window in NetNewsWire that shows how many requests NNW made for a feed, and how many 304 and Gzip responses it got. So I've been looking at the statistics, and it's not looking good out there folks. As far as gzipped feeds go, about 10% of the feeds in my NNW (about 900) are gzipped. That's a lot worse than I expected. I understand that this can be tough -- the easiest way to implement gzipping is todo what Brent suggested, shove it off to Apache. That means that people who are being hosted somewhere need to know enough Apache config to turn gzip on. Not likely. Or have enlighted hosting admins that automatically turn it on, but that' doesn't appear to be the case. So blogging software vendors could help a lot by turning gzip support on in the software.

What's even more depressing is that for HTTP conditional get, the figure is only about 33% of feeds. And this is something that the blogging software folks should do. We are doing it in pyblosxom.

If you are using NetNewsWire, you can see the truth for yourself. Just go to the Window menu and select "Show Bandwidth Statistics" (you have to do this after you've pulled your feeds, though). If you are using some other RSS reader, well, you're on your own.

I was thinking of publishing a list of feeds that don't do either gzip or HTTP conditional get, but it would be too long. If you are interested, I've (with the help of some of the other NNW beta testers) written an Applescript that exports the bandwidth statistics as an XML file. The script is available here and the output on my NNW feeds is here (there's no gzip info because there's no Applescript property for it yet). Please only download the feed statistics data if you are really interested. Its about 8600 lines of XML.

So be a good citizen and fix your feed. You'll save bandwidth, and your readers will save both bandwidth and download time.

[00:27] | [computers/internet/weblogs] | # | TB | F | G | 2 Comments | Other blogs commenting on this post


twl JPG

About

Ted Leung FOAF Explorer

I work at the Open Source Applications Foundation (OSAF).
The opinions expressed here are entirely my own, not those of my employer.

Creative Commons License
This work is licensed under a Creative Commons License.

Now available!
Professional XML Development with Apache Tools : Xerces, Xalan, FOP, Cocoon, Axis, Xindice
Technorati Profile
PGP Key Fingerprint
My del.icio.us Bookmarks
My Flickr Photos


Syndicate
RSS 2.0 xml GIF
Comments (RSS 2.0) xml GIF
Atom 0.3 feed
Feedburner'ed RSS feed

< January 2004 >
SuMoTuWeThFrSa
     1 2 3
4 5 6 7 8 910
11121314151617
18192021222324
25262728293031

Archives
2006
2005
2004
2003

Articles
Macintosh Tips and Tricks

Search
Lucene
Blogs nearby
geourl PNG

Categories
/ (1567)
  books/ (33)
  computers/ (62)
    hardware/ (15)
    internet/ (58)
      mail/ (11)
      microcontent/ (58)
      weblogs/ (174)
        pyblosxom/ (36)
      www/ (25)
    open_source/ (145)
      asf/ (53)
      osaf/ (32)
        chandler/ (35)
        cosmo/ (1)
    operating_systems/ (16)
      linux/ (9)
        debian/ (15)
        ubuntu/ (2)
      macosx/ (101)
        tips/ (25)
      windows_xp/ (4)
    programming/ (156)
      clr/ (1)
      dotnet/ (13)
      java/ (71)
        eclipse/ (22)
      lisp/ (34)
      python/ (86)
      smalltalk/ (4)
      xml/ (18)
    research/ (1)
    security/ (4)
    wireless/ (1)
  culture/ (10)
    film/ (8)
    music/ (6)
  education/ (13)
  family/ (17)
  gadgets/ (24)
  misc/ (47)
  people/ (18)
  photography/ (25)
    pictures/ (12)
  places/ (3)
    us/ (0)
      wa/ (2)
        bainbridge_island/ (17)
        seattle/ (13)
  skating/ (6)
  society/ (20)



[Valid RSS]

del.icio.us linkblog

www.flickr.com

Blogroll

java.blogs
Listed on BlogShares

Locations of visitors to this page
Where are visitors to this page?


pyblosxom GIF