Ted Leung on the air
Ted Leung on the air: Open Source, Java, Python, and ...
Wed, 23 Jun 2004
pylucene is a separate project
Today OSAF is breaking out pylucene as a separate project. Here is the announcement. We covered this today during our IRC office hours. For those of you who missed the session, here is the IRC log for today's discussion.

The pylucene project page has the usual information including a Subversion repository, mailing list info, and a bugzilla component. We're experimenting with the use of Subversion as a version control system.

A lot of people have asked for pylucene as a separate project. I hope that this will be useful to folks out there in the broader Python community. I also hope that some folks will get involved in helping to improve pylucene. There are a bunch of things that need doing, a distutils based install being high on my list.

From my perspective, this a good study in looking at how to leverage other projects. We were interested in full-text indexing from Python. We had a few options that included Managing Gigabytes, Lupy (a pure python port of Lucene), CLucene (a C++ port of Lucene) and the original Java port of Lucene. When comparing Managing Gigabytes versus any of the Lucene related libraries, most of the Lucene ports are being actively maintained, while the Managing Gigabytes software hasn't been updated since 1999. That leaves you with a Lucene variant. Lupy is much slower than either Lucene or CLucene. So that rules out Lupy. Only Lucene and CLucene are left. Coming from Python, it would seem that CLucene is the obvious choice since it's written in C++. The problem with this is that it's a port, and you are always trailing Lucene. Further, the real search engine expertise lies with Doug Cutting, who is working on Lucene. So if you have a search related problem in CLucene, it might have to go all the way back to Lucene anyway. The biggest problem with Lucene is that it's written in Java, and the last thing we want in Chandler is to have people also install Java. Enter gcj. Thanks to the hard work of the gcj team, you can compile Lucene into native code, which can then be wrapped with SWIG (just like CLucene needs to be for Python). As both gcj and gcc get better, pylucene will benefit (of course CLucene would also benefit from improvements to gcc).

[18:02] | [computers/open_source/osaf] | # | TB | F | G | 36 Comments | Other blogs commenting on this post

There's also the train of thought that lucene is becoming the de-facto open source search engine api (source)

So pylucene developers will be able to utilize the expertise of existing lucene developers.
Posted by Darryl at Thu Jun 24 05:38:05 2004

You can subscribe to an RSS feed of the comments for this blog: RSS Feed for comments

Add a comment here:

You can use some HTML tags in the comment text:
To insert a URI, just type it -- no need to write an anchor tag.
Allowable html tags are: <a href>, <em>, <i>, <b>, <blockquote>, <br/>, <p>, <code>, <pre>, <cite>, <sub> and <sup>.

You can also use some Wiki style:
URI => [uri title]
<em> => _emphasized text_
<b> => *bold text*
Ordered list => consecutive lines starting spaces and an asterisk





Remember my info?

twl JPG


Ted Leung FOAF Explorer

I work at the Open Source Applications Foundation (OSAF).
The opinions expressed here are entirely my own, not those of my employer.

Creative Commons License
This work is licensed under a Creative Commons License.

Now available!
Professional XML Development with Apache Tools : Xerces, Xalan, FOP, Cocoon, Axis, Xindice
Technorati Profile
PGP Key Fingerprint
My del.icio.us Bookmarks
My Flickr Photos

RSS 2.0 xml GIF
Comments (RSS 2.0) xml GIF
Atom 0.3 feed
Feedburner'ed RSS feed

< June 2004 >
   1 2 3 4 5
6 7 8 9101112


Macintosh Tips and Tricks

Blogs nearby
geourl PNG

/ (1567)
  books/ (33)
  computers/ (62)
    hardware/ (15)
    internet/ (58)
      mail/ (11)
      microcontent/ (58)
      weblogs/ (174)
        pyblosxom/ (36)
      www/ (25)
    open_source/ (145)
      asf/ (53)
      osaf/ (32)
        chandler/ (35)
        cosmo/ (1)
    operating_systems/ (16)
      linux/ (9)
        debian/ (15)
        ubuntu/ (2)
      macosx/ (101)
        tips/ (25)
      windows_xp/ (4)
    programming/ (156)
      clr/ (1)
      dotnet/ (13)
      java/ (71)
        eclipse/ (22)
      lisp/ (34)
      python/ (86)
      smalltalk/ (4)
      xml/ (18)
    research/ (1)
    security/ (4)
    wireless/ (1)
  culture/ (10)
    film/ (8)
    music/ (6)
  education/ (13)
  family/ (17)
  gadgets/ (24)
  misc/ (47)
  people/ (18)
  photography/ (25)
    pictures/ (12)
  places/ (3)
    us/ (0)
      wa/ (2)
        bainbridge_island/ (17)
        seattle/ (13)
  skating/ (6)
  society/ (20)

[Valid RSS]

del.icio.us linkblog



Listed on BlogShares

Locations of visitors to this page
Where are visitors to this page?

pyblosxom GIF