PyChinko/ A Native Python Rule Engine.notes

Thursday, March 24, 2005

TITLE OF PAPER: PyChinko: A Native Python Rule Engine
URL OF PRESENTATION: _URL_of_powerpoint_presentation_
PRESENTED BY: Yarden Katz
REPRESENTING: _name_of_the_company_they_represent_

CONFERENCE: PyCon 2005
DATE: March 24, 2005
LOCATION: Grand Ballroom, GWU
--------------------------------------------------------------------------

REAL-TIME NOTES / ANNOTATIONS OF THE PAPER:
{If you've contributed, add your name, e-mail & URL at the bottom}

Goals: Semantic Web rule engine
Limited resource environment
Beat CWM

Some common applications for rule systems:
    processing web content
    automating software compilation
    foundation for expert systems

Basic syntax of rules:
    A left hand side (LHS) - conditions that must hold for the rule to file
    A right hand side (RHS) - set of actions taken when the rule is fired.


Examples of rule syntaxes.

How to process rules efficiently:
    Charles Forgy's Rete algorithm
    Builds a discrimination network (decision tree) for channelling facts only to relevant rules
    Trades memory for speed
    Especially efficient when there are many different rules.


Detailed explanation of Rete algorithm

Digression on the Semantic Web


  R
ecall orignal motivation: rule engine copatible with SW languages and more specifically, faster than a slow but popular Python rule engine for the SW called CWM
  
  SW Goal: give useful semantics to otherwise me
aningless (to a computer) documents: allow for intelligent software agents to do some things that would (now) require a human eye and brain
  In short, bringing KR/AI techniques to the web (and leaving the hype at home!)
  
  Giving a story about looking up images of flowers and even tho
ugh the text search is good for documents it would not work for an image that has a flow but other text information and the SW would allow for meta information that could be mined by rule engines that would grok the meta data and find the image.
  
  There is a lot of hype and what they are trying to do is bring rules and standards to the open source world to allow for quicker acceptance.
  
Digression (cont'd): Semantic Web & Rues

  Core language is RDF (), a W3C standard; serializable into triples of the form (subject predicate object)

  Knowledge represe
ntation language called OWL () layered on top of RDF
  Example: FOAF is an OWL ontology. Large instance data (RDF documents) for FOAF generated by LiveJournal, among others.
  
Taste of SW languages

   RDF describes ressources (anything w/URI, essentiallY)
   Assertion: N
oam Chomsky is the author of Manufacturing Consent, published 1988
   One way to think of it: there exists something in the world,
   
   <image of the book here>
   
   <example of a RDF description of the assertions above>  

  or in N3 (which would you prefer writing?):
  
  @prefix dc: <http://purl.org
  
  <http://www.chomsky.info/ManuCons>
    dc:creator "NOam Chomsky";
    dc:title "Manufacturing Consent";
    dc:date "1988"
    
    <image of the graph that repres
ents that>
    
CWM (closed-world machine) is a python rule engine for SW that uses naive forward chaining algorithm
Compared with our Rete(on 1200 facts KB w/two matcha ll rule):
  2400 inferred facts (double original)
  Pychinko time: 0m1.609s
  CWM time: 0m28.691s
  Increase to 70000 initial facts, our time: 1.729s CWM time: 14m58.707s(!)
  This is one of Rete's worst cases
  
Appli
cations of rules (one the web)

Social networking: Processing FOAF documents. foaf is a semantic web vocabulary for forming social networks. consider a web application performing social network analysis on large, spidered foaf documents. an intelligent fil
tering mechanism to avoid 'bland' documents (perhaps autogenerated by a different application) unfit for analysis is needed. a filtering mechanism of this kind might follow foaf:   

Trust: policies on the web; permission/restrictions, I grant access to X only to senior employees - otheres can only view database Y and Z" or "O
nly known FOAF friends can view my pcitrue gallery" are natrually captured in the form of rules. The rules are easily expressed in Pychinko, while the terms to which the rules refer are not

Future work (in progress, actually)

  Generalizing PYchinko (native Python syntax in addition to N3, N-ary paterns)
  Moving Pychinko to the Nokia. On the phoen, resource are limited, and the hardwarelimitations prevent one from using heavy-duty reasoners such as those writeen in Java. Can use Pychinko to
process rules on the phone.

The existing code is limited to at most a few thousand rules.  For larger data sets there needs to be an extension to support secondary storage.

Quest
ions

Q. so what do you do in the face of conflicting data
A. the rules are purely symbolic, we have no way of saying they conflict semantically.  our rules systems are not able to handle true conflicts

Q. what does a rule look like in Pychinko
A. he pointed to one of the earlier slides showing that rules are queries

Q. what is the complexity of the engine
A. linearly


In case anyone is interested... since it looks like it didn't make any of the slides... PyChinko uses rdflib... see http://rdflib.net/

--------------------------------------------------------------------------
REFERENCES: {as documents / sites are referenced add them below}

http://www.mindswap.org/~katz/pychinko/


--------------------------------------------------------------------------
QUOTES:



--------------------------------------------------------------------------
CONTRIBUTORS: {add your name, e-mail address and URL below}
Ted Leung <twl@sauria.com> <http://www.sauria.com/blog>


--------------------------------------------------------------------------
E-MAIL BOUNCEBACK: {add your e-mail address separated by commas }



--------------------------------------------------------------------------
NOTES ON / KEY TO THIS TEMPLATE:
A headline (like a field in a database) will be CAPITALISED
    This differentiates from the text that follows
A variable that you can change will be surrounded by _underscores_
    Spaces in variables are also replaced with under_scores
    This allows people to select the whole variable with a simple double-click
A tool-tip is lower case and surrounded by {curly brackets / parentheses}
    These supply helpful contextual information.

--------------------------------------------------------------------------
Copyright shared between all the participants unless otherwise stated...