TITLE OF PAPER: Extending Python with Pyrex URL OF PRESENTATION: _URL_of_powerpoint_presentation_ PRESENTED BY: Paul Prescod REPRESENTING: _name_of_the_company_they_represent_ CONFERENCE: PyCON 2004 DATE: 20040325 LOCATION: Ballroom -------------------------------------------------------------------------- REAL-TIME NOTES / ANNOTATIONS OF THE PAPER: {If you've contributed, add your name, e-mail & URL at the bottom} Requested basic poll of how many people have been to two or more conferences and if so how many have heard about a talk about how to speed up python by up to 10x Apologetic groveling Experiments are important doing an experiment is better than doing nothing Most Python users complain about performance but do nothing at all many experimenters have other projects that are very successful High Level Overview Pyres compiles a Python-like language to C this gives two advantages over Python: Easy access to C types Closer to C performance than Python Gives two advantages over C: Easy access to Python types Close to Python ease and flexibility Created by Greg Ewing He told a story about how his wife thought that Pyrex was the name of cooking ware and that he was finally getting interested in cooking - then responded to his answer of it being a merger of two great languages with: "it's python and Rexx" Pyrex allows you to do the features that Python is good for and you can take advantage of that *and* keep the speed advantages of C My Advice You will be a much more proficient Pyrex programmer if you learn both C and Python. ...but you could probably get by with cargo cult techniques... Pyres compareed to ... CXX and Boost.Pyton Pyres has no explicit C++ support But also doesn't depend on C+ syntax SWIG Because Pyres is Python-specific, by default Pyrex APIs are very "Pythonic" The Breidge code reuns at C speed, not Python speed SWIG can only bridge, not program speeds SWIG can only bridge, not program itself Python2C, Starkiller and other Python compilers not yet production quality On the other hand Pyrex has a big flaw compared to most other wrappers: There isn't yet a tool to convert C headers to Python. You need to re-declare structs, typedefs, funciton definitions etc. ("just a SMOP") And there's that C++ issue Simple Pyrex funcion def hello_world(): print "Hello world" aas you can see: it uses a quirky syntax .... Manual compilation pyrexc hello_world.pyx ls hello_world* hello_world.c hello_world.pyx wc hello_world.pyx 3 5 41 hello_world.pyx wc hello_world.c 183 579 5791 hello_world.c Using Distutils from distutils.core import setup from distutils.extension import Extension from Pyres.Distutils import build_ext Using Pyximport Allows you to compile in the background and when you enable it, it will do dependecy checking import pyximport; pyximport.enable() more in the style of Python: "it should just work" Downside is that distutils will spew a lot of warnings from the import in between your import text Generated Code Aside from dozens of lines of boiler Example shows casting of c string to python string with error checking (lots of error checking) Things to note Pyrex handles checking error return codes from C funcciton type conversions reference counting Bear in mind that this will focus on how Pyrex is /different/ than Python...but usually it is /very similiar/ you can use Pyrex as basic Python to compile your python modules into C code Adding static type checking def hellow_world(char *message); print message Genereated code: static char*__pyx_argnames[] - {"message",0}; PyArg_ParsetupleAndKeywords(__pyx_args, __pyx_kwds, "s", __pyx_argnames, &__pyx_argnames) cdef functions The Pyrex programmer can also generate a function that has a C calling convention rather than Pyton: cdef extern int strlen() cdef int get_len (char *message): return strlen(message) def hello_world static int __pyx_f_11hellow_world_get_len(char(*__pyx_v_message)) { int __pyx_r; /* "/private/tmp/hellow_world.pyx": 7 */ __pyx_r = Calling the generated code Pyrex knows that it has to convert the Python string to a C string with the proper error handling and then casts the return as a Python integer Notes about cdef functions In either "regular" or "cdef" Pyrex functions you can mix and match Python types and function calls. Very few restrictions "cdef" functions cannot be called directly from Python Regular Python functions are a pain to call directly from C (need to convert everything to PyObjects) Calling between regular functions is slower than calling between cdef functions Pyrex variable declarations Basically like C, but prefixed with word "cdef" cdef int x, y cdef float z cdef char *s [Q: (Koenig) What does "cdef char *s, p" mean? A: I don't know; probably "broken" the way C is.] Python objects by default, variables are of type "Python object", with all of the dynamicity that implies. The can also be explicitly typed as "object". object o o.foo() + o ** o.bar() the full dynamic nature of python is available by default - the static typing is optional Pyrex does runtime casting cdef conversions(int x, object y): x, y = y, x return x,y # (note: a C function that returns a Python object) print conversions(1,2) print conversions(1, "2") # causes TypeError runtime checking of type coercions - Pyrex does not try to *prove* you program is staticly correct Pointers Similiar to C (including pointers to pointers etc), except there is no "*" deref operator: use [0] instead. cdef double_deref(int **y): return y[0][0] cdef int x; use array dereferencing instead of * defer Casts: uses angle bracket notation (y = x) cdef int x,y x = 1000 Importing types and functions Declare a header for Pyrex to include and re-declare the relevant symbols in it cdef extern from "stdio.h": ctypedef struct FILE FILE *fopen(char *filename, char *mode) if the inclusion of STDIO was needed in many places you can put the above code into an inclusion module (could imagine a translator that understands C declarations directly; no one's implemented that yet) Structures cdef struct mystruct: int a float b Pyrex is more regular than C: Use "." for fields of structs and pointers to structs cdef mystruct *m m.a = 5 Paul wondered about C's "." vs "->" and why C needs 2 operators -- Andrew Koenig promised to explain after the talk. Other complex types cdef union u: char *str int *x cdef enum colors: Partial structure redeclaration Don't have to re-declare all struct members: just the ones you care about. cdef extern from "stdio.h": ctypedef struct FILE: int _blksize; Typedefs Typedef works basiclally as in C: NULL there is a reserved world "NULL". It is not the same as 0 or None. 0 is an integer None is an object type NULL is for pointer types. Memory management Python objects are reference counted and garbage collected ala Python C types use manual C memory management Pyrex programmers don't need to thing about refcounts Except...if the call Python / C APIfunctions that steal or lend references. If you want to bypass Pyrex's handling of the C API and they have funny ref counting semantics then you have to use inc refs and dec refs to ensure they balance Exception handling Pyrex adds exception handling to C even cdef C functions cc an thro exceptions if the return type of the funcion is object, this works just like Python This won't work Remember to think about types! cdef object divide(int x, int y): return x/y print divide(2,0) Division by 0 is not an error for C types! What about C return types But what about when the return type is a C type? cdef int divide(int x, int y): if y == 0: raise ZerodDivisionError else: return x/y print divide(2,0) Pyrex generates exception error message saying it ignored the 'raise' -- because it can't return NULL as an int Except clause functions can have an "except" clause. they declare that a particular return value indicates that an exception has occurred Shows an example that only returns an exception if the result is < 0 This also won't work cdef extern FILE *fopen(char *filename, char*mode) except NULL standard C functions don't throw Python exceptions (they don't call PyErr_SetString) Except clauses are only usefull for functions that know about Python (described "exceptional" case: C function that calls Python function but didn't know it; Python function raises exception; OK in this case for C function to be declared "except NULL" and the right thing will happen) Conditional exception codes Sometimes any number is a valid return code An "except?" clause tells Pyrex to /check/ whether an exception was thrown. Pyrex can check whether an exception was thrown no matter what the return value. (very inefficient!) Ineger for-loops In order to get around the famous performance problems with range() (name lookup etc.), Pyrex has an integer syntax: for i from 0 <= i < n: ... [some discussion of whether this was a good idea which I couldn't quite hear -- rsf] Extension Types An extension type is just like a Python class except that: it has a more compact representation (more compact enve than __slots__ instances) it can directly contain C types (which __slots__instances cannot) it is a first-class type in the type system Defining extension types cdef class Shrubbery: cdef int width, height def __init__(self, w, h): self.width = w self.height = h def describe(self) print "this shrubbery is", self.width, ..... Using Shrubbery from Python x = Shrubbery(1,2 ) x.describe() print x.width # exception -- # not accessible from Python If you do want to access the c attribute from Python use the public keyword (also readonly keyword) Properties in Pyrex cdef clas Spam: property cheese: def __get__(self): def __set__(self: Extension types can be treated as simply Python ojbect, but they also exist in the Pyrex type system: def widen_shrubbery(Shrubbery sh, extra_width): sh.width = sh.width + extra_width Member accesses go directly to the C structure; they will not do a dictionary lookup. Careful Extension types have __special__ methods just as Python types do, but they are sometimes subtly different Read the Pyrex docs for more information. Pyrex lacks some features Function and class definitions cannot occur in function definitions "import *" is banished ["good riddance"] No generators [you can of course call python generators from pyrex code] No globals() and locals() functions Other missing features (to be corrected eventually) Functions cannot even be nested in conditionals In-place operators ("+=", "-=") are not allowed No List Comprehensions No explict support for Unicode New division syntax Non-technical "features" Pyrex needs a more active community. It could benefit from more of a shared-load development model. Somone should do C++ support! Someone should hack for optimization! Someone should integate with Jython/JNI and IronPython/CLI Someone should port the Python stdlib. ... and so forth. Pyrex needs more marketing Q: does he have any stdlib items that should be converted A: maybe the new statistics module, or even the more run-of-the mill modules that maybe do a lot a string manipulation, inner-loop routines or numeric heavy routines One of the "takehome" items is that maybe the development model for Pyrex should be more "open source" instead of being driven thru a single person Discussion What is Pyrex's future Should some of Python core be coded in Pyrex Q: what platform are support and what dependencies A: depends on Python and a C compiler (gcc, VS) Q: for numerical things is there a notion how pyrex can handle infix or other complex expressions A: it's not a feature that's implemented but he doesn't feel that it goes against the architecture C types are maps - complex type could be mapped to a structure in a similar manner Q: Good examples of Pyrex code? A: URLs coming in the next talk; a lot of Python code is legal Pyrex code. Q: what do you think about the C inline hack to get around the things that Pyrex is lacking A: PyPy has a inline C code hack that allows for insertion of raw C code - he is mostly in favor of it - he would be more in favor if the first example didn't happen to be an insertion of a Goto! but he tends to be a permissive type and that he pushes a lot of boundaries -- he would discourage it as something that should only be done if really really needed Q: Pyrex doesn't already know the C header - that's an annoying part of Pyrex - is there a project to do this? A: He thinks that there are some people who have done little prototypes but without Greg saying this is the approach he "blesses" that people are not going to move to the next step - he mentions that moving to the open source dev model then something like that could happen faster. Mentioned gcc-xml as one current project along a similar route. Q from Paul: What are people's impression of Pyrex - any con opinions? (no takers) A: One of the main con's is that people say that Pyrex is a *different* language than python - why do I need to do that - why not just write in Python and have it converted to C. In his opinion there are a couple issues to doing that - sometimes the coder knows more about what the code is doing than what the Python interpreter is doing - by giving hints you can tell Pyrex more than what a static type analyzer can do -- a static type anaalyzer cannot bridge python to C code at all -- the other thing is that Pyrex is here today and it's "good enough" and it works today - that's more inline with the Zen of Python Some people reject python because it's "clearly" not the perfect language - pyrex is here today and it works and he would rather move ahead with Pyrex and advance to more advance enhancements to Python that may remove the need for Pyrex Q: how hard is the debugging of Python with Pyrex involved A: (he asked for a clarification: finding bugs or using a debugger) answer debugger he thinks that the fact you can inspect the C code and that helps to understand the bug -- in that sense it is not that difficult - if you are dependant on visually debugger you will be out of luck because you will be tracing thru C code instead of Python code -- he describes that some bugs he had to trace into Pyrex was quite painful because he had to go into the raw Pyrex code - because Pyrex gets rid of the need to code ref counting and the other issues that normally make extension coding obtuse -------------------------------------------------------------------------- REFERENCES: {as documents / sites are referenced add them below} pyrex: http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/ gccxml: http://www.gccxml.org/HTML/Index.html -------------------------------------------------------------------------- QUOTES: "pyrex ... oh, you're finally taking an interest in cooking!" -------------------------------------------------------------------------- CONTRIBUTORS: {add your name, e-mail address and URL below} Ted Leung, twl@osafoundation.org, http://www.sauria.com/blog Bob Kuehne, rpk@blue-newt.com, http://www.blue-newt.com Russell Finn, rsf@sprucehill.com, http://www.sprucehill.com/rsf/blog/ [future] -------------------------------------------------------------------------- E-MAIL BOUNCEBACK: {add your e-mail address separated by commas } -------------------------------------------------------------------------- NOTES ON / KEY TO THIS TEMPLATE: A headline (like a field in a database) will be CAPITALISED This differentiates from the text that follows A variable that you can change will be surrounded by _underscores_ Spaces in variables are also replaced with under_scores This allows people to select the whole variable with a simple double-click A tool-tip is lower case and surrounded by {curly brackets / parentheses} These supply helpful contextual information. -------------------------------------------------------------------------- Copyright shared between all the participants unless otherwise stated...