From Fedora Project Wiki
(flesh out the page)
Line 10: Line 10:


= Example of generated code =
= Example of generated code =
See [http://dmalcolm.fedorapeople.org/python-packaging/depsolve.html depsolve.html].  You can see the generated .c code by clicking on the yellow-colored .py code.  This was generated using the "-a" option to Cython.  Note that this was generated using a development copy of Cython.  
See [http://dmalcolm.fedorapeople.org/python-packaging/depsolve.html depsolve.html].  You can see the generated .c code by clicking on the yellow-colored .py code.  This was generated using the "-a" option to Cython.  Note that this was generated using a development copy of Cython, which has the support for some lambdas.
 
Note in particular that the generated C code has comments throughout indicating which line of .py code (in context) it corresponds to: this is important for my own comfort level, in feeling that this is supportable.


= Notes on Cython =
= Notes on Cython =

Revision as of 16:37, 24 August 2010

Some speed optimization ideas for yum

  • use Cython to compile one or more of the .py files to .c code and compile them into DSOs
  • use PyPy; would require building out a full PyPy stack: an alternative implementation of Python. Last time I looked a the generated .c code, I wasn't comfortable debugging the result (I didn't feel that debugging a crash in the result would be feasible at 3am)
  • use Unladen Swallow for Python 2: would require porting the US 2.6 stack to 2.7, and a separate Python stack
  • use Unladen Swallow for Python 3: wait until it gets merged (in Python 3.3); port yum to python 3

Using Cython seems to be the least invasive approach.

I've looked at the generated code and it seems debuggable; I'd be able to debug issues arising.

Example of generated code

See depsolve.html. You can see the generated .c code by clicking on the yellow-colored .py code. This was generated using the "-a" option to Cython. Note that this was generated using a development copy of Cython, which has the support for some lambdas.

Note in particular that the generated C code has comments throughout indicating which line of .py code (in context) it corresponds to: this is important for my own comfort level, in feeling that this is supportable.

Notes on Cython

In theory this avoids both bytecode dispatch and stack manipulation, and should give us better CPU branch prediction; the result should also be more directly amenable to further optimization work: C-level profiling tools such as oprofile would indicate specifically where we're spending in the .py code.

From upstream, on one simple example: "Simply compiling this in Cython merely gives a 35% speedup. This is better than nothing, but adding some static types can make a much larger difference."

Using Cython "bakes in" some values for builtins: calls to the builtin "len" are turned directly into calls to PyObject_Length, rather than doublechecking each time what the value of __builtins__.len is, and calling it. So this is a semantic difference from regular Python, and some monkey-patching is ruled out, but I think it's a reasonable optimization.

TODO:

  • measure the impact of using a Cython .c build of depsolve.py
    • try building with Cython