I came across a document with lots of tips for efficient Python programming.
I started last week by setting up a page on the wiki to write down and publish my thoughts about how to implement code generation with array expressions. The intention was to flesh out some details and I also hoped to get some feedback. It turned out to be a really good investment as I got both suggestions about implementation and questions that helped me clarify my thoughts. (Thanks to Aaron, Andy and Mark!)
I spent a few days reading (GiNaC, distutils, numpy, fwrap, f2py, swig) and thinking before I set down to code on Thursday. By then I had reached the conclusion that I would need to implement Indexed objects. These will represent arrays in the Sympy expressions before code generation. Indexed expressions, e.g. a matrix-vector product denoted as A(i,j)x(j), contain information that is not explicit in the corresponding abstract notation Ax. In order to generate the code unambiguously, all relevant information must be explicit.
Having a specialized Sympy object that correspond to arrays in the code means that the user gets access to all the available expression manipulation functionality before the code is generated. The downside is that you need to work with indexed expressions, instead of abstract symbols. However, to translate non-indexed expressions to the corresponding indexed expression is just a call to Expr.subs, so it can be done just in time before code generation.
Over the weekend I managed to implement array argument handling in the fortran code printer, and FCodeGen, but I have not yet tested how the code compiles. This week hope to get the fortran to compile and also to implement everything in the C code generator. Maybe I can also refactor a little and move language agnostic code to the CodeGen super class.
[oy@ubuntulaptop sympy (fortran_codegen3)]$ git shortlog --since=1.weeks --author=Øyvind
Øyvind Jensen (21):
Fix issue 1920: SymTuple doesn't rebuild itself
Moved SymTuple to sympy/core/symtuple.py + imported in core/__init__.py
First implementation of Indexed and Idx
indexed.py: Added example with matrix-vector product to module docstring
move function wrap_fortran() to FCodePrinter._wrap_fortran()
Implemented optional free-form source format for Fortran code printer
Added tests for free-form fortran code + fix bug in comment line wrapping
Created codegenerator FCodeGen for generating compilable fortran code
Fixed double space after '=' in generated free form code
fix fortran line wrapping for human=False
FCodeGen should should use printer object, not fcode
Added loop functionality to FCodeGen
FCodePrinter: implemented code indentation for fortran free-form
Fixes failing test due to AttributeError in matplotlib version 0.91.2 (Is
Implemented Fortran declarations of array input arguments
Added property `dimensions' to Indexed
Implemented OutputArgument and InOutArgument in FCodeGen
Move loop generation into FCodePrinter + add declaration of local variabl
Fix leading blank in fortran codegen
Added test for FCodeGen with array arguments (matrix-vector product)
Added debug helper function for string comparison
I am in the planning phase of how to implement array arguments in the generated code. I have setup a page on the wiki where I am thinking aloud about the design. If you have specific any thoughts about code generation with Sympy, please feel free to make suggestions, add comments or even correct the ideas on the wiki.
I have now implemented and uploaded to github an improvement to the Fortran code printer which provides an option for free source form, as opposed to fixed source form. Relying on the free source form option, I also put together a working a Fortran code generator.
The free-form option necessiated a few changes in the exisiting FCodePrinter. The most substantial change being that the line breaking code, which used to be a standalone function, is now a method of the printer instance. This was necessary to make the line breaking algorithm aware of the printer settings, in particular the new ‘source_format’ setting.
I used the existing CCodeGen class as a template for my implementation of the Fortran code generator. So with respect to functionality, it should be on par with the C code generator that I described earlier. Real life testing remains of course, and the C code generator is implemented with an explicit goal of following the ANSI C standard. I know that the Fortran generator works with Intel’s ifort compiler, but I have yet to check the standard compliance of the generated Fortran code.
Provided you have ifort in your $PATH, and have checked out the branch fortran_codegen from github, the compilation and correct execution of fortran code can be tested with the command:
During the next week, I hope that I will be able to implement array functionality for both the C and Fortran code generators.
I’ve been studying the codegen utility a bit more. According to ‘git blame’, the module is mainly written by Toon Verstraelen. I’d like to give him credit, because it is a very nice piece of code, and it seems like the perfect starting point of my project.
I especially like the straightforward, object oriented structure. There are classes to represent routines, data types, arguments and return values. The actual code generators are also organized in a class hierarchy, and the abstract and language specific elements seem to be cleanly separated. The clear structure makes the code very well readable, and it appears easily extensible.
The current status of the utility can be summarized as:
- The utility can create compilable functions written in C.
- As far as I can see, it is restricted to scalar arguments and a single scalar return value.
- There are tests that will compile and run binaries created from several elementary functions and also some more complicated expressions.
For each of these points, there is a corresponding improvement that I need to implement in order to prepare codegen for the demands of the quantum physics framwork:
- Implement a Fortran code generator. This is not essential, but it will certainly be useful. Implementing for two programming languages from the start will help me keep the language specific code separate from the language agnostic code.
- Extend the functionality to allow multidimensional arguments and return values. I do not know yet the specifics of the quantum module Matt is working on, but I’m pretty sure that numerical Quantum Mechanics will need to involve arrays in one way or another.
- Write tests for all new functionality.
Looking forward to a quantum pythonical adventure!
During the summer 2010, I will work work with my mentor Andy Terrel on a GSOC project to improve the automatic code generation features of Sympy. In this blog I will write about the progress, starting out with a short presentation of my project, and my personal motivation.
The aim of my project this summer is to enable Sympy to generate directly compilable code that implements equations relevant to Computational Physics and Chemistry. This would be very useful, because it is increasingly the case that the precision of scientific calculations is limited not by the available computing power, but rather by the huge amount of work needed to manually derive and code equations.
There are already some important pieces for code generation functionality in Sympy, and I intent to build upon this as much as possible. There are code printers for fortran and c that will turn math into code, and there is also a codegen utility that aim to provide features similar to the goals of my project. I am not sure how much the codegen utility can do right now, but according to comments in the code, there is still a lot to do. I will have to study it in more detail, but at least I know that my project is not already implemented 😉 .
That’s it for now. I hope you’ll come back to read about my progress next week!