In my previous post I demonstrated array operations based on the new functionality I have implemented during my GSOC project. In this post I will discuss another feature that is essential for array calculations in a computer algebra system: The initialization of numeric arrays *based on symbolic math expressions*. After all, it is the symbolic stuff we are really interested in, isn’t it? I will demonstrate how the automatic compilation and wrapping can be used to setup Numpy arrays with fast Hydrogen wave functions.

Again, it will be interesting to see how we compare with the array initialization features of numpy. One lesson I learned while preparing my previous post is that the timing results depend heavily on the compiler. To avoid the scaling issues we observed with the matrix-matrix product last time, I will do all examples on a computer with ifort 11.0. The Numpy installation on that computer is version 1.1.0. The warning applies again: I report the timing results I get on my computer, but this is not a rigorous benchmark. Also, I may not be using Numpy in the optimal way. If you know how I could make it run faster, I’d love hear about it.

### Introductory example: linspace

First, lets try to create our own compiled linspace function that we can compare with numpy.linspace. The Numpy function creates an array of evenly spaced numbers, like this:

In [1]: import numpy as np In [2]: np.linspace(0, 1, 5) Out[2]: array([ 0. , 0.25, 0.5 , 0.75, 1. ])

The same functionality can be created by autowrapping a Sympy expression that maps the array index to an interval on the real axis:

In [3]: from sympy.utilities.autowrap import autowrap In [4]: a, b = symbols('a b') In [5]: i = Idx('i', m) In [6]: linearmap = a + (i - 1)*(a - b)/(1 - m) In [7]: x = IndexedBase('x') In [8]: linspace = autowrap(Eq(x[i], linearmap)) In [9]: linspace(0, 1, 5) Out[9]: [ 0. 0.25 0.5 0.75 1. ]

So, it worked. The array has been initialized according to the linear map defined by the Sympy expression at In[6]. In order to map the index to the interval correctly one must remember that the index of a Fortran array starts at 1, while C arrays start at 0. This must be be taken into account when the initialization expression is defined, so a slightly different expression must be used with the C backend.

Now, you may be wondering how I could know the order of the arguments in the wrapped linspace function, and why it was so conveniently compatible with the interface for numpy.linspace. The answer, is that unless autowrap gets the optional keyword argument ‘args’, the arguments are sorted according to their string representation. I carefully chose the symbols in the linear map so that my linspace would automatically get the same argument sequence as Numpy’s.

In [14]: %timeit np.linspace(0, 5, 100000) 1000 loops, best of 3: 916 µs per loop In [15]: %timeit linspace(0, 5, 100000) 1000 loops, best of 3: 381 µs per loop

Sweet! Evenly spaced numbers are of course very useful, but let us also try something a bit more juicy.

### Universal hydrogen functions

One of the really useful concepts in Numpy is the “universal function” or ufunc. To put it short, ufuncs operate on arrays by applying a scalar function elementwise. It is actually more than that, as numpy ufuncs are required to support type casting, broadcasting and more, but we will ignore that and focus on the following quote from the Numpy docs:

That is, a ufunc is a “vectorized” wrapper for a function that takes a fixed number of scalar inputs and produces a fixed number of scalar outputs.

Can we make such functions with the autowrap framework? As you can probably guess, the answer is positive, so the question is rather: can we beat Numpy on speed? Let’s find out.

Very recently, Sympy got functionality to generate radial electronic wave functions for the Hydrogen atom. That is, quantum mechanical solutions to the attractive Coulomb potential. These are well known, and long since tabulated functions, that are routinely used in quantum mechanics. Wouldn’t it be nice to have the Hydrogen wave functions available as super fast ufunc-like binary functions?

The failsafe way to build a fast binary function that works element-wise on an array, is to construct a symbolic Lambda function that contains the initialization expression. The Lambda instance should then be attached to a regular Sympy Function. This is done with a call to implemented_function(‘f’, Lambda(…)) as in the following:

In [2]: from sympy.utilities.lambdify import implemented_function In [3]: from sympy.physics.hydrogen import R_nl In [4]: a, r = symbols('a r') In [5]: psi_nl = implemented_function('psi_nl', Lambda([a, r], R_nl(1, 0, a, r))) In [6]: psi_nl(a, r) Out[6]: psi_nl(a, r) In [7]: psi_nl._imp_(a, r) Out[7]: -r ____ -- / 1 a 2* / -- *e / 3 \/ a

As you would guess from Out[6], psi_nl is just a regular Sympy Function, but with an additional attribute, _imp_, that works as displayed in In[7]. The implemented_function trick landed in Sympy just a few weeks ago. It was intended for numeric implementations, but in my autowrap3 branch, I (ab)use it to store a symbolic Lambda instead.

In [8]: x = IndexedBase('x') In [9]: y = IndexedBase('y') In [10]: i = Idx('i', m) In [11]: hydrogen = autowrap(Eq(y[i], psi_nl(a, x[i])), tempdir='/tmp/hydro')

The argument ‘tempdir’ tells autowrap to compile the code in a specific directory, and leave the files intact when finished. Checking the Fortran source code reveals that the wave function is calculated like this:

do i = 1, m y(i) = 2*sqrt(a**(-3))*exp(-x(i)/a) end do

The implemented function has been printed as an *inline* expression, avoiding a costly function call in the loop. (If you wonder why I didn’t just print the expression directly, the explanation is here.) Now that we have the functions available, this is how they look:

The parameters and are quantum numbers related to the electron’s energy and orbital angular momentum respectively.

The autowrapped Hydrogen functions can also be compared with numpy equivalents, as all waves are expressed in terms of functions that are universal in numpy. We use Sympy’s ‘lambdify’ to create a Python lambda function that calls the relevant ufuncs from numpy’s namespace:

In [31]: from sympy.utilities.lambdify import lambdify In [32]: psi_lambda = lambdify([a, r], R_nl(1, 0, a, r), 'numpy') In [33]: grid = np.linspace(0,3,100) In [34]: np.linalg.norm(psi_lambda(0.5, grid) - hydrogen(0.5, grid)) Out[34]: 2.21416777433e-15

But there are many solutions to the Hydrogen atom, so let’s study them a bit more systematically. The following is the output from a loop over the energy quantum number . I used only the s-wave functions , corresponding to an electron that moves straight through the nucleus and out on the other side before turning back again to repeat the cycle. For each , I calculated:

- %timeit for application of lambda function with Numpy ufuncs
- %timeit for application of the compiled and wrapped Sympy function
- the absolute difference between the solutions:

Here is the output:

n = 1 Numpy: 100000 loops, best of 3: 16.8 µs per loop Sympy: 1000000 loops, best of 3: 1.92 µs per loop difference: 1.07708980623e-14 n = 2 Numpy: 10000 loops, best of 3: 30 µs per loop Sympy: 1000000 loops, best of 3: 2 µs per loop difference: 5.13651621237e-15 n = 3 Numpy: 10000 loops, best of 3: 48.1 µs per loop Sympy: 100000 loops, best of 3: 3.37 µs per loop difference: 2.28896762655e-15 n = 4 Numpy: 10000 loops, best of 3: 70.9 µs per loop Sympy: 100000 loops, best of 3: 2.7 µs per loop difference: 6.18114500556e-15 n = 5 Numpy: 10000 loops, best of 3: 116 µs per loop Sympy: 100000 loops, best of 3: 4.61 µs per loop difference: 5.68967616077e-15 n = 6 Numpy: 10000 loops, best of 3: 142 µs per loop Sympy: 100000 loops, best of 3: 5.02 µs per loop difference: 1.21523884705e-14 n = 7 Numpy: 10000 loops, best of 3: 183 µs per loop Sympy: 100000 loops, best of 3: 5.95 µs per loop difference: 1.04406500806e-14

Awesome! For we beat Numpy by a factor of 30. But the really interesting thing here is the *scaling* with expression complexity. Look at this:

For higher the wave function expressions increase in complexity, and so does the execution time of both Numpy and Sympy. However, while the autowrapped Sympy expression needs 3 times longer for than for , the execution time for a lambda of native Numpy ufuncs increases tenfold!

This last figure shows the linear relation between execution times of the compiled Sympy expressions and native Numpy lambda function. The plot supports the idea that the it is the inherent complexity of the mathematical expression that determines the execution time. By extrapolation, we can expect that for an expression that would take Sympy 9 times longer, Numpy would be hit by a factor of 100.

The main reason for the speedup, is that while Numpy calculates the wave function by applying a series of ufuncs, the Sympy way is to create a new “ufunc” and apply it once. It would be very cool to try this in an iterative solution scheme. With a super fast ufunc in a single python loop, one could do complex and cpu-intensive calculations with respectable performance from within isympy. That would be awesome!