# Scipy: Speeding up calculation of a 2D complex integral

You can gain a factor of about 10 in speed by using Cython, see below:

```
In [87]: %timeit cythonmodule.doit(lam=lam, y0=y0, zxp=zxp, z=z, k=k, ra=ra)
1 loops, best of 3: 501 ms per loop
In [85]: %timeit doit()
1 loops, best of 3: 4.97 s per loop
```

This is probably not enough, and the bad news is that this is probably quite close (maybe factor of 2 at most) to everything-in-C/Fortran speed --- if using the same algorithm for adaptive integration. (scipy.integrate.quad itself is already in Fortran.)

To get further, you'd need to consider different ways to do the integration. This requires some thinking --- can't offer much from the top of my head now.

Alternatively, you can reduce the tolerance up to which the integral is evaluated.

# Do in Python # # >>> import pyximport; pyximport.install(reload_support=True) # >>> import cythonmodule cimport numpy as np cimport cython cdef extern from "complex.h": double complex csqrt(double complex z) nogil double complex cexp(double complex z) nogil double creal(double complex z) nogil double cimag(double complex z) nogil from libc.math cimport sqrt from scipy.integrate import dblquad cdef class Params: cdef public double lam, y0, k, zxp, z, ra def __init__(self, lam, y0, k, zxp, z, ra): self.lam = lam self.y0 = y0 self.k = k self.zxp = zxp self.z = z self.ra = ra @cython.cdivision(True) def integrand_real(double x, double y, Params p): R1 = sqrt(x**2 + (y-p.y0)**2 + p.z**2) R2 = sqrt(x**2 + y**2 + p.zxp**2) return creal(cexp(1j*p.k*(R1-R2)) * (-1j*p.z/p.lam/R2/R1**2) * (1+1j/p.k/R1)) @cython.cdivision(True) def integrand_imag(double x, double y, Params p): R1 = sqrt(x**2 + (y-p.y0)**2 + p.z**2) R2 = sqrt(x**2 + y**2 + p.zxp**2) return cimag(cexp(1j*p.k*(R1-R2)) * (-1j*p.z/p.lam/R2/R1**2) * (1+1j/p.k/R1)) def ymax(double x, Params p): return sqrt(p.ra**2 + x**2) def doit(lam, y0, k, zxp, z, ra): p = Params(lam=lam, y0=y0, k=k, zxp=zxp, z=z, ra=ra) rr, err = dblquad(integrand_real, -ra, ra, lambda x: -ymax(x, p), lambda x: ymax(x, p), args=(p,)) ri, err = dblquad(integrand_imag, -ra, ra, lambda x: -ymax(x, p), lambda x: ymax(x, p), args=(p,)) return rr + 1j*ri

Have you considered multiprocessing (multithreading)? It seems that you don't have a need to do a final integration (over the whole set) so simple parallel processing might be the answer. Even if you did have to integrate, you can wait for running threads to finish computation before doing the final integration. That is, you can block the main thread until all workers have completed.

http://docs.python.org/2/library/multiprocessing.html