find time shift between two similar waveforms

This function is probably more efficient for real-valued signals. It uses rfft and zero pads the inputs to a power of 2 large enough to ensure linear (i.e. non-circular) correlation:

def rfft_xcorr(x, y):
    M = len(x) + len(y) - 1
    N = 2 ** int(np.ceil(np.log2(M)))
    X = np.fft.rfft(x, N)
    Y = np.fft.rfft(y, N)
    cxy = np.fft.irfft(X * np.conj(Y))
    cxy = np.hstack((cxy[:len(x)], cxy[N-len(y)+1:]))
    return cxy

The return value is length M = len(x) + len(y) - 1 (hacked together with hstack to remove the extra zeros from rounding up to a power of 2). The non-negative lags are cxy[0], cxy[1], ..., cxy[len(x)-1], while the negative lags are cxy[-1], cxy[-2], ..., cxy[-len(y)+1].

To match a reference signal, I'd compute rfft_xcorr(x, ref) and look for the peak. For example:

def match(x, ref):
    cxy = rfft_xcorr(x, ref)
    index = np.argmax(cxy)
    if index < len(x):
        return index
    else: # negative lag
        return index - len(cxy)   

In [1]: ref = np.array([1,2,3,4,5])
In [2]: x = np.hstack(([2,-3,9], 1.5 * ref, [0,3,8]))
In [3]: match(x, ref)
Out[3]: 3
In [4]: x = np.hstack((1.5 * ref, [0,3,8], [2,-3,-9]))
In [5]: match(x, ref)
Out[5]: 0
In [6]: x = np.hstack((1.5 * ref[1:], [0,3,8], [2,-3,-9,1]))
In [7]: match(x, ref)
Out[7]: -1

It's not a robust way to match signals, but it is quick and easy.


If one is time-shifted by the other, you will see a peak in the correlation. Since calculating the correlation is expensive, it is better to use FFT. So, something like this should work:

af = scipy.fft(a)
bf = scipy.fft(b)
c = scipy.ifft(af * scipy.conj(bf))

time_shift = argmax(abs(c))

scipy provides a correlation function which will work fine for small input and also if you want non-circular correlation meaning that the signal will not wrap around. note that in mode='full' , the size of the array returned by signal.correlation is sum of the signal sizes minus one (i.e. len(a) + len(b) - 1), so the value from argmax is off by (signal size -1 = 20) from what you seem to expect.

from scipy import signal, fftpack
import numpy
a = numpy.array([0, 1, 2, 3, 4, 3, 2, 1, 0, 1, 2, 3, 4, 3, 2, 1, 0, 0, 0, 0, 0])
b = numpy.array([0, 0, 0, 0, 0, 1, 2, 3, 4, 3, 2, 1, 0, 1, 2, 3, 4, 3, 2, 1, 0])
numpy.argmax(signal.correlate(a,b)) -> 16
numpy.argmax(signal.correlate(b,a)) -> 24

The two different values correspond to whether the shift is in a or b.

If you want circular correlation and for big signal size, you can use the convolution/Fourier transform theorem with the caveat that correlation is very similar to but not identical to convolution.

A = fftpack.fft(a)
B = fftpack.fft(b)
Ar = -A.conjugate()
Br = -B.conjugate()
numpy.argmax(numpy.abs(fftpack.ifft(Ar*B))) -> 4
numpy.argmax(numpy.abs(fftpack.ifft(A*Br))) -> 17

again the two values correspond to whether your interpreting a shift in a or a shift in b.

The negative conjugation is due to convolution flipping one of the functions, but in correlation there is no flipping. You can undo the flipping by either reversing one of the signals and then taking the FFT, or taking the FFT of the signal and then taking the negative conjugate. i.e. the following is true: Ar = -A.conjugate() = fft(a[::-1])