Savitzky-Golay Filter to smooth noisy data

The following code will filter noisy data…

SGKernel[left_?NonNegative, right_?NonNegative, degree_?NonNegative, derivative_? NonNegative] :=
Module[{i, j, k, l, matrix, vector},
    matrix = Table[  (* matrix is symmetric *)
        l = i + j;
        If[l == 0,
            left + right + 1,
        (*Else*)
            Sum[k^l, {k, -left, right}]
        ],
        {i, 0, degree},
        {j, 0, degree}
    ];
    vector = LinearSolve[
        matrix, 
        MapAt[1&, Table[0, {degree+1}], derivative+1]
    ];
    (* vector = Inverse[matrix][[derivative + 1]]; *)
    Table[
        vector.Table[If[i == 0, 1, k^i], {i, 0, degree}],
        {k, -left, right}
    ]
] /; derivative <= degree <= left+right


SGSmooth[list_?VectorQ, window_, degree_, derivative_:0]:=
    Module[{pairs},
        pairs = MapThread[List, {Range[Length[list]], list}];
        Map[Last, SGSmooth[pairs, window, degree, derivative]]
    ] 


SGSmooth[list_?MatrixQ, window_, degree_, derivative_:0]:=
    Module[{kernel, list1, list2, margin, space, smoothData},

        (* determine a symmetric margin at the ends of the raw dataset.
        The window width is split in half to make a symmetric window 
        around a data point of interest *)
        margin = Floor[window/2];

        (* take only the 1st column of data in the list to be smoothed (the
        independant Values) and extract the data from the list by removing
        half the window width 'i.e., margin' from the ends of the list *)
        list1 = Take[Map[First, list], {margin + 1, Length[list] - margin}];

        (* take only the 2nd column of data in the list to be smoothed 
        (the dependent Values) and Map them into list2 *)
        list2 = Map[Last, list];

        (* get the kernel coefficients for the left and right margins, the 
           degree, and the requested derivative *)
        kernel = SGKernel[margin, margin, degree, derivative];

        (* correlation of the kernel with the list of dependent values *)
        list2 = ListCorrelate[kernel, list2];

        (* Data _should_ be equally spaced, but... calculate spacing anyway by getting
           the minimum of all the differences in the truncated list1, remove the first 
           and last points of list1 *)
        space = Min[Drop[list1, 1] - Drop[list1, -1]];

        (* condition the dependant values based on spacing and the derivative *)
        list2 = list2*(derivative!/space^derivative);

        (* recombine the correlated (x-y) data pairs (that is list1 and list2), 
        put these values back together again to form the smooth data list *)
        smoothData=Transpose[{list1, list2}]


    ] /; derivative <= degree <= 2*Floor[window/2] && $VersionNumber >= 4.0

I did not apply this to your data, but you can do that later. This example is applied to noisy random data.

Using a noisy sine data function…

dataFunction[x_] := Sin[x] + Random[Real, {-π, π}];

Build a table of noisy tabular data from $0$ to $2\pi$…

dataTable = Table[{x, dataFunction[x]}, {x, 0, 2 π, .01}];

Animate the smoothing operations. Notice the smoothed dataset shrinks with increasing 'window width'. This is an artifact of the ListCorrelate function used in the SGSmooth function. ListCorrelate uses an end buffered dataset.

NOTE: The red line is the filtered data set…

 Manipulate[
 If[showRawData,
  Show[
   ListPlot[dataTable, PlotRange -> {{0, 2 π}, {-5.0, 5.0}}],
   ListPlot[
    {
     SGSmooth[dataTable, win, order]
     },
    PlotRange -> {{0, 2 π}, {-5.0, 5.0}},
    PlotStyle -> {{Red, Thick}, {Green, Thick}},
    Joined -> True]
   ], (* ELSE just plot smooth data *)
  ListPlot[
   {
    SGSmooth[dataTable, win, order]
    },
   PlotRange -> {{0, 2 π}, {-5.0, 5.0}},
   PlotStyle -> {{Red, Thick}, {Green, Thick}},
   Joined -> True]
  ],

 {{win, 100, "window width"}, 2, 300, 1, 
  Appearance -> "Labeled"}, {{order, 1, "order of polynomial"}, 1, 9, 
  1, Appearance -> "Labeled"},
 {{showRawData, True, "Raw Data"}, {True, False}},
 SaveDefinitions -> True
 ]

This will create a Manipulate similar to the following:

demo

Hope this helps!


I'm just posting this to record for posterity something I posted in the chatroom a not-so-long time ago. As I noted there, the following routine will only do smoothing; I had a more general routine for generating the differentiation coefficients, but I still have not been able to find it. (For the more general, but less compact version, see below.) As with Virgil's method (the one in Alexey's answer), this is based on Gorry's procedure (though I have traced the spirit of the algorithm going as far back as Hildebrand's book):

GramP[k_Integer, m_Integer, t_Integer] :=
     (-1)^k HypergeometricPFQ[{-k, k + 1, -t - m}, {1, -2 m}, 1]

SavitzkyGolay[n_Integer, m_Integer, t_Integer] := 
        Table[Sum[(Binomial[2 m, k]/Binomial[2 m + k + 1, k + 1])
                  GramP[k, m, i] GramP[k, m, t] (1 + k/(k + 1)), {k, 0, n}, 
                  Method -> "Procedural"], {i, -m, m}]
SavitzkyGolay[n_Integer, m_Integer] := Table[SavitzkyGolay[n, m, t], {t, -m, m}]

Usage is pretty straightforward: n is the order of the polynomial smoothing; 2 m + 1 is the window size, and t tells how much to shift the window.


Added 12/17/2015

Here is a faster routine for evaluating the Gram polynomial, using some undocumented functionality:

GramP[k_Integer, m_Integer, t_Integer] :=
      (-1)^k Internal`DCHypergeometricPFQ[k, {-k, k + 1, -m - t}, {1, -2 m}, 1]

I managed to finally recover the general SG routine I once wrote through the kind assistance of a friend. To share my joy, I now release this to you:

Options[SavitzkyGolay] = {Derivative -> 0, WorkingPrecision -> Infinity};

SavitzkyGolay[n_Integer?Positive, m_Integer?Positive, t_Integer,
              OptionsPattern[]] /; 1 < n < 2 m + 1 && -m <= t <= m := 
Module[{o = OptionValue[Derivative], c, s, h, p, q, u, v, w},
       u = UnitVector[o + 1, 1]; v = ConstantArray[0, o + 1]; 
       c = 1/(2 m + 1); s = Join[{Boole[o == 0] c},
       Table[h = 0;
             {p, q} = {2 (2 k - 1), (k - 1) (2 m + k)}/(k (2 m - k + 1));
             Do[w = u[[j]]; (* evaluate Gram polynomial and derivatives *)
                u[[j]] = p (t w + (j - 1) h) - q v[[j]];
                v[[j]] = h = w,
                {j, Min[k, o] + 1}];
             c *= (2 m - k + 1) (1 + 1/k)/(2 m + k + 1);
             c (1 + k/(k + 1)) u[[o + 1]],
             {k, n}]];
       Table[h = s[[n]] + 2 (2 n - 1) (p = s[[n + 1]]) j/(n (2 m - n + 1));
             Do[q = p; p = h; (* Clenshaw's recurrence *)
                h = s[[k]] + 2 (2 k - 1) p j/(k (2 m - k + 1)) -
                    k (2 m + k + 1) q/((k + 1) (2 m - k)),
                {k, n - 1, 1, -1}];
                N[h, OptionValue[WorkingPrecision]], {j, -m, m}] // 
    Developer`ToPackedArray];

SavitzkyGolay[n_Integer?Positive, m_Integer?Positive, opts___?OptionQ] /; 
  1 < n < 2 m + 1 := 
 Developer`ToPackedArray[Table[SavitzkyGolay[n, m, t, opts], {t, -m, m}]]

As advertised, it uses no matrices, and instead uses the recurrence relation of the Gram polynomial. If need be, the guts of the routine can be embedded within a Compile[].


Added 12/17/2015

Altho SavitzkyGolayMatrix[] is now built-in in version 10, it is only limited to producing the "central" coefficients, as opposed to the routine SavitzkyGolay[] in this answer that can also generate coefficients for the left and right ends.

SavitzkyGolayMatrix[{2}, 3, 1, WorkingPrecision -> ∞]
   {1/12, -2/3, 0, 2/3, -1/12}

SavitzkyGolay[3, 2, 0 (* central *), Derivative -> 1]
   {1/12, -2/3, 0, 2/3, -1/12}

In general, the result of SavitzkyGolayMatrix[] is built from appropriate outer products of coefficient lists.

SavitzkyGolayMatrix[{3, 4}, {2, 3}, WorkingPrecision -> ∞] ===
Outer[Times, SavitzkyGolay[2, 3, 0], SavitzkyGolay[3, 4, 0]]
   True

Several years ago in related MathGroups thread Virgil P. Stokes suggested:

A few years back I wrote a Mathematica notebook that shows how one can obtain the SG smoother from Gram polynomials. The code is not very elegant; but, it is a rather general implementation that should be easy to understand. Contact me if you are interested and I will be glad to forward the notebook to you.

I contacted him and received the notebook. I find his implementation of the Savitzky-Golay filter quite stable and working pretty well. Here I publish it with his permission:

Clear[m, i]; (* m, i are global variables !! *)
Clear[GramPolys, LSCoeffs, SGSmooth];

GramPolys[mm_, nmax_] := 
  Module[{k, m = mm},  (* equations (1a), (1b) *)
      (* Define recursive equation for Gram polynomials as a function of m,i for degrees 0,1,...,nmax *)
      p[m, 0, i] = 1; 
      p[m, -1, i] = 0;    
      p[m_, k_, i_] := 
    p[m, k, i] = 2*(2*k - 1)/(k*(2*m - k + 1))*i*p[m, k - 1, i] - 
                                         (k - 
         1)*(2*m + k)/(k*(2*m - k + 1))*p[m, k - 2, i];

      (* Return coefficients for degrees 0,1,...,nmax in a list *) 
      Table[p[mm, k, i] // FullSimplify, {k, 0, nmax}]    
   ]; 

LSCoeffs[m_, n_, d_] := 
  Module[{k, j, sum, clist, polynomial, cclist},   
   polynomial = GramPolys[m, n];
   clist = {};
   Do[(* points in each sliding window *)
    sum = 0;
    Do[ (* degree loop *)
         num = (2 k + 1) FactorialPower[2 m, k];
         den = FactorialPower[2 m + k + 1, k + 1];
         t1 = polynomial[[k + 1]] /. {i -> j};
         t2 = polynomial[[k + 1]]; 
         sum = sum + (num/den)*t1*t2 // FullSimplify;
         (*Print["k,polynomial[[k+1]]: ",k,", ",polynomial[[k+1]]];*)
     , {k, 0, n}];
       clist = Append[clist, sum];
    , {j, -m, m}];
   Table[D[clist, {i, d}] /. {i -> j}, {j, -m, m}]
   ];

SGSmooth[cc_, data_] := Module[{m, y, datal, datar, k, kk, n, yy},

   n = Length[data];
   m = (Length[cc] - 1)/2;

   (* Left end  --- first 2*m+1 points used *)
   datal = Take[data, 2*m + 1];
   (* Smooth first m points (1,2,...,m-1,m) *)
   kk = 0;
   Table[(kk = kk + 1; 
     y[k] = ListConvolve[Reverse[cc[[kk]]], datal][[1]]), {k, -m, -1}];

   (* Smooth central points (m+1,m+2,...n-m-1) *)
   y[0] = ListConvolve[Reverse[cc[[m + 1]]], data];

   (* Right end --- last 2*m+1 points used *)
   datar = Take[data, {n - (2*m + 1) + 1, n}];
   (* Smooth last m points (n-m,n-m+1,...,n) *)
   kk = m + 1;
   Table[(kk = kk + 1; 
     y[k] = ListConvolve[Reverse[cc[[kk]]], datar][[1]]), {k, 1, m}];

   (* And now we concatenate the front-end, central, and back-
   end estimated data values *)
   yy = Join[Table[y[k], {k, -m, -1}], y[0], Table[y[k], {k, 1, m}]]
   ];
Usage: SGOutput = SGSmooth[LSCoeffs[m,n,d], data]
Inputs: 
   m   =  half-width of smoothing window; i.e., 2m+1 points in smoothing kernel
   n   =  degree of LS polynomial (n < 2m+1)
   d   =  order of derivative (d =0, smoother; d = 1, 1st derivative; ...)
   data = list of uniformly sampled (spaced) data values to be smoothed (length(data) >=2m+1)  
Outputs:
   SGOutput =  list of smoothed data values