Detecting and fixing outliers in a GPS trajectory

Algorithm I use.

  1. Calculate Euclidean minimum spanning tree of points:

enter image description here

  1. Find 2 points most far apart from each other on this network

enter image description here

  1. Find shortest route between them:

enter image description here

As one can see it might cut corner on a sharp turn.

I have ArcGIS python implementation of above algorithm, it uses networkx module. Let me know if this is of interest and I'll update my answer with the script

UPDATE:

# Connects points to make polyline. Makes 1 line at a time
# Tool assumes that 1st layer in Table of Conternt is TARGET polyline feature class,
# second layer in TOC is SOURCE point fc.
# If no selection found in SOURCE layer, works on entire dataset

import arcpy, traceback, os, sys
import itertools as itt
from math import sqrt
sys.path.append(r'C:\Users\felix_pertziger\AppData\Roaming\Python\Python27\site-packages')
import networkx as nx
from networkx import dijkstra_path_length

try:
    def showPyMessage():
        arcpy.AddMessage(str(time.ctime()) + " - " + message)
    def CheckLayerLine(infc):
        d=arcpy.Describe(infc)
        theType=d.shapeType
        if theType!="Polyline":
            arcpy.AddWarning("\nTool designed to work with polylines as TARGET!")
            raise NameError, "Wrong input\n"
        return d
    def CheckLayerPoint(infc):
        d=arcpy.Describe(infc)
        theType=d.shapeType
        if theType!="Point":
            arcpy.AddWarning("\nTool designed to work with points as SOURCE!")
            raise NameError, "Wrong input\n"
        return d
    mxd = arcpy.mapping.MapDocument("CURRENT")
    layers = arcpy.mapping.ListLayers(mxd)
    if len(layers)<=1:
        arcpy.AddWarning("\nNot enough layers in the view!")
        raise NameError, "Wrong input\n"
    destLR, sourceLR=layers[0],layers[1]
    a = CheckLayerPoint(sourceLR);d = CheckLayerLine(destLR)

#  copy all points to manageable list
    g=arcpy.Geometry()
    geometryList=arcpy.CopyFeatures_management(sourceLR,g)
    nPoints=len(geometryList)
    arcpy.AddMessage('Computing minimum spanning tree')
    list2connect=[p.firstPoint for p in geometryList]
#  create network    
    p=list(itt.combinations(range(nPoints), 2))
    arcpy.SetProgressor("step", "", 0, len(p),1)
    G=nx.Graph()
    for f,t in p:
        p1=list2connect[f]
        p2=list2connect[t]
        dX=p2.X-p1.X;dY=p2.Y-p1.Y
        lenV=sqrt(dX*dX+dY*dY)
        G.add_edge(f,t,weight=lenV)
        arcpy.SetProgressorPosition()
    arcpy.AddMessage(len(G.edges()))
    mst=nx.minimum_spanning_tree(G)
    del G

#  find remotest pair
    arcpy.AddMessage(len(mst.edges()))
    length0=nx.all_pairs_dijkstra_path_length(mst)
    lMax=0
    for f,t in p:
        lCur=length0[f][t]
        if lCur>lMax:
            lMax=lCur
            best=(f,t)
    gL=nx.dijkstra_path(mst,best[0],best[1])
    del mst
    nPoints=len(gL)
    ordArray=arcpy.Array()
    for i in gL: ordArray.add(list2connect[i])

#  append line to TARGET
    curT = arcpy.da.InsertCursor(destLR,"SHAPE@")
    curT.insertRow((arcpy.Polyline(ordArray),))
    arcpy.RefreshActiveView()
    del curT

except:
    message = "\n*** PYTHON ERRORS *** "; showPyMessage()
    message = "Python Traceback Info: " + traceback.format_tb(sys.exc_info()[2])[0]; showPyMessage()
    message = "Python Error Info: " +  str(sys.exc_type)+ ": " + str(sys.exc_value) + "\n"; showPyMessage()            

One idea is to create a script that lists the angles (and maybe the length also) of every segment of your path. Now you can compare the values of every segment with its direct neighbours (and possibly the second neighbours also to increase accuracy) and select all those points where the values exceed a given threashold-value. Finally simply delete the points from your path.


Also worth looking at is the Median-5 method.

Each x (or y) coordinate is set to the median of the 5 x (or y) values around it in sequence (i.e. itself, the two previous values and the two subsequent values).

e.g. x3 = median(x1,x2,x3,x4,x5) y3 = median(y1,y2,y3,y4,y5) etc.

Method is quick and is also easy to use on streaming data.