Why is SQL Server's STDistance Very Slightly Different Than The Vincenty Formula? (Same Up To ~0.0001km)

SQL Server's STDistance function does not compute geodesic distances. Its algorithm essentially generates straight lines in 3D, densifies them and maps them on the ellipsoid in a manner described more specifically as such:

[...] we have chosen to parameterize the ellipsoid with a mapping from a three dimensional space (with the exception of the origin) which we call the space of directions. The mapping takes any nonzero vector (, , ) to the unique point (, , ) on the ellipsoid on which (, , ) is an outward pointing surface normal.

(From Geometric algorithms on an ellipsoid earth model, by M. Kallay)

According to Kallay, the computed edges are actually equivalent to great elliptic arcs, and the distances it yields are indeed much closer to great ellipse distances, and definitely aren't geodesics by the nature of the algorithm. Detail of this algorithm can be found in Geometric algorithms on an ellipsoid earth model, by M. Kallay. It seems that the Microsoft SQL Server documentation recognizes that the values aren't quite equivalent to geodesics:

Note

STDistance() returns the shortest LineString between two geography types. This is a close approximate to the geodesic distance. The deviation of STDistance() on common earth models from the exact geodesic distance is no more than .25%. This avoids confusion over the subtle differences between length and distance in geodesic types.

That last enigmatic sentence refers to the fact that because SQL Server's geographic features' edges are being treated with the same mapping scheme I explained earlier, developers decided to apply the same algorithm to the STDistance function so that it always yielded the same value as an edge's length value. I don't know why they use such an algorithm, and don't treat their edges as geodesics, but this related answer, by Charles Karney, author of several reknown papers about geodesics, suggests that it might be due to a lack of knowledge on the subject.

For your example, according to Charles Karney's GeographicLib geodesic distance calculator, the distance should be 373073.02837349, and matches GeoPy and PostGIS's values. Tools using Vincenty's formula also yield a value close to this (373073.028 rounded to the nearest millimeter). Since it is a relatively short distance, the difference from STDistance is rather small (24 mm), but for longer distances, and depending on where you are on the ellipsoid, the difference can be much larger. For distances around 10,000 km, the difference can be several meters, and for near-antipodal points, several kilometers.

Take this example below, computed with Charles Karney's algorithm on GeographicLib, versus SQL Server's result (I did not use Vincenty's here because it is ill-suited for near antipodal points):

Start point : 0°N, 90°E
End point :   0°N, 90.1 W

SQL Server result :

DECLARE @g geography;  
DECLARE @h geography;  
SET @g = geography::STGeomFromText('POINT(90 0)', 4326);  
SET @h = geography::STGeomFromText('POINT(-90.1 0)', 4326);  
SELECT @g.STDistance(@h);

------------
20026376.2765101

GeographicLib result : 20003008.421509

Notice the 23.3 km difference! SQL Server's computing scheme takes the edge along the Equator, but the true shortest geodesic actually passes in a polar region since it is shorter this way.

So in short, do not use SQL Server's STDistance as a reference for geodesic distances, since it is not what this algorithm calculates.


Without having access to the source code of SQL Server it may be impossible to say. As a comparison, PostGIS ST_Distance query from the geography example of https://postgis.net/docs/ST_Distance.html modified to use your test points gives

SELECT ST_Distance(gg1, gg2) As spheroid_dist, ST_Distance(gg1, gg2, false) As sphere_dist
FROM (SELECT
    'SRID=4326;POINT(-119.02138 35.22229)'::geography as gg1,
    'SRID=4326;POINT(-121.0777 38.148205)'::geography as gg2
    ) As foo  ;

373073.02837349 373433.04866826

PostGIS is open source and you can check from the source code how the result is computed. Because the results from PostGIS (with the spheroid alternative) and geopy are equal to the last digit they may be using the same code.