Batch geocode with multiple services?

I work at SmartyStreets and I've been working on a similar comparison for over two years now. As it turns out, even the almighty google gets some addresses wrong. I've compared google to ten other services that provide geocoding.
Here's the comparison:

https://docs.google.com/spreadsheet/ccc?key=0AidEWya_p6XFdGw1RmZ6TjB1ajZxVk81d2pISDMzVUE&usp=sharing

Now, certainly it would be nice to paint my company to look better than the others, but I opted for brutal honesty instead of spin.

The results (small set but it was enough for me) show that overall, the google geocoding service was significantly better in almost every instance. (The green fields show the best for that address while the red fields show the worst for that address). Google averaged a mere 28ft variance.

(update - among the 10 addresses I checked actually Yahoo had the closest average geocoding. wow!)

Keep in mind that there are also instances that I have tested where some of the other services were actually closer than googlemaps. These "other" services rely on the US Census Bureau TIGER data which is generally only updated every 10 years, with the census, while the googlemaps data is constantly updated.

I would recommend picking either google or bing, based on your preference and stick with them. I'm happy to give you more details if you need them.


You might take a look at geopy, a Python library for several of the popular geocoding services. You can get some results to see the differences and maybe go from there:

>>> from geopy import geocoders

>>> p = "1600 Pennsylvania Ave, Washington DC"
>>> g = geocoders.GoogleV3()
>>> us = geocoders.GeocoderDotUS()

>>> place, (lat, lng) = g.geocode(p)
>>> print "%s: %.5f, %.5f" % (place, lat, lng)
1600 Pennsylvania Ave Northwest, President's Park, Washington, DC 20500, USA: 38.89710, -77.03654

>>> place, (lat, lng) = us.geocode(p)
>>> print "%s: %.5f, %.5f" % (place, lat, lng)
1600 Pennsylvania Ave NW, Washington, DC 20502: 38.89875, -77.03768