What algorithm should I use for wifi geolocation?

Check perhaps these papers:

  • RADAR: an in-building RF-based user location and tracking system
  • WLAN location determination via clustering and probability distributions

It sounds like you don't know the signal locations very well, so you need first to estimate them and then, given those estimates, triangulate your position.

If you want some accuracy and realism, consider adopting a likelihood model for the signal strengths, finding the maximum likelihood, and making a gridded map of the location probability computed from the maximum likelihood estimates. The global maximum on the grid identifies the best estimate of the location and the contours (relative to the maximum) give confidence sets for that location.

A general likelihood model is obtained by positing a formula for the signal attenuation and allowing for error. You won't get very far with a completely general formula (with an angle- and location-dependent attenuation function), so you'll have to simplify. For instance, you might consider a "universal" attenuation function, call it f, so that if the source strength at a WiFi location x equals a then the expected strength at another location y is given by

z(y; x) = a f(|y - x|).

For example, you might consider inverse-square attenuation for which f(t) = 1/t^2 provided the distance t is greater than some small threshold. As another simplification, you might take the strength reading z(y;x) at location y for the source at x to differ from the expected value by a normally-distributed error; assume all errors are independent; and assume they all have the same standard deviation (s). The contribution to the log likelihood of a strength reading z then becomes

L(y,x) = -[(z(y;x) - a f(|y-x|)^2 / s^2 + ln(s)]/2.

The log likelihood to be maximized is the double sum of L(y,x) over all locations y and all sources x. It is a function of the unknown locations, the unknown source intensities, and the unknown standard deviation of the errors. It's straightforward to find the optimal standard deviation and optimal source intensities (take partial derivatives, set those to zero, and solve), but for realistic attenuation functions f you have a non-linear problem for finding the locations. However, in your example it involves only 13 parameters so you should be able to dump it into, say, a multivariate Newton-Raphson optimizer and quickly get a good answer. (The statistics literature is full of methods to solve these kinds of equations.)

If you additionally assume the second device has proportionally greater sensitivity than the data-collection device, it will make little difference in the model I have proposed (because the signal strengths enter multiplicatively). In fact, if you let the errors scale with intensity (so they have standard deviation a *s* rather than s) the difference between devices should be inconsequential.

In order to keep this simple I have skipped over some statistical niceties, such as the fact that this is a multivariate prediction interval problem, not a confidence interval problem. If the amount of error is not great (i.e., s is small), the difference will not be of much consequence.