How to remove duplicates from set of machine precision 2D points?

One possible solution is using Split, which obeys Internal`$SameQTolerance

r = RandomReal[1, {10000000, 2}];
r = RandomChoice[r, 10000000];

Split[Sort[r]][[All, 1]]; // AbsoluteTiming

(* ==> {12.756712, Null} *)

This is about 3-4 times slower than Union:

Union[r]; // AbsoluteTiming

(* ==> {4.306363, Null} *)

How about a compiled solution. I think this solves the transitivity issues with Split.

deleteDuplicatesC = 
  Compile[{{v, _Real, 2}}, 
   Block[{i, len = Length[v], output = Table[1, {i, len}]},
    Do[If[Compile`GetElement[v, i] == Compile`GetElement[v, i - 1], 
      output[[i]] = 0], {i, 2, len}];
    output], 
   RuntimeOptions -> {"Speed", "CompareWithTolerance" -> True}, 
   CompilationTarget -> "C"];

r = RandomReal[1, {10000000, 2}];
r = RandomChoice[r, 10000000];

selected = 
   With[{sorted = Sort[r]}, 
    Pick[sorted, deleteDuplicatesC[sorted], 1]]; // AbsoluteTiming

(* {4.033231, Null} *)

Looks like it works:

a = 1.;
b = a + $MachineEpsilon;
test = {{a, a}, {b, b}, {b, a}, {a, b}};
Pick[test, deleteDuplicatesC[test], 1]

(* {{1., 1.}} *)

Here's another compiled implementation. It's only very slightly faster than the code of s0rce, which is due to this version making only one array access (rather than two) per inner loop iteration. It also uses the Internal`Bag, which may be advantageous for memory consumption in case there are many duplicates.

It must be said that this is still not as fast as Union, but at least it acknowledges Internal`$EqualTolerance while being faster than Split. The value of Internal`$EqualTolerance is actually hard-coded into the bytecode on compilation, so it will be necessary to recompile if a different tolerance is required.

compiledUnion = Compile[{{r, _Real, 2}},
   Block[{
     sorted = Sort[r], output,
     seen, current
    },
    output = Internal`Bag[seen = First[sorted], 1];
    Do[
     If[i != seen, Internal`StuffBag[output, seen = i, 1]],
     {i, sorted}
    ];
    Partition[Internal`BagPart[output, All], Length[seen]]
   ],
   RuntimeOptions -> {"Speed", "CompareWithTolerance" -> True},
   CompilationTarget -> "C"
  ];

Test case, with the default value of Internal`$EqualTolerance == 7 Log[2]/Log[10]:

compiledUnion[{
  {1., 1., 1.}, {1. + $MachineEpsilon, 1., 1.}, {1., 1. + $MachineEpsilon, 1.},
  {2., 1., 1.}
 }]
(* -> {{1., 1., 1.}, {2., 1., 1.}} *)

Performance comparison, in ascending order of run-time:

r = RandomReal[1, {10000000, 2}];
r = RandomChoice[r, 10000000];

DeleteDuplicates[r]; (* 1.813 seconds; incorrect result (no tolerance) *)
DeleteDuplicates[r, Equal]; (* same timing and result--incorrect special-casing *)
Sort[r]; (* 4.297 seconds; base case for Union-like approaches *)
Union[r]; (* 4.609 seconds; incorrect result (no tolerance) *)

compiledUnion[r]; (* 5.875 seconds *)
With[{sorted = Sort[r]}, Pick[sorted, deleteDuplicatesC[sorted], 1]]; (* 6.109 seconds *)

Split[Sort[r]][[All, 1]]; (* 11.953 seconds; unpacks + additional memory overhead *)
DeleteDuplicates[r, Equal[##] &]; (* 294.8 days; unpacks *)