Bad Float Counts? (Length[xs] != Total@Counts[xs])

Update:

Counts[data] seems to be equivalent to AssociationThread @@ Transpose@Tally[data]. The OP's problem arises because in constructing an Association, keys are checked for uniqueness and later entries with a duplicate key replace earlier entries. (Simple example: Association[{1. -> 1, 1. -> 2}].) Uniqueness is determined by MatchQ, I believe, which has problems discussed below in the original answer. The problem with SameQ not strictly being an equivalence relation due nontransitivity is still an issue. This update principally clarifies the role of forming an association: It discards entries of the Tally with duplicate keys, which results in an undercount.

Original answer:

Working with floating-point numbers is tricky. I'd say the most important, common issue is that rounding errors lead to different but close numbers that users wish would be treated the same. Introductory programming courses teach that comparing floats should be done with something like Abs[x - y] < $MyTolerance. In Mathematica similar (but relative) tolerances are built into SameQ and Equal, which are controlled by the the internal system parameters Internal`$SameQTolerance and Internal`$EqualTolerance respectively (see also this; this question has similar issues as the OP). Perhaps less well known is that MatchQ has a small tolerance like SameQ but is slightly more restrictive. The most important difference is that MatchQ is transitive but SameQ is not.

These functions play various roles in pattern-matching and comparing numbers, and their issues affect functions like Counts[] when applied to floating-point data. When constructing classes from data, as in Counts[], some reflection should lead one to think that using a comparison that is an equivalence relation, and therefore transitive, would be desirable. And if not transitive, it should be at least "locally transitive" on the actual data being used. By "locally transitive," I mean that the relation is transitive when restricted to the data set, even if it is not transitive on all floating-point numbers.

This seems to be the problem with the OP's example: SameQ is not transitive on xs. I say "seems" because I cannot check the internal workings of Counts[]. It's possible that SameQ is used to construct the keys and MatchQ to tally the counts. It seems to me that Counts[] uses SameQ and CountsBy[] uses MatchQ to construct the keys, however the counting is done. Since SameQ is not transitive, the order that the data is processed can make a difference. Nevertheless, the problem can be fixed by making SameQ locally transitive on xs.

The reason this comes up in this example is that in constructing the data xs, the rounding drift amounts to 2 bits (2 ulp), which is one too many for SameQ to be locally transitive.

Here is the best fix (that fixes SameQ -- Chip Hurst points out that the tolerance can be as small as 0.55, which is close to the value of Log10[4.] = 0.60.. that would be predicted by the observed rounding error):

Block[{Internal`$SameQTolerance = Internal`$EqualTolerance}, 
 Total@Counts[xs]]
(*  100  *)

Another fix is to use MatchQ via CountsBy[]:

Total@CountsBy[xs, # &]
(*  100  *)

The CountsBy[] association has several keys for equal numbers, but it does have the correct total. The first solution seems better because it has one key for each cluster of equal numbers. (It likely that in some applications one would like distinct floating-point numbers to have distinct entries; then something like CountsBy[xs, ToString@*FullForm] would do the trick.)

Appendix

Some pictures showing the properties of SameQ and MatchQ on consecutive machine-precision floats:

Block[{x1},
 x1 = Table[1 + n*$MachineEpsilon, {n, 0, 5}];
 {Outer[Boole@*SameQ, x1, x1] // MatrixPlot[#, PlotLabel -> SameQ] &,
   Outer[Boole@*MatchQ, x1, x1] // MatrixPlot[#, PlotLabel -> MatchQ] &} //
     GraphicsRow
 ]

enter image description here


Total@Counts@Chop@xs  

100

CountsBy[xs, N]

enter image description here

CountsBy[xs, N] // Total

100

Maybe the precision caused the problem during counting.

keys = Keys@Counts[xs];
res = xs /. Thread[keys -> ConstantArray[0, Length@keys]]//FullForm

enter image description here