Why do NaN values make min and max sensitive to order?

Is min sensitive to input order?

Yes.

https://docs.python.org/3/library/functions.html#min

"If multiple items are minimal, the function returns the first one encountered."

The documentation does not specify exactly how "minimal" is defined in the face of items that don't have a consistent order, but it's likely that min is based on looping over the elements and using the < operator to determine if the new element is smaller than the smallest item found so-far.

To confirm this hypothesis we can read the source code (search for builtin_min and min_max in https://github.com/python/cpython/blob/c96d00e88ead8f99bb6aa1357928ac4545d9287c/Python/bltinmodule.c ), it's slightly confusing because the implementations for min and max are combined and the variable names seem to be based on it being a max function but it's not too hard to follow.

And it does indeed loop through the elements in order and performs the comparison with a call to PyObject_RichCompareBool with an "opid" of Py_LT which is the C API equivalent of the python < operator.

Comparisons between NaN and numbers return false, so in a list containing numbers and NaNs if there is a NaN in the first position it will be considered the minimum as no number will be "less than" it. On the other hand, if the NaN is not in the first position then it will be effectively skipped over as it is not "less than" any number.


Yes nan breaks proper ordering, because it always compares as False. A lot of things with nan are inconsistent:

In [2]: 3.0 < float('nan')
Out[2]: False

In [3]: float('nan') < 3.0
Out[3]: False

In [4]: float('nan') == 3.0
Out[4]: False

min and max can only give you consistent results of you are working with well-defined orderings, which numeric types are not if you can have nan

Tags:

Python

Numpy

Nan