Why we write lo+(hi-lo)/2 in binary search?

Suppose you are searching a 4000000000-element array using 32-bit unsigned int as indexes.

The first step made it appear as though the searched element, if present, would be in the top half. lo's value is 2000000000 and hi's is 4000000000.

hi + lo overflows and produces a value smaller than the intended 6000000000. It actually produces 6000000000-2³². As a result, (hi + lo) / 2 is a small value. It is not even between lo and hi!

From then on the search will be wrong (it will probably conclude that the element is absent even if it was there).

By contrast, even with the extreme values in this example, lo + (hi - lo) / 2 always computes an index halfway between hi and lo, as intended by the algorithm.

Mathematically speaking, they are equivalent.

In computer terms, mid=(hi+lo)/2 has fewer operations, but mid=lo+(hi-lo)/2 is preferred to avoid overflow.

Say the item you are searching are near the end of the array, then hi+lo is nearly 2*size. Since size can be almost as large as your maximum index, 2*size and thus hi+lo can overflow.

Why we write lo+(hi-lo)/2 in binary search?

Tags:

Algorithm

C++

Binary Search

Related

Recent Posts