Lower bound for finding second largest element

The (tight) lower bound is $n + \lceil \lg n \rceil - 2$ (where $\lg n$ means $\log_2 n$).

I'll prove tightness first: that this can be achieved (apparently the idea is due to Lewis Carroll!). First find the maximum using a "tennis tournament" structure: first compare the $n$ elements in pairs, then compare the $n/2$ "winners" in pairs, and so on. (Unpaired elements get a bye to the next round.) Since every element except the winner loses exactly once, this takes $n-1$ comparisons. But now note that the second largest element must be one which lost to the winner, as it couldn't have been defeated by any other element. So you need to find the maximum among all the (up to) $\lceil \lg n \rceil$ elements that were defeated by the winner, and finding this maximum can be done in $\lceil \lg n \rceil - 1$.

We can prove this is a lower bound as well. Let the number of elements that lost to the maximum be $m$.

  • Firstly, you need to find the maximum, since one cannot be sure some element is the second maximum without knowing which element is the maximum. Further, for each element that lost to the maximum ($m$ of them), this comparison was useless in determining whether it was or not the second maximum. In other words, all elements other than the maximum must lose at least once, and all but one ($m-1$) of the elements that directly lost to the maximum must lose at least once more, no matter in which order the comparisons were made. So we need at least $n-1 + m-1 = n + m - 2$ comparisons.
  • To state the same thing differently: $n-2$ of the elements must be found less than the second-largest element — comparisons with the largest element do not help here — plus $m$ of them must lose directly to the maximum by definition, so we need at least $n - 2 + m = n + m - 2$ comparisons.

This proves the $n + \lceil \lg n\rceil - 2$ lower bound if we show that $m \ge \lceil \lg n \rceil$, i.e., an adversary can make sure you always have at least $\lceil \lg n\rceil$ elements that lost to the winner. This is proved here or (with a bug pointed out by @noedne below) here; an argument goes as follows: the adversary keeps for each element a “weight”, which is either $0$ if the element has ever lost a comparison (and is therefore known not to be the maximum element) or else the number of elements known to be less than or equal to it. Whenever a query's result is already “known”, the adversary gives that answer, else it declares the one which has a larger weight the winner (breaking ties arbitrarily). The maximum is determined when some weight grows to $n$, and with each comparison any weight can only double, so the number of comparisons needs to be at least $\lceil \lg n\rceil$.


The definitive answer for any $k$ (not just the second largest) is discussed in these notes. In short, if $V(n,k)$ is the number of comparisons needed to determine the kth largest element in a set of size $n$, then $$ V(n,k) \ge n + R(n,k) - 2\sqrt{R(n,k)} $$ where $$ R(n,k) = \log \binom{n}{k} - \log (n-k+1) + 3$$


An observation: it will always require at least n-1. Draw a graph. If we test two nodes, then we draw an arrow from the smaller to the larger. We know a is greater than b if and only if there is a path from a to b along the arrows. We will always need at least n-1 arrows to connect the graph and hence n-1 comparisons to find a lower bound. It is easy enough to see that this isn't the minimal upper bound though - we will only know the second highest element if our arrows form a line (ie. we can go from a "start" node to an "end" node by passing through each node and each arrow).

Another fact - there are algorithms for finding the kth largest number in guaranteed linear time. That could give you some information, but there could always be a special algorithm for finding the second largest element.

I don't have a solution for the problem yet. But one thing we can try is breaking the problem into two groups and combining them. Suppose we have two groups, A and B, where we know both the largest and second largest elements, but no knowledge of any relationships between the groups. Then it takes two arrows to find the second highest element. First we compare the highest elements in A and B. Without loss of generality assume A has the highest element. Then we would compare B with the highest element in A. Showing that the groups can't be combined with a single comparison is trivial. This gives us a recursion relation for the time taken to solve by combining groups which have been solved separately: $f(n)=min f(k)+f(n-k)+2$ where $ 0 < k < n $.