Get the second largest number in a list in linear time

You could use the heapq module:

>>> el = [20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7]
>>> import heapq
>>> heapq.nlargest(2, el)
[90.8, 74]

And go from there...

You could always use sorted

>>> sorted(numbers)[-2]
74

Since @OscarLopez and I have different opinions on what the second largest means, I'll post the code according to my interpretation and in line with the first algorithm provided by the questioner.

def second_largest(numbers):
    count = 0
    m1 = m2 = float('-inf')
    for x in numbers:
        count += 1
        if x > m2:
            if x >= m1:
                m1, m2 = x, m1            
            else:
                m2 = x
    return m2 if count >= 2 else None

(Note: Negative infinity is used here instead of None since None has different sorting behavior in Python 2 and 3 – see Python - Find second smallest number; a check for the number of elements in numbers makes sure that negative infinity won't be returned when the actual answer is undefined.)

If the maximum occurs multiple times, it may be the second largest as well. Another thing about this approach is that it works correctly if there are less than two elements; then there is no second largest.

Running the same tests:

second_largest([20,67,3,2.6,7,74,2.8,90.8,52.8,4,3,2,5,7])
=> 74
second_largest([1,1,1,1,1,2])
=> 1
second_largest([2,2,2,2,2,1])
=> 2
second_largest([10,7,10])
=> 10
second_largest([1,1,1,1,1,1])
=> 1
second_largest([1])
=> None
second_largest([])
=> None

Update

I restructured the conditionals to drastically improve performance; almost by a 100% in my testing on random numbers. The reason for this is that in the original version, the elif was always evaluated in the likely event that the next number is not the largest in the list. In other words, for practically every number in the list, two comparisons were made, whereas one comparison mostly suffices – if the number is not larger than the second largest, it's not larger than the largest either.

Try the solution below, it's O(n) and it will store and return the second greatest number in the second variable. UPDATE: I've adjusted the code to work with Python 3, because now arithmetic comparisons against None are invalid.

Notice that if all elements in numbers are equal, or if numbers is empty or if it contains a single element, the variable second will end up with a value of None - this is correct, as in those cases there isn't a "second greatest" element.

Beware: this finds the "second maximum" value, if there's more than one value that is "first maximum", they will all be treated as the same maximum - in my definition, in a list such as this: [10, 7, 10] the correct answer is 7.

def second_largest(numbers):
    minimum = float('-inf')
    first, second = minimum, minimum
    for n in numbers:
        if n > first:
            first, second = n, first
        elif first > n > second:
            second = n
    return second if second != minimum else None

Here are some tests:

second_largest([20, 67, 3, 2.6, 7, 74, 2.8, 90.8, 52.8, 4, 3, 2, 5, 7])
=> 74
second_largest([1, 1, 1, 1, 1, 2])
=> 1
second_largest([2, 2, 2, 2, 2, 1])
=> 1
second_largest([10, 7, 10])
=> 7
second_largest( [1, 3, 10, 16])
=> 10
second_largest([1, 1, 1, 1, 1, 1])
=> None
second_largest([1])
=> None
second_largest([])
=> None

Get the second largest number in a list in linear time

Tags:

Python

Performance

Related

Recent Posts