Julia's most efficient way to choose longest array in array of arrays?

@Colin has provided a compact, convenient answer. However, if speed matters (op asked for most efficient way) this should be close to optimum

function findlongest(A)
    idx = 0
    len = 0
    @inbounds for i in 1:length(A)
        l = length(A[i])
        l > len && (idx = i; len=l)
    end
    return A[idx]
end

Note that this implementation would (presumably) be a really bad idea in Python :)

Quick benchmark:

julia> using BenchmarkTools

julia> A = [[1,2], [1,2,3,4,5,6], [1,2,3]]
3-element Array{Array{Int64,1},1}:
 [1, 2]            
 [1, 2, 3, 4, 5, 6]
 [1, 2, 3] 

julia> @btime findlongest(A);
  26.880 ns (0 allocations: 0 bytes)

julia> @btime A[indmax(length.(A))];
  9.813 μs (25 allocations: 1.14 KiB)

That's a ~365 times speedup for this example.

EDIT: Better benchmark (suggested in the comments)

julia> @btime findlongest($A);
  9.813 ns (0 allocations: 0 bytes)

julia> @btime $A[indmax(length.($A))];
  41.813 ns (1 allocation: 112 bytes)

The $ signs avoid setup allocations and times. Speedup ~4.

Quick explanation

  • for loops are fast in Julia, so why not use them
  • avoid allocation (length.(A) allocates a new array of integers)
  • a && b is shortcut for "if a then b"
  • @inbounds avoids bound checks for A[i]

UPDATE: For v1+ you'll need to replace indmax in this answer with argmax.

EDIT: Note, it is also worth checking out the other answer by @crstnbr

Consider the following example code:

julia> A = [[1,2], [1,2,3,4,5,6], [1,2,3]]
3-element Array{Array{Int64,1},1}:
 [1, 2]            
 [1, 2, 3, 4, 5, 6]
 [1, 2, 3]         

julia> length(A)
3

julia> length.(A)
3-element Array{Int64,1}:
 2
 6
 3

julia> indmax(length.(A))
2

julia> A[indmax(length.(A))]
6-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6

The first call to length gets the length of the outer vector in A, which is not what we want. In the second call, I use the broadcasting operator . so that I instead get the length of each of the inner vectors. In the indmax line, I'm finding the index of largest value in length.(A), ie the index of the longest inner vector. If you instead want to return the longest inner vector, you can just index into A using the result of the indmax line.

Tags:

Arrays

Max

Julia