Julia version of R's Match?

It sounds like you're looking for indexin (just as search fodder, this is also called ismember by Matlab). It is very slightly different: it returns a vector where the i'th element is the last index where v1[i] appears in v2.

julia> v1 = [8,6,7,11]; v2 = -10:10;
       idxs = indexin(v1, v2)
4-element Array{Int64,1}:
 19
 17
 18
  0

It returns zero for the index of an element in v1 that does not appear in v2. So you can "reconstruct" the parts of v1 that are in v2 simply by indexing by the nonzero indices:

julia> v2[idxs[idxs .> 0]]
3-element Array{Int64,1}:
 8
 6
 7

If you look at the implementation, you'll see that it uses a dictionary to store and look up the indices. This means that it only makes one pass over v1 and v2 each, as opposed to searching through v2 for every element in v1. It should be much more efficient in almost all cases.

If it's important to match R's behavior and return the first index, we can crib off the base implementation and just build the dictionary backwards so the lower indices overwrite the higher ones:

function firstindexin(a::AbstractArray, b::AbstractArray)
    bdict = Dict{eltype(b), Int}()
    for i=length(b):-1:1
        bdict[b[i]] = i
    end
    [get(bdict, i, 0) for i in a]
end

julia> firstindexin([1,2,3,4], [1,1,2,2,3,3])
4-element Array{Int64,1}:
 1
 3
 5
 0

julia> indexin([1,2,3,4], [1,1,2,2,3,3])
4-element Array{Int64,1}:
 2
 4
 6
 0

I don't think this exists out of the box, but as @Khashaa's comment (and Tim Holy's answer to the other question) points out, you should be able to come up with your own definition fairly quickly. A first attempt:

function matched(v1::Array, v2::Array)
  matched = zeros(length(v1))
  for i = 1:length(v1)
    matched[i] = findfirst(v2, v1[i])
  end
  return matched
end

(note that I called the function matched because match is defined in Base for string matching, if you wanted to extend it you'd have to import Base.match first). You could certainly make this faster applying some of the tricks from the Julia docs' performance section if you care about performance.
This function should be doing what you're looking for if I understand correctly, try it with e.g.

v1 = [rand(1:10) for i = 1:100]
v2 = [rand(1:10) for i = 1:100]
matched2(v1,v2)

Tags:

R

Julia