Max Distance between 2 points in a data set and identifying the points

All the other answers here take O(N^2) time and space. That's terrible.

Instead, recognize that the two farthest points in a dataset lie on the set's convex hull. Since hulls can be computed in O(N log N) time in 3D this forms an efficient prefilter. In my testing on a dataset with 16,000,000 points the convex hull contained only 420 points.

The farthest points on the hull can be found in O(H log H) time, though this is harder to implement in Python. So we can instead at that point fall back to the O(N^2) cdist solutions.

import numpy as np
from scipy.spatial import ConvexHull
from scipy.spatial.distance import cdist

N = 16000000

# Find a convex hull in O(N log N)
points = np.random.rand(N, 3)   # N random points in 3-D

# Returned 420 points in testing
hull = ConvexHull(points)

# Extract the points forming the hull
hullpoints = points[hull.vertices,:]

# Naive way of finding the best pair in O(H^2) time if H is number of points on
# hull
hdist = cdist(hullpoints, hullpoints, metric='euclidean')

# Get the farthest apart points
bestpair = np.unravel_index(hdist.argmax(), hdist.shape)

#Print them
print([hullpoints[bestpair[0]],hullpoints[bestpair[1]]])

This will give you pairs of indexes of the points in A that are furthest apart as temp_b. Note that it will include both directions such as (8, 222) and (222, 8) in the list. I will leave it to you remove them if you want.

import numpy as np
import random as rd
from scipy.spatial.distance import pdist
from scipy.spatial.distance import squareform

A = np.array([np.array([rd.randint(-5,5) for x in range(3)]) for y in range(500)])
D=pdist(A)
D=squareform(D)
temp = np.where(D == D.max())
temp_b = zip(temp[0],temp[1])

If you have scipy, you have exact equivalent for most of matlab core functions :

from numpy import random, nanmax, argmax, unravel_index
from scipy.spatial.distance import pdist, squareform

A = random.randint(-5,5, (500,3))
D = pdist(A)
D = squareform(D);
N, [I_row, I_col] = nanmax(D), unravel_index( argmax(D), D.shape )

You can also get it in pure python using itertools :

from itertools import combinations
from random import randint

A = [[randint(-5,5) for coord in range(3)] for point in range(500)]

def square_distance(x,y): return sum([(xi-yi)**2 for xi, yi in zip(x,y)])    

max_square_distance = 0
for pair in combinations(A,2):
    if square_distance(*pair) > max_square_distance:
        max_square_distance = square_distance(*pair)
        max_pair = pair