Java algorithm for find intersection between intervals

The intersection of two intervals [s1, s2] and [t1, t2] is empty if and only if:

    t2 < s1 or s2 < t1

So for two intervals to check if the two are intersecting or not you need to do only the test above.

Also once you know that s2 < t1 then there is no point to continue further on the list that brought t1 since the larger intervals will never intersect, which means you should move on.

Naive Psuedo Algorithm:

   given [s1, s2]
   for each list [t1, t2, ... t(n)] in search_lists
        for each interval [t(x), t(x+1)] from [t1, t2, ... t(n] (x goes from 0 to n-1)
           if t(x+1) < s1
              continue
           if s2 < t(x)
              break
           saveInterval()

This can be improved quite a bit to really use the fact that [t1, t2, .. , t(n)] is sorted.

first note that [s1, s2] will intersect with [t(x), t(x+1)] iff t(x+1) >= s1 and s2 >= t(x)

However

if t(x) >= s1 then for every h>0      `t(x+h) >= s1` 

also

if s2 >= t(x) then for every h>0  `s2 >= t(x-h)`

so if we find the smallest i so that t(i+1)>=s1 then all the intervals from [t(i), t(i+1)] on-wards meet the first condition of intersection; i.e. ([t(i+1), t(i+2)] , [t(i+2), t(i+3)] ...)

and if we find the largest j so that s2 >= t(j-1) then all the intervals from [t(j-1), t(j)] backwards meet the second condition . i.e. ([t(j-2), t(j-1)], [t(j-3), t(j-2)] ...)

All the intervals between i and j meet both criteria and only them.

So the final algorithm is:

given [s1, s2]
for each list [t1, t2, ... t(n)] in search_lists
    find the smallest i such that t(i+1)>=s1  
    find the biggest  j such that s2>= t(j-1)

    if j>i then all the intervals between `{t(i)... t(j)}` intersect with [s1, s2]
    otherwise there is no intersection.       

Since {t1, t2, t3...t(n)} is sorted we can use binary search to find the indices i and j efficiently

EDIT2:

The intersection of [s1,s2] and [t1, t2] is:
[max(s1, t1), min(s2,t2)]

the sizes are: L1 = s2-s1 L2 = t2-t1 L3 = min(s2,t2) - max(s1,t1)

The score you are looking for is: L3/ min(L2, L1) a score between 0 and 1.

(min(s2,t2) - max(s1,t1)) / ( min(s2-s1, t2-t1) )

The cost of calculating this is 3 tests, 3 minus operations and one floating point operation. But I am assuming the intervals are valid and the intersection exists otherwise more tests are needed. (s2>s2, t2>t1 and min(s2,t2) > max(s1,t1). The final test is the same iff condition for intersection from the discussion above.


First and foremost, your data structure is confusing - if you're trying to talk about discrete intervals of time, structure your data like so; for instance int[][] where the inner array is always length 2, so your t1 becomes:

int[][] t1 = {{3,6}, {6,9}, {9,10}};

Using the right structure will probably help you simplify your algorithm and make it easier to work with.


Better than properly structured arrays, however, would be to use a dedicated type to represent these intervals, such that you could pass around List<Interval> objects and do some sort of contains check on them. But don't reinvent the wheel. The awesome Guava library provides a robust Range class that you can use. Even better though, it also provides RangeSet and RangeMap classes, which let you easily do the things you're talking about. See also their Ranges Explained article which covers the basics.

Note that you could pretty easily transform your current design into Range objects internally, if you can't redesign the array structure externally.

Having tried at one point to build my own IntervalSet class, let me tell you that it's a tricky problem to get right and you'll save yourself a lot of headaches using their well designed and highly tested range utilities.

Here's the way that I would do what you're describing with Guava - notice that we avoid even needing to think about the math involved - Range does the right thing for us:

/**
 * Given a Range and an group of other Ranges, identify the set of ranges in
 * the group which overlap with the first range.  Note this returns a Set<Range>
 * not a RangeSet, because we don't want to collapse connected ranges together. 
 */
public static <T extends Comparable<?>> Set<Range<T>>
        getIntersectingRanges(Range<T> intersects, Iterable<Range<T>> ranges) {
    ImmutableSet.Builder<Range<T>> builder = ImmutableSet.builder();
    for(Range<T> r : ranges) {
        if(r.isConnected(intersects) && !r.intersection(intersects).isEmpty()) {
            builder.add(r);
        }
    }
    return builder.build();
}

/**
 * Given a 2-length array representing a closed integer range, and an array of
 * discrete instances (each pair of which therefore represents a closed range)
 * return the set of ranges overlapping the first range.
 * Example: the instances array [1,2,3,4] maps to the ranges [1,2],[2,3],[3,4].
 */
public static Set<Range<Integer>> getIntersectingContinuousRanges(int[] intersects,
        int[] instances) {
    Preconditions.checkArgument(intersects.length == 2);
    Preconditions.checkArgument(instances.length >= 2);
    ImmutableList.Builder<Range<Integer>> builder = ImmutableList.builder();
    for(int i = 0; i < instances.length-1; i++) {
        builder.add(Range.closed(instances[i], instances[i+1]));
    }
    return getIntersectingRanges(Range.closed(intersects[0], intersects[1]),
                                 builder.build());
}

Using your examples:

public static void main(String[] args)
{
    int[] interval = {5,10};
    int[] t1 = {3,6,9,10};
    int[] t2 = {2,4,5,6,10};

    System.out.println(getIntersectingContinuousRanges(interval, t1));
    System.out.println(getIntersectingContinuousRanges(interval, t2));
}

The above prints out:

[[3‥6], [6‥9], [9‥10]]
[[4‥5], [5‥6], [6‥10]]

Tags:

Algorithm

Java