Finding longest overlapping ranges
I suggest you iterate your ranges only once, but keep in memory the current range being expanded, like so:
def overlaps(r1, r2): assert r1 <= r2, "Assume ranges sorted by first coordinate" return (r2 <= r1 <= r2) or (r1 <= r2 <= r1) ranges = [(1, 50), (45, 47), (49, 70), (75, 85), (84, 88), (87, 92)] def fuse_ranges(ranges): output_ranges =  curr_r = list(ranges) curr_overlap = False # Is the current range already overlapping? # Assuming it is sorted by starting coordinate. for r in ranges[1:]: if overlaps(curr_r, r): curr_overlap = True curr_r = max(curr_r, r) # Extend the end of the current range. else: if curr_overlap: output_ranges.append(curr_r) curr_overlap = False curr_r = list(r) if curr_overlap: output_ranges.append(curr_r) return output_ranges if __name__ == '__main__': print(fuse_ranges(sorted(ranges, key=lambda r: r)))
[[1, 70], [75, 92]]
Not sure my solution can be much less verbose than yours though...
I think you can sort your input by the start of the ranges, then iterate through them. At each item, it is either added to the current range (if the start is less than the end of the current range) or we yield out current range and begin accumulating a new range:
def overlaps(ranges): ranges = sorted(ranges) # If our inputs are garunteed sorted, we can skip this it = iter(ranges) try: curr_start, curr_stop = next(it) # overlaps = False # If we want to exclude output ranges not produced by overlapping input ranges except StopIteration: return for start, stop in it: if curr_start <= start <= curr_stop: # Assumes intervals are closed curr_stop = max(curr_stop, stop) # overlaps = True else: # if overlaps: yield curr_start, curr_stop curr_start, curr_stop = start, stop # overlaps = False # if overlaps: yield curr_start, curr_stop print(list(overlaps([(1, 50), (49, 70), (75, 85), (84, 88), (87, 92)]))) # [(1, 70), (75, 92)] print(list(overlaps([(20, 30), (5, 10), (1, 7), (12, 21)]))) # [(1, 10), (12, 30)]
Could be done using
from functools import reduce ranges = [(1, 50), (45, 47), (49, 70), (75, 85), (84, 88), (87, 92)] reducer = ( lambda acc, el: acc[:-1:] + [(min(*acc[-1], *el), max(*acc[-1], *el))] if acc[-1] > el else acc + [el] ) print(reduce(reducer, ranges[1::], [ranges]))
[(1, 70), (75, 92)]
Hard to put into words, but it uses
reduce to go through the ranges. If the last tuple in the range and the next provided overlap (
if acc[-1] > el), it creates a new range from the
(min, max) of both and then replaces this new combined range to what was behind it (
acc[:-1:] + [(min, max)]), otherwise simply adding the new range to the end (
acc + [el]).
Edit: After reviewing other answers, updated to take min/max of the two ranges compared instead of just first and last
you can use zip to group all the start values and end values of each range pair. If the start value is lower than the previous end value then there is an overlap so remove that start and end value. we are using an int to track which index in each low and high list we are looking the low index is always one higher than the high index.
#split the numbers in to the low and high part of each range #and set the index position for each of them ranges = [(1, 50), (49, 70), (75, 85), (84, 88), (87, 92)] low, high = [list(nums) for nums in zip(*ranges)] l, h = 1, 0 #Iterate over the ranges and remove when there is an overlap if no over lap move the pointers while l < len(low) and h < len(high): if low[l] < high[h]: del low[l] del high[h] else: l +=1 h +=1 #zip the low and high back into ranges new_ranges = list(zip(low, high)) print(new_ranges)
[(1, 70), (75, 92)]