Interview: Remove Loop in linked list - Java

Solution 1 - courtesy of Career Cup and "Cracking the Coding Interview" book:

public static LinkedListNode findStartOfLoop(LinkedListNode head) {
    LinkedListNode n1 = head;
    LinkedListNode n2 = head; 

    // find meeting point using Tortoise and Hare algorithm
    // this is just Floyd's cycle detection algorithm
    while (n2.next != null) { 
        n1 = n1.next; 
        n2 = n2.next.next; 
        if (n1 == n2) { 
            break; 
        }
    }

    // Error check - there is no meeting point, and therefore no loop
    if (n2.next == null) {
        return null;
    }

    /* Move n1 to Head. Keep n2 at Meeting Point.  Each are k steps
    /* from the Loop Start. If they move at the same pace, they must
     * meet at Loop Start. */
    n1 = head; 
    while (n1 != n2) { 
        n1 = n1.next; 
        n2 = n2.next; 
    }
    // Now n2 points to the start of the loop.
    return n2;
}

The explanation for this solution is straight from the book:

If we move two pointers, one with speed 1 and another with speed 2, they will end up meeting if the linked list has a loop. Why? Think about two cars driving on a track; the faster car will always pass the slower one!

The tricky part here is finding the start of the loop. Imagine, as an analogy, two people racing around a track, one running twice as fast as the other. If they start off at the same place, when will they next meet? They will next meet at the start of the next lap.

Now, let’s suppose Fast Runner had a head start of k meters on an n step lap. When will they next meet? They will meet k meters before the start of the next lap. (Why? Fast Runner would have made k + 2(n - k) steps, including its head start, and Slow Runner would have made n - k steps Both will be k steps before the start of the loop ).

Now, going back to the problem, when Fast Runner (n2) and Slow Runner (n1) are moving around our circular linked list, n2 will have a head start on the loop when n1 enters. Specifically, it will have a head start of k, where k is the number of nodes before the loop. Since n2 has a head start of k nodes, n1 and n2 will meet k nodes before the start of the loop.

So, we now know the following:

  1. Head is k nodes from LoopStart (by definition)
  2. MeetingPoint for n1 and n2 is k nodes from LoopStart (as shown above)

Thus, if we move n1 back to Head and keep n2 at MeetingPoint, and move them both at the same pace, they will meet at LoopStart

Solution 2 - courtesy of me :)

public static LinkedListNode findHeadOfLoop(LinkedListNode head) {

    int indexer = 0;
    Map<LinkedListNode, Integer> map = new IdentityHashMap<LinkedListNode, Integer>();
    map.put(head, indexer);
    indexer++;

    // start walking along the list while putting each node in the HashMap
    // if we come to a node that is already in the list, 
    // then that node is the start of the cycle 
    LinkedListNode curr = head;
    while (curr != null) {

        if (map.containsKey(curr.next)) {
            curr = curr.next;
            break;
        }
        curr = curr.next;
        map.put(curr, indexer);
        indexer++;
    }
    return curr;
}

I hope this helps.
Hristo


There are two parts to this problem:

  1. Detect if there is a loop in the list
  2. Identify the start of the loop

Once you know where the loop starts, it's easy to identify the last element in the list since it's the element in the list following the start of the loop that ends up pointing back to the start of the loop. It is then trivial to set the next pointer/reference of this element to null to correct the cyclic link list (not circular linked list which is where the last elements points back to the first - this would be a specific instance of cyclic lists).

  1. Floyd's cycle detect algorithm, also called the tortoise and hare algorithm as it involves using two pointers/references that move at different speeds, is one way of detecting the cycle. If there is a cycle, the two pointers (say p1 and p2) will end up pointing to the same element after a finite number of steps. Interestingly, it can be proved that the element at which they meet will be the same distance to the start of the loop (continuing to traverse the list in the same, forward direction) as the start of the loop is to the head of the list. That is, if the linear part of the list has k elements, the two pointers will meet inside the loop of length m at a point m-k from the start of the loop or k elements to the 'end' of the loop (of course, it's a loop so it has no 'end' - it's just the 'start' once again). And that gives us a way to find the start of the loop:

  2. Once a cycle has been detected, let p2 remain pointing to the element where the loop for the step above terminated but reset p1 so that it's pointing back to the head of the list. Now, move each pointer one element at a time. Since p2 began inside the loop, it will continue looping. After k steps (equal to the distance of the start of the loop from the head of the list), p1 and p2 will meet again. This will give you a reference to the start of the loop.

  3. It is now easy to set p1 (or p2) to point to the element starting the loop and traverse the loop until p1 ends up pointing back to the starting element. At this point p1 is referencing the 'last' element list and it's next pointer can be set to null.


Here's some quick and dirty Java code assuming a linked list of Nodes where a Node has a next reference. This could be optimized but it should give you the basic idea:

Node slow, fast, start;
fast = slow = head;

//PART I - Detect if a loop exists
while (true)
{
    // fast will always fall off the end of the list if it is linear
    if (fast == null || fast.next == null)
    {
        // no loop
        return;
    }
    else if (fast == slow || fast.next == slow)
    {
        // detected a loop
        break;
    }
    else
    {
        fast = fast.next.next; // move 2 nodes at at time
        slow = slow.next; // move 1 node at a time
    }
}

//PART II - Identify the node that is the start of the loop
fast = head; //reset one of the references to head of list

//until both the references are one short of the common element which is the start of the loop
while(fast.next != slow.next) 
{
    fast = fast.next;
    slow = slow.next;
}

start = fast.next;

//PART III - Eliminate the loop by setting the 'next' pointer 
//of the last element to null
fast = start;
while(fast.next != start)
{
    fast = fast.next;
}

fast.next = null; //break the loop

This explanation might help the why behind Part II:

Assume the length of the cycle is M, and the length of the rest of the linked list is L. Let's figure out what is the position in the cycle when t1/t2 first meet?

Define the first node in the cycle is position 0, following the links we have position 1, 2,..., up to M-1. ( when we walk in the cycle, our current position is (walk_length) mod M, right?) Suppose t1/t2 first meet at position p, then their travel time are the same, (L+k1*M+p)/v = (L+k2*M+p)/2v for some k1

So it concludes that if t1 start from p, t2 start from head and move at the same speed, then will grantee to meet at position 0, the first node of the cycle. QED.

More references:

  • http://www.quora.com/How-does-Floyds-cycle-finding-algorithm-work
  • Explain how finding cycle start node in cycle linked list work?
  • Proof of detecting the start of cycle in linked list
  • Hristo's answer to this question on this page also quotes an explanation from an interview book