Manacher's algorithm (algorithm to find longest palindrome substring in linear time)

I agree that the logic isn't quite right in the explanation of the link. I give some details below.

Manacher's algorithm fills in a table P[i] which contains how far the palindrome centered at i extends. If P[5]=3, then three characters on either side of position five are part of the palindrome. The algorithm takes advantage of the fact that if you've found a long palindrome, you can fill in values of P on the right side of the palindrome quickly by looking at the values of P on the left side, since they should mostly be the same.

I'll start by explaining the case you were talking about, and then I'll expand this answer as needed.

R indicates the index of the right side of the palindrome centered at C. Here is the state at the place you indicated:

C=11
R=20
i=15
i'=7
P[i']=7
R-i=5

and the logic is like this:

if P[i']<=R-i:  // not true
else: // P[i] is at least 5, but may be greater

The pseudo-code in the link indicates that P[i] should be greater than or equal to P[i'] if the test fails, but I believe it should be greater than or equal to R-i, and the explanation backs that up.

Since P[i'] is greater than R-i, the palindrome centered at i' extends past the palindrome centered at C. We know the palindrome centered at i will be at least R-i characters wide, because we still have symmetry up to that point, but we have to search explicitly beyond that.

If P[i'] had been no greater than R-i, then the largest palindrome centered at i' is within the largest palindrome centered at C, so we would have known that P[i] couldn't be any larger than P[i']. If it was, we would have a contradiction. It would mean that we would be able to extend the palindrome centered at i beyond P[i'], but if we could, then we would also be able to extend the palindrome centered at i' due to the symmetry, but it was already supposed to be as large as possible.

This case is illustrated previously:

C=11
R=20
i=13
i'=9
P[i']=1
R-i=7

In this case, P[i']<=R-i. Since we are still 7 characters away from the edge of the palindrome centered at C, we know that at least 7 characters around i are the same as the 7 characters around i'. Since there was only a one character palindrome around i', there is a one character palindrome around i as well.

j_random_hacker noticed that the logic should be more like this:

if P[i']<R-i then
  P[i]=P[i']
else if P[i']>R-i then
  P[i]=R-i
else P[i]=R-i + expansion

If P[i'] < R-i, then we know that P[i]==P[i'], since we're still inside the palindrome centered at C.

If P[i'] > R-i, then we know that P[i]==R-i, because otherwise the palindrome centered at C would have extended past R.

So the expansion is really only necessary in the special case where P[i']==R-i, so we don't know if the palindrome at P[i] may be longer.

This is handled in the actual code by setting P[i]=min(P[i'],R-i) and then always expanding. This way of doing it doesn't increase the time complexity, because if no expansion is necessary, the time taken to do the expansion is constant.


I have found one of the best explanation so far at the following link:

http://tarokuriyama.com/projects/palindrome2.php

It also has a visualization for the same string example (babcbabcbaccba) used at the first link mentioned in the question.

Apart from this link, i also found the code at

http://algs4.cs.princeton.edu/53substring/Manacher.java.html

I hope it will be helpful to others trying hard to understand the crux of this algorithm.


The Algorithm on this site seems understandable to the certain point http://www.akalin.cx/longest-palindrome-linear-time

To understand this particular approach the best is to try to solving the problem on paper and catching the tricks you can implement to avoid checking for the palindrome for each possible center.

First answer yourself - when you find a palindrome of a given length, let's say 5 - can't you as a next step just jump to the end of this palindrome (skipping 4 letters and 4 mid-letters)?

If you try to create a palindrome with length 8 and place another palindrome with length > 8, which center is in the right side of the first palindrome you will notice something funny. Try it out: Palindrome with length 8 - WOWILIKEEKIL - Like + ekiL = 8 Now in most cases you would be able to write down the place between two E's as a center and number 8 as the length and jump after the last L to look for the center of the bigger palindrome.

This approach is not correct, which the center of bigger palindrome can be inside ekiL and you would miss it if you would jump after the last L.

After you find LIKE+EKIL you place 8 in the array that these algos use and this looks like:

[0,1,0,3,0,1,0,1,0,3,0,1,0,1,0,1,8]

for

[#,W,#,O,#,W,#,I,#,L,#,I,#,K,#,E,#]

The trick is that you already know that most probably next 7 (8-1) numbers after 8 will be the same as on the left side, so the next step is to automatically copy 7 numbers from left of 8 to right of 8 keeping in mind they are not yet final. The array would look like this

[0,1,0,3,0,1,0,1,0,3,0,1,0,1,0,1,8,1,0,1,0,1,0,3] (we are at 8)

for

[#,W,#,O,#,W,#,I,#,L,#,I,#,K,#,E,#,E,#,K,#,I,#,L]

Let's make an example, that such jump would destroy our current solution and see what we can notice.

WOWILIKEEKIL - lets try to make bigger palindrome with the center somewhere within EKIL. But its not possible - we need to change word EKIL to something that contain palindrome. What? OOOOOh - thats the trick. The only possibility to have a bigger palindrome with the center in the right side of our current palindrome is that it is already in the right (and left) side of palindrome.

Let's try to build one based on WOWILIKEEKIL We would need to change EKIL to for example EKIK with I as a center of the bigger palindrome - remember to change LIKE to KIKE as well. First letters of our tricky palindrome will be:

WOWIKIKEEKIK

as said before - let the last I be the center of the bigger pallindrome than KIKEEKIK:

WOWIKIKEEKIKEEKIKIW

let's make the array up to our old pallindrom and find out how to laverage the additional info.

for

[_ W _ O _ W _ I _ K _ I _ K _ E _ E _ K _ I _ K _ E _ E _ K _ I _ K _ I _ W ]

it will be [0,1,0,3,0,1,0,1,0,3,0,3,0,1,0,1,8

we know that the next I - a 3rd will be the longest pallindrome, but let's forget about it for a bit. lets copy the numbers in the array from the left of 8 to the right (8 numbers)

[0,1,0,3,0,1,0,1,0,3,0,3,0,1,0,1,8,1,0,1,0,3,0,3]

In our loop we are at between E's with number 8. What is special about I (future middle of biggest pallindrome) that we cannot jump right to K (the last letter of currently biggest pallindrome)? The special thing is that it exceeds the current size of the array ... how? If you move 3 spaces to the right of 3 - you are out of array. It means that it can be the middle of the biggest pallindrome and the furthest you can jump is this letter I.

Sorry for the length of this answer - I wanted to explain the algorythm and can assure you - @OmnipotentEntity was right - I understand it even better after explaining to you :)