How do I check if a string is entirely made of the same substring?

Perhaps the fastest algorithmic approach is building a Z-function in linear time:

The Z-function for this string is an array of length n where the i-th element is equal to the greatest number of characters starting from the position i that coincide with the first characters of s.

In other words, z[i] is the length of the longest common prefix between s and the suffix of s starting at i.

C++ implementation for reference:

vector<int> z_function(string s) {
    int n = (int) s.length();
    vector<int> z(n);
    for (int i = 1, l = 0, r = 0; i < n; ++i) {
        if (i <= r)
            z[i] = min (r - i + 1, z[i - l]);
        while (i + z[i] < n && s[z[i]] == s[i + z[i]])
            ++z[i];
        if (i + z[i] - 1 > r)
            l = i, r = i + z[i] - 1;
    }
    return z;
}

JavaScript implementation
Added optimizations - building a half of z-array and early exit

function z_function(s) {
  var n = s.length;
  var z = Array(n).fill(0);
  var i, l, r;
  //for our task we need only a half of z-array
  for (i = 1, l = 0, r = 0; i <= n/2; ++i) {
    if (i <= r)
      z[i] = Math.min(r - i + 1, z[i - l]);
    while (i + z[i] < n && s[z[i]] == s[i + z[i]])
      ++z[i];

      //we can check condition and return here
     if (z[i] + i === n && n % i === 0) return true;
    
    if (i + z[i] - 1 > r)
      l = i, r = i + z[i] - 1;
  }
  return false; 
  //return z.some((zi, i) => (i + zi) === n && n % i === 0);
}
console.log(z_function("abacabacabac"));
console.log(z_function("abcab"));

Then you need to check indexes i that divide n. If you find such i that i+z[i]=n then the string s can be compressed to the length i and you can return true.

For example, for

string s= 'abacabacabac'  with length n=12`

z-array is

(0, 0, 1, 0, 8, 0, 1, 0, 4, 0, 1, 0)

and we can find that for

i=4
i+z[i] = 4 + 8 = 12 = n
and
n % i = 12 % 4 = 0`

so s might be represented as substring of length 4 repeated three times.

There’s a nifty little theorem about strings like these.

A string consists of the same pattern repeated multiple times if and only if the string is a nontrivial rotation of itself.

Here, a rotation means deleting some number of characters from the front of the string and moving them to the back. For example, the string hello could be rotated to form any of these strings:

hello (the trivial rotation)
elloh 
llohe 
lohel 
ohell

To see why this works, first, assume that a string consists of k repeated copies of a string w. Then deleting the first copy of the repeated pattern (w) from the front of the string and tacking it onto the back will give back the same string. The reverse direction is a bit trickier to prove, but the idea is that if you rotate a string and get back what you started with, you can apply that rotation repeatedly to tile the string with multiple copies of the same pattern (that pattern being the string you needed to move to the end to do the rotation).

Now the question is how to check whether this is the case. For that, there’s another beautiful theorem we can use:

If x and y are strings of the same length, then x is a rotation of y if and only if x is a substring of yy.

As an example, we can see that lohel is a rotation of hello as follows:

hellohello
   ^^^^^

In our case, we know that every string x will always be a substring of xx (it’ll appear twice, once at each copy of x). So basically we just need to check if our string x is a substring of xx without allowing it to match at the first or halfway character. Here’s a one-liner for that:

function check(str) {
    return (str + str).indexOf(str, 1) !== str.length;
}

Assuming indexOf is implemented using a fast string matching algorithm, this will run in time O(n), where n is the length of the input string.

Hope this helps!

You can do it by a capturing group and backreference. Just check it's the repetition of the first captured value.

function check(str) {
  return /^(.+)\1+$/.test(str)
}

console.log(check('aa')) //true
console.log(check('aaa')) //true
console.log(check('abcabcabc')) //true
console.log(check('aba')) //false
console.log(check('ababa')) //false

In the above RegExp:

^ and $ stands for start and end anchors to predict the position.
(.+) captures any pattern and captures the value(except \n).
\1 is backreference of first captured value and \1+ would check for repetition of captured value.

Regex explanation here

For RegExp debugging use: https://regex101.com/r/pqlAuP/1/debugger

Performance : https://jsperf.com/reegx-and-loop/13

How do I check if a string is entirely made of the same substring?

Tags:

Javascript

Algorithm

String

Related

Recent Posts