Find the substring avoiding the use of recursive function

We can see that the string represented by x(k) grows exponentially in length with increasing k:

len(x(1)) == 3
len(x(k)) == len(x(k-1)) * 2 + 3

So:

len(x(k)) == 3 * (2**k - 1)

For k equal to 100, this amounts to a length of more than 1030. That's more characters than there are atoms in a human body!

Since the parameters s and t will take (in comparison) a tiny, tiny slice of that, you should not need to produce the whole string. You can still use recursion though, but keep passing an s and t range to each call. Then when you see that this slice will actually be outside of the string you would generate, then you can just exit without recursing deeper, saving a lot of time and (string) space.

Here is how you could do it:

def getslice(k, s, t):
    def recur(xsize, s, t):
        if xsize == 0 or s >= xsize or t <= 0:
            return ""
        smaller = (xsize - 3) // 2
        return ( ("1" if s <= 0 else "")
               + recur(smaller, s-1, t-1)
               + ("2" if s <= smaller+1 < t else "")
               + recur(smaller, s-smaller-2, t-smaller-2)
               + ("3" if t >= xsize else "") )
    return recur(3 * (2**k - 1), s, t)

This doesn't use any caching of x(k) results... In my tests this was fast enough.


This is an interesting problem. I'm not sure whether I'll have time to write the code, but here's an outline of how you can solve it. Note: see the better answer from trincot.

As discussed in the comments, you cannot generate the actual string: you will quickly run out of memory as k grows. But you can easily compute the length of that string.

First some notation:

f(k) : The generated string.
n(k) : The length of f(k).
nk1  : n(k-1), which is used several times in table below.

For discussion purposes, we can divide the string into the following regions. The start/end values use standard Python slice numbering:

Region | Start         | End           | Len | Subtring | Ex: k = 2
-------------------------------------------------------------------
A      | 0             | 1             | 1   | 1        | 0:1  1
B      | 1             | 1 + nk1       | nk1 | f(k-1)   | 1:4  123
C      | 1 + nk1       | 2 + nk1       | 1   | 2        | 4:5  2
D      | 2 + nk1       | 2 + nk1 + nk1 | nk1 | f(k-1)   | 5:8  123
E      | 2 + nk1 + nk1 | 3 + nk1 + nk1 | 1   | 3        | 8:9  3

Given k, s, and t we need to figure out which region of the string is relevant. Take a small example:

k=2, s=6, and t=8.

The substring defined by 6:8 does not require the full f(k). We only need
region D, so we can turn our attention to f(k-1).

To make the shift from k=2 to k=1, we need to adjust s and t: specifically,
we need to subtract the total length of regions A + B + C. For k=2, that
length is 5 (1 + nk1 + 1).

Now we are dealing with: k=1, s=1, and t=3.

Repeat as needed.

Whenever k gets small enough, we stop this nonsense and actually generate the string so we can grab the needed substring directly.

It's possible that some values of s and t could cross region boundaries. In that case, divide the problem into two subparts (one for each region needed). But the general idea is the same.