Probability that a sequence repeats itself

I can't give an exact answer, but an interesting connection and some numerical results.

Some dabbling with short examples shows that the events for different integers $m$ are nearly but not exactly independent. For instance, some sequences that don't repeat at $m=3$ repeat at $m=2$ but not at $m=1$, whereas for sequences that do repeat at $m=3$ repetition at $m=2$ implies repetition at $m=1$. However, the effect from the dependence appears quite small, so

$$ \prod_{m=1}^\infty(1-k^{-m}) $$

should be a good approximation to the probability that there is no repetition. Incidentally, if $k$ is a prime power, this is the probability that a large random square matrix over $\mathbb F_k$ is invertible; see Probability that a random binary matrix is invertible. In that answer, I computed the product for $k=2$ to be approximately $0.288788$, which would make the probability for a binary sequence to exhibit repetition approximately $0.711212$.

Here are numerical results for $k=2$ for repetitions up to $m=17$. Each row contains the number of strings of length of $2m$ that repeat, the number of strings that don't repeat, the proportion $p_m$ of strings that repeat, and a value $\alpha_m=-\log_2(1-(1-p_m)/(1-p_{m-1}))$ that would be $m$ if repetition at $m$ were independent of previous repetitions (here's the code):

$$ \begin{array}{rrrlr} m&\text{repeating}&\text{non-repeating}&\text{proportion}&\alpha_m\\\hline 1 & 2 & 2 & 0.5 & 1.00000\\ 2 & 10 & 6 & 0.625 & 2.00000\\ 3 & 44 & 20 & 0.6875 & 2.58496\\ 4 & 182 & 74 & 0.7109375 & 3.73697\\ 5 & 738 & 286 & 0.720703125 & 4.88753\\ 6 & 2972 & 1124 & 0.7255859375 & 5.83794\\ 7 & 11924 & 4460 & 0.7277832031 & 6.96450\\ 8 & 47768 & 17768 & 0.7288818359 & 7.95290\\ 9 & 191214 & 70930 & 0.7294235229 & 8.96725\\ 10 & 765136 & 283440 & 0.7296905518 & 9.98483\\ 11 & 3061104 & 1133200 & 0.7298240662 & 10.98340\\ 12 & 12245530 & 4531686 & 0.7298904657 & 11.99044\\ 13 & 48984342 & 18124522 & 0.7299235761 & 12.99397\\ 14 & 195941804 & 72493652 & 0.7299401015 & 13.99640\\ 15 & 783776080 & 289965744 & 0.7299483567 & 14.99761\\ 16 & 3135122038 & 1159845258 & 0.7299524820 & 15.99838\\ 17 & 12540523572 & 4639345612 & 0.7299545438 & 16.99901\\ \end{array} $$

The agreement with $0.711212$ is OK but not marvelous. As suggested by the $m=3$ example, the dependence slightly increases the probability of repetition because a repetition implies correlations between the conditions of repetition at lower values of $m$; but this effect is strongest at $m=3$ and becomes negligible at higher values of $m$, where $\alpha_m\approx m$ shows that almost exactly the expected proportion of sequences repeat for the first time. Thus we can get an accurate estimate of the limit probability from $p\approx p_{17}+(p_{17}-p_{16})\approx0.7299566056$, which should be accurate to eight or nine digits.

I checked OEIS for the sequences of both the repeating and the non-repeating counts; no hit.

P.S.: Note that as t.b. pointed out in a comment under the answer linked to above, by the pentagonal number theorem the above product is given by

$$ \begin{align} \prod_{m=1}^\infty(1-k^{-m})&=\sum_{j=-\infty}^\infty(-1)^jk^{-j(3j-1)/2)}\\&=1-k^{-1}-k^{-2}+k^{-5}+k^{-7}-k^{-12}-k^{-15}+k^{-22}+k^{-26}-\dotso\;, \end{align} $$

which leads to an interesting pattern of digits in its representation in base $k$:

$$ \begin{align} \prod_{m=1}^\infty(1-2^{-m})&=0.01001001111011100000010000111111110111110000000000100\ldots_2\;, \\ \prod_{m=1}^\infty(1-10^{-m})&=0.89001009999899900000010000999999998999990000000000100\ldots_{10}\;. \end{align} $$