Painting $n$ balls from $2n$ balls red, and guessing which ball is red, game

Here is a sketch of an argument by my colleague Jim Roche of an $\Omega(\log n/\log \log n)$ lower bound. The basic idea is that Lucy chooses randomly between two strategies, one of which puts all the red balls early (thereby forcing Alice to open some number of the early boxes), and the other of which puts a lot of the red balls late (thereby forcing Alice to open some number of the late boxes).

Specifically, with probability 1/2, Lucy distributes the $n$ red balls randomly among the first $n(1 + 1/\log_2 n)$ boxes. With probability 1/2, she distributes $n/\log_2 n$ red balls randomly among the first $n(1 + 1/\log_2 n)$ boxes, and puts the remaining red balls at the very end. We now give a lower bound on how many boxes Alice expects to open under any randomized strategy that succeeds with probability at least $1-1/n$.

The argument that follows is slightly imprecise because it treats the probabilities that the boxes contain red balls as independent, whereas they are actually dependent since Lucy is constrained to paint exactly $n$ balls red. However, the error terms are negligible, and the argument is clearer if we ignore the dependencies.

For any constant $C$, define $P_C$ to be the probability that Alice chooses at most $(C\ln n)/\ln\log_2 n$ of the first $n(1+1/\log_2 n)$ boxes. Then we must have $${1\over n} \ge \Pr\{\hbox{Alice fails}\} \ge {P_C\over 2} \left({1\over\log_2 n} \right)^{(C\ln n)/\ln\log_2 n}, $$ so $$P_C \le {2\over n}\exp (C\ln n) = 2n^{C-1}. $$ In particular, for $C=1/2$, the probability that Alice chooses at most $(\ln n)/2\ln \log_2 n $ of the first $n(1+1/\log_2 n)$ boxes is at least $2/\sqrt n$. Therefore, with probability 1 as $n\to\infty$, Alice must examine at least $(\ln n)/ 2 \ln \log_2 n$ of the first $n(1+1/\log_2 n)$ boxes.

But remember that with probability 1/2, Lucy places only $n/\log_2 n$ red balls among the first $n(1+1/\log_2 n)$ boxes. If she does so, then the probability that Alice's first $(\ln n)/2 \ln \log_2 n$ looks uncovers a red ball is upper-bounded by the union bound $$\left({\ln n\over 2\ln\log_2 n}\right)\left({1\over \log_2 n}\right),$$ which approaches zero as $n\to\infty$. Therefore, with probability at least $1/2 -\epsilon$, Alice must examine at least $(1/2 - \epsilon)(\ln n)/\ln \log_2 n$ boxes. This establishes the $\Omega(\log n/\log \log n)$ bound.


Here is a $\Omega(\log n)$ bound for $X$ if Alice follows a very restricted strategy. I think this shows well why this might be the bound and the difficulty of the problem.

Suppose Alice decides for every ball $i$ whether she takes it or not with probability $p_i$ independently (if she has not yet found a red ball earlier). If $i\le j$, then $p_i\le p_j$, otherwise Lucy could swap the $i$th and $j$th ball and increase $X$. Denote by $t$ the largest $i$ for which $p_i< \frac{\log n}{100n}$. Lucy will put the red balls to the first $\min(t,n)$ bins and the last $n-\min(t,n)$. It is very likely that during the first $t$ steps Alice won't find any red balls. If $t> n$, then Alice's chance of failure is more than $1/n$. If $t\le n$, then Alice will (whp) ask $\Omega(\log n)$ white balls in the next $n$ steps.

Unfortunately this argument only works for this very special version, but maybe some parts of it are useful for the general case too.