A variation of the law of large numbers for random points in a square

Given $n^2$ i.i.d. uniform points in $[0,1]^2$, the goal is to draw a configuration of $cn$ vertical lines and $cn$ horizontal lines such that in each small rectangle there is at most one marked point.

We show below that $c$ must satisfy $c=\Omega(n^{1/3})$ for this to be typically possible:

In fact, $\Theta(n^{4/3})$ lines are necessary and sufficient for such a configuration to exist with substantial probability.

More precisely, denote by $p_n(k)$ the probability that a configuration of $k$ vertical lines and $k$ horizontal lines separate $n^2$ i.i.d. uniform points $\{x_j\}_{j=1}^{n^2}$ in $[0,1]^2$.

Claim: For suitable constants $0<c_1<c_2<\infty$, we have (omitting integer part symbols):

(a) $\; p_n(c_1 n^{4/3}) \to 0$ as $n \to \infty$, and

(b) $\; p_n(c_2 n^{4/3}) \to 1$ as $n \to \infty$.

This is proved below with $c_1=1/20$ and $c_2=3/2$; no attempt has been made to optimize these constants.

Proof: Consider an auxiliary grid of $L:=n^{4/3}$ uniformly spaced vertical lines and $L$ uniformly spaced horizontal lines in the unit square. This grid defines $L^2$ grid squares of side length $1/L$.

(a) Call a grid square $Q$ nice if it contains exactly two of the $n^2$ given points $\{x_j\}$. Observe that for two distinct grid squares, the events that they are nice are negatively correlated. Call a nice grid square $Q$ good if there is at most one other nice square in its row and at most one other nice square in its column. The probability that a specific grid square $Q$ is nice is $${n^2 \choose 2}L^{-4}(1-L^{-2})^{n^2-2}=(1/2+o(1))L^{-1}.$$
Given that $Q$ is nice, The conditional expectation of the number of nice squares (other than $Q$) in the row of $Q$ is $1/2+o(1)$ Thus, given that $Q$ is nice, Markov's inequality implies that the conditional probability that there are two or more additional nice squares in the row of $Q$ (besides $Q$ itself) is at most $1/4+o(1)$. The same applies to the column of $Q$, and we deduce that $$P(Q \; {\rm is \; good}\; | Q \; {\rm is \; nice}) \ge 1/2+o(1) \, ,$$ so $$P(Q \; {\rm is \; good} ) \ge (1/4+o(1))L^{-1} \, .$$ Let $G$ denote the number of good grid squares. Then the mean satisfies $$E(G) \ge (1/4+o(1))L \,.$$ Observe that if we replace one point $x_i$ by $x_i'$ then $G$ will change by at most 5, so Mcdiarmid's inequality, see [1, Theorem 3.1] or [2], implies that for $n$ large enough, $$P(G \le L/5) \le \exp(-\frac{(L/21)^2}{25n^2}) \to 0 \,. {\rm as} \; n \to \infty \,.$$ (Alternatively, one could invoke the Efron-Stein inequality or estimate the variance directly to verify this.) Now suppose that $S$ is a set of vertical and horizontal lines that separate the points $\{x_j\}_{j=1}^{n^2}$. For each good grid square $Q$, a line of $S$ is required to separate the two points $x_i, x_j$ in the square, and each such line can be used for at most two good squares. Thus $|S| \ge G/2$ so $$p_n(L/20) \le P(\exists \; {\rm separating } \; S \; {\rm with } \; |S| \le L/10) \le P(G \le L/5) \to 0 $$.

(b) Denote by $M$ the number of pairs $(i,j)$ such that $1 \le i<j \le n^2$ and $x_i,x_j$ fall in the same grid square. Then $E(M) = {n^2 \choose 2}L^{-2} \le L/2$, and another application of McDirarmid's inequality implies that $P(M \ge L) \to 0$ as $n \to \infty$.

Finally, construct a separating set of lines $S$ by combining the $2L$ lines of the auxiliary grid with one separating line for each pair $(i,j)$ counted in $M$ (we can take half of these lines vertical and half horizontal). Then $P(|S| \ge 3L) \to 1$ as $n \to \infty$ and $p_n(3L/2) \to 1$ as well.

[1] McDiarmid, Colin. "Concentration." In Probabilistic methods for algorithmic discrete mathematics, pp. 195-248. Springer, Berlin, Heidelberg, 1998.http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=8B1FFFE4553B63543AFEA0706E686E65?doi=10.1.1.168.5794&rep=rep1&type=pdf

[2] McDiarmid, C. (1989). "On the method of bounded differences". Surveys in Combinatorics. London Math. Soc. Lectures Notes 141. Cambridge: Cambridge Univ. Press. pp. 148–188. MR 1036755


Interestingly, if we allow the lines to have arbitrary directions, it still requires roughly n^{4/3} (up to a log correction) lines to separate all the points.

https://www.cambridge.org/core/journals/proceedings-of-the-london-mathematical-society/article/economical-covers-with-geometric-applications/486374A93F4351DF26C155F6C3FE35AE


Label your $N$ points as $(x_i,y_{\sigma(i)})$ with $x_1 < \cdots < x_N$ and $y_1 < \cdots < y_N$ ; this defines a uniform random permutation $\sigma \in \mathfrak{S}_N$, and all the information about the problem is encoded in $\sigma$.

Let $C$ be the number of axis-parallel cuts needed to scatter all the points. An easy lower bound is $C \geq L-1$, where $L$ is the length of the longest monotone subsequence of $\sigma$. It is well known that $L \sim 2 \sqrt{N}$ in probability, so if we could show that the above lower bound is typically sharp, this would solve the problem.