Probability that product is a perfect square

The main term certainly exceeds $1/n$ because $0$ appears $2n+1$ times . . .

I assume, then, that you meant to choose $(x,y)$ uniformly from $[1,n]^2$, not $[0,n]^2$. But it's still more than $1/n$: you get $n$ solutions from just $x=y$, and another $n - O(n^{1/2})$ from $(x,y)=(a^2,b^2)$, and these have an overlap of only $\lfloor n^{1/2} \rfloor$ so we get a probability of at least $2/n - O(n^{-3/2})$.

In fact the main term is $(C \log n) / n$ where $C = 1/\zeta(2) = 6/\pi^2$. Indeed $x,y$ such that $xy$ is a square are parametrized by $(ma^2,mb^2)$ with $\gcd(a,b)=1$. Given $m \leq n$, there are $\lfloor(n/m)^{1/2}\rfloor$ choices for each of $a$ and $b$, and they are relatively prime with probability about $1/\zeta(2) = C$ as long as $m = o(n)$. Hence the count is asymptotic to $C \sum_{m=1}^n n/m \sim C n \log n$, and the probability is asymptotic to $(C\log n) / n$ as claimed.


Using the method of contour integration, one can get

$$P(N)=\frac{\log N}{\zeta(2)N}+\frac{A}{N}+O(N^{-23/22+o(1)}),$$

where $P(N)$ is our probability and $A=\frac{3\gamma}{\zeta(2)}-1-\frac{2\zeta'(2)}{\zeta(2)^2}-\frac{1}{\zeta(2)}$. To prove this formula, lets note that if

$$N^2P(N)-(N-1)^2P(N-1)=a(N)$$

then $a(N)-1$ is equal to the number of pairs of the form $(x,N)$ or $(N,x)$ with $xN=y^2$ for some $y$, $x<N$. One can easily see that if $D(N)^2$ is a maximal square divisor of $N$ then $a(N)-1=2D(N)-2$. This is because $N=MD(N)^2$ with squarefree $M$ and if $xN=y^2$, then we should have $x=Mz^2$ for some integer $z$. There are exactly $D(N)-1$ admissible values for $z$, hence there are $2(D(N)-1)$ pairs of the form $(x,N)$ or $(N,x)$ with $xN=y^2$ and $x<N$. So we get

$$a(N)=2D(N)-1.$$

Now let us consider the generating Dirichlet series of the function $a(N)$:

$$f(s)=\sum_{n\in \mathbb N} \frac{a(n)}{n^s}=2\sum_{n \in \mathbb N} D(n)/n^s-\zeta(s).$$

The function $D(n)$ is multiplicative and for any prime $p$ we have $D(p^k)=p^{[k/2]}$. Therefore,

$$\sum_{n \in \mathbb N} D(n)n^{-s}=\prod_p (1+p^{-s}+p^{1-2s}+p^{1-3s}+\ldots)=\prod_p(1+p^{-s})(1+p^{1-s}+p^{2-2s}+\ldots)=\frac{\zeta(s)\zeta(2s-1)}{\zeta(2s)}.$$

Next, using the truncated version of Perron's formula and the fact that $D(n) \leq \sqrt n$, we obtain for any $b>1$ (and fixed) and any $T$ the following:

$$\sum_{n \leq N} D(n)=\frac{1}{2\pi i}\int_{b-iT}^{b+iT} N^s\frac{\zeta(s)\zeta(2s-1)}{s\zeta(2s)}ds+O\left(\frac{N^{3/2}\log N}{T}\right).$$

Moving the contour of integration to the line $\mathrm{Re}\,s=1/2+\varepsilon$ for any $\varepsilon>0$ and using the estimates $\zeta(s) \ll |s|^{1/6}$, $\zeta(2s-1) \ll |s|^{1/2}$ and $\zeta(s)^{-1} \ll 1$ one can prove that

$$\sum_{n \leq N} D(n)=\mathrm{Res}_{s=1}N^s\frac{\zeta(s)\zeta(2s-1)}{s\zeta(2s)}+O(N^\varepsilon(\sqrt N T^{5/6}+NT^{-1/3}+N^{3/2}T^{-1})).$$

Choosing $T=N^{6/11}$, we finally obtain

$$\sum_{n \leq N} D(n)=\mathrm{Res}_{s=1}N^s\frac{\zeta(s)\zeta(2s-1)}{s\zeta(2s)}+O(N^{21/22+\varepsilon}).$$

It remains to compute the residue. For $s \to 1$ we have

$$\frac{N^s}{s}=N+(N\log N-N)(s-1)+O((s-1)^2),$$

$$\zeta(s)=\frac{1}{s-1}+\gamma+O(s-1),$$

$$\zeta(2s-1)=\frac{1}{2(s-1)}+\gamma+O(s-1)$$

and

$$\frac{1}{\zeta(2s)}=\frac{1}{\zeta(2)}-\frac{2\zeta'(2)}{\zeta(2)^2}(s-1)+O((s-1)^2).$$

Multipying this gives the result

$$\frac{N}{2\zeta(2)^2(s-1)^2}+\frac{N\log N/2\zeta(2)+N(A+1)/2}{s-1}+O(1),$$

so

$$\sum_{n \leq N} D(n)=\frac{N\log N}{2\zeta(2)}+N(A+1)/2+O(N^{21/22+o(1)})$$

and

$$N^2P(N)=\sum_{n \leq N} a(N)=2\sum_{n \leq N} D(n)-N=\frac{N\log N}{\zeta (2)}+AN+O(N^{21/22+o(1)})$$

which is equivalent to the stated result.

Analogous estimates for $k$-th powers with $k>1$ will give us something like

$$\frac{2\zeta(k-1)}{\zeta(k)N}-\frac{1}{N}+O(N^{-1-e_k})$$

with $e_k>0$, as the corresponding generating function is $2\frac{\zeta(s)\zeta(ks-1)}{\zeta(ks)}-\zeta(s)$. Also, I think that the error term could be significantly improved under the assumption of the Riemann Hypothesis.


For the case $m=k=2$, in which we seek the number $N(n)$ of pairs $(x,y) \in [1,n]^2$ for which $xy$ is a square, we give an elementary estimate $$ N(n) = Cn \log n + An + O(n^{2/3}), $$ where $C = 1/\zeta(2) = 6/\pi^2$ and $$ A = \frac{3\gamma-1}{\zeta(2)} - \frac{2\zeta'(2)}{\zeta(2)^2} - 1 = 0.1377775\ldots \, . $$ This agrees with the analytic calculation of Asymptotiac K (and is corroborated by numerical computation up to $n = 2^{30}$), and improves the error term (the contour-integral analysis gave an error estimate equivalent to $n^{21/22+o(1)}$ rather than $n^{2/3}$).

Recall that $xy$ is a square iff $(x,y) = (ma^2,mb^2)$ for some positive integers $m,a,b$, and the representation can be made unique by requiring either that $a,b$ be coprime or that $m$ be squarefree, which is the source of the factor $1/\zeta(2)$. Let $M(n)$, then, be the number of triples $(m,a,b)$ of positive integers such that $ma^2\leq n$ and $mb^2\leq n$, without the additional coprimality or squarefree condition. Then Möbius inversion (applied with either condition $\gcd(a,b)=1$ or $\mu(m)^2 = 1$ yields $$ N(n) = \sum_{d=1}^{\lfloor\sqrt{n}\rfloor} \mu(d) M(\lfloor n/d^2 \rfloor). $$ We show:

Proposition. $M(n) = n \log n + Bn + O(n^{2/3})$, where $B = 3\gamma - 1 - \zeta(2) = -0.913287\ldots$ (and again $\gamma$ is Euler's constant $0.5772156649\ldots$).

Proof: Let $R = \lfloor n^{1/3} \rfloor$. If $m\cdot\max(a,b)^2 \leq n$ then either $m \leq R$ or $\max(a,b) \leq R$. Given $m$, the number of $(a,b)$ pairs is $\lfloor \sqrt{n/m} \rfloor^2 = n/m - O(\sqrt{n/m})$; summing this over $m \leq R$ gives $n H_R - O(n^{2/3})$ where $H_R$ is the harmonic sum $\sum_{m=1}^r 1/m$. In the other direction, each $k \leq R$ occurs $2k-1$ times as $\max(a,b)$ for positive integers $a,b$, each of which accounts for $\lfloor n/k^2 \rfloor$ solutions, for a sum of $$ \sum_{k=1}^R (2k-1) \, (n/k^2 - O(1)) = (2 H_R - \zeta(2)) n - O(n^{2/3}) $$ solutions. For the total count we add this to $n H_r - O(n^{2/3})$, and subtract $R^3 = n - O(n^{2/3})$ which is the number of solutions for which $a,b,m$ are all $\leq R$, finding $(3 H_R - \zeta(2) - 1) n + O(n^{2/3})$. The Proposition then follows from $H_R = \log R + \gamma + O(1/R) = \frac13 \log n + \gamma + O(n^{-1/3})$. $\Box$

The estimate for $N(n)$ then follows from $N(n) = \sum_{d=1}^{\lfloor\sqrt{n}\rfloor} \mu(d) M(\lfloor n/d^2 \rfloor)$; the $2\zeta'(2) / \zeta(2)^2$ comes from the second term of $$ \frac{n}{d^2} \log \frac{n}{d^2} = \frac1{d^2} n \log n - 2n \frac{\log d}{d^2} $$ because $\sum_{d=1}^\infty \mu(d) \log d \, / \, d^2$ is the derivative at $s=2$ of $-1/\zeta(s)$.

One can probably use bounds on exponential sums to further reduce the $O(n^{2/3})$ error, both in worst and average case, as is done for the Dirichlet divisor problem and the Gauss circle problem.