Number of paths on $\mathbb Z^d$

Without loss of generality, let us assume that $y = (0,\dotsc, 0)$ is the origin. We write $x = (x_1, \dotsc, x_d)$.

Again without loss of generality, let us assume that every $x_i$ is non-negative. This is because we may "mirror" all the paths with respect to the $i$-th coordinate hyperplane.

We write $S$ for the sum of all $x_i$.

For the reason of parity, we should also assume that $n = 2m + S$ for some non-negative integer $m$, otherwise there is no such path.

With all these assumptions, the number of paths connecting $x$ and $y$, as function of $m$, is given by: $$f_d(m) = n!\sum_{a_1 + \dotsc + a_d = m}\frac{1}{\prod_{i = 1}^d\left(a_i!(a_i + x_i)!\right)}.$$

Reason: the number $a_i$ is the number of steps going in the negative direction of the $i$-th axis. Then we have to go $a_i + x_i$ steps in the positive direction of the $i$-th axis. The sum of all $a_i$ is just half of the steps that we "wasted", hence is equal to $m$; and once the numbers $a_i$ are all fixed, then we just choose how they are arranged in the total of $n$ steps, and there are $\frac{n!}{\prod_{i = 1}^d\left(a_i!(a_i + x_i)!\right)}$ such ways.


Example:

For $d = 1$, the function we have is simply $f_1(m) = \binom n m = \binom {2m + S} m$.

By Stirling, this is assymptotic to $\frac{2^{2m + S}}{\sqrt{\pi m}}$. (We say two functions $g(m)$ and $h(m)$ are assymptotic, or $f$ is assymptotic to $g$, if the limit $g(m)/h(m)$ tends to $1$ when $m$ tends to infinity.)


This paragraph has been edited. See the edit history for the previous version.

For $d = 2$, it turns out that there is again a closed formula.

We have \begin{eqnarray*} f_2(m) &=& n!\sum_{a_1 + a_2 = m}\frac{1}{a_1!a_2!(a_1 + x_1)!(a_2 + x_2)!}= \binom n m \sum_{a = 0}^m\binom m a \binom{m + S}{a + x_2}\\ &=& \binom n m \sum_{a = 0}^m\binom m a \binom{m + S}{m + x_1 - a} = \binom n m \binom n {m + x_1}. \end{eqnarray*}

Therefore, we have $f_2(m)$ assympotic to $\frac{2^{2n}}{\pi m}$, by Stirling.


This should give you a general taste of how $f_d$ grows with respect to $m$. More specifically, my feeling is that for every $d$ there exists a number $\alpha_d$ such that $f_d(m)$ grows "like" $\alpha_d^m$, up to some polynomial factor of $m$.

A trivial upper bound is $\alpha_d \leq 4d^2$, since every step has only $2d$ different choices, hence the total number of paths is $(2d)^n$, which is $(4d^2)^m$ up to constant factor. Thus a brave guess could be $\alpha_d = 2d$.

The idea of proof is to imitate the $d = 2$ case here. But to keep this answer in a reasonable length, I would like to pause the discussion here and leave it to interested people.


P.S. The problem obviously has an interpretation as random walks on lattices. So maybe there are probabilists who know this or are interested in this.


The best way to approach this is to instead consider a random walk on $\mathbb Z^d$ starting at the origin and ask for the probabilities that one ends at a specific point $z$ after $n$ steps. The reason for this is that we can define random variables $X_i$ for each step chosen uniformly from the $2d$ possible steps and then the ending position is just $X_1+\ldots + X_n$. These probabilities are just the quantities you want divided by $(2d)^n$, but thinking about it in these terms lets us bring in a powerful machine: the central limit theorem.

So, first of all, let's jump straight to the answer and come back to fill in the pesky details. Let $Z_n$ be a random variable given as the position after $n$ steps. The multivariate central limit theorem states that $$\frac{1}{\sqrt{n}}\cdot Z_n \rightarrow N\left(0,\frac{1}{d}\cdot I_n\right)$$ where $N$ is the multivariate normal distribution and $I_n$ is the identity matrix and the arrow means convergence in distribution. To unpack that with less jargon: the distribution of $\frac{1}{\sqrt{n}}\cdot Z_n$ tends to be the same as if we'd randomly chosen the coordinates, each from a normal distribution with variance $\frac{1}{d}$ - which is the variance of each $X_i$.

In particular, unless anything really bad happens, this means that the probability of being at a particular point $z$ after $n$ steps is roughly the probability of choosing, from the normal distribution, a point $p$ such that the closest point in $\frac{1}{\sqrt{n}}\mathbb Z^d$ with the correct parity is $\frac{1}{\sqrt{n}}z$. The probability density function of $N\left(0,\frac{1}d\cdot I_n\right)$ evaluated at $\frac{1}{\sqrt{n}}\cdot z$ is given as, where the product runs over all dimensions $$\prod_i \frac{\sqrt{d}}{\sqrt{2\pi}} \cdot \exp\left(-\frac{1}2\cdot \left(\frac{\sqrt{d}\cdot z_i}{\sqrt{n}}\right)^2\right) = \left(\frac{\sqrt{d}}{\sqrt{2\pi}}\right)^d\cdot \exp\left(-\frac{d\|z\|^2}{2n}\right).$$ The probability of this point being the closest point is roughly this evaluation of a PDF times the volume of the region in which $\frac{1}{\sqrt{n}}z$ is the closest point of the right parity - this volume being $2\left(\frac{1}{\sqrt{n}}\right)^d$, which gives the probability of ending at a $z$ of the right parity being roughly $$2\left(\frac{\sqrt{d}}{\sqrt{2\pi n}}\right)^d\cdot \exp\left(-\frac{d\|z\|^2}{2n}\right)$$ If we multiply this probability by the count of $(2d)^n$ equally likely paths of length of $n$, we get the following result:

The number of paths of length $n$ starting at the origin and ending at some point $z$ is approximately $$2(2d)^n\left(\frac{\sqrt{d}}{\sqrt{2\pi n}}\right)^d\cdot \exp\left(-\frac{d\|z\|^2}{2n}\right)$$

To be a bit more formal, all that we can use "convergence in distribution" to say is that if we choose some (measurable) region $A$ of $\mathbb R^d$ whose boundary has no $d$-volume, then the probability that $\frac{1}{\sqrt{n}}Z_n$ is in $A$ tends to, as $n$ goes to $\infty$, the probability that the same would happen for a normal distribution.

To use this at the origin, you can look at the probability that the ending position is within a ball of radius $\frac{\alpha}{\sqrt{n}}$ around the origin and have formally that this probability tends to be equal to the probability that the given normal distribution is within $\alpha$ of the origin. I'm struggling to recall a nice way to show this, but the probabilities of landing at any given point in this region are not too different from each other - essentially, the random walk doesn't remember where it started - so the probability of landing at a given point turns out to be really close to the total probability of the region divided by the number of points therein - which is also pretty much the asymptotic given before.


As mentioned by @Shalop in the comments, we can justify the asmyptotics on the probabilities of ending up at a given point by resorting to the use of some form of a local central limit theorem, which directly yield such results. For instance, Theorem 3 of this paper can be applied to this process after using a suitable change of coordinates to remove the parity issues.

One an also put together various ad-hoc approaches to prove this particular result, by noting that the probability of ending up at a particular point $z$ decreases in each coordinate of $z$'s absolute value, or that the ratio of adjacent probabilities can often be interpreted as events like "the probability of crossing a some hyperplane given that we ended up at some point" (which are very close to $1$) or that the differences of adjacent probabilities shrink quickly (which can be established via the Fourier methods in my other answer).


Here we give a combinatorial approach in terms of lattice paths for the special case of walks in a $d$-dimensional lattice which start and end at the origin. This problem is usually stated as Pólya's drunkard problem. Here we closely follow example VI.14 from Analytic Combinatorics by P. Flajolet and R. Sedgewick.

Pólya's drunkard problem: In the $d$-dimensional lattice path $\mathbb{Z}^d$ of points with integer coordinates, the drunkard performs a random walk starting from the origin with steps in $[-1,+1]^d$, each taken with equal likelihood. The probability that the drunkard is back at the origin after $2n$ steps is

\begin{align*} q_n^{(d)}=\left(\frac{1}{2^{2n}}\binom{2n}{n}\right)^d,\tag{1} \end{align*}

since the walk is a product of $d$ independent one-dimensional walks. The probability that $2n$ is the epoch of the first return to the origin is the quantity $p_n^{(d)}$,which is determined implicitly as results from the decomposition of loops into primitive loops by

\begin{align*} \left(1-\sum_{n=1}^\infty p_n^{(d)}z^n\right)^{-1}=\sum_{n=0}^\infty q_n^{(d)}z^n\tag{2} \end{align*}

In a previous section the authors define primitive loops as walks that start and end at the origin, but do not otherwise touch the origin. The generating function $\mathcal{L}$ of primitive loops is given as \begin{align*} \mathcal{L}(z)=1-\frac{1}{\sum_{n=0}^\infty \binom{2n}{n}^2z^{2n}}=4z^2+20z^4+176z^6+1876z^8+\cdots \end{align*} The coefficients are archived in OEIS as A054474. In particular $[z^{2n}]\mathcal{L}\left(\frac{z}{4}\right)$ is the probability that the random walk first returns to the origin in $2n$ steps.

In terms of the associated ordinary generating functions $P$ and $Q$, the relation (2) reads as $(1-P(z))^{-1}=Q(z)$, implying \begin{align*} P(z)=1-\frac{1}{Q(z)}\tag{3} \end{align*}

The asymptotic analysis of the $q_n$ can be done easily using Stirling's approximation of $n!\sim \left(\frac{n}{e}\right)^n\sqrt{2\pi n}$. In the following we give asymptotic expansions for the cases $d=1,d=2$ and $d=3$.

Case $d=1$:

This case can be solved directly from (1) and (3) by introducing \begin{align*} \beta(z)=\sum_{n\geq 0}\frac{1}{2^{2n}}\binom{2n}{n}z^n=\frac{1}{\sqrt{1-z}} \end{align*} With $P(z)=1-\sqrt{1-z}$ we obtain \begin{align*} \color{blue}{p_n^{(1)}}=\frac{1}{n2^{2n-1}}\binom{2n-2}{n-1}\color{blue}{\sim\frac{1}{2\sqrt{\pi n^3}}} \end{align*}

The cases $d>1$ can be solved by the Hadamard closure theorem. The Hadamard product of two functions $f(z)$ and $g(z)$ analytic at the origin is defined as their term-by-term product, \begin{align*} f(z)\odot g(z)=\sum_{n\geq 0}f_ng_nz^n\qquad\text{where}\qquad f(z)=\sum_{n\geq 0}f_nz^n,\quad g(z)=\sum_{n\geq 0}g_nz^n \end{align*} The Hadamard closure theorem (VI.11):

  • (i) Assume that $f(z)$ and $g(z)$ are analytic in a $\triangle$-domain, $\triangle(\psi_{0},\eta)$ (see reference, figure VI.14 for more details). Then, the Hadamard product $(f\odot g)(z)$ is analytic in a (possibly smaller) $\triangle$-domain, $\triangle^{\prime}$.

  • (ii) Assume further that \begin{align*} f(z)=O((1-z)^{\alpha})\quad \text{and}\quad g(z)=O((1-z)^b),\qquad z\in\triangle(\psi_0,\eta) \end{align*} Then the Hadamard product $(f\odot g)(z)$ admits in $\triangle^{\prime}$ an expansion given by the following rules (see reference, Theorem VI.11 for more rules):

    • If $a+b+1$ is a non-negative integer, then (with $\mathrm{L}(z)=\log(1-z)^{-1}$)

\begin{align*} (f\odot g)(z)=\sum_{j=0}^k\frac{(-1)^j}{j!}(f\odot g)^{(j)}(1)(1-z)^j+O\left((1-z)^{a+b+1}\mathrm{L}(z)\right). \end{align*}

Case $d=2$: By the Hadamard closure theorem, the function $Q(z)=\beta(z)\odot \beta(z)$ admits a priori a singular expansion at $z=1$ that is composed solely of elements of the form $(1-z)^{\alpha}$ possibly multiplied by integral powers of the logarithmic function $\mathrm{L}(z)=\log(1/(1-z))$.

The authors derive in the following an asymptotic expansion of $P(z)$ as

The singular expansion of $P(z)$ at $z=1$ results in \begin{align*} P(z)\sim 1-\frac{\pi}{\mathrm{L}(z)}+\frac{\pi^2 K}{\mathrm{L}(z)^2}+\cdots \end{align*} so that, by Theorems VI.2 and VI.3, one has \begin{align*} \color{blue}{p_n^{(2)}}&\color{blue}{=\frac{\pi}{n\log^2 n}-2\pi\frac{\gamma+\pi K}{n\log^3 n}+O\left(\frac{1}{n\log^4 n}\right)}\\ K&=1+\sum_{n=1}^\infty\left(16^{-n}\binom{2n}{n}^2-\frac{1}{\pi n}\right)\\ &\doteq 0.882\,542\,400\,610\,606\,373\,585\,825\,7\ldots \end{align*}

In a similar manner the authors also provide an expansion of the case $d=3$. Omitting some details we find

Case $d=3$: By singularity analysis the last expansion gives \begin{align*} \color{blue}{p_n^{(3)}}&\color{blue}{=\frac{1}{\pi^{3/2}Q(1)^2}\,\frac{1}{n^{3/2}}+O\left(\frac{1}{n^2}\right)}\\ Q(1)&=\frac{\pi}{\Gamma\left(\frac{3}{4}\right)^4}\\ &\doteq 1.393\,203\,929\,685\,676\,859\,184\,246\,3\ldots \end{align*}

The authors close this section with:

Higher dimensions are treated similarly, with logarithmic terms surfacing in asymptotic expansions for all even dimensions.

Note: Some further information regarding Pólya's drunkard problem is given in Pólya's Random Walk Constants.



Some related information regarding the general case is stated in section 5.9 Pólya's Random Walk Constants in Mathematical Constants by S.R. Finch.

From Pólya's Random Walk Constants (section 5.9):

  • Let $U_{d,l,n}$ be the number of $d$-dimensional $n$-step walks that start from the origin and end at a lattice point $l$.

  • Let $V_{d,l,n}$ be the number of $d$-dimensional $n$-step walks that start from the origin and reach the lattice point $l\ne 0$ for the first time at the end (second time if $l=0$).

Then the generating functions

\begin{align*} U_{d,l}(x)=\sum_{n=0}^\infty\frac{U_{d,l,n}}{(2d)^n}x^n,\quad V_{d,l}(x)=\sum_{n=0}^\infty\frac{V_{d,l,n}}{(2d)^n}x^n \end{align*}

satisfy

\begin{align*} V_{d,l}(x)&=\frac{U_{d,l}(x)}{U_{d,0}(x)}\qquad\qquad\quad l\ne 0\\ V_{d,0}(x)&=1-\frac{1}{U_{d,0}(x)} \end{align*}

Cases $d=1,d=2$:

\begin{align*} U_{1,l}(x)&=\sum_{n=0}^\infty\frac{1}{2^n}\binom{n}{\frac{l+n}{2}}x^n\\ U_{2,l}(x)&=\sum_{n=0}^\infty\frac{1}{4^n}\binom{n}{\frac{l_1+l_2+n}{2}}\binom{n}{\frac{l_1-l_2+n}{2}}x^n \end{align*}

where we agree to set the binomial coefficients equal to $0$ if $l+n$ is odd for $d=1$ or $l_1+l_2+n$ is odd for $d=2$.

Case $d=3$:

If $d=3$, then $a_n=U_{3,0,2n}$ satisfies according to the OEIS archived sequences A002896, A039699, A049037 and A063888:

\begin{align*} &a_n=\binom{2n}{n}\sum_{k=0}^n\binom{n}{k}^2\binom{2k}{k}=\sum_{k=0}^n\frac{(2n)!(2k)!}{(n-k)!^2k!^4}\\ &\sum_{n=0}^\infty\frac{a_n}{(2n)!}y^{2n}=I_0(2y)^3, \end{align*} where $I_0$ is the zero-th modified Bessel function and the $a_n$ fulfill the recurrence relation \begin{align*} (n+2)^3a_{n+2}-2(2n+3)(10n^2+30n+23)a_{n+1}+36(n+1)(2n+1)(2n+3)a_n=0. \end{align*}