Random thoughts on: Random walks in one dimension

A wild Bessel function has appeared!

I have to tell you something: I really like Bessel functions. All of them. Plain or modified, of first or second kind. I am particularly amazed at how often they appear in completely unrelated contexts. For example, they can be used to describe the vibrations of a drum or, as I recently stumbled upon, random walks in one dimension. And that is also the example I want to show you today. To understand the problem, imagine an infinite line divided into equal steps marked by the index $n$ . A ball is placed on the line at some position. At a rate $k_\mathrm{R}$ it goes one step to the right, $n\to n+1$ , and one step to the left, $n\to n-1$ , at the rate $k_\mathrm{L}$ :

*The ball at $n$ moves with rate $k_\mathrm{R}$ one step to the right and rate $k_\mathrm{L}$ one step to the left.*

The ball therefore has a certain probability $p_n(t)$ of being on the line at position $n$ and time $t$ . We are mainly interested in the time evolution and form of the probability, so how can we describe the dynamics of the system? We can do this using a master equation. To derive it, note that the evolution of $p_n(t)$ boils down to a balance of „outflux“ and „influx“ of probabilities into a location $n$ . What are these influxes and outfluxes? First of all, the ball at $n$ can move to $n\pm1$ at rates $k_\mathrm{R}$ or $k_\mathrm{L}$ , so the outflux from this site is $-(k_\mathrm{L}+k_\mathrm{R})p_n(t)$ . On the other hand, the ball could move from the locations $n\pm1$ to $n$ at the respective rates, resulting in a probability influx of $k_\mathrm{R}p_{n-1}(t)+k_\mathrm{L}p_{n+1}(t)$ . In total we find for the rate of change of the probability:

$\displaystyle \frac{\mathrm{d}}{\mathrm{d}t}p_n(t)=k_\mathrm{R}p_{n-1}(t)+k_\mathrm{L}p_{n+1}(t)-(k_\mathrm{L}+k_\mathrm{R})p_n(t)$ ,

which is a combined differential and recurrence equation. To solve this equation, we can use the generating function approach. We take the probability $p_n(t)$ as the coefficients of a power series and write

$\displaystyle G(z,t)=\sum_{n\in\mathbb{Z}}p_n(t)z^n$ .

Inserting this expression into the master equation yields the following differential equation:

$\displaystyle \frac{\mathrm{d}}{\mathrm{d}t}G(z,t)=k_\mathrm{R}zG(z,t)+k_\mathrm{L}\frac{G(z,t)}{z}-(k_\mathrm{L}+k_\mathrm{R})G(z,t)$ .

Since the prefactor of $G(z,t)$ on the righthand side is independent of $t$ , we obtain together with the condition $G(z,0)=1$ (i.e., the ball starts at $n=0$ )

$\displaystyle G(z,t)=\exp\left(\left[k_\mathrm{R}z+\frac{k_\mathrm{L}}{z}-(k_\mathrm{L}+k_\mathrm{R})\right]t\right)$ .

All that remains is to find the power series coefficients of the above function. This is where the Bessel function joins the game. You see, the modified Bessel function of the first kind fulfills the rather similar-looking generating function

$\displaystyle \mathrm{e}^{\frac{a}{2}(z+\frac{1}{z})}=\sum_{n\in\mathbb{Z}}I_n(a)z^n$ ,

therefore, it’s a matter of some simple algebra to show that using it we can write the expansion coefficients of $G(z,t)$ as

$\displaystyle \exp\left(\left[k_\mathrm{R}z+\frac{k_\mathrm{L}}{z}-(k_\mathrm{L}+k_\mathrm{R})\right]t\right)=\sum_{n\in\mathbb{Z}}\left(\frac{k_\mathrm{R}}{k_\mathrm{L}}\right)^{\frac{n}{2}}I_n(2\sqrt{k_\mathrm{L}k_\mathrm{R}}t)\mathrm{e}^{-(k_\mathrm{L}+k_\mathrm{R})t}z^n$ .

This, in turn, implies that the probability is given by⁺

$\displaystyle \boxed{p_n(t)=\left(\frac{k_\mathrm{R}}{k_\mathrm{L}}\right)^{\frac{n}{2}}I_n(2\sqrt{k_\mathrm{L}k_\mathrm{R}}t)\mathrm{e}^{-(k_\mathrm{L}+k_\mathrm{R})t}}$

Using the generating function, we were able to find the microscopic behavior of the random walk and show how it relates to Bessel functions (pretty cool!). In many applications, however, one is more interested in macroscopic properties of the system, such as the mean position or the spread of the mean position (aka the variance). We can also use the generating function for these situations. To see how, note that we want to compute the moments of the probability distribution, i.e,

$\displaystyle \mu_m=\sum_{n\in\mathbb{Z}}p_n(t)n^m$ .

The first moment is the mean position $\langle n\rangle=\mu_1$ , while the variance of the position can be calculated from $\mathrm{Var}[n]=\mu_2-\mu_1^2$ . Since $z\frac{\partial}{\partial z}z^n=nz^n$ , we find

$\displaystyle \mu_m=\sum_{n\in\mathbb{Z}}p_n(t)\left(z\frac{\partial}{\partial z}\right)^mz^n\Big|_{z=1}=\left(z\frac{\partial}{\partial z}\right)^mG(z,t)\Big|_{z=1}$ ,

and for the mean and variance

$\displaystyle \langle n\rangle=(k_\mathrm{R}-k_\mathrm{L})t\qquad\textsf{and}\qquad \mathrm{Var}[n]=(k_\mathrm{L}+k_\mathrm{R})t$ .

In the case of a biased random walk, $k_\mathrm{R}\neq k_\mathrm{L}$ , the ball is expected to move on average with a „velocity“ $k_\mathrm{R}-k_\mathrm{L}$ , while the uncertainty of the position spreads with factor $k_\mathrm{L}+k_\mathrm{R}$ . This behavior is not really apparent from the Bessel-form of the probability. However, we can derive another form of $p_n(t)$ that makes these properties (and an interesting connection to another process) obvious. To do this, we look at the long-term dynamics of the system, i.e. what happens to $p_n(t)$ for $t\to\infty$ . Our starting point is a very convenient integral representation* of our probability, namely

$\displaystyle p_n(t)=\frac{1}{2\pi}\int_{-\pi}^\pi\mathrm{d}x\,\mathrm{e}^{\mathrm{i}xn}\mathrm{e}^{k_\mathrm{L}t(\mathrm{e}^{\mathrm{i}x}-1)+k_\mathrm{R}t(\mathrm{e}^{-\mathrm{i}x}-1)}$ .

In the limit of large times, we can very well approximate the second exponent by means of a power series up to the second order and extend the integration bounds to infinity^#, which gives us

$\displaystyle p_n(t)\approx\frac{1}{2\pi}\int_{-\infty}^\infty\mathrm{d}x\,\mathrm{e}^{\mathrm{i}xn}\mathrm{e}^{-\mathrm{i}(k_\mathrm{R}-k_\mathrm{L})tx-\frac{k_\mathrm{L}+k_\mathrm{R}}{2}tx^2}$ .

This integral is just the Fourier transform of a shifted Gaussian, so we have

$\displaystyle p_n(t)\approx\frac{\mathrm{e}^{-\frac{(n-(k_\mathrm{R}-k_\mathrm{L})t)^2}{2(k_\mathrm{L}+k_\mathrm{R})t}}}{\sqrt{2\pi(k_\mathrm{L}+k_\mathrm{R})t}}$ ,

which, after identifying $D=\frac{k_\mathrm{L}+k_\mathrm{R}}{2}$ and $v=k_\mathrm{R}-k_\mathrm{L}$ , gives the well-known formula for diffusion with drift:

$\displaystyle p_n(t)\approx\frac{\mathrm{e}^{-\frac{(n-vt)^2}{4Dt}}}{\sqrt{4\pi Dt}}$ .

Therefore, we can understand diffusion (with drift) at the microscopic level as a collection of (biased) random walks of the indiviual particles. And finally, here is a plot of the probability $p_n(t)$ for the rates $k_\mathrm{R}=5$ and $k_\mathrm{L}=1$ at time $t=8$ (red line) compared with the Gaussian approximation (black dashed):

*Plot of the probability $p_n(t)$ for the rates $k_\mathrm{R}=5$ and $k_\mathrm{L}=1$ at time $t=8$ (red line) compared with the Gaussian approximation (black dashed).*

Now that we know how to deal with single steps, we can further generalize this approach to next-nearest-neighbors steps. The general idea is quite simple: before we only had rates for steps from $n$ to $n\pm1$ , now we have a second set of rates for steps from $n$ to $n\pm2$ . Let’s denote the 1-step rates by $k_\mathrm{L}^{(1)}$ and $k_\mathrm{R}^{(1)}$ and the 2-step rates by $k_\mathrm{L}^{(2)}$ and $k_\mathrm{R}^{(2)}$ , namely

*The ball at $n$ moves with rates $k_\mathrm{R}^{(i)}$ $i$ steps to the right and rate $k_\mathrm{L}{(i)}$ *$i$* steps to the left.*

Similar to before, the master equation describing the time evolution of the probability is given by

$\displaystyle {\frac{\mathrm{d}}{\mathrm{d}t}p_n(t)=k_\mathrm{R}^{(2)}p_{n-2}(t)+k_\mathrm{R}^{(1)}p_{n-1}(t)+k_\mathrm{L}^{(1)}p_{n+1}(t)+k_\mathrm{L}^{(2)}p_{n+2}(t)-(k_\mathrm{L}^{(2)}+k_\mathrm{L}^{(1)}+k_\mathrm{R}^{(1)}+k_\mathrm{R}^{(2)})p_n(t)\,.}$

Using the generating function approach, we find for the function $G(z,t)$ :

$\displaystyle G(z,t)=\exp\left(\left[k_\mathrm{R}^{(2)}z^2+k_\mathrm{R}^{(1)}z+\frac{k_\mathrm{L}^{(1)}}{z}+\frac{k_\mathrm{L}^{(2)}}{z^2}-(k_\mathrm{L}^{(2)}+k_\mathrm{L}^{(1)}+k_\mathrm{R}^{(1)}+k_\mathrm{R}^{(2)})\right]t\right)$ .

Again, we need to find the power series coefficients of this function to find the probability of the system. Since the generating function can be factored into the functions

$\displaystyle \exp\left(\left[k_\mathrm{R}^{(1)}z+\frac{k_\mathrm{L}^{(1)}}{z}\right]t\right)\exp\left(\left[k_\mathrm{R}^{(2)}z^2+\frac{k_\mathrm{L}^{(2)}}{z^2}\right]t\right)\exp\left(-(k_\mathrm{L}^{(2)}+k_\mathrm{L}^{(1)}+k_\mathrm{R}^{(1)}+k_\mathrm{R}^{(2)})t\right)$ ,

we can expand the first two exponentials as power series:

$\displaystyle \sum_{n\in\mathbb{Z}}\sum_{m\in\mathbb{Z}}\left(\frac{k_\mathrm{R}^{(1)}}{k_\mathrm{L}^{(1)}}\right)^{\frac{n}{2}}\left(\frac{k_\mathrm{R}^{(2)}}{k_\mathrm{L}^{(2)}}\right)^{\frac{m}{2}}I_n\left(2\sqrt{k_\mathrm{L}^{(1)}k_\mathrm{R}^{(1)}}t\right)I_m\left(2\sqrt{k_\mathrm{L}^{(2)}k_\mathrm{R}^{(2)}}t\right)z^{n+2m}$ .

This expression is just a product of two Laurent series and can be simplified using the Cauchy product formula as follows:

$\displaystyle \sum_{n\in\mathbb{Z}}\left[\sum_{m\in\mathbb{Z}}\left(\frac{k_\mathrm{R}^{(1)}}{k_\mathrm{L}^{(1)}}\right)^{\frac{n}{2}-m}\left(\frac{k_\mathrm{R}^{(2)}}{k_\mathrm{L}^{(2)}}\right)^{\frac{m}{2}}I_{n-2m}\left(2\sqrt{k_\mathrm{L}^{(1)}k_\mathrm{R}^{(1)}}t\right)I_m\left(2\sqrt{k_\mathrm{L}^{(2)}k_\mathrm{R}^{(2)}}t\right)\right]z^n$ .

Therefore, the probability describing the system is given by the expression

$\displaystyle p_n(t)=\mathrm{e}^{-(k_\mathrm{L}^{(2)}+k_\mathrm{L}^{(1)}+k_\mathrm{R}^{(1)}+k_\mathrm{R}^{(2)})t}\sum_{m\in\mathbb{Z}}\left(\frac{k_\mathrm{R}^{(1)}}{k_\mathrm{L}^{(1)}}\right)^{\frac{n}{2}-m}\left(\frac{k_\mathrm{R}^{(2)}}{k_\mathrm{L}^{(2)}}\right)^{\frac{m}{2}}I_{n-2m}\left(2\sqrt{k_\mathrm{L}^{(1)}k_\mathrm{R}^{(1)}}t\right)I_m\left(2\sqrt{k_\mathrm{L}^{(2)}k_\mathrm{R}^{(2)}}t\right)$ ,

where the sum cannot be simplified any further (to my knowledge). As for the one-step random walk, we expect it to behave like a Gaussian in the long-time limit,

$\displaystyle p_n(t)\approx\frac{\mathrm{e}^{-\frac{(n-vt)^2}{4Dt}}}{\sqrt{4\pi Dt}}$ .

With the help of the generating function, we find for the diffusion constant and velocity $2D=4k_\mathrm{L}^{(2)}+k_\mathrm{L}^{(1)}+k_\mathrm{R}^{(1)}+4k_\mathrm{R}^{(2)}$ and $v=2k_\mathrm{R}^{(2)}+k_\mathrm{R}^{(1)}-k_\mathrm{L}^{(1)}-2k_\mathrm{L}^{(2)}$ respectively.

*Plot of the probability $p_n(t)$ for the two-step random walk (red line) compared with the Gaussian approximation (black dashed).*

In the last section, I want to take a quick look at the most general random walk with arbitrary step sizes. In this case, the random walker moves with rates $k_\mathrm{R}^{(i)}$ $i$ steps to the right and with rates $k_\mathrm{L}{(i)}$ $i$ steps to the left. Similar to the previous two special cases, the master equation of this arbitrary step random walk is given by

$\displaystyle \frac{\mathrm{d}}{\mathrm{d}t}p_n(t)=\sum_{i=1}^\infty (k_\mathrm{R}^{(i)}p_{n-i}(t)+k_\mathrm{L}^{(i)}p_{n+i}(t))-p_n(t)\sum_{i=1}^\infty (k_\mathrm{R}^{(i)}+k_\mathrm{L}^{(i)})$ .

We first compute the macroscopic properties of the system, the mean position and the variance of the position. By mathematical induction, we find

$\displaystyle \langle n\rangle=\sum_{i=1}^\infty i(k_\mathrm{R}^{(i)}-k_\mathrm{L}^{(i)})t\qquad\textsf{and}\qquad \mathrm{Var}[n]=\sum_{i=1}^\infty i^2(k_\mathrm{L}^{(i)}+k_\mathrm{R}^{(i)})t$ .

From these follow the drift velocity and diffusion constant

$\displaystyle v=\sum_{i=1}^\infty i(k_\mathrm{R}^{(i)}-k_\mathrm{L}^{(i)})\qquad\textsf{and}\qquad D=\frac{1}{2}\sum_{i=1}^\infty i^2(k_\mathrm{L}^{(i)}+k_\mathrm{R}^{(i)})$ .

An interesting observation from these expressions is that unless the rate constants are zero after some maximum step size, the rates must decay faster than $1/i^3$ for the long-time diffusion constant to be finite. This means that otherwise in the long time limit, similar to the Cauchy distribution, the arbitrary step random walk has no well-defined second moment (or any higher moment). Figuratively speaking, the random walker occasionally wanders off arbitrarily far from the mean position. On the other hand, in the case of infinitely many rates where the second moment does converge, such as $k_\mathrm{R}^{(i)}\propto k_\mathrm{L}^{(i)}\propto1/i^4$ , even the infinitely many possible step sizes and the occasional large steps are not enough to „smear out“ the random walker’s position.

To elaborate on the last example, take the symmetric rates $k_\mathrm{R}^{(i)}= k_\mathrm{L}^{(i)}=k/i^4$ . The drift velocity and diffusion constant in this case are given by

$\displaystyle v=0\qquad\textsf{and}\qquad D=\frac{\pi^2k}{6}$ .

To calculate the form of the probability, we again consider the generating function, which is now given by

$\displaystyle G(z,t)=\prod_{i=1}^\infty\exp\left(\left[z^i+\frac{1}{z^i}\right]tk\right)\exp\left(-2tk\sum_{j=1}^\infty\frac{1}{j^4}\right)$ .

To compute the coefficients of the infinite product, we first consider the case of only two step sizes. From earlier we know that the coefficients are given by

$\displaystyle \sum_{m\in\mathbb{Z}}I_{n-2m}(2tk)I_m\left(2\cdot\frac{tk}{2^4}\right)$ ,

which we can write as a product of integrals using their respective integral representations:

$\displaystyle \sum_{m\in\mathbb{Z}}\frac{1}{(2\pi)^2}\int_{-\pi}^\pi\mathrm{d}x\,\int_{-\pi}^\pi\mathrm{d}y\,\mathrm{e}^{\mathrm{i} (n-2m)x}\mathrm{e}^{\mathrm{i} my}\mathrm{e}^{2tk\cos(x)}\mathrm{e}^{2tk\frac{\cos(y)}{2^4}}$ .

Switching the order of integration and summation, we can identify the well-known Fourier series representation of the Dirac delta, which gives us

$\displaystyle \frac{1}{2\pi}\int_{-\pi}^\pi\mathrm{d}x\,\int_{-\pi}^\pi\mathrm{d}y\,\delta(x-2y)\mathrm{e}^{\mathrm{i} nx}\mathrm{e}^{2tk\cos(x)}\mathrm{e}^{2tk\frac{\cos(y)}{2^4}}$

and after integrating over $y$ :

$\displaystyle \frac{1}{2\pi}\int_{-\pi}^\pi\mathrm{d}x\,\mathrm{e}^{\mathrm{i} nx}\mathrm{e}^{2tk\bigl(\cos(x)+\frac{\cos(2x)}{2^4}\bigr)}$ .

We can repeat these steps for larger and larger $i$ and arrive at (and I leave this as an exercise for the reader):

$\displaystyle \frac{1}{2\pi}\int_{-\pi}^\pi\mathrm{d}x\,\mathrm{e}^{\mathrm{i} nx}\mathrm{e}^{2tk\bigl(\cos(x)+\frac{\cos(2x)}{2^4}+\frac{\cos(3x)}{3^4}+\dotsm+\frac{\cos(ix)}{i^4}+\dotsm\bigr)}$ .

The sum in the exponent is a special case of the Clausen function (more precisely, the Glaisher-Clausen function), and the probability can thus be written as

$\displaystyle p_n(t)=\frac{\mathrm{e}^{-\frac{2\pi^4}{90}tk}}{2\pi}\int_{-\pi}^\pi\mathrm{d}x\,\mathrm{e}^{\mathrm{i} nx}\mathrm{e}^{2tk\,\text{Sl}_4(x)}$ .

But we can actually do a little bit better. First, we take advantage of the symmetry of the second integrand and write the integral as

$\displaystyle p_n(t)=\frac{\mathrm{e}^{-\frac{2\pi^4}{90}tk}}{\pi}\int_0^\pi\mathrm{d}x\,\cos(nx)\mathrm{e}^{2tk\,\text{Sl}_4(x)}$ .

For the interval $0\leq x\leq\pi$ , the Glaisher-Clausen function of order 4 has the simple expansion

$\displaystyle \text{Sl}_4(x)=\frac{\pi^4}{90}-\frac{\pi^2x^2}{12}+\frac{\pi x^3}{12}-\frac{x^4}{48}$ ,

thus, the above integral can also be expressed as

$\displaystyle p_n(t)=\frac{\mathrm{e}^{-\frac{2\pi^4}{90}tk}}{\pi}\int_0^\pi\mathrm{d}x\,\cos(nx)\mathrm{e}^{2tk\bigl(\frac{\pi^4}{90}-\frac{\pi^2x^2}{12}+\frac{\pi x^3}{12}-\frac{x^4}{48}\bigr)}$ .

And just to round things off, compare this to the much simpler long time limit:

$\displaystyle p_n(t)\approx\frac{\mathrm{e}^{-\frac{3n^2}{2\pi^2 kt}}}{\sqrt{\frac{2}{3}\pi^3kt}}$ .

Here a quick comparison of the integral expression with the Gaussian approximation:

Plot of the probability $p_n(t)$ for the arbitrary-step random walk (red line) compared with the Gaussian approximation (black dashed). ^{(This is probably a bit too early in time, as you can see at the peak, but it should be good enough to demonstrate the quality of the approximation)}

Whew! What a ride! This has probably been the most laborious post so far, but it’s been great fun to dive back into Bessel functions and also a bit into diffusion and random walks. But that’s it for now.

See you next time, Cheers!

I don’t really have any reading suggestions. But anything related to Bessel is probably good

*For a derivation, see my response to this math.stackexchange post
⁺This probability density is also called Skellam distribution
^#This is called a saddlepoint approximation

Pseudo-Riemannian Manifold of Random Ideas