Properties of Brownian Motion.

Let \(B(t)\) be a fixed Brownian motion. We give below some simple properties that follow directly from the definition of the Brownian Motion.

For any \(t\geq0\), \(B(t)\) is normally distributed with mean \(0\) and variance \(t\). For any \(s,t\geq0\) we have \(\mathbb{E}(B_{s}B_{t})=\min\{s,t\}\).

Proof. From condition (1), we have that \(B_{0}=0\). From condition (2), \(B_{t}-B_{0}=B_{t}\) is normally distributed with mean \(0\) and variance \(t\).

Assume that \(s<t\).

We have:

\[\begin{aligned} \mathbb{E}(B_{s}B_{t}) & =\mathbb{E}\left[B_{s}(B_{t}-B_{s}+B_{s})\right] & \{\text{Write }B_{t}=B_{t}-B_{s}+B_{s}\}\\ & =\mathbb{E}[B_{s}(B_{t}-B_{s})]+\mathbb{E}[B_{s}^{2}] & \{\text{Linearity of expectations}\}\\ & =\mathbb{E}[B_{s}]\mathbb{E}(B_{t}-B_{s})+s & \{B_{s},(B_{t}-B_{s})\text{ are independent}\}\\ & =0\cdot0+s\\ & =s \end{aligned}\]

This closes the proof. ◻ :::

(Translation Invariance) For fixed \(t_{0}\geq0\), the stochastic process \(\tilde{B}(t)=B(t+t_{0})-B(t_{0})\) is also a Brownian motion.

Proof. Proof. Firstly, the stochastic process \(\tilde{B}(t)\) is such that:

(1) \(\tilde{B}(0)=B(t_{0})-B(t_{0})=0\). Hence, it satisfies condition (1).

(2) Let \(s<t\). We have: \(\tilde{B}(t)-\tilde{B}(s)=B(t+t_{0})-B(s+t_{0})\) which a Gaussian random variable with mean 0 and variance \(t-s\). Hence, for \(a\leq b\),

\[\begin{aligned} \mathbb{P}\{a\leq & \tilde{B}(t)\leq b\}=\frac{1}{\sqrt{2\pi(t-s)}}\int_{a}^{b}e^{-\frac{x^{2}}{2(t-s)}}dx \end{aligned}\]

Hence, it satisfies condition (2).

(3) To check condition (3) for \(\tilde{B}(t)\), we may assume \(t_{0}>0\). Then, for any \(0\leq t_{1}\leq t_{2}\leq\ldots\leq t_{n}\), we have:

\[0<t_{0}\leq t_{0}+t_{1}\leq t_{0}+t_{2}\leq\ldots\leq t_{0}+t_{n}\]

So, \(B(t_{1}+t_{0})-B(t_{0})\), \(B(t_{2}+t_{0})-B(t_{1}+t_{0})\), \(\ldots\), \(B(t_{k}+t_{0})-B(t_{k-1}+t_{0})\), \(\ldots\), \(B(t_{n}+t_{0})-B(t_{n-1}+t_{0})\) are independent random variables. Consequently, \(\tilde{B}(t)\) satisfies condition (3).

This closes the proof. ◻

The above translation invariance property says that a Brownian motion starts afresh at any moment as a new Brownian motion.

(Scaling Invariance) For any real number \(\lambda>0\), the stochastic process \(\tilde{B}(t)=B(\lambda t)/\sqrt{\lambda}\) is also a Brownian motion.

Proof. Proof. The scaled stochastic process \(\tilde{B}(t)\) is such that:

(1) \(\tilde{B}(0)=0\). Hence it satisfies condition (1).

(2) Let \(s<t\). Then, \(\lambda s<\lambda t\). We have:

\[\begin{aligned} \tilde{B}(t)-\tilde{B}(s) & =\frac{1}{\sqrt{\lambda}}(B(\lambda t)-B(\lambda s)) \end{aligned}\]

Now, \(B(\lambda t)-B(\lambda s)\) is a Gaussian random variable with mean \(0\) and variance \(\lambda(t-s)\). We know that, if \(X\) is a random variable with mean \(\mu\) and variance \(\sigma^{2}\), \(Z=\left(\frac{X-\mu}{\sigma}\right)\) has mean \(0\) and variance \(1\). Consequently, \(\frac{B(\lambda t)-B(\lambda s)}{\sqrt{\lambda}}\) is a Gaussian random variable with mean \(0\) and variance \((t-s)\).

Hence, \(\tilde{B}(t)-\tilde{B}(s)\) is normal distributed with mean \(0\) and variance \(t-s\) and it satisfies condition (2).

(3) To check condition (3) for \(\tilde{B}(t)\), we may assume \(t_{0}>0\). Then, for any \(0\leq t_{1}\leq t_{2}\leq\ldots\leq t_{n}\), we have:

\[0\leq\lambda t_{1}\leq\lambda t_{2}\leq\ldots\leq\lambda t_{n}\]

Consequently, the random variables \(B(\lambda t_{k})-B(\lambda t_{k-1})\), \(k=1,2,3,\ldots,n\) are independent. Hence it follows that \(\frac{1}{\sqrt{\lambda}}[B(\lambda t_{k})-B(\lambda t_{k-1})]\) for \(k=1,2,\ldots,n\) are also independent random variables.

This closes the proof. ◻

It follows from the scaling invariance property that for any \(\lambda>0\) and \(0\leq t_{1}\leq t_{2}\leq\ldots\leq t_{n}\), the random vectors:

\[(B(\lambda t_{1}),B(\lambda t_{2}),\ldots,B(\lambda t_{n}))\quad(\sqrt{\lambda}B(t_{1}),\sqrt{\lambda}B(t_{1}),\ldots,\sqrt{\lambda}B(t_{n}))\]

have the same distribution.

The scaling property shows that Brownian motion is self-similar, much like a fractal. To see this, suppose we zoom into a Brownian motion path very close to zero, say on the interval \([0,10^{-6}]\). If the Brownian motion path were smooth and differentiable, the closer we zoom in around the origin, the flatter the function will look. In the limit, we would essentially see a straight line given by the derivative at \(0\). However, what we see with the Brownian motion is very different. The scaling property means that for \(a=10^{-6}\),

\[ \begin{aligned} (B_{10^{-6}t,}t\in[0,1]) & \stackrel{\text{distrib.}}{=}(10^{-3}B_{t},t\in[0,1]) \end{aligned} \]

where \(\stackrel{\text{distrib.}}{=}\) means equality of the distribution of the two processes. In other words, Brownian motion on \([0,10^{-6}]\) looks like a Browian motion on \([0,1]\), but with its amplitude multiplied by a factor of \(10^{-3}\). In particular, it will remain rugged as we zoom in, unlike a smooth function.

(Reflection at time \(s\)) The process \((-B_{t},t\geq0)\) is a Brownian motion. More generally, for any \(s\geq0\), the process \((\tilde{B}(t),t\geq0)\) defined by:

\[\begin{aligned} \tilde{B}(t) & =\begin{cases} B_{t} & \text{if }t\leq s\\ B_{s}-(B_{t}-B_{s}) & \text{if }t>s \end{cases}\label{eq:reflection-property} \end{aligned}\]

is a Brownian motion.

Proof. Proof. (a) Consider the process \(\tilde{B}(t)=(-B_{t},t\geq0)\).

(1) \(\tilde{B}(0)=0\).

(2) If \(X\) is a Gaussian random variable with mean \(0\) and variance \(t-s\), \(-X\) is also Gaussian with mean \(0\) and variance \(t-s\). Thus, \(\tilde{B}(t)-\tilde{B}(s)=-(B(t)-B(s))\) is also Gaussian with mean \(0\) and variance \((t-s)\). Hence condition (2) is satisfied.

(3) Assume that \(0\leq t_{0}\leq t_{1}\leq\ldots\leq t_{n}\). Then, the random variables \(-(B(t_{k})-B(t_{k-1}))\) are independent for \(k=1,2,3,\ldots,n\). Hence, condition (3) is satisfied.

(b) Consider the process \(\tilde{B}(t)\) as defined in ([eq:reflection-property]).

Fix an \(s\geq0\).

(1) Let \(t=0\). Then, \(t\leq s\). \(\tilde{B}(t)=\tilde{B}(0)=B(0)=0\).

(2) Let \(t_{1}<t_{2}\leq s\). Then, \(\tilde{B}(t_{2})-\tilde{B}(t_{1})=B(t_{2})-B(t_{1})\). This is a Gaussian random variable with mean \(0\) and variance \(t_{2}-t_{1}\).

Let \(t_{1}<s<t_{2}\). Then, \(\tilde{B}(t_{2})-\tilde{B}(t_{1})=B(s)-(B(t_{2})-B(s))-B(t_{1})=(B(s)-B(t_{1}))-(B(t_{2})-B(s))\). Since, \(B(s)-B(t_{1})\) and \(B(t_{2})-B(s)\) are independent Gaussian random variables, any linear combination of these is Gaussian. Moreover, its mean is zero. The variance is given by:

\[\begin{aligned} Var[\tilde{B}(t_{2})-\tilde{B}(t_{1})] & =Var[B(s)-B(t_{1})]+Var[B(t_{2})-B(s)]\\ & =(s-t_{1})+(t_{2}-s)\\ & =t_{2}-t_{1} \end{aligned}\]

Let \(s<t_{1}<t_{2}\). Then, \[\begin{aligned} \tilde{B}(t_{2})-\tilde{B}(t_{1}) & =B_{s}-(B_{t_{2}}-B_{s})-(B_{s}-(B_{t_{1}}-B_{s}))\\ & =\cancel{B_{s}}-(B_{t_{2}}-\cancel{B_{s}})-(\cancel{B_{s}}-(B_{t_{1}}-\cancel{B_{s}}))\\ & =-(B_{t_{2}}-B_{t_{1}}) \end{aligned}\]

Hence, \(\tilde{B}(t_{2})-\tilde{B}(t_{1})\) is again a Gaussian random variable with mean \(0\) and variance \(t_{2}-t_{1}\). Hence, condition (3) is satisfied.

(3) Assume that \(0\leq t_{1}\leq\ldots\leq t_{k-1}\leq s\leq t_{k}\leq\ldots\leq t_{n}\). From the above discussion, the increments \(\tilde{B}(t_{2})-\tilde{B}(t_{1})\), \(\ldots\), \(\tilde{B}(s)-\tilde{B}(t_{k-1})\), \(\tilde{B}(t_{k})-\tilde{B}(s)\), \(\ldots\), \(\tilde{B}(t_{n})-\tilde{B}(t_{n-1})\) are independent increments. The increment \(\tilde{B}(t_{k})-\tilde{B}(t_{k-1})\) only depends on the random variables \(\tilde{B}(s)-\tilde{B}(t_{k-1})\) and \(\tilde{B}(t_{k})-\tilde{B}(s)\). Thus, \(\tilde{B}(t_{2})-\tilde{B}(t_{1})\), \(\ldots\), \(\tilde{B}(t_{k})-\tilde{B}(t_{k-1})\), \(\ldots\), \(\tilde{B}(t_{n})-\tilde{B}(t_{n-1})\) are independent. ◻

(Time Reversal). Let \((B_{t},t\geq0)\) be a Brownian motion. Show that the process \((B_{1}-B_{1-t},t\in[0,1])\) has the distribution of a standard brownian motion on \([0,1]\).

Proof. Proof. (1) At \(t=0\), \(B(1)-B(1-t)=B(1)-B(1)=0\).

(2) Let \(s<t\). Then, \(1-t<1-s\). So, the increment :

\[\begin{aligned} (B(1)-B(1-t))-(B(1)-B(1-s)) & =B(1-s)-B(1-t) \end{aligned}\]

has a Gaussian distribution. It’s mean is \(0\) and variance is \((1-s)-(1-t)=t-s\).

(3) Let \(0\leq t_{1}\leq t_{2}\leq\ldots\leq t_{n}\). Then:

\[1-t_{n}\leq\ldots\leq1-t_{k}\leq1-t_{k-1}\leq\ldots\leq1-t_{2}\leq1-t_{1}\]

Consider the increments of the process for \(k=1,2,\ldots,n\):

\[\begin{aligned} (B(1)-B(1-t_{k}))-(B(1)-B(1-t_{k-1})) & =B(1-t_{k-1})-B(1-t_{k}) \end{aligned}\]

They are independent random variables. Hence, condition (3) is satisfied. ◻

(Evaluating Brownian Probabilities). Let’s compute the probability that \(B_{1}>0\) and \(B_{2}>0\). We know from the definition that \((B_{1},B_{2})\) is a Gaussian vector with mean \(0\) and covariance matrix:

\[\begin{aligned} C & =\left[\begin{array}{cc} 1 & 1\\ 1 & 2 \end{array}\right] \end{aligned}\]

The determinant of \(C\) is \(1\). By performing row operations on the augmented matrix \([C|I]\) we find that:

\[\begin{aligned} C^{-1} & =\left[\begin{array}{cc} 2 & -1\\ -1 & 1 \end{array}\right] \end{aligned}\]

Thus, the probability \(\mathbb{P}(B_{1}>0,B_{2}>0)\) can be expressed as:

\[\begin{aligned} \mathbb{P}(B_{1}>0,B_{2}>0) & =\frac{1}{\sqrt{(2\pi)^{2}}}\int_{0}^{\infty}\int_{0}^{\infty}\exp\left[-\frac{1}{2}(2x_{1}^{2}-2x_{1}x_{2}+x_{2}^{2}\right]dx_{2}dx_{1} \end{aligned}\]

This integral can be evaluated using a calculator or software and is equal to \(3/8\). The probability can also be computed using the independence of increments. The increments \((B_{1},B_{2}-B_{1})\) are IID standard Gaussians. We know their joint PDF. It remains to integrate over the correct region of \(\mathbf{R}^{2}\) which in this case will be:

\[\begin{aligned} D^{*} & =\{(z_{1},z_{2}):(z_{1}>0,z_{1}+z_{2}>0)\} \end{aligned}\]

We have:

\[\begin{aligned} \mathbb{P}(B_{1}>0,B_{2}>0) & =\frac{1}{2\pi}\int_{0}^{\infty}\int_{z_{2}=-z_{1}}^{z_{2}=\infty}e^{-(z_{1}^{2}+z_{2}^{2})/2}dz_{2}dz_{1} \end{aligned}\]

It turns out that this integral can be evaluated exactly. Indeed by writing \(B_{1}=Z_{1}\) and \(Z_{2}=B_{2}-B_{1}\) and splitting the probability on the event \(\{Z_{2}\geq0\}\) and its complement, we have that \(\mathbb{P}(B_{1}\geq0,B_{2}\geq0)\) equals:

\[\begin{aligned} \mathbb{P}(B_{1}\geq0,B_{2}\geq0) & =\mathbb{P}(Z_{1}\geq0,Z_{1}+Z_{2}>0,Z_{2}\geq0)+\mathbb{P}(Z_{1}\geq0,Z_{1}+Z_{2}>0,Z_{2}<0)\\ & =\mathbb{P}(Z_{1}\geq0,Z_{2}\geq0)+\mathbb{P}(Z_{1}\geq0,Z_{1}>-Z_{2},-Z_{2}>0)\\ & =\mathbb{P}(Z_{1}\geq0,Z_{2}\geq0)+\mathbb{P}(Z_{1}\geq0,Z_{1}>Z_{2},Z_{2}>0)\\ & =\frac{1}{4}+\frac{1}{8}\\ & =\frac{3}{8} \end{aligned}\]

Note that, by symmetry, \(\mathbb{P}(Z_{1}\geq0,Z_{1}>Z_{2},Z_{2}>0)=\mathbb{P}(Z_{1}\geq0,Z_{1}\leq Z_{2},Z_{2}>0)=\frac{1}{8}\).

(Another look at Ornstein Uhlenbeck process.) Consider the process \((X_{t},t\in\mathbf{R})\) defined by :

\[\begin{aligned} X_{t} & =\frac{e^{-2t}}{\sqrt{2}}B(e^{4t}),\quad t\in\mathbf{R} \end{aligned}\]

Here the process \((B_{e^{4t}},t\ge0)\) is called a time change of Brownian motion, since the time is now quantitfied by an increasing function of \(t\) namely \(e^{4t}\). The example \((B(\lambda t),t\geq0)\) in the scaling property is another example of time change.

It turns out that \((X_{t},t\in\mathbf{R})\) is a stationary Ornstein-Uhlenbeck process. (Here the index of time is \(\mathbf{R}\) instead of \([0,\infty)\), but the definition also applies as the process is stationary. Since the original brownian motion \(B(t)\) is a Gaussian process, any finite dimensional vector \((B(t_{1}),\ldots,B(t_{n}))\) is Gaussian. It follows that:

\[(B(T_{1}),\ldots,B(T_{n}))=\frac{1}{\sqrt{2}}(e^{-2t_{1}}B(e^{4t_{1}}),\ldots,e^{-2t_{n}}B(e^{4t_{n}}))\]

is also a Gaussian vector. (Note, once we fix \(t_{1},t_{2},\ldots,t_{n}\), \(e^{-4t_{1}},\ldots,e^{-4t_{n}}\) are constants.) Hence, \((X_{t},t\in\mathbf{R})\) is a Gaussian process.

The mean of \((X_{t},t\in\mathbf{R})\) is:

\[\begin{aligned} \mathbb{E}[X_{t}] & =\frac{e^{-2t}}{\sqrt{2}}\mathbb{E}[B(e^{4t})]=0 \end{aligned}\]

And if \(s<t\),

\[\begin{aligned} \mathbb{E}[X_{s}X_{t}] & =\frac{e^{-2(s+t)}}{2}\mathbb{E}[B(e^{4s})B(e^{4t})]\\ & =\frac{e^{-2(s+t)}}{2}e^{4s}\\ & =\frac{e^{-2(t-s)}}{2} \end{aligned}\]

Two Gaussian processes having the same mean and covariance have the same distribution. Hence, it proves the claim that \((X_{t})\) is a stationary OU process.

Properties of the paths.

First we review the definitions of the Riemann integral and the Riemann-Stieljtes integral in Calculus.

A partition \(P\) of \([a,b]\) is a finite set of points from \([a,b]\) that includes both \([a,b].\)The notational convention is to always list the points of a partition \(P=\{a=x_{0},x_{1},x_{2},\ldots,x_{n}=b\}\) in increasing order. Thus:

\[a=x_{0}<x_{1}<\ldots<x_{k-1}<x_{k}<\ldots<x_{n}=b\]

For each subinterval \([x_{k-1},x_{k}]\) of \(P\), let

\[\begin{aligned} m_{k} & =\inf\{f(x):x\in[x_{k-1},x_{k}]\}\\ M_{k} & =\sup\{f(x):x\in[x_{k-1},x_{k}]\} \end{aligned}\]

The lower sum of \(f\) with respect to \(P\) is given by :

\[\begin{aligned} L(f,P) & =\sum_{k=1}^{n}m_{k}(x_{k}-x_{k-1}) \end{aligned}\]

The upper sum of \(f\) with respect to \(P\) is given by:

\[\begin{aligned} U(f,P) & =\sum_{k=1}^{n}M_{k}(x_{k}-x_{k-1}) \end{aligned}\]

For a particular partition \(P\), it is clear that \(U(f,P)\geq L(f,P)\) because \(M_{k}\geq m_{k}\) for all \(k=0,1,2,\ldots,n\).

A partition \(Q\) is called a refinement of \(P\) if \(Q\) contains all of the points of \(P\); that is \(Q\subseteq P\).

If \(P\subseteq Q\), then \(L(f,P)\leq L(f,Q)\) and \(U(f,Q)\leq U(f,P)\).

Proof. Proof. Consider what happens when we refine \(P\) by adding a single point \(z\) to some subinterval \([x_{k-1},x_{k}]\) of \(P\). We have:

\[\begin{aligned} m_{k}(x_{k}-x_{k-1}) & =m_{k}(x_{k}-z)+m_{k}(z-x_{k-1})\\ & \leq m_{k}'(x_{k}-z)+m_{k}''(z-x_{k-1}) \end{aligned}\]

where

\[\begin{aligned} m_{k}' & =\inf\{f(x):x\in[z,x_{k}]\}\\ m_{k}'' & =\inf\{f(x):x\in[x_{k-1},z]\} \end{aligned}\]

By induction we have:

\[\begin{aligned} L(f,P) & \leq L(f,Q)\\ U(f,Q) & \leq U(f,P) \end{aligned}\] ◻

If \(P_{1}\) and \(P_{2}\) are any two partitions of \([a,b]\), then \(L(f,P_{1})\leq U(f,P_{2})\).

Proof. Proof. Let \(Q=P_{1}\cup P_{2}\). Then, \(P_{1}\subseteq Q\) and \(P_{2}\subseteq Q\). Thus, \(L(f,P_{1})\leq L(f,Q)\leq U(f,Q)\leq L(f,P_{2})\). ◻

Let \(\mathcal{P}\) be the collection of all possible partitions of the interval \([a,b]\). The upper integral of \(f\) is defined to be:

\[\begin{aligned} U(f) & =\inf\{U(f,P):P\in\mathcal{P}\} \end{aligned}\]

The lower integral of \(f\) is defined by:

\[\begin{aligned} L(f) & =\sup\{L(f,P):P\in\mathcal{P}\} \end{aligned}\]

Consider the set of all upper sums of \(f\) - \(\{U(f,P):P\in\mathcal{P}\}\). Take an arbitrary partition \(P'\in\mathcal{P}\). Since \(L(f,P')\leq U(f,P)\) for all \(P\in\mathcal{P}\), by the Axiom of Completeness(AoC), \(\inf\{U(f,P):P\in\mathcal{P}\}\) exists.We can similarly argue for the supremum of all lower Riemann sums.

For any bounded function \(f\) on \([a,b]\), it is always the case that \(U(f)\geq L(f)\).

Proof. Proof. By the properties of the infimum of a set, \((\forall\epsilon>0)\), \(\exists P(\epsilon)\) such that \(U(f)<U(f,P(\epsilon))<U(f)+\epsilon\). Pick \(\epsilon=1,\frac{1}{2},\frac{1}{3}\ldots,\frac{1}{n},\ldots\). Thus, we can produce a sequence of partitions \(P_{n}\) such that :

\[U(f)<\ldots<U(f,P_{n})<U(f)+\frac{1}{n}\]

Consequently, \(\lim U(f,P_{n})=U(f)\). Similarly, we can produce a sequence of partitions \((Q_{m})\) such that :

\[L(f)-\frac{1}{m}<\ldots<L(f,Q_{m})<L(f)\]

We know that:

\[\begin{aligned} L(f,Q_{m}) & \leq U(f,P_{n}) \end{aligned}\]

Keeping \(m\) fixed and passing to the limit, as \(n\to\infty\) on both sides, we have:

\[\begin{aligned} \lim_{n\to\infty}L(f,Q_{m}) & \leq\lim_{n\to\infty}U(f,P_{n})\quad\left\{ \text{Order Limit Theorem}\right\} \\ L(f,Q_{m}) & \leq U(f) \end{aligned}\]

Now, passing to the limit, as \(m\to\infty\) on both sides, we have:

\[\begin{aligned} \lim_{m\to\infty}L(f,Q_{m}) & \leq\lim_{m\to\infty}U(f)\quad\left\{ \text{Order Limit Theorem}\right\} \\ L(f) & \leq U(f) \end{aligned}\] ◻

(Riemann Integrability). A bounded function \(f\) on the interval \([a,b]\) is said to be Riemann integrable if \(U(f)=L(f)\). In this case, we define \(\int_{a}^{b}f\) or \(\int_{a}^{b}f(x)dx\) to be the common value:

\[\begin{aligned} \int_{a}^{b}f(x)dx & =U(f)=L(f) \end{aligned}\]

(Integrability Criterion) A bounded function \(f\) is integrable on \([a,b]\) if and only if, for every \(\epsilon>0\), there exists a partition \(P_{\epsilon}\) of \([a,b]\) such that:

\[\begin{aligned} U(f,P_{\epsilon})-L(f,P_{\epsilon}) & <\epsilon \end{aligned}\]

Proof. Proof. (\(\Longleftarrow\) direction.) Let \(\epsilon>0\). If such a partition \(P_{\epsilon}\) exists, then:

\[U(f)-L(f)\leq U(f,P_{\epsilon})-L(f,P_{\epsilon})<\epsilon\]

Because \(\epsilon\) is arbitrary, it follows that \(U(f)=L(f)\) and hence \(f\) is Riemann integrable.

(\(\Longrightarrow\) direction.) Let \(f\) be a bounded function on \([a,b]\) such that \(f\) is Riemann integrable.

Pick an arbitrary \(\epsilon>0\).

Then, since \(U(f)=\inf\{U(f,P):P\in\mathcal{P}\}\), there exists \(P_{\epsilon}\in\mathcal{P}\), such that \(U(f)<U(f,P_{\epsilon})<U(f)+\frac{\epsilon}{2}\). Since \(L(f)=\sup\{L(f,P):P\in\mathcal{P}\}\), there exists \(P_{\epsilon}\in\mathcal{P}\), such that \(L(f)-\frac{\epsilon}{2}<L(f,P_{\epsilon})<L(f)\). Consequently,

\[\begin{aligned} U(f,P_{\epsilon})-L(f,P_{\epsilon}) & <U(f)+\frac{\epsilon}{2}-\left(L(f)-\frac{\epsilon}{2}\right)\\ & =U(f)-L(f)+\epsilon\\ & =\epsilon \end{aligned}\] ◻

Functions considered in Stochastic Calculus.

A point \(c\) is called a discontinuity of the first kind or jump point if both limits \(g(c+)=\lim_{t\uparrow c}g(t)\) and \(g(c-)=\lim_{t\downarrow c}g(t)\) exist and are not equal. The jump at \(c\) is defined as \(\Delta g(c)=g(c+)-g(c-)\). Any other discontinuity is said to be of the second kind.

Consider the function

\[\begin{aligned} f(x) & =\sin\left(\frac{1}{x}\right) \end{aligned}\]

Let \(x_{n}=\frac{1}{2n\pi}\). Then, \(f(x_{n})=(0,0,0,\ldots)\). Next, consider \(y_{n}=\frac{1}{\pi/2+2n\pi}\). Then, \(f(y_{n})=(1,1,1,\ldots)\). Consequently, \(f\) is not continuous at \(0\). Hence, limits from the left or right don’t exist. Consequently, this is a discontinuity of the second kind.

Functions in stochastic calculus are functions without discontinuities of the second kind, that is functions have both left and right hand limits at any point of the domain and have one-sided limits at the boundary. These functions are called regular functions. It is often agreed to identify functions if they have the same right and left limits at any point.

The class \(D=D[0,T]\) of right-continuous functions on \([0,T]\) with left limits has a special name, cadlag functions (which is the abbreviation of right continuous with left limits in French). Sometimes these processes are called R.R.C. for regular right continuous. Notice that this class of processes includes \(C\), the class of continuous functions.

Let \(g\in D\) be a cadlag function, then, by definition, all the discontinuities of \(g\) are jumps. An important result in analysis is that, a function can have no more than a countable number of discontinuities.

Variation of a function.

If \(g\) is a function of a real variable, its variation over the interval \([a,b]\) is defined as:

\[\begin{aligned} V_{g}([a,b]) & =\sup\left\{ \sum_{i=1}^{n}\left|g(t_{i})-g(t_{i-1})\right|\right\} \label{eq:total-variation-of-a-function} \end{aligned}\]

where the supremum is taken over all partitions \(P\in\mathcal{P}\).

Clearly, by the Triangle Inequality, the sums in ([eq:total-variation-of-a-function]) increase as new points are added to the partitions. Therefore, the variation of \(g\) is:

\[\begin{aligned} V_{g}([a,b]) & =\lim_{||\Delta_{n}||\to0}\sum_{i=1}^{n}\left|g(t_{i})-g(t_{i-1})\right| \end{aligned}\]

where \(||\Delta_{n}||=\max_{1\leq i\leq n}(t_{i}-t_{i-1})\). If \(V_{g}([a,b])\) is finite, then \(g\) is said to be a function of finite variation on \([a,b]\). If \(g\) is a function of \(t\geq0\), then the variation of \(g\) as a function of \(t\) is defined by:

\[\begin{aligned} V_{g}(t) & =V_{g}([0,t]) \end{aligned}\]

Clearly, \(V_{g}(t)\) is an increasing function of \(t\).

\(g\) is a function of finite variation if \(V_{g}(t)<\infty\) for all \(t\in[0,\infty)\). \(g\) is of bounded variation if \(\sup_{t}V_{g}(t)<\infty\), in other words there exists \(C\), for all \(t\), such that \(V_{g}(t)<C\). Here \(C\) is independent of \(t\).

(1) If \(g(t)\) is increasing then for any \(i\), \(g(t_{i})\geq g(t_{i-1})\), resulting in a telescopic sum, where all terms excluding the first and the last cancel out, leaving

\[\begin{aligned} V_{g}(t) & =g(t)-g(0) \end{aligned}\]

(2) If \(g(t)\) is decreasing, then similarly,

\[\begin{aligned} V_{g}(t) & =g(0)-g(t) \end{aligned}\]

If \(g(t)\) is differentiable with continuous derivative \(g'(t)\), \(g(t)=\int_{0}^{t}g'(s)ds\) then

\[\begin{aligned} V_{g}(t) & =\int_{0}^{t}|g'(s)|ds \end{aligned}\]

Proof. Proof. By definition,

\[\begin{aligned} V_{g}(t) & =\lim_{||\Delta_{n}\to0||}\sum_{i=1}^{n}|g(t_{i})-g(t_{i-1})| \end{aligned}\]

Since \(g\) is continuous and differentiable on \([t_{i-1},t_{i}]\), there exists \(z_{i}\in(t_{i-1},t_{i})\) such, that \(g(t_{i})-g(t_{i-1})=g'(z_{i})(t_{i}-t_{i-1})\). Therefore, we can write:

\[\begin{aligned} {1} V_{g}(t) & =\lim_{||\Delta_{n}\to0||}\sum_{i=1}^{n}|g'(z_{i})|(t_{i}-t_{i-1})\\ & =\int_{0}^{t}|g'(s)|ds \end{aligned}\] ◻

If \(g\) is continuous, \(g'\) exists and \(\int_{0}^{t}|g'(s)|ds\) is finite, then \(g\) is of finite variation.

The function \(g(t)=t\sin(1/t)\) for \(t>0\) and \(g(0)=0\) is continuous on \([0,1]\) and differentiable at all points except zero, but is not of bounded variation on any interval that includes \(0\). Consider the partition \(\{x_{n}\}=\left\{ \frac{1}{\pi/2+n\pi}\right\}\). Thus,

\[\begin{aligned} \sin(\frac{1}{x_{n}}) & =\begin{cases} 1 & \text{if }n\text{ is even}\\ -1 & \text{if }n\text{ is odd} \end{cases} \end{aligned}\]

Thus,

\[\begin{aligned} f(x_{n}) & =\begin{cases} x_{n} & n\text{ is even}\\ -x_{n} & n\text{ is odd} \end{cases} \end{aligned}\]

Therefore,

\[\begin{aligned} \sum_{n=1}^{m}|f(x_{n})-f(x_{n-1})| & =\sum_{n=1}^{m}(x_{n}+x_{n-1})\\ & =x_{0}+x_{n}+2\sum_{n=1}^{m-1}x_{n}\\ & \geq\sum_{n=1}^{m-1}x_{n} \end{aligned}\]

This is the lower bound on the variation of \(g\) on the partition \(\{0,x_{m},\ldots,x_{1},x_{0},1\}\). Now, passing to the limit as \(m\) approaches infinity, \(\sum\frac{1}{\pi/2+n\pi}\) is a divergent series. Consequently, \(V_{g}([0,1])\) has unbounded variation.

Jordan Decomposition.

Any function \(g:[0,\infty)\to\mathbf{R}\) is of bounded variation if and only if it can be expressed as the difference of two increasing functions:

\[\begin{aligned} g(t) & =a(t)-b(t) \end{aligned}\]

Proof. Proof. (\(\Longrightarrow\)direction). If \(g\) is of finite variation, \(V_{g}(t)<\infty\) for all \(t\), and we can write:

\[\begin{aligned} g(t) & =V_{g}(t)-(V_{g}(t)-g(t)) \end{aligned}\]

Let \(a(t)=V_{g}(t)\) and \(b(t)=V_{g}(t)-g(t)\). Clearly, both \(a(t)\) and \(b(t)\) are increasing functions.

(\(\Longleftarrow\)direction). Suppose a function \(g\) can be expressed as a difference of two bounded increasing functions. Then,

\[\begin{aligned} V_{g}(t) & =\lim_{||\Delta_{n}||\to0}\sum_{i=1}^{n}|(a(t_{i})-b(t_{i}))-(a(t_{i-1})-b(t_{i-1})|\\ & \quad\{\text{ Telescoping sum }\}\\ & =a(t)-b(t)-(a(0)-b(0)) \end{aligned}\]

Since both \(a(t)\) and \(b(t)\) are bounded, \(g\) has bounded variation. ◻

Riemann-Stieltjes Integral.

Let \(g\) be a montonically increasing function on a finite closed interval \([a,b]\). A bounded function \(f\) defined on \([a,b]\) is said to Riemann-Stieltjes integrable with respect to \(g\) if the following limit exists:

\[\begin{aligned} \int_{a}^{b}f(t)dg(t) & =\lim_{||\Delta_{n}||\to0}\sum_{i=1}^{n}f(\tau_{i})(g(t_{i})-g(t_{i-1}))\label{eq:riemann-stieltjes-integral} \end{aligned}\]

where \(\tau_{i}\) is an evaluation point in the interval \([t_{i-1},t_{i}]\). It is a well-known fact that continuous functions are Riemann integrable and Riemann-Stieltjes integrable with respect to any monotonically increasing function on \([a,b]\).

We ask the following question. For any continuous functions \(f\) and \(g\) on \([a,b]\), can we define the integral \(\int_{a}^{b}f(t)dg(t)\) by Equation ([eq:riemann-stieltjes-integral])?

Consider the special case \(f=g\), namely, the integral:

\[\int_{a}^{b}f(t)df(t)\]

Let \(\Delta_{n}=\{a=t_{0},t_{1},\ldots,t_{n}=b\}\) be a partition of \([a,b]\). Let \(L_{n}\) and \(R_{n}\) denote the corresponding Riemann sums with the evaluation points \(\tau_{i}=t_{i-1}\) and \(\tau_{i}=t_{i}\), respectively, namely,

\[\begin{aligned} L_{n} & =\sum_{i=1}^{n}f(t_{i-1})(f(t_{i})-f(t_{i-1}))\label{eq:left-riemann-sum}\\ R_{n} & =\sum_{i=1}^{n}f(t_{i})(f(t_{i})-f(t_{i-1}))\label{eq:right-riemann-sum} \end{aligned}\]

Is it true that, \(\lim L_{n}=\lim R_{n}\) as \(||\Delta_{n}||\to0\)? Observe that:

\[R_{n}-L_{n}=\sum_{i=1}^{n}(f(t_{i})-f(t_{i-1}))^{2}\label{eq:quadratic-variation}\]

\[R_{n}+L_{n}=\sum_{i=1}^{n}(f(t_{i})^{2}-f(t_{i-1})^{2})=f(b)^{2}-f(a)^{2}\label{eq:sum-of-left-and-right-riemann-sums}\]

Therefore, \(R_{n}\) and \(L_{n}\) are given by:

\[R_{n}=\frac{1}{2}\left(f(b)^{2}-f(a)^{2}+\sum_{i=1}^{n}(f(t_{i})-f(t_{i-1}))^{2}\right)\]

\[L_{n}=\frac{1}{2}\left(f(b)^{2}-f(a)^{2}-\sum_{i=1}^{n}(f(t_{i})-f(t_{i-1}))^{2}\right)\]

The limit of the right-hand side of equation ([eq:quadratic-variation]) is called the quadratic variation of the function \(f\) on \([a,b]\). Obviously, \(\lim_{||\Delta_{n}||\to0}R_{n}\neq\lim_{||\Delta_{n}||\to0}L_{n}\) if and only the quadratic variation of the function \(f\) is non-zero.

Let \(f\) be a \(C^{1}\)-function that is \(f'(t)\) is a continuous function. Then, by the mean value theorem:

\[\begin{aligned} |R_{n}-L_{n}| & =\sum_{i=1}^{n}(f(t_{i})-f(t_{i-1}))^{2}\\ & =\sum_{i=1}^{n}(f'(t_{i}^{*})(t_{i}-t_{i-1}))^{2}\\ & \quad\{\text{Mean Value Theorem}\}\\ & \leq\sum_{i=1}^{n}\left\Vert f'\right\Vert _{\infty}^{2}(t_{i}-t_{i-1})^{2}\\ & \quad\{\text{ Interior Extremum Theorem }\}\\ & \leq\left\Vert f'\right\Vert _{\infty}^{2}\left\Vert \Delta_{n}\right\Vert \sum_{i=1}^{n}(t_{i}-t_{i-1})\\ & =\left\Vert f'\right\Vert _{\infty}^{2}\left\Vert \Delta_{n}\right\Vert (b-a) \end{aligned}\]

where \(\left\Vert f'\right\Vert _{\infty}=\sup_{x\in[a,b]}f(x)\). Thus, the limit as \(\left\Vert \Delta_{n}\right\Vert \to0\) of the distance \(|R_{n}-L_{n}|\) also approaches zero. Thus, \(\lim L_{n}=\lim R_{n}\) as \(\left\Vert \Delta_{n}\right\Vert \to0\) and the Riemann-Stieltjes integral exists. By equation ([eq:sum-of-left-and-right-riemann-sums]), we have:

\[\lim_{\left\Vert \Delta_{n}\right\Vert \to0}L_{n}=\lim_{\left\Vert \Delta_{n}\right\Vert \to0}R_{n}=\frac{1}{2}(f(b)^{2}-f(a)^{2})\]

On the other hand, for such a \(C^{1}\)-function \(f\), we may simply define the integral \(\int_{a}^{b}f(t)df(t)\) by:

\[\begin{aligned} \int_{a}^{b}f(t)df(t) & =\int_{a}^{b}f(t)f'(t)dt \end{aligned}\]

Then, by the fundamental theorem of Calculus:

\[\int_{a}^{b}f(t)df(t)=\int_{a}^{b}f(t)f'(t)dt=\frac{1}{2}f(t)^{2}|_{a}^{b}=\frac{1}{2}(f(b)^{2}-f(a)^{2})\]

There is a very close relationship between functions with bounded variation and functions for which the classical integral makes sense. For the Ito integral, the quadratic variation plays a similar role. The quadratic variation of a smooth fuction \(f\in C^{1}([0,t])\) is zero.

Suppose \(f\) is a continuous function satisfying the condition

\[\begin{aligned} |f(t)-f(s)| & \leq C|t-s|^{1/2} \end{aligned}\]

where \(0<C<1\).

In this case we have:

\[0\leq|R_{n}-L_{n}|\leq C^{2}\sum_{i=1}^{n}(t_{i}-t_{i-1})=C^{2}(b-a)\]

Hence, \(\lim R_{n}\neq\lim L_{n}\) as \(\left\Vert \Delta_{n}\right\Vert \to0\) when \(a\neq b\). Consequently, the integral \(\int_{a}^{b}f(t)df(t)\) cannot be defined for such a function \(f\). Observe that the quandratic variation of the function is \(b-a\) (non-zero).

We see from the above examples, that definining the integral \(\int_{a}^{b}f(t)dg(t)\) even when \(f=g\) is a non-trivial problem. Consider the question posed earlier - if \(f\) and \(g\) are continuous functions on \([a,b]\), can we define the integral \(\int_{a}^{b}f(t)dg(t)\)? There is no simple answer to this question. But then in view of example ([ex:non-zero-quadratic-variation-example]), we can ask another question:

Question. Are there continuous functions \(f\) satisfying the condition

\[\begin{aligned} |f(t)-f(s)| & \leq C|t-s|^{1/2} \end{aligned}\]

Brownian motion as the limit of a symmetric random walk.

Consider a random walk starting at \(0\) with jumps \(h\) and \(-h\) equally at times \(\delta\), \(2\delta\), \(\ldots\) where \(h\) and \(\delta\) are positive numbers. More precisely, let \(\{X_{n}\}_{n=1}^{\infty}\) be a sequence of independent and identically distributed random variables with :

\[\begin{aligned} \mathbb{P}\{X_{j}=h\} & =\mathbb{P}\{X_{j}=-h\}=\frac{1}{2} \end{aligned}\]

Let \(Y_{\delta,h}(0)=0\) and put:

\[\begin{aligned} Y_{\delta,h}(n\delta) & =X_{1}+X_{2}+\ldots+X_{n} \end{aligned}\]

For \(t>0\), define \(Y_{\delta,h}(t)\) by linearization that is, for \(n\delta<t<(n+1)\delta\), define:

\[\begin{aligned} Y_{\delta,h}(t) & =\frac{(n+1)\delta-t}{\delta}Y_{\delta,h}(n\delta)+\frac{t-n\delta}{\delta}Y_{\delta,h}((n+1)\delta) \end{aligned}\]

We can think of \(Y_{\delta,h}(t)\) as the position of the random walk at time \(t\). In particular, \(X_{1}+X_{2}+\ldots+X_{n}\) is the position of this random walk at time \(n\delta\).

Question. What is the limit of the random walk \(Y_{\delta,h}\) as \(\delta,h\to0\)?

Recall that the characteristic function of a random variable \(X\) is \(\phi_{X}(\lambda)=\mathbb{E}\exp[i\lambda X]\). In order to find out the answer, let us compute the following limit of the characteristic function of \(Y_{\delta,h}(t)\):

\[\lim_{\delta,h\to0}\mathbb{E}\exp\left[i\lambda Y_{\delta,h}(t)\right]\]

where \(\lambda\in\mathbf{R}\)is fixed. For heuristic derivation, let \(t=n\delta\) and so \(n=t/\delta\). Then we have:

\[\begin{aligned} \mathbb{E}\exp\left[i\lambda Y_{\delta,h}(t)\right] & =\prod_{j=1}^{n}\mathbb{E}e^{i\lambda X_{j}}\\ & =\prod_{j=1}^{n}\left(\frac{1}{2}e^{i\lambda h}+\frac{1}{2}e^{-i\lambda h}\right)\\ & =\left(\frac{1}{2}e^{i\lambda h}+\frac{1}{2}e^{-i\lambda h}\right)^{n}\\ & =\left(\cos\lambda h\right)^{n}\\ & =\left(\cos\lambda h\right)^{t/\delta} \end{aligned}\]

For fixed \(t\) and \(\lambda\), when \(\delta\) and \(h\) independently approach \(0\), the limit of \(\mathbb{E}\exp\left[i\lambda Y_{\delta,h}(t)\right]\) may not exist. For example, holding \(h\) constant, letting \(\delta\to0\), since \(-1\leq\cos\theta\leq1\), the function \(\left(\cos\lambda h\right)^{t/\delta}\to0\). Holding \(\delta\) constant, letting \(h\to0\), the function \(\left(\cos\lambda h\right)^{t/\delta}\to1\). In order for the limit to exist, we impose a certain relationship between \(\delta\) and \(h\). However, depending on the relationship, we may obtain different limits.

Let \(u=\cos(\lambda h)^{1/\delta}\). Then \(\ln u=\frac{1}{\delta}\ln\cos(\lambda h)\). Note that:

\[\begin{aligned} \cos(\lambda h) & \approx1-\frac{1}{2}\lambda^{2}h^{2} \end{aligned}\]

And \(\ln(1+x)\approx x\). Hence,

\[\ln\cos(\lambda h)\approx\ln\left(1-\frac{1}{2}\lambda^{2}h^{2}\right)\approx-\frac{1}{2}\lambda^{2}h^{2}\]

Therefore for small \(\lambda\) and \(h\), we have \(\ln u\approx-\frac{1}{2\delta}\lambda^{2}h^{2}\) and so:

\[\begin{aligned} u & \approx\exp\left[-\frac{1}{2\delta}\lambda^{2}h^{2}\right] \end{aligned}\]

In particular, if \(\delta\) and \(h\) are related by \(h^{2}=\delta\), then

\[\begin{aligned} \lim_{\delta\to0}\mathbb{E}\exp\left[i\lambda Y_{\delta,h}(t)\right] & =e^{-\frac{1}{2}\lambda^{2}t} \end{aligned}\]

But, \(e^{-\frac{1}{2}\lambda^{2}t}\) is the characteristic function of a Gaussian random variable with mean \(0\) and variance \(t\). Thus, we have derived the following theorem about the limit of the random walk \(Y_{\delta,h}\) as \(\delta,h\to0\) in such a way that \(h^{2}=\delta\).

Let \(Y_{\delta,h}(t)\) be the random walk starting at \(0\) with jumps \(h\) and \(-h\) equally likely at times \(\delta\), \(2\delta\), \(3\delta\), \(\ldots\). Assume that \(h^{2}=\delta\). Then, for each \(t\geq0\), the limit:

\[\begin{aligned} \lim_{\delta\to0}Y_{\delta,h}(t) & =B(t) \end{aligned}\] exists in distribution. Moreover, we have:

\[\begin{aligned} \mathbb{E}e^{i\lambda B(t)} & =e^{-\frac{1}{2}\lambda^{2}t} \end{aligned}\]

(Quadratic Variation of a Brownian motion). Let \((B_{t},t\ge0)\) be a standard brownian motion. Then, for any sequence of partitions \((t_{j},j\leq n)\) of \([0,t]\) we have:

\[\begin{aligned} \left\langle B\right\rangle _{t} & =\sum_{j=1}^{n}(B_{t_{j+1}}-B_{t_{j}})^{2}\stackrel{L^{2}}{\to}t \end{aligned}\]

where the convergence is in the \(L^{2}\) sense.

It is reasonable to have some sort of convergence as we are dealing with a sum of independent random variables. However, the conclusion would not hold if the increments were not squared. So there is something more at play here.

Proof. Proof. We have:

\[\begin{aligned} \mathbb{E}\left[\left(\sum_{j=0}^{n-1}(B(t_{j+1})-B(t_{j}))^{2}-t\right)^{2}\right] & =\mathbb{E}\left[\left(\sum_{j=0}^{n-1}(B(t_{j+1})-B(t_{j}))^{2}-\sum_{j=0}^{n-1}(t_{j+1}-t_{j})\right)^{2}\right]\\ & =\mathbb{E}\left[\left(\sum_{j=0}^{n-1}\left\{ (B(t_{j+1})-B(t_{j}))^{2}-(t_{j+1}-t_{j})\right\} \right)^{2}\right] \end{aligned}\]

For simplicity, we define the variables \(X_{j}=(B(t_{j+1})-B(t_{j}))^{2}-(t_{j+1}-t_{j})\). Then, we may write:

\[\begin{aligned} \mathbb{E}\left[\left(\sum_{j=0}^{n-1}(B(t_{j+1})-B(t_{j}))^{2}-t\right)^{2}\right] & =\mathbb{E}\left[\left(\sum_{j=0}^{n-1}X_{j}\right)^{2}\right]\\ & =\mathbb{E}\left[\sum_{i=0}^{n-1}\sum_{j=0}^{n-1}X_{i}X_{j}\right]\\ & =\sum_{i=0}^{n-1}\sum_{j=0}^{n-1}\mathbb{E}[X_{i}X_{j}] \end{aligned}\]

Now, the random variables \(X_{j}\) are independent.

The expectation of \(X_{j}\) is \(\mathbb{E}[X_{j}]=\mathbb{E}(B(t_{j+1})-B(t_{j}))^{2}-(t_{j+1}-t_{j})=0\).

Since, \(X_{i}\) and \(X_{j}\) are independent, for \(i\neq j\), \(\mathbb{E}[X_{i}X_{j}]=\mathbb{E}X_{i}\cdot\mathbb{E}X_{j}=0\).

Hence, we have:

\[\begin{aligned} \mathbb{E}\left[\left(\sum_{j=0}^{n-1}(B(t_{j+1})-B(t_{j}))^{2}-t\right)^{2}\right] & =\sum_{i=0}^{n-1}\mathbb{E}[X_{i}^{2}] \end{aligned}\]

We now develop the expectation of the square of \(X_{i}\). We have:

\[\begin{aligned} \mathbb{E}[X_{i}^{2}] & =\mathbb{E}\left[\left((B(t_{i+1})-B(t_{i}))^{2}-(t_{i+1}-t_{i})\right)^{2}\right]\\ & =\mathbb{E}\left[((B(t_{i+1})-B(t_{i}))^{4}-2(B(t_{i+1})-B(t_{i}))^{2}(t_{i+1}-t_{i})+(t_{i+1}-t_{i})^{2}\right] \end{aligned}\]

The MGF of the random variable \(B(t_{i+1})-B(t_{i})\) is :

\[\begin{aligned} \phi(\lambda) & =\exp\left[\frac{\lambda^{2}(t_{i+1}-t_{i})}{2}\right]\\ \phi'(\lambda) & =\lambda(t_{i+1}-t_{i})\exp\left[\frac{\lambda^{2}(t_{i+1}-t_{i})}{2}\right]\\ \phi''(\lambda) & =\left[(t_{i+1}-t_{i})+\lambda^{2}(t_{i+1}-t_{i})^{2}\right]\exp\left[\frac{\lambda^{2}(t_{i+1}-t_{i})}{2}\right]\\ \phi^{(3)}(\lambda) & =\left[3\lambda(t_{i+1}-t_{i})^{2}+\lambda^{3}(t_{i+1}-t_{i})^{3}\right]\exp\left[\frac{\lambda^{2}(t_{i+1}-t_{i})}{2}\right]\\ \phi^{(4)}(\lambda) & =\left[3(t_{i+1}-t_{i})^{2}+6\lambda^{2}(t_{i+1}-t_{i})^{3}+\lambda^{4}(t_{i+1}-t_{i})^{4}\right]\exp\left[\frac{\lambda^{2}(t_{i+1}-t_{i})}{2}\right] \end{aligned}\]

Thus, \(\mathbb{E}[(B(t_{i+1})-B(t_{i}))^{4}]=3(t_{i+1}-t_{i})^{2}\). Consequently,

\[\begin{aligned} \mathbb{E}[X_{i}^{2}] & =\mathbb{E}[(B(t_{i+1})-B(t_{i}))^{4}]-2(t_{i+1}-t_{i})\mathbb{E}[(B(t_{i+1})-B(t_{i}))^{2}]+(t_{i+1}-t_{i})^{2}\\ & =3(t_{i+1}-t_{i})^{2}-2(t_{i+1}-t_{i})^{2}+(t_{i+1}-t_{i})^{2}\\ & =2(t_{i+1}-t_{i})^{2} \end{aligned}\]

Putting all this together, we finally have that:

\[\begin{aligned} \mathbb{E}\left[\left(\sum_{j=0}^{n-1}(B(t_{j+1})-B(t_{j}))^{2}-t\right)^{2}\right] & =2\sum_{i=0}^{n-1}(t_{i+1}-t_{i})^{2}\label{eq:second-moment-of-qv}\\ & \leq2\left\Vert \Delta_{n}\right\Vert \sum_{i=0}^{n-1}(t_{i+1}-t_{i})\nonumber \\ & =2\left\Vert \Delta_{n}\right\Vert \cdot t\nonumber \end{aligned}\]

As \(n\to\infty\), \(\left\Vert \Delta_{n}\right\Vert \to0\). Hence,

\[\begin{aligned} \lim_{n\to\infty}\mathbb{E}\left[\left(\sum_{j=0}^{n-1}(B(t_{j+1})-B(t_{j}))^{2}-t\right)^{2}\right] & =0 \end{aligned}\]

Hence, the sequence of random variables

\[\begin{aligned} \sum_{j=0}^{n-1}(B(t_{j+1})-B(t_{j}))^{2} & \stackrel{L^{2}}{\to}t \end{aligned}\] ◻

(Quadratic Variation of a Brownian Motion Path). Let \((B_{s},s\geq0)\) be a Brownian motion. For every \(n\in\mathbf{N}\), consider the dyadic partition \((t_{j},j\leq2^{n})\) of \([0,t]\) where \(t_{j}=\frac{j}{2^{n}}t\). Then we have that:

\[\begin{aligned} \left\langle B\right\rangle _{t} & =\sum_{j=1}^{2^{n}-1}(B_{t_{j+1}}-B_{t_{j}})^{2}\stackrel{a.s.}{\to}t \end{aligned}\]

Proof. Proof. We have \((t_{i+1}-t_{i})=\frac{t}{2^{n}}.\) Borrowing equation ([eq:second-moment-of-qv]) from the proof of theorem ([th:quadratic-variation-of-bm-approaches-t-in-mean-square]), we have that:

\[\begin{aligned} \mathbb{E}\left[\left(\sum_{j=0}^{2^{n}-1}(B(t_{j+1})-B(t_{j}))^{2}-t\right)^{2}\right] & =2\sum_{i=0}^{2^{n}-1}\left(\frac{t}{2^{n}}\right)^{2}\\ & =2\cdot(2^{n})\cdot\frac{t^{2}}{2^{2n}}\\ & =\frac{2t^{2}}{2^{n}} \end{aligned}\]

By Chebyshev’s inequality,

\[\begin{aligned} \mathbb{P}\left(\left|\sum_{j=0}^{2^{n}-1}(B(t_{j+1})-B(t_{j}))^{2}-t\right|>\epsilon\right) & \leq\frac{1}{\epsilon^{2}}\mathbb{E}\left[\left(\sum_{j=0}^{2^{n}-1}(B(t_{j+1})-B(t_{j}))^{2}-t\right)^{2}\right]\\ & \leq\frac{1}{\epsilon^{2}}\cdot\frac{2t^{2}}{2^{n}} \end{aligned}\]

Define \(A_{n}:=\left\{ \left|\sum_{j=0}^{2^{n}-1}(B(t_{j+1})-B(t_{j}))^{2}-t\right|>\epsilon\right\}\). Since, \(\sum\frac{1}{2^{n}}\) is a convergent series, any multiple of it, \((2t^{2}/\epsilon^{2})\sum\frac{1}{2^{n}}\) also converges. Now, \(0\leq\mathbb{P}(A_{n})\leq\frac{(2t^{2}/\epsilon^{2})}{2^{n}}\). By the comparison test, \(\sum\mathbb{P}(A_{n})\) converges to a finite value. By Theorem ([th:sufficient-condition-for-almost-sure-convergence]),

\[\begin{aligned} \sum_{j=0}^{2^{n}-1}(B(t_{j+1})-B(t_{j}))^{2} & \stackrel{a.s.}{\to}t \end{aligned}\] ◻

We are now ready to show that every Brownian motion path has infinite variation.

If \(g\) is a \(C^{1}\) function,

\[\begin{aligned} \int_{0}^{t}|g'(t)|dt & =\int_{0}^{t}\sqrt{g'(t)^{2}}dt\\ & \leq\int_{0}^{t}\sqrt{1+g'(t)^{2}}dt\\ & =l_{g}(t) \end{aligned}\]

where \(l_{g}(t)\) is the arclength of the function \(g\) between \([0,t]\). So, \(V_{g}(t)\leq l_{g}(t)\) and further:

\[\begin{aligned} l_{g}(t) & =\int_{0}^{t}\sqrt{1+g'(t)^{2}}dt\\ & \leq\int_{0}^{t}\left(1+\sqrt{g'(t)^{2}}\right)dt\\ & \leq t+V_{g}(t) \end{aligned}\]

Consequently,

\[\begin{aligned} V_{g}(t) & \leq l_{g}(t)\leq t+V_{g}(t) \end{aligned}\]

The total variation of the function is finite if and only if it’s arclength is.

Hence, intuitively, our claim is that a Brownian motion path on \([0,T]\) has infinite arc-length. Since \(g\in C^{1}([a,b])\Longrightarrow(V_{g}(t)<\infty)\), it follows that \((V_{g}(t)\to\infty)\Longrightarrow g\notin C^{1}\).

(Brownian Motion paths have unbounded total variation.) Let \((B_{s},s\geq0)\) be a Brownian motion. Then, the random functions \(B(s,\omega)\) on the interval \([0,t]\) have unbounded variation almost surely.

Proof. Proof. Take the sequence of dyadic partitions of \([0,t]\): \(t_{j}=\frac{j}{2^{n}}t\), \(n\in\mathbf{N}\), \(j\leq2^{n}\). By pulling out the worst increment, we have the trivial bound for every \(\omega\):

\[\begin{aligned} \sum_{j=0}^{2^{n}-1}(B_{t_{j+1}}(\omega)-B_{t_{j}}(\omega))^{2} & \leq\max_{0\leq j\leq2^{n}}\left|B_{t_{j+1}}(\omega)-B_{t_{j}}(\omega)\right|\cdot\sum_{j=0}^{2^{n}-1}(B_{t_{j+1}}(\omega)-B_{t_{j}}(\omega))\label{eq:trivial-upper-bound-on-quadratic-variation} \end{aligned}\]

We proceed by contradiction. Let \(A'\) be the set of all \(\omega\), for which the Brownian motion paths have bounded total variation. Let \(A\) be event that the Brownian motion paths have unbounded variation.

By the definition of total variation, that would imply, \(\exists M\in\mathbf{N}\) :

\[\begin{aligned} (\forall\omega\in A')\quad\lim_{n\to\infty}\sum_{j=0}^{2^{n}-1}\left|(B_{t_{j+1}}(\omega)-B_{t_{j}}(\omega))\right| & <M \end{aligned}\]

Since Brownian Motion paths are continuous on the compact set \([\frac{j}{2^{n}}t,\frac{j+1}{2^{n}}t]\), they are uniformly continuous. So, as \(n\to\infty\), \(|t_{j+1}-t_{j}|\to0\) and therefore \(|B_{t_{j+1}}(\omega)-B_{t_{j}}(\omega)|\to0\). And consequently, \(\max_{0\leq j\leq2^{n}}\left|B_{t_{j+1}}(\omega)-B_{t_{j}}(\omega)\right|\to0\).

Thus, for every \(\omega\in A'\), the right hand side of the inequality ([eq:trivial-upper-bound-on-quadratic-variation]), converges to \(0\) and therefore the left hand side converges to \(0\). But, this contradicts the fact that \(\left\langle B\right\rangle _{t}\stackrel{a.s.}{\to}t\). So, \(A'\) is a null set, and \(\mathbb{P}(A')=0\) and \(\mathbb{P}(A)=1\). This closes the proof. ◻

What exactly is \((\Omega,\mathcal{F},\mathbb{P})\) in mathematical finance?

If we make the simplifying assumption that the process paths are continuous, we obtain the set of all continuous functions on \([0,T]\), denoted by \(C[0,T]\). This is a very rich space. In a more general model, it is assumed that the process paths are right continuous with left limits (regular right-continuous RRC, cadlag) functions.

Let the sample space \(\Omega=D[0,T]\) be the set of all RRC functions on \([0,T]\). An element of this set is a RRC function from \([0,T]\) into \(\mathbf{R}\). First we must decide what kind of sets of these functions are measurable? The simplest set for which we would like to calculate the probabilities are sets of the form \(\{a\leq S(t_{1})\leq b\}\) for some \(t_{1}\). If \(S(t)\) represents the price of a stock at time \(t\), then the probability of such a set gives the probability that the stock price at time \(t_{1}\) is between \(a\) and \(b\). We are also interested in how the price of the stock at time \(t_{1}\) affects the price at another time \(t_{2}\). Thus, we need to talk about the joint distribution of stock prices \(S(t_{1})\) and \(S(t_{2})\). This means that we need to define probability on the sets of the form \(\{S(t_{1})\in B_{1},S(t_{2})\in B_{2}\}\) where \(B_{1}\) and \(B_{2}\) are intervals on the line. More generally, we would like to have all the finite-dimensional distributions of the process \(S(t)\), that is, the probabilities of the sets: \(\{S(t_{1})\in B_{1},S(t_{2})\in B_{2},\ldots,S(t_{n})\in B_{n}\}\) for any choice of \(0\leq t_{1}\leq\ldots\leq t_{n}\leq T\).

The sets of the form \(A=\{\omega(\cdot)\in D[0,T]:\omega(t_{1})\in B_{1},\ldots,\omega(t_{n})\in B_{n}\}\), where \(B_{i}\)’s are borel subsets of \(\mathbf{R}\), are called cylinder sets or finite-dimensional rectangles.

The stochastic process \(S(t)\) is just a (function-valued) random variable on this sample space, which takes some value \(\omega(t)\) - the value of the function \(\omega\) at \(t\).

Let \(\mathcal{R}\) be the colllection of all cylindrical subsets of \(D[0,1]\). Obviously \(\mathcal{R}\) is not a \(\sigma\)-field.

Probability is first defined by on the elements of \(\mathcal{R}\). Let \(A\subseteq\mathcal{R}\).

\[\begin{aligned} \mathbb{P}(A) & =\int_{B_{1}}\cdots\int_{B_{n}}\prod_{i=1}^{n}\frac{1}{\sqrt{(2\pi)(t_{i}-t_{i-1})}}\exp\left[-\frac{(u_{i}-u_{i-1})^{2}}{2(t_{i}-t_{i-1})}\right]du_{1}\cdots du_{n} \end{aligned}\]

and then extended to the \(\sigma\)-field generated by taking unions, complements and intersections of cylinders. We take the smallest \(\sigma\)-algebra containing all the cylindrical subsets of \(D[0,1]\). Thus, \(\mathcal{F}=\mathcal{B}(D[0,1])\).

Hence, \((\Omega,\mathcal{F},\mathbb{P})=(D[0,1],\mathcal{B}(D[0,1]),\mathbb{P})\) is a probability space. It is called the Wiener space and \(\mathbb{P}\) here is called the Wiener measure.

Continuity and Regularity of paths.

As discussed in the previous section, a stochastic process is determined by its finite-dimensional distribution. In studying stochastic processes, it is often natural to think of them as function-valued random variables in \(t\). Let \(S(t)\) be defined for \(0\leq t\leq T\), then for a fixed \(\omega\), it is a function in \(t\), called the sample path or a realization of \(S\). Finite-dimensional distributions do not determine the continuity property of sample paths. The following example illustrates this.

Let \(X(t)=0\) for all \(t\), \(0\leq t\leq1\) and \(\tau\) be a uniformly distributed random variable on \([0,1]\). Let \(Y(t)=0\) for \(t\neq\tau\) and \(Y(t)=1\) if \(t=\tau.\) Then, for any fixed \(t\), \(\mathbb{P}(Y(t)\neq0)=\mathbb{P}(\tau=t)=0\), and hence \(\mathbb{P}(Y(t)=0)=1\). So, that all one-dimensional distributions of \(X(t)\) and \(Y(t)\) are the same. Similarly, all finite-dimensional distributions of \(X\) and \(Y\) are the same. However, the sample paths of the process \(X\), that is, the functions \(X(t)_{0\leq t\leq1}\) are continuous in \(t\), whereas every sample path \(Y(t)_{0\leq t\leq1}\) has a jump at the (random) point \(\tau\). Notice that, \(\mathbb{P}(X(t)=Y(t))=1\) for all \(t\), \(0\leq t\leq1\).

Two stochastic processes are called versions (modifications) of one another if

\[\mathbb{P}(X(t)=Y(t))=1\quad\text{for all }0\leq t\leq T\]

Thus, the two processes in the example ([ex:modifications-of-a-stochastic-process]) are versions of one another, one has continuous sample paths, the other does not. If we agree to pick any version of the process we want, then we can pick the continous version when it exists. In general, we choose the smoothest possible version of the process.

For two processes, \(X\) and \(Y\), denote by \(N_{t}=\{X(t)\neq Y(t)\}\), \(0\leq t\leq T\). In the above example, \(\mathbb{P}(N_{t})=\mathbb{P}(\tau=t)=0\) for any \(t\), \(0\leq t\leq1\). However, \(\mathbb{P}(\bigcup_{0\leq t\leq1}N_{t})=\mathbb{P}(\tau=t\:\text{for some }t\:\text{in }[0,1])=1\). Although, each of \(N_{t}\) is a \(\mathbb{P}\)-null set, the union \(N=\bigcup_{0\leq t\leq1}N_{t}\) contains uncountably many null sets, and in this particular case it is a set of of probability one.

If it happens that \(\mathbb{P}(N)=0\), then \(N\) is called an evanescent set, and the processes \(X\) and \(Y\) are called indistinguishable. Note that in this case, \(\mathbb{P}(\{\omega:\exists t:X(t)\neq Y(t)\})=\mathbb{P}(\bigcup_{0\leq t\leq1}\{X(t)\neq Y(t))=0\) and \(\mathbb{P}(\bigcap_{0\leq t\leq1}\{X(t)=Y(t)\})=1\). It is clear, that if the time is discrete, then any two versions of the process are indistinguishable. It is also not hard to see, that if \(X(t)\) and \(Y(t)\) are versions of one another and they are both right-continuous, they are indistinguishable.

(Paul Levy’s construction of Brownian Motion). Standard Brownian motion exists.

Proof. Proof. I reproduce the standard proof as present in Brownian Motion by Morters and Peres. I added some remarks for greater clarity.

Let

\[\begin{aligned} \mathcal{D}_{n} & =\left\{ \frac{k}{2^{n}}:k=0,1,2,\ldots,2^{n}\right\} \end{aligned}\]

be a finite set of dyadic points.

Let

\[\begin{aligned} \mathcal{D} & =\bigcup_{n=0}^{\infty}\mathcal{D}_{n} \end{aligned}\]

Let \(\{Z_{t}:t\in\mathcal{D}\}\) be a collection of independent, standard normally distributed random variables. This is a countable set of random variables.

Let \(B(0):=0\) and \(B(1):=Z_{1}\).

For each \(n\in\mathbf{N}\), we define the random variables \(B(d)\), \(d\in\mathcal{D}_{n}\) such that, the following invariant holds:

(1) for all \(r<s<t\) in \(\mathcal{D}_{n}\) the random variable \(B(t)-B(s)\) is normally distributed with mean zero and variance \(t-s\) and is independent of \(B(s)-B(r)\).

(2) the vectors \((B(d):d\in\mathcal{D}_{n})\) and \((Z_{t}:t\in\mathcal{D}\setminus\mathcal{D}_{n})\) are independent.

Note that we have already done this for \(\mathcal{D}_{0}=\{0,1\}\). Proceeding inductively, let’s assume that the above holds for some \(n-1\). We are interested to prove that the invariant also holds for \(n\).

We define \(B(d)\) for \(d\in\mathcal{D}_{n}\backslash\mathcal{D}_{n-1}\) by:

\[\begin{aligned} B(d) & =\frac{B(d-2^{-n})+B(d+2^{-n})}{2}+\frac{Z_{d}}{2^{(n+1)/2}} \end{aligned}\]

Note that, the points \(0,\frac{1}{2^{n-1}},\ldots,\frac{k}{2^{n-1}},\frac{k+1}{2^{n-1}},\ldots,1\) belong to \(\mathcal{D}_{n-1}\). The first summand is the linear interpolation of the values of \(B\) at the neighbouring points of \(d\) in \(\mathcal{D}_{n-1}\). That is,

\[\begin{aligned} B\left(\frac{2k+1}{2^{n}}\right) & =\frac{B\left(\frac{k}{2^{n-1}}\right)+B\left(\frac{k+1}{2^{n-1}}\right)}{2}+\frac{Z_{d}}{2^{(n+1)/2}} \end{aligned}\]

Since \(P(n-1)\) holds, \(B(d-2^{-n})\) and \(B(d+2^{-n})\) are have no dependence on \((Z_{t}:t\in\mathcal{D}\setminus\mathcal{D}_{n-1})\). Consequently, \(B(d)\) has no dependence on \((Z_{t}:t\in\mathcal{D}\setminus\mathcal{D}_{n})\) and the second property is fulfilled.

Moreover, as \(\frac{1}{2}[B(d+2^{-n})-B(d-2^{-n})]\) depends only on \((Z_{t}:t\in\mathcal{D}_{n-1})\), it is independent of \(\frac{Z_{d}}{2^{(n+1)/2}}\). By our induction assumptions, they are both nromally distributed with mean \(0\) and variance \(\frac{1}{2^{(n+1)}}\).

So, their sum and difference random variables

\[\begin{aligned} B(d)-B(d-2^{-n}) & =\frac{B(d+2^{-n})-B(d-2^{-n})}{2}+\frac{Z_{d}}{2^{(n+1)/2}}\\ B(d+2^{-n})-B(d) & =\frac{B(d+2^{-n})-B(d-2^{-n})}{2}-\frac{Z_{d}}{2^{(n+1)/2}} \end{aligned}\]

are also independent, with mean \(0\) and variance \(\frac{1}{2^{n}}\) (the variance of independent random variables is the sum of the variances).

Indeed all increments \(B(d)-B(d-2^{-n})\) for \(d\in\mathcal{D}_{n}\setminus\{0\}\) are independent. To see this, it suffices to show that they are pairwise independent. We have seen in the previous paragraph that the pairs \(B(d)-B(d-2^{-n})\) and \(B(d+2^{-n})-B(d)\) with \(d\in\mathcal{D}_{n}\setminus\mathcal{D}_{n-1}\) are independent. The other possibility is that the increments are over the intervals separated by some \(d\in\mathcal{D}_{n-1}\). For concreteness, if \(n\) were \(3\), then the increments, \(B_{7/8}-B_{6/8}\) and \(B_{5/8}-B_{4/8}\) are seperated by \(d=\frac{3}{4}\in\mathcal{D}_{2}\). Choose \(d\in\mathcal{D}_{j}\) with this property and minimal \(j\), so, the two intervals are contained in \([d-2^{-j},d]\) and \([d,d+2^{-j}]\) respectively. By induction, the increments over these two intervals of length \(2^{-j}\) are independent and the increments over the intervals of length \(2^{-n}\) are constructed from the independent increments \(B(d)-B(d-2^{-j})\) and \(B(d+2^{-j})-B(d)\) using a disjoint set of variables \((Z_{t}:t\in\mathcal{D}_{n})\). Hence, they are independent and this implies pairwise independence. This implies the first property. Consequently, the vector of increments \((B(d)-B(d-2^{-n})\) for all \(d\in\mathcal{D}_{n}\) is Gaussian.

Having thus chosen the value of the process on all the dyadic points, we interpolate between them. Formally, we define:

\[\begin{aligned} F_{0}(t) & =\begin{cases} Z_{1} & \text{for }t=1\\ 0 & \text{for }t=0\\ \text{\text{linear in between}} \end{cases} \end{aligned}\]

and for each \(n\geq1\),

\[\begin{aligned} F_{n}(t) & =\begin{cases} \frac{Z_{t}}{2^{(n+1)/2}} & \text{for }t\in\mathcal{D}\setminus\mathcal{D}_{n-1}\\ 0 & \text{for }t\in\mathcal{D}_{n-1}\\ \text{\text{linear between consecutive points in }\ensuremath{\mathcal{D}_{n}}} \end{cases} \end{aligned}\]

These functions are continuous on \([0,1]\) and for all \(n\) and \(d\in\mathcal{D}_{n}\), we have:

\[\begin{aligned} B(d) & =\sum_{i=0}^{n}F_{i}(d)=\sum_{i=0}^{\infty}F_{i}(d)\label{eq:claim-of-induction-for-bd} \end{aligned}\]

To see this, assume that above equation holds for all \(d\in\mathcal{D}_{n-1}\).

Let’s consider the point \(d\in\mathcal{D}_{n}\setminus\mathcal{D}_{n-1}\).

\[\begin{aligned} B(d) & =\frac{B(d-2^{-n})+B(d+2^{-n})}{2}+\frac{Z_{d}}{2^{(n+1)/2}}\nonumber \\ & =\sum_{i=0}^{n-1}\frac{F_{i}(d-2^{-n})+F_{i}(d+2^{-n})}{2}+\frac{Z_{d}}{2^{(n+1)/2}}\label{eq:expression-for-bd} \end{aligned}\]

Now, \(d-2^{-n}\) and \(d+2^{-n}\) belong to \(\mathcal{D}_{n-1}\) and are not in \(\bigcup_{i<n-1}\mathcal{D}_{i}\). Therefore, for \(i=0,1,\ldots,n-2\), the points \((d-2^{-n},F_{i}(d-2^{-n}))\) and \((d+2^{-n},F_{i}(d+2^{-n})\) lie on some straight line and have \((d,F_{i}(d))\) as their midpoint. Moreover, \(d-2^{-n}\) and \(d+2^{-n}\) are vertices in \(\mathcal{D}_{n-1}\). So, by definition of \(F_{n-1}(d)\), we have \(F_{n-1}(d)=[F_{n-1}(d-2^{-n})+F_{n-1}(d+2^{-n})]/2\).

To summarize, the first term on the right hand side of expression ([eq:expression-for-bd]) is equal to \(\sum_{i=0}^{n-1}F_{i}(d)\). By mathematical induction, it follows that the claim ([eq:claim-of-induction-for-bd]) is true for all \(n\in\mathbf{N}\).

It’s extremely easy to find an upper bound on the probability contained in the Gaussian tails. Suppose \(X\sim N(0,1)\) and let \(x>0\). We are interested in the tail probability \(\mathbb{P}(X>x)\). We have:

\[\begin{aligned} \mathbb{P}(X>x) & =\int_{x}^{\infty}e^{-x^{2}/2}dx=\int_{x}^{\infty}\frac{xe^{-x^{2}/2}dx}{x} \end{aligned}\]

Let \(u=\frac{1}{x}\) and \(dv=xe^{-x^{2}/2}dx\). We have:

\(u=\frac{1}{x}\)	\(dv=xe^{-x^{2}/2}dx\)
\(du=-\frac{1}{x^{2}}dx\)	\(v=-e^{-x^{2}/2}\)

Thus,

\[\begin{aligned} \mathbb{P}(X>x) & =-\left.\frac{1}{x}e^{-x^{2}/2}\right|_{x}^{\infty}-\int_{x}^{\infty}\frac{e^{-x^{2}/2}}{x^{2}}dx\\ & =\frac{e^{-x^{2}/2}}{x}-\int_{x}^{\infty}\frac{e^{-x^{2}/2}}{x^{2}}dx\\ & \quad\left\{ I(x)=\int_{x}^{\infty}\frac{e^{-x^{2}/2}}{x^{2}}\geq0\right\} \\ & \leq\frac{e^{-x^{2}/2}}{x} \end{aligned}\]

Thus, for \(c>1\) and large \(n\), we have:

\[\begin{aligned} \mathbb{P}(|Z_{d}|\geq c\sqrt{n}) & \leq\frac{1}{c\sqrt{n}}e^{-c^{2}n/2}\leq\exp\left(-\frac{c^{2}n}{2}\right) \end{aligned}\]

So, the series:

\[\begin{aligned} \sum_{n=0}^{\infty}\mathbb{P}\left\{ \text{There exists atleast one }d\in\mathcal{D}_{n}\text{ with }|Z_{d}|\geq c\sqrt{n}\right\} & \leq\sum_{n=0}^{\infty}\sum_{d\in\mathcal{D}_{n}}\mathbb{P}\left\{ |Z_{d}|\geq c\sqrt{n}\right\} \\ & \leq\sum_{n=0}^{\infty}(2^{n}+1)\exp\left(-\frac{c^{2}n}{2}\right) \end{aligned}\]

Now, the series \((a_{n})\) given by, \(a_{n}:=(2^{n}+1)e^{-c^{2}n/2}\) has the ratio between successive terms:

\[\begin{aligned} \lim\left|\frac{a_{n+1}}{a_{n}}\right| & =\lim_{n\to\infty}\frac{2^{n+1}+1}{2^{n}+1}\cdot\frac{e^{(c^{2}n)/2}}{e^{c^{2}(n+1)/2}}\\ & =\lim_{n\to\infty}\frac{\frac{1}{2}+\frac{1}{2^{n}}}{1+\frac{1}{2^{n}}}\cdot\frac{1}{e^{c^{2}/2}}\\ & =\frac{1}{2e^{c^{2}/2}} \end{aligned}\]

If this ratio is less than unity, that is \(c>\sqrt{2\log2}\), than by the ratio test, \(\sum(2^{n}+1)e^{-c^{2}n/2}\) converges to a finite value. Fix such a \(c\).

By BCL1(Borel-Cantelli Lemma), if \(A_{n}:=\left\{ \text{There exists atleast one }d\in\mathcal{D}_{n}\text{ with }|Z_{d}|\geq c\sqrt{n}\right\}\) and \(\sum_{n=0}^{\infty}\mathbb{P}(A_{n})\) converges to a finite value, then the event \(A_{n}\) occurs finitely many times with probability \(1\). There exists \(N\in\mathbf{N}\), such that for all \(n\geq N\), \(A_{n}\) fails to occur with probability \(1\). Thus, for all \(n\geq N\), \(\{Z_{d}\leq c\sqrt{n}\}\) occurs with probability \(1\). It follows that:

\[\begin{aligned} \sup_{t\in[0,1]}F_{n}(t) & \leq\frac{c\sqrt{n}}{2^{(n+1)/2}} \end{aligned}\]

Define

\[\begin{aligned} M_{n} & =\frac{c\sqrt{n}}{2^{(n+1)/2}} \end{aligned}\]

Since \(\sum M_{n}\) converges, by the Weierstrass \(M\)-test, the infinite series of functions \(\sum_{n=0}^{\infty}F_{n}(t)\) converges uniformly on \([0,1].\) Since, each \(F_{n}(t)\) is piecewise linear and continuous, by the Term-by-Term continuity theorem, \(\sum_{n=0}^{\infty}F_{n}(t)\) is continuous on \([0,1]\). ◻

A point of comparison: The Poisson Process.

Like the Brownian motion, the Poisson process is defined as a process with stationary and independent increments.

A process \((N_{t},t\geq0)\) defined on \((\Omega,\mathcal{F},\mathbb{P})\) has the distribution of the Poisson process with rate \(\lambda>0\), if and only if the following hold:

(1) \(N_{0}=0\).

(2) For any \(s<t\), the increment \(N_{t}-N_{s}\) is a Poisson random variable with parameter \(\lambda(t-s).\)

(3) For any \(n\in\mathbf{N}\) and any choice \(0<t_{1}<t_{2}<\ldots<t_{n}<\infty\), the increments \(N_{t_{2}}-N_{t_{1}},N_{t_{3}}-N_{t_{2}},\ldots,N_{t_{n}}-N_{t_{n-1}}\) are independent.

Poisson paths can be sampled using this definition. By construction, it is not hard to see that the paths of Poisson processes are piecewise, constant, integer-valued and non-decreasing. In particular, the paths of Poisson processes have finite variation. Poisson paths are much simpler than the ones of Brownian motion in many ways!

(Simulating the Poisson Process.) Use the definition ([def:poisson-process]) to generate \(10\) paths of the Poisson process with rate \(1\) on the interval \([0,10]\) with step-size \(0.01\).

def generatePoissonProcess(lam,T,stepSize):
    N = int(T/stepSize)
    x = np.random.poisson(lam=lam,size=N)
    y = np.cumsum(x)
    y = np.concatenate([[0.0],y])
    return y

We can construct a Poisson process as follows. Consider \((\tau_{j},j\in\mathbf{N})\) IID exponential random variables with parameter \(1/\lambda\). One should think of \(\tau_{j}\) as the waiting time from the \((j-1)\)st to the \(j\)th jump. Then, one defines :

\[\begin{aligned} N_{t} & =\#\{k:\tau_{1}+\tau_{2}+\ldots+\tau_{k}\leq t\}\\ & =\text{Number of jumps upto and including time }t \end{aligned}\]

Now, here is an idea! What about defining a new process with stationary and independent increments using a given distribution other than Poisson and Gaussian? Is this even possible? The answer is yes, but only if the distribution satisfies the property of being infinitely divisible. To see this, consider the value of the process at time \(1\), \(N_{1}\). Then, no matter how many subintervals we chop the interval \([0,1]\) into, we must have the increments add up to \(N_{1}\). In other words, we must be able to write \(N_{1}\) as a sum of \(n\) IID random variables for every possible \(n\). This is certainly true for Poisson random variables and Gaussian random variables. Another example is the Cauchy distribution. In general, processes that can be constructed using independent, stationary increments are called Levy processes.

Time Inversion. Let \((B_{t},t\geq0)\) be a standard brownian motion. We consider the process:

\[\begin{aligned} X_{t} & =tB_{1/t}\quad\text{for }t>0 \end{aligned}\]

This property relates the behavior of \(t\) large to the behavior of \(t\) small.

(a) Show that \((X_{t},t>0)\) has the distribution of Brownian motion on \(t>0\).

Proof.

Like \(B(t)\), it is an easy exercise to prove that \(X(t)\) is also a Gaussian process.

We have, \(\mathbb{E}[X_{s}]=0\).

Let \(s<t\). We have:

\[\begin{aligned} Cov(X_{s},X_{t}) & =\mathbb{E}[sB(1/s)\cdot tB(1/t)]\\ & =st\mathbb{E}[B(1/s)\cdot B(1/t)]\\ & =st\cdot\frac{1}{t}\\ & \quad\left\{ \because\frac{1}{t}<\frac{1}{s}\right\} \\ & =s \end{aligned}\]

Consequently, \(X(t)\) has the distribution of a Brownian motion.

(b) Argue that \(X(t)\) converges to \(0\) as \(t\to0\) in the sense of \(L^{2}\)-convergence. It is possible to show convergence almost surely so that \((X_{t},t\geq0)\) is really a Brownian motion for \(t\geq0\).

Solution.

Let \((t_{n})\) be any arbitrary sequence of positive real numbers approaching \(0\) and consider the sequence of random variables \((X(t_{n}))_{n=1}^{\infty}\). We have:

\[\begin{aligned} \mathbb{E}\left[X(t_{n})^{2}\right] & =\mathbb{E}\left[t_{n}^{2}B(1/t_{n})^{2}\right]\\ & =t_{n}^{2}\mathbb{E}\left[B(1/t_{n})^{2}\right]\\ & =t_{n}^{2}\cdot\frac{1}{t_{n}}\\ & =t_{n} \end{aligned}\]

Hence,

\[\begin{aligned} \lim\mathbb{E}\left[X(t_{n})^{2}\right] & =\lim t_{n}=0 \end{aligned}\]

Since \((t_{n})\) was an arbitrary sequence, it follows that \(\lim_{t\to0}\mathbb{E}[(X(t))^{2}]=0\).

\[\begin{aligned} \lim_{t\to\infty}\frac{X(t)}{t} & =0\quad\text{almost surely} \end{aligned}\]

Solution.

What we need to do is to show that \(X(t)\to0\) as \(t\to0\) almost surely. That would show that \(\frac{B(1/t)}{1/t}\to0\) as \(t\to0\) almost surely, which is the same as showing \(\frac{B(t)}{t}\to0\) as \(t\to\infty\), which is the law of large numbers for Brownian motion.

What we have done in part (b), is to prove the claim that \(\mathbb{E}[X(t)^{2}]\to0\) as \(t\to0\), which shows convergence in the \(L^{2}\) sense and hence convergence in probability. This is infact the weak law of large numbers. \(\frac{B(t)}{t}\stackrel{\mathbb{\mathbf{P}}}{\to}0\) as \(t\to\infty\).

For \(t>0\), continuity is clear. However, it is the proof that as \(t\to0\), \(X(t)\to0\) almost surely which we have not done.

Note that, the limit \(X(t)\to0\) as \(t\to0\) if and only if \((\forall n\geq1)\), \((\exists m\geq1)\), such that \(\forall r\in\mathbb{Q}\cap(0,\frac{1}{m}]\), we have \(|X(r)|=\left|rB\left(\frac{1}{r}\right)\right|\leq\frac{1}{n}\).

To understand the above, we just recall the \(\epsilon-\delta\) definition of continuity. Note that \(\frac{1}{n}\) plays the role of \(\epsilon\) and \(\frac{1}{m}\) works as \(\delta\).

That is,

\[\begin{aligned} \Omega^{X}:=\left\{ \lim_{t\to0}X(t)=0\right\} & =\bigcap_{n\geq1}\bigcup_{m\geq1}\bigcap_{r\in\mathbb{Q}\cap(0,\frac{1}{m}]}\left\{ \left|X(r)\right|\leq\frac{1}{n}\right\} \end{aligned}\]

Also, note that \(X(t)\) is continuous on all \([a,1]\) for all \(a>0\), thus, uniformly continuous on \([a,1]\), and hence uniformly continuous on \(\mathbb{Q}\cap(0,1]\). So, there exists a continuous extension of \(X(t)\) on \([0,1]\). We already know from part (a), that \((X(t))_{t>0}\) and \((B(t))_{t>0}\) have the same finite dimensional distributions. Therefore, the RHS event has the same probability as \(\Omega^{B}:=\bigcap_{n\geq1}\bigcup_{m\geq1}\bigcap_{r\in\mathbb{Q}\cap(0,\frac{1}{m}]}\left\{ \left|B(r)\right|\leq\frac{1}{n}\right\}\). Since \(B(t)\to0\) as \(t\to0\) almost surely, the event \(\Omega^{B}\) has probability \(1\). Thus, \(\mathbb{P}\left\{ \lim_{t\to0}X(t)=0\right\} =1\).

This actually shows that \(X(t)\) is a bonafide standard brownian motion, as we have established continuity as well.

References

Introduction to Stochastic Calculus with Applications, Fima C Klebaner
Brownian Motion Calculus, Ubbo Wiersema