||

(For new reader and those who request 好友请求, please read my 公告栏 first)

After discussing the fundamentals
in my first two blogs, we are now ready to talk about the really practically
useful subject of stochastic processes. First we start with **Stochastic Sequences .** . .x_{1},
x_{2}, x_{3}, . . , x_{i}, . . . , x_{t}, x_{t+1},
. . which is nothing but a sequence of random variables indexed by an
independent integer variable i= . . . , 1, 2, . . . ,t, t+1, . .. Everything we
talked about n r.v.s in the first blog **PSPT1
**apply. If we concentrate of the r.v.s 1, . . , t then for specification we
need the joint probability density function p(x_{1}, x_{2}, x_{3},
. . . , x_{t}) and for rough characterization we have the mean
t-vector, [E(x_{i})], whose typical element is the mean of the r.v. x_{i}
and the txt covariance matrix [s_{ij}] with
typical element s_{ij} which is the
covariance of x_{i} with x_{j}. The first row of [s_{ij}] are
the elements s^{2}_{11, } s_{12}, s_{13}
, . . . , s_{1}_{t}_{ }( if the above
statements are not obvious to you, then you need to go back to **PSPT1** to review the material.) However,
here the terminology by tradition is slightly different. Instead of Covariance’s
of x_{1} with x_{2}, x_{3}, . . , x_{t}, we
call this a **correlation sequence** and
denote it with the notation x_{1i} , i=1,2, . . .t. The rationale for this notational complication
is this:

In general x_{i }could be
a vector r.v. Itself. We need to talk about covariance among it's

elements. Thus, for dependence between r.v.s in time we use the term correlation

instead . Covariance’s are then restricted to r,v,s occurring at the same instant of time to

avoid confusion . But it should be emphasized that mathematically these two types of

second order characterization is exactly the same. Everything conceptually you need to

deal with stochastic sequence has
already been covered in **PSPT1**,
except for

computational considerations.

In computation dealing with time, the number of time instants we need are often in the

hundreds or thousands. Dealing with a thousand-by- thousand matrix or a thousand

variable density function is cumbersome at best and often impractical. We need to consider special cases which simplify things. The first drastic simplification is to have an

Independent Stochastic Sequence , I.e.,

p(x_{1},
. . . , x_{t})=p(x_{1})p(x2). . . p(x_{t})

Colloquially such a sequence is
called a **white noise** sequence. If
furthermore

p(x_{1})=p(x_{2})=
. . . = p(x_{t}), then we have a **stationary
white noise** sequence.

it is then also referred to as an
**i.i.d. Sequence** (** i**ndependent and

But i.i.d. is often too restrictive an assumption for most stochastic sequences observed in practice. Thus, we introduce the next order of complication by

**The
Markov Assumption**
– This assumption is often captured by the statement that “** knowledge of the present separate
the past from the future**”, Mathematically, we say,

p(x_{t+1}/
x_{t}, x_{t-1}, . . . x_{1}) = p(x_{t+1}/ x_{t}) for all t (1)

Eq(1) immediately simplifies a t-variable function into products of 2-variable functions, i.e.,

p(x_{t},
x_{t-1}, . . . x_{1})= p(x_{t}/x_{t-1})p(x_{t-1}/x_{t-2})
. . . p(x_{2}/x_{1})p(x_{1}) (2)

Of course, one can object that the
Markov assumption is un-realistic since when does something, x_{t}, only
depends on its immediate past x_{t-1 }but not on its recent past, for
example, x_{t-2}? Theoretically we can easily overcome this objection.
Re-define

y_{t }=[x_{t }, x_{t-1}]

Then a little thought will
convince you that y_{t} is a Markov stochastic sequence. Thus, **any sequence that depends on the finite
past can be converted into a Markov sequence**. This justifies the huge
amount of theoretical literature on the study of Markov processes. Of course,
computationally this is only a notational change. Simplification in computation
must be dealt with separately. In particular, we shall see in the next article
that Gauss-Markov sequences occupy an unique role of generality, practically,
and computational simplicity in the study of stochastic systems.

In the meantime, I cannot emphasize more strongly that

(i)
a
stochastic sequence is nothing more than a collection of indexed random
variables (properties of which already covered in** PSPT1**)

(ii) Markov stochastic Sequence permits very general study on any random phenomena in time that have dependence on the finite past.

(iii) “Gaussian-ness”, “Stationary”, “Markov” and “i.i.d”, etc, are adjectives that can be selectively applied to a stochastic sequence to simplify notation and/or computation about these sequences.

To test your understanding of this blog article, here is an exercise:

“**Does the description**- *a
system output (possibly a vector) is characterized by a non-Gaussian Markov
stochastic sequence with an output covariance matrix that has nonzero
off–diagonal elements* – **makes sense?**”

**With this article we can go directly to continuous time stochastic
processes with no new required concepts except the caveat in PSPT2 about
continuous time**.

Note 1. We further distinguish **wide sense stationary **and** strict sense stationary**. The former
only requires that s_{ij
}be_{ }dependent
on the difference i-j while the latter requires
p(xi,xj) be dependent only on the difference between the indexes i-j for
all i,j..

http://blog.sciencenet.cn/blog-1565-666599.html

上一篇：Probability and Stochastic Process Tutorial (2)

下一篇：Flash News from Lexington, MA!

扫一扫，分享此博文

- • Two stressful weeks and we are a lucky couple
- • Three works of mine that stood the test of time
- • More on “Tutorial on Solving a Complex Problem”
- • Tutorial on Solving a Complex Problem
- • Difference between US and Chinese Graduate Education (2)
- • Difference between US and Chinese Graduate Education Systems

Archiver|手机版|**科学网**
( 京ICP备07017567号-12 )

GMT+8, 2020-9-23 04:01

Powered by **ScienceNet.cn**

Copyright © 2007- 中国科学报社