何毓琦的个人博客分享 http://blog.sciencenet.cn/u/何毓琦 哈佛(1961-2001) 清华(2001-date)

博文

Probability and Stochastic Process Tutorial (3) 精选

已有 9705 次阅读 2013-3-2 23:27 |系统分类:教学心得|关键词:3 namespace office center normal

(For new reader and those who request 好友请求, please read my 公告栏 first)

 

After discussing the fundamentals in my first two blogs, we are now ready to talk about the really practically useful subject of stochastic processes. First we start with Stochastic Sequences . . .x1, x2, x3, . . , xi, . . . , xt, xt+1, . . which is nothing but a sequence of random variables indexed by an independent integer variable i= . . . , 1, 2, . . . ,t, t+1, . .. Everything we talked about n r.v.s in the first blog PSPT1 apply. If we concentrate of the r.v.s 1, . . , t then for specification we need the joint probability density function p(x1, x2, x3, . . . , xt) and for rough characterization we have the mean t-vector, [E(xi)], whose typical element is the mean of the r.v. xi and the txt covariance matrix [sij] with typical element sij which is the covariance of xi with xj. The first row of [sij] are the elements s211,  s12, s13  , . . . , s1t ( if the above statements are not obvious to you, then you need to go back to PSPT1 to review the material.) However, here the terminology by tradition is slightly different. Instead of Covariance’s of x1 with x2, x3, . . , xt, we call this a correlation sequence and denote it with the notation x1i , i=1,2, . . .t.  The rationale for this notational complication is this:

In general xi could be a vector r.v. Itself. We need to talk about covariance among it's

elements. Thus, for dependence between r.v.s in time we use the term correlation

instead . Covariance’s are then restricted to r,v,s occurring at the same instant of time to

avoid confusion . But it should be emphasized that mathematically these two types of

second order characterization is exactly the same. Everything conceptually you need to

deal with stochastic sequence has already been covered in PSPT1, except for

computational considerations.

 

In computation dealing with time, the number of time instants we need are often in the

hundreds or thousands. Dealing with a thousand-by- thousand matrix or a thousand

variable density function is cumbersome at best and often impractical. We need to consider special cases which simplify things. The first drastic simplification is to have an

Independent Stochastic Sequence , I.e.,

p(x1, . . . , xt)=p(x1)p(x2). . . p(xt)

Colloquially such a sequence is called a white noise sequence. If furthermore

p(x1)=p(x2)= . . . = p(xt), then we have a stationary white noise sequence.

it is then also referred to as an i.i.d. Sequence ( independent and identically distributed. See also (note 1 below).

 

But i.i.d. is often too restrictive an assumption for most stochastic sequences observed in practice. Thus, we introduce the next order of complication by

 

The Markov Assumption – This assumption is often captured by the statement that “knowledge of the present separate the past from the future”, Mathematically, we say,

            p(xt+1/ xt, xt-1, . . . x1) = p(xt+1/ xt)       for all t                                                            (1)

 

Eq(1) immediately simplifies a t-variable function into products of 2-variable functions, i.e.,

 

            p(xt, xt-1, . . . x1)= p(xt/xt-1)p(xt-1/xt-2) . . . p(x2/x1)p(x1)                                  (2)

 

Of course, one can object that the Markov assumption is un-realistic since when does something, xt, only depends on its immediate past xt-1 but not on its recent past, for example, xt-2? Theoretically we can easily overcome this objection. Re-define 

yt =[xt , xt-1]

Then a little thought will convince you that yt is a Markov stochastic sequence. Thus, any sequence that depends on the finite past can be converted into a Markov sequence. This justifies the huge amount of theoretical literature on the study of Markov processes. Of course, computationally this is only a notational change. Simplification in computation must be dealt with separately. In particular, we shall see in the next article that Gauss-Markov sequences occupy an unique role of generality, practically, and computational simplicity in the study of stochastic systems.

 

In the meantime, I cannot emphasize more strongly that

 

(i)            a stochastic sequence is nothing more than a collection of indexed random variables (properties of which already covered in PSPT1)

(ii)          Markov stochastic Sequence permits very general study on any random phenomena in time that have dependence on the finite past.

(iii)         “Gaussian-ness”, “Stationary”, “Markov” and “i.i.d”,  etc, are adjectives that can be selectively applied to a stochastic sequence to simplify notation and/or computation about these sequences.

To test your understanding of this blog article, here is an exercise:

           

Does the description- a system output (possibly a vector) is characterized by a non-Gaussian Markov stochastic sequence with an output covariance matrix that has nonzero off–diagonal elementsmakes sense?

 

With this article we can go directly to continuous time stochastic processes with no new required concepts except the caveat in PSPT2 about continuous time.

 

Note 1. We further distinguish wide sense stationary and strict sense stationary. The former only requires that sij  be dependent on the difference i-j while the latter requires  p(xi,xj) be dependent only on the difference between the indexes i-j for all i,j..



http://blog.sciencenet.cn/blog-1565-666599.html

上一篇:Probability and Stochastic Process Tutorial (2)
下一篇:Flash News from Lexington, MA!
收藏 分享 举报

9 彭真明 蒋永华 周姗姗 唐常杰 周公朴 林中鹿 徐晓 ahsys ahmen

该博文允许注册用户评论 请点击登录 评论 (10 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备14006957 )

GMT+8, 2017-11-24 18:57

Powered by ScienceNet.cn

Copyright © 2007-2017 中国科学报社

返回顶部