Time Series Analysis (Best MSE Predictor & Best Linear Predictor)

Time Series Analysis

Best MSE (Mean Square Error) Predictor

对于所有可能的预测函数 (f(X_{n})),找到一个使 (mathbb{E}big[big(X_{n} - f(X_{n})big)^{2} big]) 最小的 (f) 的 predictor。这样的 predictor 假设记为 (m(X_{n})), 称作 best MSE predictor,i.e.,

[m(X_{n}) = mathop{argmin}limits_{f} mathbb{E}big[ big( X_{n+h} - f(X_{n}) big)^{2} big] ]

我们知道:(mathop{argmin}limits_{f} mathbb{E}big[ big( X_{n+h} - f(X_{n}) big)^{2} big]) 的解即为:

[mathbb{E}big[ X_{n+h} ~ big| ~ X_{n} big] ]


证明:

基于 (X_{n})(mathbb{E}big[ big( X_{n+h} - f(X_{n}) big)^{2} big]) 的最小值,实际上:

[mathop{argmin}limits_{f} mathbb{E}big[ big( X_{n+h} - f(X_{n}) big)^{2} big] iff mathop{argmin}limits_{f} mathbb{E}big[ big( X_{n+h} - f(X_{n}) big)^{2} ~ big| ~ X_{n} big] ]


  • 私以为更严谨的写法是 (mathop{text{argmin}}limits_{f} ~ mathbb{E}Big[Big(X_{n+h} - fbig( X_{n}big)Big)^{2} ~ | ~ mathcal{F}_{n}Big]),其中 (left{ mathcal{F}_{t}right}_{tgeq 0})(left{ X_{t} right}_{tgeq 0}) 相关的 natural filtration,but whatever。

等式右侧之部分:

[begin{align*} mathbb{E}big[ big( X_{n+h} - f(X_{n}) big)^{2} ~ big| ~ X_{n} big] & = mathbb{E}[X_{n+h}^{2} ~ | ~ X_{n}] - 2f(X_{n})mathbb{E}[X_{n+h} ~ | ~ X_{n}] + f^{2}(X_{n}) \ end{align*} ]

其中由于:

[begin{align*} Var(X_{n+h} ~ | ~ X_{n}) & = mathbb{E}Big[ big( X_{n+h} - mathbb{E}big[ X_{n+h}^{2} ~ | ~ X_{n} big] big)^{2} ~ Big| ~ X_{n} Big] \ & = mathbb{E}big[ X_{n+h}^{2} ~ big| ~ X_{n} big] - 2mathbb{E}^{2}big[ X_{n+h}^{2} ~ big| ~ X_{n} big] + mathbb{E}^{2}big[ X_{n+h}^{2} ~ big| ~ X_{n} big] \ & = mathbb{E}big[ X_{n+h}^{2} ~ big| ~ X_{n} big] - mathbb{E}^{2}big[ X_{n+h}^{2} ~ big| ~ X_{n} big] end{align*} ]

which gives that:

[implies Var(X_{n+h} ~ | ~ X_{n}) = mathbb{E}big[ X_{n+h}^{2} ~ big| ~ X_{n} big] - mathbb{E}^{2}big[ X_{n+h} ~ big| ~ X_{n} big] ]

因此,

[begin{align*} mathbb{E}big[ big( X_{n+h} - f(X_{n}) big)^{2} ~ big| ~ X_{n} big] & = Var(X_{n+h} ~ | ~ X_{n}) + mathbb{E}^{2}big[ X_{n+h} ~ big| ~ X_{n}big] - 2f(X_{n})mathbb{E}[X_{n+h} ~ | ~ X_{n}] + f^{2}(X_{n}) \ & = Var(X_{n+h} ~ | ~ X_{n}) + Big( mathbb{E}big[ X_{n+h} ~ big| ~ X_{n}big] - f(X_{n}) Big)^{2} end{align*} ]

方差 (Var(X_{n+h} ~ | ~ X_{n})) 为定值,那么 optimal solution (m(X_{n})) 显而易见:

[m(X_{n}) = mathbb{E}big[ X_{n+h} ~ big| ~ X_{n} big] ]


此时 (left{ X_{t} right}) 为一个 Stationary Gaussian Time Series, i.e.,

[begin{pmatrix} X_{n+h}\ X_{n} end{pmatrix} sim N begin{pmatrix} begin{pmatrix} mu \ mu end{pmatrix}, ~ begin{pmatrix} gamma(0) & gamma(h) \ gamma(h) & gamma(0) end{pmatrix} end{pmatrix} ]

那么我们有:

[X_{n+h} ~ | ~ X_{n} sim NBig( mu + rho(h)big(X_{n} - mubig), ~ gamma(0)big(1 - rho^{2}(h)big) Big) ]

其中 (rho(h))(left{ X_{t} right}) 的 ACF,因此,

[mathbb{E}big[ X_{n+h} ~ big| ~ X_{n} big] = m(X_{n}) = mu + rho(h) big( X_{n} - mu big) ]

注意:

(left{ X_{t} right}) 是一个 Gaussian time series,则一定能计算 best MSE predictor。而若 (left{ X_{t} right}) 并非 Gaussian time series,则计算通常十分复杂。

因此,我们通常不找 best MSE predictor,而寻找 best linear predictor。


Best Linear Predictor (BLP)

在 BLP 假设下,我们寻找一个形如 (f(X_{n}) propto aX_{n} + b) 的 predictor。

则目标为:

[text{minimize: } ~ S(a,b) = mathbb{E} big[ big( X_{n+h} - aX_{n} -b big)^{2} big] ]


推导:

分别对 (a, b) 求偏微分:

[begin{align*} frac{partial}{partial b} S(a, b) & = frac{partial}{partial b} mathbb{E} big[ big( X_{n+h} - aX_{n} -b big)^{2} big] \ & = -2 mathbb{E} big[ X_{n+h} - aX_{n} - b big] \ end{align*} ]

令:

[frac{partial}{partial b} S(a, b) = 0 ]

则:

[begin{align*} -2 cdot & mathbb{E} big[ X_{n+h} - aX_{n} - b big] = 0 \ implies & qquad mathbb{E}[X_{n+h}] - amathbb{E}[X_{n}] - b = 0\ implies & qquad mu - amu - b = 0 \ implies & qquad b^{star} = (1 - a^{star}) mu end{align*} ]

回代并 take partial derivative on (a)

[begin{align*} frac{partial}{partial a} S(a, b) & = frac{partial}{partial a} mathbb{E} big[ big( X_{n+h} - aX_{n} - (1 - a)mu big)^{2} big] \ & = frac{partial}{partial a} mathbb{E} Big[ Big( big(X_{n+h} - mu big) - big( X_{n} - mu big) a Big)^{2} Big] \ & = mathbb{E} Big[ - big( X_{n} - mu big) Big( big(X_{n+h} - mu big) - big( X_{n} - mu big) a Big)Big] \ end{align*} ]

令:

[frac{partial}{partial a} S(a, b) = 0 ]

则:

[begin{align*} & mathbb{E} Big[ - big( X_{n} - mu big) Big( big(X_{n+h} - mu big) - big( X_{n} - mu big) a Big)Big] = 0 \ implies & qquad mathbb{E} Big[big( X_{n} - mu big) Big( big(X_{n+h} - mu big) - big( X_{n} - mu big) a Big)Big] = 0 \ implies & qquad mathbb{E} Big[big( X_{n} - mu big) big(X_{n+h} - mu big) - a big( X_{n} - mu big) big( X_{n} - mu big) Big] = 0 \ implies & qquad mathbb{E} Big[big( X_{n} - mu big) big(X_{n+h} - mu big) Big] = a cdot mathbb{E} Big[big( X_{n} - mu big) big( X_{n} - mu big) Big] \ implies & qquad mathbb{E} Big[big( X_{n} - mathbb{E}[X_{n}] big) big(X_{n+h} - mathbb{E}[X_{n+h}] big) Big] = a cdot mathbb{E} Big[big( X_{n} - mathbb{E}[X_{n}] big)^{2} Big] \ implies & qquad text{Cov}(X_{n}, X_{n+h}) = a cdot text{Var}(X_{n}) \ implies & qquad a^{star} = frac{gamma(h)}{gamma(0)} = rho(h) end{align*} ]

综上,time series (left{ X_{n} right}) 的 BLP 为:

[f(X_{n}) = l(X_{n}) = mu + rho(h) big( X_{n} - mu big) ]

且 BLP 相关的 MSE 为:

[begin{align*} text{MSE} & = mathbb{E}big[ big( X_{n+h} - l(X_{n}) big)^{2} big] \ & = mathbb{E} Big[ Big( X_{n+h} - mu - rho(h) big( X_{n} - mu big) Big)^{2} Big] \ & = rho(0) cdot big( 1 - rho^{2}(h) big) end{align*} ]

发表评论

相关文章