样本方差,协方差,协方差矩阵

一、样本方差

设样本均值为$\bar x$,样本方差为S2,总体均值为${\rm{\mu }}$,总体方差为${{\rm{\sigma }}^2}$,那么样本方差

${S^2} = \frac{1}{{n – 1}}\mathop \sum \limits_{i = 1}^n {\left( {{x_i} – \bar x} \right)^2}$

推导:假设样本数量等于总体数量,应有

 ${S^2} = \frac{1}{n}\mathop \sum \limits_{i = 1}^n {\left( {{x_i} – \bar x} \right)^2}$

在多次重复抽取样本过程中,样本方差会逐渐接近总体方差,假设每次抽取的样本方差为

(S12,S22,S32…),然后对这些样本方差求平均值记为E(S2),则

 

${\rm{E}}\left( {{{\rm{S}}^2}} \right) = {\rm{E}}\left( {\frac{1}{n}\mathop \sum \limits_{i = 1}^n {{\left( {{x_i} – \bar x} \right)}^2}} \right)$

$ = {\rm{E}}\left( {\frac{1}{n}\mathop \sum \limits_{i = 1}^n {{\left( {\left( {{x_i} – \mu } \right) – \left( {\bar x – \mu } \right)} \right)}^2}} \right)$

因为

$\frac{1}{n}\mathop \sum \limits_{i = 1}^n \left( {{x_i} – \mu } \right) = \frac{1}{n}\mathop \sum \limits_{i = 1}^n {x_i} – \mu  = \bar x – \mu $

接上式

${\rm{E}}\left( {\frac{1}{n}\mathop \sum \limits_{i = 1}^n {{\left( {\left( {{x_i} – \mu } \right) – \left( {\bar x – \mu } \right)} \right)}^2}} \right) = {\rm{E}}\left( {\frac{1}{n}\mathop \sum \limits_{i = 1}^n {{\left( {{x_i} – \mu } \right)}^2} – \frac{1}{n}\mathop \sum \limits_{i = 1}^n 2({x_i} – \mu )\left( {\bar x – \mu } \right) + \frac{1}{n}\mathop \sum \limits_{i = 1}^n {{\left( {{x_i} – \mu } \right)}^2}} \right)$

$ = {\rm{E}}\left( {\frac{1}{n}\mathop \sum \limits_{i = 1}^n {{\left( {{x_i} – \mu } \right)}^2} – 2\left( {\bar x – \mu } \right)\left( {\bar x – \mu } \right) + {{\left( {\bar x – \mu } \right)}^2}} \right)$

$ = {\rm{\;E}}\left( {\frac{1}{n}\mathop \sum \limits_{i = 1}^n {{\left( {{x_i} – \mu } \right)}^2} – {{\left( {\bar x – \mu } \right)}^2}} \right)$

$ = {\rm{E}}\left( {\frac{1}{n}\mathop \sum \limits_{i = 1}^n {{\left( {{x_i} – \mu } \right)}^2}} \right) – E({\left( {\bar x – \mu } \right)^2}) \le {\sigma ^2}$

所以样本方差除以n会小于总体方差

${\rm{E}}\left( {\frac{1}{n}\mathop \sum \limits_{i = 1}^n {{\left( {{x_i} – \mu } \right)}^2}} \right) – E({\left( {\bar x – \mu } \right)^2}) = {\sigma ^2} – \frac{1}{n}{\sigma ^2} = \frac{{n – 1}}{n}{\sigma ^2}$

所以样本方差与总体方差差(n-1)/n倍。

 

二、协方差

协方差是对两个随机变量联合分布线性相关程度的一种度量。两个随机变量越线性相关,协方差越大,完全线性无关,协方差为零。

Cov(x,y) = E[(x-E(x))(y-E(y))]

特殊的当只存在一个变量x,x与自身的协方差等于方差,记作Var(x)

Cov(x,x) =Var(x)= E[(x-E(x))(x-E(x))]

样本协方差

对于多维随机变量Q(x1,x2,x3,…,xn),样本集合为xij=[x1j,x2j,…,xnj](j=1,2,…,m),m为样本数量,在a,b(a,b=1,2…n)两个维度内

${\rm{cov}}\left( {{{\rm{x}}_{\rm{a}}},{{\rm{x}}_{\rm{b}}}} \right) = \frac{{\mathop \sum \nolimits_{j = 1}^m \left( {{x_{aj}} – {{\bar x}_a}} \right)\left( {{x_{bj}} – {{\bar x}_b}} \right)}}{{m – 1}}$

 

三、协方差矩阵

对于多维随机变量Q(x1,x2,x3,…,xn)我们需要对任意两个变量(xi,xj)求线性关系,即需要对任意两个变量求协方差矩阵

Cov(xi,xj)= E[(xi-E(xi))(xj-E(xj))]

\[{\rm{cov}}\left( {{x_i},{x_j}} \right) = \left[ {\begin{array}{*{20}{c}}
{{\rm{cov}}\left( {{x_1},{x_1}} \right)}&{{\rm{cov}}\left( {{x_1},{x_2}} \right)}&{{\rm{cov}}\left( {{x_1},{x_3}} \right)}& \cdots &{{\rm{cov}}\left( {{x_1},{x_{\rm{n}}}} \right)}\\
{{\rm{cov}}\left( {{x_2},{x_1}} \right)}&{{\rm{cov}}\left( {{x_2},{x_2}} \right)}&{{\rm{cov}}\left( {{x_2},{x_3}} \right)}& \cdots &{{\rm{cov}}\left( {{x_2},{x_n}} \right)}\\
{{\rm{cov}}\left( {{x_3},{x_1}} \right)}&{{\rm{cov}}\left( {{x_3},{x_2}} \right)}&{{\rm{cov}}\left( {{x_3},{x_3}} \right)}& \cdots &{{\rm{cov}}\left( {{x_3},{x_n}} \right)}\\
\vdots & \vdots & \vdots & \ddots & \vdots \\
{{\rm{cov}}\left( {{x_m}{x_1}} \right)}&{{\rm{cov}}\left( {{x_m},{x_2}} \right)}&{{\rm{cov}}\left( {{x_m},{x_3}} \right)}& \cdots &{{\rm{cov}}\left( {{x_m},{x_n}} \right)}
\end{array}} \right]\]

 

【 结束 】

版权声明:本文为fujj原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://www.cnblogs.com/fujj/p/9720357.html