6.2 多元正态分布

6.2.1 多元分布的基本运算性质

  1. \(E(tr(AX))=tr(E(AX))=tr(AE(X))\)

  2. \(Cov(AX,BY)=ACov(X,Y)B', \, Cov(AX)=ACov(X)A'\)

  3. \(E(X'AX)=tr(A\Sigma)+\mu'A\mu\)

证:

\[ \begin{aligned} E(X'AX)&=E[tr(X'AX)] \\ &=E[tr(AXX')] \\ &=tr(AE(XX')) \\ &=tr(A(\Sigma+\mu\mu')) \\ &=tr(A\Sigma)+tr(A\mu\mu') \\ &=tr(A\Sigma)+tr(\mu'A\mu) \\ &=tr(A\Sigma)+\mu'A\mu \end{aligned} \tag{6.18} \]

其中\(Cov(X)=\Sigma=E(XX')-\mu\mu'\)

  1. \(X \sim \varphi_X(t), \, Y=AX+A\),则\(\varphi_Y(t)=\exp(it'a)\varphi_X(A't)\)

证:

\[ \begin{aligned} \varphi_Y(t)&=E(e^{it'Y}) \\ &= E(e^{it'(AX+a)}) \\ &= e^{it'a}E(e^{it'AX}) \\ &= e^{it'a}E(e^{i(A't)'X}) \\ &= e^{it'a}\varphi_X(A't) \end{aligned} \tag{6.19} \]

  1. 若X,Y相互独立且维数相同,则\(\varphi_{X+Y}(t)=\varphi_X(t)\varphi_Y(t)\)

证:

\[ \begin{aligned} \varphi_{X+Y}(t)&=E(e^{it'(X+Y)}) \\ &=E(e^{it'X}e^{it'Y}) \\ &= E(e^{it'X})E(e^{it'Y}) \\ &= \varphi_X(t)\varphi_Y(t) \end{aligned} \tag{6.20} \]

一元正态分布\(N(\mu,\sigma^2)\)的特征函数
\[ \begin{aligned} E(e^{itX})&=\int_{-\infty}^{\infty}\exp(itx)\frac{1}{\sqrt{2\pi}\sigma}\exp[-\frac{(x-\mu)^2}{2\sigma^2}]dx \\ &= \frac{1}{\sqrt{2\pi}\sigma} \int_{-\infty}^{\infty}\exp[-\frac{(x-\mu)^2-2\sigma^2itx}{2\sigma^2}]dx \\ &=\frac{1}{\sqrt{2\pi}\sigma} \int_{-\infty}^{\infty}\exp[-\frac{(x-\sigma^2it-\mu)^2+\sigma^4t^2-2\sigma^2itu}{2\sigma^2}]dx \\ &= \frac{1}{\sqrt{2\pi}\sigma} \exp(\frac{-\sigma^2t^2}{2}+itu) \int_{-\infty}^{\infty}\exp[-\frac{(x-\sigma^2it-\mu)^2}{2\sigma^2}]dx \\ &\stackrel{y=(\frac{x-\sigma^2it-\mu}{\sigma})}\Longrightarrow \frac{1}{\sqrt{2\pi}} \exp(\frac{-\sigma^2t^2}{2}+itu)\int_{-\infty}^{\infty} \exp[-\frac{y^2}{2}]dy \\ &=\exp(\frac{-\sigma^2t^2}{2}+itu) \end{aligned} \tag{6.21} \]

6.2.2 多元正态分布的定义

  1. 概率密度函数

\[ f(x)=\frac{1}{(2\pi)^{\frac{p}{2}}|\Sigma|^{\frac{1}{2}}}\exp(-\frac{1}{2}(x-\mu)'\Sigma^{-1}(x-\mu)) \tag{6.22} \]

该定义要求\(\Sigma > 0\),记\(X\sim N_p(\mu, \Sigma)\)

  1. 特征函数

\[ \varphi_X(t)=\exp(it'\mu-\frac{1}{2}t'\Sigma t) \tag{6.23} \]

该定义要求\(\Sigma \geq 0\),记\(X\sim N_p(\mu, \Sigma)\)

  1. 线性组合1

    \(Y_1,...Y_q \stackrel{iid}\sim N(0,1)\),A时\(p \times q\)常数矩阵,\(\mu\)\(p \times 1\)常数向量,称q维随机向量\(Y=(Y_1,...,Y_p)'\)的线性组合\(X=AY+\mu\)的分布为p维正态分布,记记\(X\sim N_p(\mu, \Sigma)\),其中\(\Sigma=AA'\)

  2. 线性组合2

    若p维随机向量\(X=(X_1,...,X_p)'\)的任意线性组合均服从一元正态分布,则称X为p维正态分布,记为\(X\sim N_p(\mu, \Sigma)\)

二元正态分布\(N(\mu_1,\mu_2,\sigma_1^2,\sigma_2^2,\rho)\)的概率密度函数
\[ f(x,y)=\frac{1}{2\pi\sigma_1\sigma_2\sqrt{1-\rho^2}}\exp\{-\frac{[\frac{(x-\mu_1)^2}{\sigma^2}-2\rho\frac{(x-\mu_1)(y-\mu_2)}{\sigma_1\sigma_2}+\frac{(y-\mu_2)^2}{\sigma^2}]}{2(1-\rho^2)}\} \tag{6.24} \]

6.2.3 正态分布的条件分布和独立性

6.2.3.1 条件分布

\(X \sim N_p(\mu, \Sigma), p \geq 2\),将\(X,\,\mu, \, \Sigma\)进行相同的分块,即\(X=\begin{pmatrix} X^{(1)} \\ X^{(2)} \end{pmatrix}, \mu=\begin{pmatrix} \mu^{(1)} \\ \mu^{(2)} \end{pmatrix}, \Sigma=\begin{pmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{pmatrix}>0\),其中\(X^{(1)}\)为q维向量,\(X^{(2)}\)为p-q维向量。

  1. 给定\(X^{(2)}=x^{(2)}\)时,\((X^{(1)}|X^{(2)}=x^{(2)}) \sim N_q(\mu_{1 \cdot 2},\Sigma_{11 \cdot 2})\),其中\(\mu_{1 \cdot 2}=\mu^{(1)}+\Sigma_{12}\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)}), \, \Sigma_{11 \cdot 2}=\Sigma_{11}-\Sigma_{12}\Sigma_{22}^{-1}\Sigma_{21}\)

证1:

由分块矩阵逆的性质可知

\[ \Sigma^{-1}=\begin{pmatrix} \Sigma_{11 \cdot 2}^{-1} & -\Sigma_{11 \cdot 2}^{-1}\Sigma_{12}\Sigma_{22}^{-1} \\ -\Sigma_{22}^{-1}\Sigma_{21}\Sigma_{11 \cdot 2}^{-1} & \Sigma_{22}^{-1}+\Sigma_{22}^{-1}\Sigma_{21}\Sigma_{11 \cdot 2}^{-1}\Sigma_{12}\Sigma_{22}^{-1} \end{pmatrix} \tag{6.25} \]

由条件密度函数定义可得

\[ \begin{aligned} f(x^{(1)}|x^{(2)}) &= \frac{f(x^{(1)},x^{(2)})}{f_{X^{(2)}}(x^{(2)})} \\ &=\frac{(2\pi)^{-\frac{p}{2}}|\Sigma|^{-\frac{1}{2}} \exp(-\frac{1}{2}(x-\mu)'\Sigma^{-1}(x-\mu))}{(2\pi)^{-\frac{p-q}{2}}|\Sigma_{22}|^{-\frac{1}{2}} \exp(-\frac{1}{2}(x^{(2)}-\mu^{(2)})'\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)}))} \end{aligned} \tag{6.25} \]

根据\((x-\mu)'=\begin{pmatrix} x^{(1)}-\mu^{(1)} \\ x^{(2)}-\mu^{(2)} \end{pmatrix}'\),及式(6.25),可得

\[ \begin{aligned} &\quad (x-\mu)'\Sigma^{-1}(x-\mu)-(x^{(2)}-\mu^{(2)})'\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)}) \\ &=(x^{(1)}-\mu^{(1)})'\Sigma_{11 \cdot 2}^{-1}(x^{(1)}-\mu^{(1)})-(x^{(2)}-\mu^{(2)})'(\Sigma_{22}^{-1}\Sigma_{21}\Sigma_{11 \cdot 2}^{-1})(x^{(1)}-\mu^{(1)}) \\ &-(x^{(1)}-\mu^{(1)})'\Sigma_{11 \cdot 2}^{-1}\Sigma_{12}\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)}) \\ &+ (x^{(2)}-\mu^{(2)})'(\Sigma_{22}^{-1}+\Sigma_{22}^{-1}\Sigma_{21}\Sigma_{11 \cdot 2}^{-1}\Sigma_{12}\Sigma_{22}^{-1})(x^{(2)}-\mu^{(2)}) \\ &- (x^{(2)}-\mu^{(2)})'\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)}) \\ &= (x^{(1)}-\mu^{(1)})'\Sigma_{11 \cdot 2}^{-1}(x^{(1)}-\mu^{(1)}-\Sigma_{12}\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)})) \\ &-(x^{(2)}-\mu^{(2)})'\Sigma_{22}^{-1}\Sigma_{21}\Sigma_{11 \cdot 2}^{-1}(x^{(1)}-\mu^{(1)}-\Sigma_{12}\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)})) \\ &= ((x^{(1)}-\mu^{(1)})'-(x^{(2)}-\mu^{(2)})'\Sigma_{22}^{-1}\Sigma_{21})\Sigma_{11\cdot 2}^{-1}(x^{(1)}-\mu^{(1)}-\Sigma_{12}\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)})) \\ &\stackrel{\mu_{1 \cdot 2}=\mu^{(1)}+\Sigma_{12}\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)})}\Rightarrow (x^{(1)}-\mu_{1\cdot 2})'\Sigma_{11 \cdot 2}^{-1}(x^{(1)}-\mu_{1\cdot 2}) \end{aligned} \tag{6.26} \]

\[ \begin{aligned} f(x^{(1)}|x^{(2)})&=(2\pi)^{-\frac{q}{2}}|\Sigma_{11\cdot 2}|^{-\frac{1}{2}} \exp(-\frac{1}{2}(x^{(1)}-\mu_{1\cdot 2})'\Sigma_{11 \cdot 2}^{-1}(x^{(1)}-\mu_{1\cdot 2})) \end{aligned} \tag{6.27} \]

\((X^{(1)}|X^{(2)}=x^{(2)}) \sim N_q(\mu_{1 \cdot 2},\Sigma_{11 \cdot 2})\)

证2:

\(Y^{(1)}=X^{(1)}-\Sigma_{12}\Sigma_{22}^{-1}X_2^{(2)}\)\(Y^{(2)}=X^{(2)}\),则有

\[ Y= \begin{pmatrix} Y^{(1)} \\ Y^{(2)} \end{pmatrix}= \begin{pmatrix} I & -\Sigma_{12}\Sigma_{22}^{-1} \\ 0 & I \end{pmatrix} \begin{pmatrix} X^{(1)} \\ X^{(2)} \end{pmatrix}= AX \tag{6.28} \]

因为\(X \sim N_p(\mu, \Sigma)\),所以\(Y \sim N_p(A\mu, A\Sigma A')\)

其中

\[ A\mu = \begin{pmatrix} \mu_1-\Sigma_{12}\Sigma_{22}^{-1}\mu_2 \\ \mu_2 \end{pmatrix} \tag{6.29} \]

\[ \begin{aligned} A\Sigma A' &= \begin{pmatrix} I & -\Sigma_{12}\Sigma_{22}^{-1} \\ 0 & I \end{pmatrix} \begin{pmatrix} \Sigma_{11} & \Sigma_{12} \\ \Sigma_{21} & \Sigma_{22} \end{pmatrix} \begin{pmatrix} I & 0 \\ -\Sigma_{22}^{-1}\Sigma_{12} & I \end{pmatrix} \\ &= \begin{pmatrix} \Sigma_{11}-\Sigma_{12}\Sigma_{22}^{-1}\Sigma_{21} & 0 \\ 0 & \Sigma_{22} \end{pmatrix} \\ &= \begin{pmatrix} \Sigma_{11 \cdot 2}& 0 \\ 0 & \Sigma_{22} \end{pmatrix} \end{aligned} \tag{6.30} \]

\(Y^{(1)}\)\(Y^{(2)}\)独立。

已知\(Y^{(1)} \sim N_q(\mu_1-\Sigma_{12}\Sigma_{22}^{-1}\mu_2,\Sigma_{11 \cdot 2})\),当给定\(X^{(2)}=x^{(2)}\),即\(Y^{(2)}=y^{(2)}\)的条件下,\(X^{(1)}=Y^{(1)}+\Sigma_{12}\Sigma_{22}^{-1}x^{(2)} \sim N_q(\mu_1+\Sigma_{12}\Sigma_{22}^{-1}(x^{(2)}-\mu_2),\Sigma_{11 \cdot 2})\)

同理,给定\(X^{(2)}=x^{(2)}\)时,\((X^{(1)}|X^{(2)}=x^{(2)}) \sim N_q(\mu_{1 \cdot 2},\Sigma_{11 \cdot 2})\),其中\(\mu_{1 \cdot 2}=\mu^{(1)}+\Sigma_{12}\Sigma_{22}^{-1}(x^{(2)}-\mu^{(2)}), \, \Sigma_{11 \cdot 2}=\Sigma_{11}-\Sigma_{12}\Sigma_{22}^{-1}\Sigma_{21}\)

6.2.3.2 独立性

\(X \sim N_p(\mu,\Sigma),\, Y=AX+a, \, Z= BX+b\),则Y和Z独立当且仅当\(A\Sigma B'=0\)

证:

\[ W=\begin{pmatrix} Y \\ Z \end{pmatrix}=\begin{pmatrix} A \\ B \end{pmatrix}X+\begin{pmatrix} a \\ b \end{pmatrix} \\ W\sim N\begin{pmatrix} \begin{pmatrix} A\mu+a \\ B\mu +b\end{pmatrix},\begin{pmatrix} A\Sigma A' & A\Sigma B' \\ B\Sigma A' & B\Sigma B'\end{pmatrix} \end{pmatrix} \tag{6.31} \]

显然当\(A\Sigma B'=0\)时,非主对角线元素为0,此时Y与Z独立。

6.2.4 偏相关系数与全相关系数

6.2.4.1 偏相关系数

对于\((X^{(1)}|X^{(2)}=x^{(2)})\sim N_q(\mu_{1\cdot 2},\Sigma_{11 \cdot 2})\),有\(E(X^{(1)}|X^{(2)})=\mu_{1 \cdot 2}=\mu^{(1)}+\Sigma_{12}\Sigma_{22}^{-1}(X^{(2)}-\mu^{(2)})\),该形式与回归分析中的条件期望回归类似,因此可将\(\Sigma_{12}\Sigma_{22}^{-1}\)视为\(X^{(1)}\)\(X^{(2)}\)的回归系数。记\(\Sigma_{11 \cdot 2}=(\sigma_{ij \cdot q+1,...,p})_{q \times q}\),则称\(r_{ij \cdot q+1,...,p}=\frac{\sigma_{ij \cdot q+1,...,p}}{(\sigma_{ii \cdot q+1,...,p}\sigma_{jj \cdot q+1,...,p})^{\frac{1}{2}}}\)为在给定\(X^{(2)}\)条件下\(X_i\)\(X_j\)的偏相关系数。

就是在给定条件下对条件协差阵求相关系数

6.2.4.2 全相关系数

对于随机向量X和随机变量y,设 \[ Z=\begin{pmatrix} X \\ y \end{pmatrix} \sim N\begin{pmatrix} \begin{pmatrix} \mu_X \\ \mu_y \end{pmatrix} , \begin{pmatrix} \Sigma_{XX} & \Sigma_{Xy} \\ \Sigma_{yX} & \sigma_{yy} \end{pmatrix} \end{pmatrix} \]

\[R=\begin{pmatrix} \frac{\Sigma_{yX}\Sigma_{XX}^{-1}\Sigma_{Xy}}{\sigma_{yy}} \end{pmatrix}^{\frac{1}{2}} \] 为y与X的全相关系数。

随机向量拼上随机变量,其中\(\Sigma_{yX}\)\(1 \times p\)维向量,\(\Sigma_{Xy}\)\(p \times 1\)维向量

可以简记为大协差阵拆出两个标量构造分式,其中分母是\(\sigma_{yy}\),整个分式取根号就是全相关系数

特别的,\(R=\max\limits_{Cov(a'X)=1} corr(y,a'X)\)

证:

\[ \begin{aligned} (corr(y,a'X))^2&= \frac{Cov^2(y,a'X)}{Cov(y)Cov(a'X)} \\ &= \frac{(\Sigma_{yX}a)^2}{\sigma_{yy}a'\Sigma_{XX}a} \\ &\leq \frac{(\Sigma_{Xy}'\Sigma_{XX}^{-1}\Sigma_{Xy})(a'\Sigma_{XX}a)}{\sigma_{yy}a'\Sigma_{XX}a} \\ &= \frac{(\Sigma_{Xy}'\Sigma_{XX}^{-1}\Sigma_{Xy})}{\sigma_{yy}} \\ &= \frac{\Sigma_{yX}\Sigma_{XX}^{-1}\Sigma_{Xy}}{\sigma_{yy}} \\ &= R^2 \end{aligned} \tag{6.32} \]

Cauchy-Schwarz不等式:设\(B>0\),则\((x'y)^2 \leq (x'Bx)(y'B^{-1}y)\)。这里选取\(B=\Sigma_{XX}\)是为了把分母的\(a'\Sigma_{XX}a\)消掉