MAALA3.9_初等矩阵和等价 (Elementary Matrices and Equivalence)

注：本文是对Matrix Analysis and Applied Linear Algebra一书3.9节Elementary Matrices and Equivalence的学习笔记

将复杂的对象分解成几个基本对象的组合是一种常用的处理数学问题的方式，比如因式分解。在矩阵代数中，类似地，一个一般的矩阵也可能可以被分解成几个初等矩阵(Elementary Matrices)的乘积。

Matrices of the form I−uvT\mathbf I − \mathbf {uv}^TI−uvT, where u\mathbf uu and v\mathbf vv are n×1n \times 1n×1 columns such that vTu≠1\mathbf v^T \mathbf u \ne 1vTu=1 are called elementary matrices, and we know from Sherman–Morrison Formula that all such matrices are nonsingular and (I−uvT)−1=I−uvTvTu−1.(\mathbf I − \mathbf {uv}^T)^{−1} = \mathbf I − \frac{\mathbf {uv}^T}{\mathbf v^T \mathbf u − 1} . (I−uvT)−1=I−vTu−1uvT. Notice that inverses of elementary matrices are elementary matrices.

我们特别关注和基本行/列变换有关的的初等矩阵。定义：

Type I\text{Type }\rm IType I: 交换第i,ji,ji,j行/列
Type II\text{Type }\rm {II}Type II: 将第iii行/列乘以α(α≠0)\alpha(\alpha \ne 0)α(α=0)倍
Type III\text{Type }\rm {III}Type III: 将第iii行/列的若干倍加到第jjj行/列

这三种变换对应的初等矩阵，分别是E1=I−uuT,u=ej−eiE2=I−(1−α)eieiTE3=I+αejeiT\begin{aligned}\mathbf E_1&=\mathbf I-\mathbf u \mathbf u^T, \mathbf u=\mathbf e_j-\mathbf e_i \\ \mathbf E_2&=\mathbf I-(1-\alpha) \mathbf e_i \mathbf e_i^T \\ \mathbf E_3&=\mathbf I+\alpha \mathbf e_j \mathbf e_i^T \end{aligned}E1E2E3=I−uuT,u=ej−ei=I−(1−α)eieiT=I+αejeiT可以验证它们满足这样的性质：

When used as a left-hand multiplier, an elementary matrix of Type I, II, or III executes the corresponding row operation.

When used as a right-hand multiplier, an elementary matrix of Type I, II, or III executes the corresponding column operation.

比如Type I\text{Type }\rm IType I: E1A=(I−(ej−ei)(ej−ei)T)A=A−(ejejT+eieiT−eiejT−ejeiT)A=A−(ejAj∗+eiAi∗−eiAj∗−ejAi∗)=A−([0⋮0⋮Aj∗⋮0]+[0⋮Ai∗⋮0⋮0]−[0⋮Aj∗⋮0⋮0]−[0⋮0⋮Ai∗⋮0]),AE1=(I−(ej−ei)(ej−ei)T)A=A−A(ejejT+eieiT−eiejT−ejeiT)=A−(A∗jejT+A∗ieiT−A∗jeiT−A∗iejT)=A−([0⋮0⋮A∗jT⋮0]T+[0⋮A∗iT⋮0⋮0]T−[0⋮A∗jT⋮0⋮0]T−[0⋮0⋮A∗iT⋮0]T)\begin{aligned}\mathbf E_1 \mathbf A &=(\mathbf I-(\mathbf e_j-\mathbf e_i)(\mathbf e_j-\mathbf e_i)^T)\mathbf A\\&=\mathbf A-(\mathbf e_j \mathbf e_j^T+\mathbf e_i \mathbf e_i^T-\mathbf e_i \mathbf e_j^T-\mathbf e_j \mathbf e_i^T)\mathbf A\\&=\mathbf A-(\mathbf e_j \mathbf A_{j*}+\mathbf e_i \mathbf A_{i*}-\mathbf e_i \mathbf A_{j*}-\mathbf e_j \mathbf A_{i*}) \\ &=\mathbf A-\left(\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf A_{j*} \\ \vdots \\ \mathbf 0 \end{matrix}\right]+\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf A_{i*} \\ \vdots \\ \mathbf 0\\ \vdots \\ \mathbf 0 \end{matrix}\right]-\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf A_{j*} \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf 0 \end{matrix}\right]-\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf A_{i*} \\ \vdots \\ \mathbf 0 \end{matrix}\right] \right),\\ \mathbf A \mathbf E_1 &=(\mathbf I-(\mathbf e_j-\mathbf e_i)(\mathbf e_j-\mathbf e_i)^T)\mathbf A\\&=\mathbf A-\mathbf A(\mathbf e_j \mathbf e_j^T+\mathbf e_i \mathbf e_i^T-\mathbf e_i \mathbf e_j^T-\mathbf e_j \mathbf e_i^T)\\&=\mathbf A-(\mathbf A_{*j} \mathbf e_j ^T+ \mathbf A_{*i} \mathbf e_i^T- \mathbf A_{*j} \mathbf e_i^T- \mathbf A_{*i} \mathbf e_j^T) \\ &=\mathbf A-\left(\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf A_{*j}^T \\ \vdots \\ \mathbf 0 \end{matrix}\right]^T+\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf A_{*i}^T \\ \vdots \\ \mathbf 0\\ \vdots \\ \mathbf 0 \end{matrix}\right]^T-\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf A_{*j}^T \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf 0 \end{matrix}\right]^T-\left[\begin{matrix}\mathbf 0 \\ \vdots \\ \mathbf 0 \\ \vdots \\ \mathbf A_{*i}^T \\ \vdots \\ \mathbf 0 \end{matrix}\right]^T \right)\end{aligned}E1AAE1=(I−(ej−ei)(ej−ei)T)A=A−(ejejT+eieiT−eiejT−ejeiT)A=A−(ejAj∗+eiAi∗−eiAj∗−ejAi∗)=A−⎝⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎛⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮0⋮Aj∗⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤+⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮Ai∗⋮0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤−⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮Aj∗⋮0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤−⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮0⋮Ai∗⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤⎠⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎞,=(I−(ej−ei)(ej−ei)T)A=A−A(ejejT+eieiT−eiejT−ejeiT)=A−(A∗jejT+A∗ieiT−A∗jeiT−A∗iejT)=A−⎝⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎛⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮0⋮A∗jT⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤T+⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮A∗iT⋮0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤T−⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮A∗jT⋮0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤T−⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡0⋮0⋮A∗iT⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤T⎠⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎞ Type II\text{Type }\rm {II}Type II和Type III\text{Type }\rm {III}Type III类似。

也就是说，矩阵的基本行/列变换等价于左乘/右乘对应的初等矩阵
通过这个结论，我们可以得到下面这个推论（也就是我们一开始提到的，把一个一般的矩阵分解成若干初等矩阵的乘积）：

A\mathbf AA is a nonsingular matrix if and only if A\mathbf AA is the product of elementary matrices of Type I, II, or III.

⟹\Longrightarrow⟹
如果A\mathbf AA是非奇异的，那么它可以通过基本行变换化成最简行阶梯型，刚好就是单位阵I\mathbf II. 我们用G1,G2,⋯,Gk\mathbf G_1,\mathbf G_2, \cdots, \mathbf G_kG1,G2,⋯,Gk表示每一步基本行变换对应的初等矩阵，那么有Gk⋯G2G1A=Ior, equivalently, A=G1−1G2−1⋯Gk−1.\mathbf G_k \cdots \mathbf G_2 \mathbf G_1 \mathbf A=\mathbf I\text{ or, equivalently, }\mathbf A=\mathbf G_1^{-1}\mathbf G_2^{-1}\cdots \mathbf G_k^{-1}.Gk⋯G2G1A=I or, equivalently, A=G1−1G2−1⋯Gk−1.根据初等矩阵的定义，有Type I,II,III\text{Type }\rm{I},\rm{II},\rm{III}Type I,II,III型初等矩阵的逆也是Type I,II,III\text{Type }\rm{I},\rm{II},\rm{III}Type I,II,III型初等矩阵，因此证明A\mathbf AA可以表示成Type I,II,III\text{Type }\rm{I},\rm{II},\rm{III}Type I,II,III型初等矩阵的乘积。
⟸\Longleftarrow⟸
若A=E1E2⋯Ek\mathbf A=\mathbf E_1 \mathbf E_2 \cdots \mathbf E_kA=E1E2⋯Ek是若干初等矩阵的乘积，由于初等矩阵非奇异，且非奇异矩阵的乘积非奇异，有A\mathbf AA非奇异。

把基本行/列变换和左乘/右乘初等矩阵对应起来后，我们引入等价(Equivalence)的概念。

Whenever B\mathbf BB can be derived from A\mathbf AA by a combination of elementary row and column operations, we write A∼B\mathbf A \sim \mathbf BA∼B, and we say that A\mathbf AA and B\mathbf BB are equivalent matrices. A∼B⟺PAQ=Bfor nonsingular Pand Q.\mathbf A \sim \mathbf B \iff \mathbf {PAQ = B} \text{ for nonsingular } \mathbf P \text{ and } \mathbf Q.A∼B⟺PAQ=B for nonsingular P and Q.

Whenever B\mathbf BB can be obtained from A\mathbf AA by performing a sequence of elementary row operations only, we write A∼rowB\mathbf A \overset{\text{row}}{\sim} \mathbf BA∼rowB, and we say that A\mathbf AA and B\mathbf BB are row equivalent. In other words, A∼rowB⟺PA=Bfor a nonsingular P.\mathbf A \overset{\text{row}}{\sim} \mathbf B \iff \mathbf {PA = B} \text{ for a nonsingular }\mathbf P.A∼rowB⟺PA=B for a nonsingular P.

Whenever B\mathbf BB can be obtained from A\mathbf AA by performing a sequence of elementary column operations only, we write A∼colB\mathbf A \overset{\text{col}}{\sim} \mathbf BA∼colB, and we say that A\mathbf AA and B\mathbf BB are column equivalent. In other words, A∼colB⟺AQ=Bfor a nonsingular Q.\mathbf A \overset{\text{col}}{\sim} \mathbf B \iff \mathbf {AQ = B} \text{ for a nonsingular }\mathbf Q.A∼colB⟺AQ=B for a nonsingular Q.

容易得到两点性质：

A∼B⟺B∼A\mathbf A \sim \mathbf B \iff \mathbf B \sim \mathbf AA∼B⟺B∼A (因为P,Q\mathbf P,\mathbf QP,Q可逆)
A∼Band B∼C⟹A∼C\mathbf A \sim \mathbf B \text{ and } \mathbf B \sim \mathbf C \Longrightarrow \mathbf A \sim \mathbf CA∼B and B∼C⟹A∼C

在介绍最简行阶梯型时，有一个结论是：对于每一个矩阵A\mathbf AA，其最简行阶梯型EA\mathbf {E_A}EA是唯一的。现在我们可以借助等价性给出该结论的严格证明。

不失一般性，假设A\mathbf AA是方阵（如果不是方阵可以补零变成方阵，对应的最简行阶梯型也只是增加了几个全零行）
若最简行阶梯型不唯一，设其形式分别为E1\mathbf E_1E1和E2\mathbf E_2E2，有A∼rowE1,A∼rowE2\mathbf A \overset{\text{row}}{\sim} \mathbf E_1,\mathbf A \overset{\text{row}}{\sim} \mathbf E_2A∼rowE1,A∼rowE2. 我们首先证明E1\mathbf E_1E1和E2\mathbf E_2E2基本列的位置是一样的。
引入一种新的形式Ti\mathbf T_iTi，它是上三角矩阵，基本列的位置和Ei\mathbf E_iEi一样，但是每个pivotal 1都在矩阵的对角线上。比如：E=(120001000),then T=(120000001).\mathbf E=\left(\begin{matrix}1 & 2 & 0 \\ 0 & 0& 1 \\ 0 & 0 & 0\end{matrix}\right),\text{ then }\mathbf T=\left(\begin{matrix}1 & 2 & 0 \\ 0 & 0& 0 \\ 0 & 0 & 1\end{matrix}\right).E=⎝⎛100200010⎠⎞, then T=⎝⎛100200001⎠⎞.
再引入一种形式Ui\mathbf U_iUi，它可以从Ti\mathbf T_iTi经过Type I\text{Type }\rm{I}Type I型基本变换得到：Ui=QiTiQiT=(IriJi00),\mathbf U_i=\mathbf Q_i \mathbf T_i \mathbf Q_i^T=\left(\begin{matrix} \mathbf I_{r_i} & \mathbf J_i \\ \mathbf 0 & \mathbf 0 \end{matrix}\right),Ui=QiTiQiT=(Iri0Ji0),其中Qi\mathbf Q_iQi是一系列Type I\text{Type }\rm{I}Type I型基本变换对应的初等矩阵。在上一个例子中，有U=(102010000).\mathbf U=\left(\begin{matrix}1 & 0 & 2 \\ 0 & 1& 0 \\ 0 & 0 & 0\end{matrix}\right).U=⎝⎛100010200⎠⎞.另外也容易验证Ui=EiQi.\mathbf U_i=\mathbf E_i \mathbf Q_i.Ui=EiQi.

插入一个小结论：Type I\text{Type }\rm{I}Type I型初等矩阵是置换阵(permutation matrix)，即E1−1=E1T\mathbf E_1^{-1}=\mathbf E_1^TE1−1=E1T. 这是因为E1−1=(I−uuT)−1=I−uuTuTu−1=I−uuT(ej−ei)T(ej−ei)−1=I−uuT=E1T(=E1)\mathbf E_1^{-1}=(\mathbf I-\mathbf u \mathbf u^T)^{-1}=\mathbf I-\frac{\mathbf u \mathbf u^T}{\mathbf u^T \mathbf u-1}=\mathbf I-\frac{\mathbf u \mathbf u^T}{( \mathbf e_j-\mathbf e_i)^T( \mathbf e_j-\mathbf e_i)-1}=\mathbf I-\mathbf u \mathbf u^T=\mathbf E_1^T(=\mathbf E_1)E1−1=(I−uuT)−1=I−uTu−1uuT=I−(ej−ei)T(ej−ei)−1uuT=I−uuT=E1T(=E1)那么容易得到若干个Type I\text{Type }\rm{I}Type I型初等矩阵的乘积也是置换阵。

根据插入的结论，有Qi\mathbf Q_iQi是置换阵。又有Ui2=Ui\mathbf U_i^2=\mathbf U_iUi2=Ui，因此Ti2=(QiTUiQi)(QiTUiQi)=QiTUiQi=Ti.\mathbf T_i^2=(\mathbf Q_i^T \mathbf U_i \mathbf Q_i)(\mathbf Q_i^T \mathbf U_i \mathbf Q_i)=\mathbf Q_i^T \mathbf U_i \mathbf Q_i=\mathbf T_i.Ti2=(QiTUiQi)(QiTUiQi)=QiTUiQi=Ti.由E1∼rowT1,E2∼rowT2\mathbf E_1 \overset{\text{row}}{\sim} \mathbf T_1,\mathbf E_2 \overset{\text{row}}{\sim} \mathbf T_2E1∼rowT1,E2∼rowT2以及等价的传递性，有T1∼rowT2\mathbf T_1 \overset{\text{row}}{\sim} \mathbf T_2T1∼rowT2，那么设存在非奇异矩阵R\mathbf RR使得RT1=T2\mathbf R \mathbf T_1=\mathbf T_2RT1=T2. 因此T2=RT1=RT1T1=T2T1and T1=R−1T2=R−1T2T2=T1T2.\mathbf T_2=\mathbf R \mathbf T_1=\mathbf R \mathbf T_1 \mathbf T_1=\mathbf T_2 \mathbf T_1 \text{ and } \mathbf T_1=\mathbf R^{-1}\mathbf T_2=\mathbf R^{-1} \mathbf T_2 \mathbf T_2=\mathbf T_1 \mathbf T_2.T2=RT1=RT1T1=T2T1 and T1=R−1T2=R−1T2T2=T1T2.因为Ti\mathbf T_iTi是上三角矩阵，T1T2\mathbf T_1 \mathbf T_2T1T2和T2T1\mathbf T_2 \mathbf T_1T2T1有完全相同的对角线元素。因此，T1\mathbf T_1T1中的基本列位置和T2\mathbf T_2T2一样。又因为Ti\mathbf T_iTi中基本列位置和Ei\mathbf E_iEi一样，有E1\mathbf E_1E1中的基本列位置和E2\mathbf E_2E2一样。因此可以得到Q1=Q2=Q,\mathbf Q_1=\mathbf Q_2=\mathbf Q,Q1=Q2=Q,也即E1Q=(IrJ100)and E2Q=(IrJ200).\mathbf E_1 \mathbf Q=\left(\begin{matrix} \mathbf I_r & \mathbf J_1 \\ \mathbf 0 & \mathbf 0 \end{matrix}\right) \text{ and }\mathbf E_2 \mathbf Q=\left(\begin{matrix} \mathbf I_r & \mathbf J_2 \\ \mathbf 0 & \mathbf 0 \end{matrix}\right).E1Q=(Ir0J10) and E2Q=(Ir0J20).
由E1∼rowE2\mathbf E_1 \overset{\text{row}}{\sim} \mathbf E_2E1∼rowE2，存在非奇异矩阵P\mathbf PP使得PE1=E2\mathbf P \mathbf E_1=\mathbf E_2PE1=E2. 左右两边同乘Q\mathbf QQ，则PE1Q=E2Q,or (P11P12P21P22)(IrJ100)=(IrJ200),\mathbf P \mathbf E_1 \mathbf Q=\mathbf E_2 \mathbf Q,\text{ or } \left(\begin{matrix} \mathbf P_{11} & \mathbf P_{12} \\ \mathbf P_{21} & \mathbf P_{22} \end{matrix}\right)\left(\begin{matrix} \mathbf I_r & \mathbf J_1 \\ \mathbf 0 & \mathbf 0 \end{matrix}\right)=\left(\begin{matrix} \mathbf I_r & \mathbf J_2 \\ \mathbf 0 & \mathbf 0 \end{matrix}\right),PE1Q=E2Q, or (P11P21P12P22)(Ir0J10)=(Ir0J20),展开有P11=Ir,P11J1=J2.\mathbf P_{11}=\mathbf I_r,\mathbf P_{11}\mathbf J_1=\mathbf J_2.P11=Ir,P11J1=J2. 因此J1=J2,E1=E2\mathbf J_1=\mathbf J_2,\mathbf E_1=\mathbf E_2J1=J2,E1=E2.

通过基本行变换可以把矩阵简化成最简行阶梯型，如果继续进行基本列变换，则可以得到一个更简单的形式：Rank Normal Form

If A\mathbf AA is an m×nm \times nm×n matrix such that rank(A)=rrank(\mathbf A)=rrank(A)=r, thenA∼Nr=(Ir000).\mathbf A \sim \mathbf N_r=\left(\begin{matrix}\mathbf I_r & \mathbf 0\\ \mathbf 0 & \mathbf 0 \end{matrix}\right).A∼Nr=(Ir000).Nr\mathbf N_rNr is called the rank normal form for A\mathbf AA, and it is the end product of a complete reduction of A\mathbf AA by using both row and column operations.

证明：A\mathbf AA通过基本行变换变成最简行阶梯型EA\mathbf {E_A}EA，该过程可以表示成A∼rowEA\mathbf A \overset{\text{row}}{\sim} \mathbf {E_A}A∼rowEA或PA=EA.\mathbf {PA=E_A}.PA=EA. 如果rank(A)=rrank(\mathbf A)=rrank(A)=r，那么EA\mathbf {E_A}EA的基本列是rrr个单位列(unit column). 把它们通过列变换移到最左边，该过程可以用Type I\text{Type }\rm IType I型初等矩阵的乘积Q1\mathbf Q_1Q1表示，即PAQ1=EAQ1=(IrJ00).\mathbf{PAQ}_1=\mathbf{E_A}\mathbf Q_1=\left(\begin{matrix}\mathbf I_r & \mathbf J\\ \mathbf 0 & \mathbf 0 \end{matrix}\right).PAQ1=EAQ1=(Ir0J0).两边同乘Q2=(Ir−J0I)\mathbf Q_2=\left(\begin{matrix}\mathbf I_r & -\mathbf J\\ \mathbf 0 & \mathbf I \end{matrix}\right)Q2=(Ir0−JI)得到PAQ1Q2=(IrJ00)(Ir−J0I)=(Ir000).\mathbf {PAQ}_1\mathbf Q_2=\left(\begin{matrix}\mathbf I_r & \mathbf J\\ \mathbf 0 & \mathbf 0 \end{matrix}\right)\left(\begin{matrix}\mathbf I_r & -\mathbf J\\ \mathbf 0 & \mathbf I \end{matrix}\right)=\left(\begin{matrix}\mathbf I_r & \mathbf 0\\ \mathbf 0 & \mathbf 0 \end{matrix}\right).PAQ1Q2=(Ir0J0)(Ir0−JI)=(Ir000).P\mathbf PP和Q=Q1Q2\mathbf Q=\mathbf Q_1 \mathbf Q_2Q=Q1Q2是非奇异的，因此A∼Nr\mathbf A \sim \mathbf N_rA∼Nr.

例题：说明为什么rank(A00B)=rank(A)+rank(B).rank\left(\begin{matrix}\mathbf A & \mathbf 0 \\ \mathbf 0 & \mathbf B \end{matrix}\right)=rank(\mathbf A)+rank(\mathbf B).rank(A00B)=rank(A)+rank(B).
假设rank(A)=r,rank(B)=srank(\mathbf A)=r,rank(\mathbf B)=srank(A)=r,rank(B)=s，且PrAQr=Nr,PsBQs=Ns.\mathbf P_r \mathbf A \mathbf Q_r=\mathbf N_r,\mathbf P_s \mathbf B \mathbf Q_s=\mathbf N_s.PrAQr=Nr,PsBQs=Ns. 那么(Pr00Ps)(A00B)(Qr00Qs)=(Nr00Ns).\left(\begin{matrix}\mathbf P_r & \mathbf 0\\ \mathbf 0 & \mathbf P_s \end{matrix}\right)\left(\begin{matrix}\mathbf A & \mathbf 0\\ \mathbf 0 & \mathbf B \end{matrix}\right)\left(\begin{matrix}\mathbf Q_r & \mathbf 0\\ \mathbf 0 & \mathbf Q_s \end{matrix}\right)=\left(\begin{matrix}\mathbf N_r & \mathbf 0\\ \mathbf 0 & \mathbf N_s \end{matrix}\right).(Pr00Ps)(A00B)(Qr00Qs)=(Nr00Ns).由于原矩阵的秩和Rank normal form的秩相同，有rank(A00B)=rank(Nr00Ns)=r+s.rank\left(\begin{matrix}\mathbf A & \mathbf 0\\ \mathbf 0 & \mathbf B \end{matrix}\right)=rank\left(\begin{matrix}\mathbf N_r & \mathbf 0\\ \mathbf 0 & \mathbf N_s \end{matrix}\right)=r+s.rank(A00B)=rank(Nr00Ns)=r+s.

定义等价之后，如何判断是否A∼B,A∼rowB,A∼colB\mathbf A \sim \mathbf B, \mathbf A \overset{\text{row}}{\sim} \mathbf B,\mathbf A \overset{\text{col}}{\sim} \mathbf BA∼B,A∼rowB,A∼colB呢？把A\mathbf AA和B\mathbf BB化成最简阶梯型再判断的方式太繁琐，有更加简单的方式：

Testing for Equivalence
For m×nm \times nm×n matrices A\mathbf AA and B\mathbf BB the following statements are true.

A∼B⟺rank(A)=rank(B).\mathbf A \sim \mathbf B \iff rank(\mathbf A)=rank(\mathbf B).A∼B⟺rank(A)=rank(B).

A∼rowB⟺EA=EB.\mathbf A \overset{\text{row}}{\sim} \mathbf B \iff \mathbf {E_A}=\mathbf {E_B}.A∼rowB⟺EA=EB.

A∼colB⟺EAT=EBT.\mathbf A \overset{\text{col}}{\sim} \mathbf B \iff \mathbf E_{\mathbf A^T}=\mathbf E_{\mathbf B^T}.A∼colB⟺EAT=EBT.
Corollary: Multiplication by nonsingular matrices cannot change rank.

证明：

⟹\Longrightarrow⟹: 设rank(A)=r,rank(B)=srank(\mathbf A)=r,rank(\mathbf B)=srank(A)=r,rank(B)=s，有A∼Nr,B∼Ns\mathbf A \sim \mathbf N_r,\mathbf B \sim \mathbf N_sA∼Nr,B∼Ns. 根据等价性的传递性，有Nr∼Ns\mathbf N_r \sim \mathbf N_sNr∼Ns，即r=sr=sr=s.
⟸\Longleftarrow⟸: 设rank(A)=rank(B)=rrank(\mathbf A)=rank(\mathbf B)=rrank(A)=rank(B)=r，有A∼Nr,B∼Nr\mathbf A \sim \mathbf N_r,\mathbf B \sim \mathbf N_rA∼Nr,B∼Nr，因此A∼B\mathbf A \sim \mathbf BA∼B.
⟹\Longrightarrow⟹: 因为B∼EB\mathbf B \sim \mathbf {E_B}B∼EB，有A∼EB\mathbf A \sim \mathbf {E_B}A∼EB. 又因为一个矩阵的最简行阶梯型是唯一确定的，所以EB=EA.\mathbf E_\mathbf B=\mathbf E_\mathbf A.EB=EA.
⟸\Longleftarrow⟸: A∼rowEA=EB∼rowB→A∼rowB.\mathbf A \overset{\text{row}}{\sim} \mathbf {E_A}=\mathbf {E_B} \overset{\text{row}}{\sim} \mathbf B \to \mathbf A \overset{\text{row}}{\sim} \mathbf B.A∼rowEA=EB∼rowB→A∼rowB.
A∼colB⟺AQ=B⟺(AQ)T=BT⟺QTAT=BT⟺AT∼rowBT.\mathbf A \overset{\text{col}}{\sim} \mathbf B \iff \mathbf{AQ=B} \iff (\mathbf{AQ})^T=\mathbf B^T \iff \mathbf Q^T \mathbf A^T=\mathbf B^T \iff \mathbf A^T \overset{\text{row}}{\sim} \mathbf B^T.A∼colB⟺AQ=B⟺(AQ)T=BT⟺QTAT=BT⟺AT∼rowBT.

根据上面的判据，我们可以比较方便地推出：转置(共轭转置)不改变矩阵的秩

Transposition does not change the rank—i.e., for all m×nm\times nm×n matrices, rank(A)=rank(AT)rank (\mathbf A)= rank (\mathbf A^T)rank(A)=rank(AT) and rank(A)=rank(A∗)rank (\mathbf A)= rank (\mathbf A^∗)rank(A)=rank(A∗).

证明：设rank(A)=rrank(\mathbf A)=rrank(A)=r，并存在非奇异阵P\mathbf PP和Q\mathbf QQ使得PAQ=Nr=(Ir0r×(n−r)0(m−r)×r0(m−r)×(n−r)).\mathbf {PAQ}=\mathbf N_r=\left(\begin{array}{llll} \mathbf I_r &\mathbf 0_{r \times (n-r)} \\ \mathbf 0_{(m-r)\times r} &\mathbf 0_{(m-r)\times(n-r)}\end{array}\right).PAQ=Nr=(Ir0(m−r)×r0r×(n−r)0(m−r)×(n−r)).对等号两边同时求转置，有QTATPT=NrT.\mathbf Q^T\mathbf A^T\mathbf P^T=\mathbf N_r^T.QTATPT=NrT. 因为QT\mathbf Q^TQT和PT\mathbf P^TPT都是非奇异的，有AT∼NrT\mathbf A^T \sim \mathbf N_r^TAT∼NrT. 因此rank(AT)=rank(Nr)=rank(Ir0r×(m−r)0(n−r)×r0(n−r)×(m−r))=r=rank(A).rank(\mathbf A^T)=rank(\mathbf N_r)=rank\left(\begin{array}{llll} \mathbf I_r &\mathbf 0_{r \times (m-r)} \\ \mathbf 0_{(n-r)\times r} &\mathbf 0_{(n-r)\times(m-r)}\end{array}\right)=r=rank(\mathbf A).rank(AT)=rank(Nr)=rank(Ir0(n−r)×r0r×(m−r)0(n−r)×(m−r))=r=rank(A).要证明rank(A)=rank(A∗)rank (\mathbf A)= rank (\mathbf A^∗)rank(A)=rank(A∗)，只需要继续证明rank(Aˉ)=rank(A).rank(\bar{\mathbf A})=rank(\mathbf A).rank(Aˉ)=rank(A). 类似地，原式两边取共轭，有Nr=Nrˉ=PAQˉ=PˉAˉQˉ.\mathbf N_r=\bar{\mathbf N_r}=\bar{\mathbf{PAQ}}=\bar{\mathbf P}\bar{\mathbf A}\bar{\mathbf Q}.Nr=Nrˉ=PAQˉ=PˉAˉQˉ. 由非奇异矩阵的共轭也是非奇异矩阵(因为Kˉ−1=K−1‾\bar{\mathbf K}^{-1}=\overline{\mathbf K^{-1}}Kˉ−1=K−1)，有Nr∼Aˉ\mathbf N_r \sim \bar{\mathbf A}Nr∼Aˉ，因此rank(A)=rank(Aˉ)=rank(AˉT)=rank(A∗).rank(\mathbf A)=rank(\bar{\mathbf A})=rank(\bar{\mathbf A}^T)=rank(\mathbf A^*).rank(A)=rank(Aˉ)=rank(AˉT)=rank(A∗).

几个含有实用结论的例题

Suppose that A\mathbf AA is an m×nm\times nm×n matrix.

If [A∣Im][\mathbf A|\mathbf I_m][A∣Im] is row reduced to a matrix [B∣P][\mathbf B|\mathbf P][B∣P], explain why P\mathbf PP must be a nonsingular matrix such that PA=B\mathbf {PA=B}PA=B.

If [AIn‾]\left[\begin{matrix}\mathbf A \\ \overline{\mathbf I_n}\end{matrix}\right][AIn] is column reduced to [CQ‾]\left[\begin{matrix}\mathbf C \\ \overline{\mathbf Q}\end{matrix}\right][CQ], explain why Q\mathbf QQ must be a nonsingular matrix such that AQ=C\mathbf{AQ = C}AQ=C.

设G1,⋯,Gk\mathbf G_1,\cdots,\mathbf G_kG1,⋯,Gk是[A∣I]→[B∣P][\mathbf A|\mathbf I] \to [\mathbf B|\mathbf P][A∣I]→[B∣P]过程中一系列基本行变换对应的初等矩阵，有Gk⋯G2G1[A∣I]=[B∣P]⟹[Gk⋯G2G1A∣Gk⋯G2G1I]=[B∣P]⟹Gk⋯G2G1A=B,Gk⋯G2G1=P⟹PA=B.\mathbf G_k \cdots \mathbf G_2\mathbf G_1[\mathbf A|\mathbf I]=[\mathbf B|\mathbf P] \Longrightarrow [\mathbf G_k \cdots \mathbf G_2\mathbf G_1\mathbf A|\mathbf G_k \cdots \mathbf G_2\mathbf G_1\mathbf I]=[\mathbf B|\mathbf P] \\ \Longrightarrow \mathbf G_k \cdots \mathbf G_2\mathbf G_1\mathbf A=\mathbf B, \mathbf G_k \cdots \mathbf G_2\mathbf G_1=\mathbf P \Longrightarrow \mathbf {PA=B}.Gk⋯G2G1[A∣I]=[B∣P]⟹[Gk⋯G2G1A∣Gk⋯G2G1I]=[B∣P]⟹Gk⋯G2G1A=B,Gk⋯G2G1=P⟹PA=B.
类似，只不过把左乘改成右乘
在用的时候，常令B=EA\mathbf B=\mathbf E_\mathbf AB=EA，这样就能得到化为最简行阶梯型对应的初等矩阵。

Prove that rank(Am×n)=1rank (\mathbf A_{m\times n}) = 1rank(Am×n)=1 if and only if there are nonzero columns um×1\mathbf u_{m\times 1}um×1 and vn×1\mathbf v_{n\times 1}vn×1 such that A=uvT.\mathbf A=\mathbf{uv}^T.A=uvT.

Prove that if rank(An×n)=1rank (\mathbf A_{n\times n})=1rank(An×n)=1, then A2=τA\mathbf A^2 = \tau \mathbf AA2=τA, where τ=tr(A)\tau = \mathrm{tr} (\mathbf A)τ=tr(A).

⟹\Longrightarrow⟹: 因为rank(Am×n)=1rank (\mathbf A_{m\times n}) = 1rank(Am×n)=1，存在非奇异矩阵P,Q\mathbf P,\mathbf QP,Q使得PAQ=N1=e1e1T\mathbf {PAQ}=\mathbf N_1=\mathbf e_1 \mathbf e_1^TPAQ=N1=e1e1T，其中e1\mathbf e_1e1是m×1m \times 1m×1的，e1T\mathbf e_1^Te1T是1×n1 \times n1×n的. 那么A=P−1e1e1TQ−1=(P−1)∗1(Q−1)1∗=uvT.\mathbf A=\mathbf P^{-1}\mathbf e_1 \mathbf e_1^T \mathbf Q^{-1}=(\mathbf P^{-1})_{*1}(\mathbf Q^{-1})_{1*}=\mathbf{uv}^T.A=P−1e1e1TQ−1=(P−1)∗1(Q−1)1∗=uvT.⟸\Longleftarrow⟸: 若u\mathbf uu和v\mathbf vv都是非零列向量，有u∼rowe1and v∼cole1T⟹A=uvT∼e1e1T=N1⟹rank(A)=1.\mathbf u \overset{\text{row}}{\sim} \mathbf e_1 \text{ and }\mathbf v \overset{\text{col}}{\sim} \mathbf e_1^T \Longrightarrow \mathbf A=\mathbf{uv}^T \sim \mathbf e_1 \mathbf e_1^T=\mathbf N_1 \Longrightarrow rank(\mathbf A)=1.u∼rowe1 and v∼cole1T⟹A=uvT∼e1e1T=N1⟹rank(A)=1.
根据第一点结论，将A\mathbf AA写作A=uvT\mathbf A=\mathbf{uv}^TA=uvT，则A2=(uvT)(uvT)=u(vTu)vT=τuvT=τA.\mathbf A^2=(\mathbf{uv}^T)(\mathbf{uv}^T)=\mathbf u(\mathbf v^T \mathbf u)\mathbf v^T=\tau \mathbf{uv}^T=\tau \mathbf A.A2=(uvT)(uvT)=u(vTu)vT=τuvT=τA.

If rank(Am×n)=rrank (\mathbf A_{m\times n})= rrank(Am×n)=r, show that there exist matrices Bm×r\mathbf B_{m\times r}Bm×r and Cr×n\mathbf C_{r\times n}Cr×n such that A=BC\mathbf {A = BC}A=BC, where rank(B)=rank(C)=rrank (\mathbf B)= rank (\mathbf C)= rrank(B)=rank(C)=r. Such a factorization is called a full-rank factorization. Hint: Consider the basic columns of A\mathbf AA and the nonzero rows of EA\mathbf {E_A}EA.

解一：由rank(Am×n)=rrank (\mathbf A_{m\times n})= rrank(Am×n)=r，设存在非奇异矩阵P,Q\mathbf P,\mathbf QP,Q使得PAQ=Nr\mathbf{PAQ}=\mathbf N_rPAQ=Nr. 也即A=P−1NrQ−1=(P−1[Ir0(m−r)×r])([Ir0r×(n−r)]Q−1)=BC\mathbf A=\mathbf P^{-1} \mathbf N_r \mathbf Q^{-1}=(\mathbf P^{-1} \left[\begin{array}{llll}\mathbf I_r \\ \mathbf 0_{(m-r)\times r}\end{array}\right])([\mathbf I_r \quad \mathbf 0_{r \times (n-r)}]\mathbf Q^{-1})=\mathbf B \mathbf CA=P−1NrQ−1=(P−1[Ir0(m−r)×r])([Ir0r×(n−r)]Q−1)=BC. 因为P−1,Q−1\mathbf P^{-1},\mathbf Q^{-1}P−1,Q−1非奇异，与其它矩阵相乘时不改变矩阵的秩，易有rank(B)=rank(C)=rrank (\mathbf B)= rank (\mathbf C)= rrank(B)=rank(C)=r.

解二：令Bm×r=[A∗b1A∗b2⋯A∗br]\mathbf B_{m \times r}=[\mathbf A_{*b_1}\mathbf A_{*b_2}\cdots\mathbf A_{*b_r}]Bm×r=[A∗b1A∗b2⋯A∗br]包含A\mathbf AA的所有基本列，并令Cr×n\mathbf C_{r \times n}Cr×n包含EA\mathbf {E_A}EA的非全零行。如果A∗k\mathbf A_{*k}A∗k是基本列，比如A∗k=A∗bj\mathbf A_{*k}=\mathbf A_{*b_j}A∗k=A∗bj，那么C∗k=ej\mathbf C_{*k}=\mathbf e_jC∗k=ej，且(BC)∗k=BC∗k=Bej=B∗j=A∗bj=A∗k.(\mathbf{BC})_{*k}=\mathbf {BC}_{*k}=\mathbf B \mathbf e_j=\mathbf B_{*j}=\mathbf A_{*b_j}=\mathbf A_{*k}.(BC)∗k=BC∗k=Bej=B∗j=A∗bj=A∗k.如果A∗k\mathbf A_{*k}A∗k不是基本列，则C∗k\mathbf C_{*k}C∗k也不是基本列，并有如下形式C∗k=(μ1μ2⋮μj⋮0)=μ1(10⋮0⋮0)+μ2(01⋮0⋮0)+⋯+μj(00⋮1⋮0)=μ1e1+μ2e2+⋯+μjej,\begin{aligned}\mathbf C_{*k}=\left(\begin{matrix}\mu_1\\\mu_2\\\vdots\\\mu_j\\\vdots\\0\end{matrix}\right)&=\mu_1\left(\begin{matrix}1\\0\\\vdots\\0\\\vdots\\0\end{matrix}\right)+\mu_2\left(\begin{matrix}0\\1\\\vdots\\0\\\vdots\\0\end{matrix}\right)+\cdots+\mu_j\left(\begin{matrix}0\\0\\\vdots\\1\\\vdots\\0\end{matrix}\right)\\ &=\mu_1 \mathbf e_1+\mu_2 \mathbf e_2+\cdots+\mu_j \mathbf e_j,\end{aligned}C∗k=⎝⎜⎜⎜⎜⎜⎜⎜⎜⎛μ1μ2⋮μj⋮0⎠⎟⎟⎟⎟⎟⎟⎟⎟⎞=μ1⎝⎜⎜⎜⎜⎜⎜⎜⎜⎛10⋮0⋮0⎠⎟⎟⎟⎟⎟⎟⎟⎟⎞+μ2⎝⎜⎜⎜⎜⎜⎜⎜⎜⎛01⋮0⋮0⎠⎟⎟⎟⎟⎟⎟⎟⎟⎞+⋯+μj⎝⎜⎜⎜⎜⎜⎜⎜⎜⎛00⋮1⋮0⎠⎟⎟⎟⎟⎟⎟⎟⎟⎞=μ1e1+μ2e2+⋯+μjej,因为A∼rowEA\mathbf A \overset{\text{row}}{\sim} \mathbf {E_A}A∼rowEA，且Cr×n\mathbf C_{r \times n}Cr×n包含EA\mathbf {E_A}EA的非全零行，则A\mathbf AA的列间关系和C\mathbf CC的列间关系完全相同。也即A∗k=μ1A∗b1+μ2A∗b2+⋯+μjA∗bj,\mathbf A_{*k}=\mu_1\mathbf A_{*b_1}+\mu_2\mathbf A_{*b_2}+\cdots+\mu_j\mathbf A_{*b_j},A∗k=μ1A∗b1+μ2A∗b2+⋯+μjA∗bj,因此(BC)∗k=BC∗k=B(μ1e1+μ2e2+⋯+μjej)=μ1B∗1+μ2B∗2+⋯+μjB∗j=μ1A∗b1+μ2A∗b2+⋯+μjA∗bj=A∗k.\begin{aligned}(\mathbf{BC})_{*k}&=\mathbf{BC}_{*k}=\mathbf B(\mu_1 \mathbf e_1+\mu_2 \mathbf e_2+\cdots+\mu_j \mathbf e_j)\\ &=\mu_1\mathbf B_{*1}+\mu_2\mathbf B_{*2}+\cdots+\mu_j\mathbf B_{*j}\\ &=\mu_1\mathbf A_{*b_1}+\mu_2\mathbf A_{*b_2}+\cdots+\mu_j\mathbf A_{*b_j}\\&=\mathbf A_{*k}\end{aligned}.(BC)∗k=BC∗k=B(μ1e1+μ2e2+⋯+μjej)=μ1B∗1+μ2B∗2+⋯+μjB∗j=μ1A∗b1+μ2A∗b2+⋯+μjA∗bj=A∗k.

MAALA3.9_初等矩阵和等价 (Elementary Matrices and Equivalence)相关推荐

双语矩阵论课程笔记（1）—— 常用术语翻译
双语矩阵论课程笔记文章目录 1. New stuff 2. Determinants(行列式) 3. Matrices(矩阵) 4. Systems of Linear Equations(线性方程 ...
matlab实例 pdf,matlab65实例教程(含语句注释).pdf
matlab65实例教程(含语句注释).pdf 1 2. 基础准备及入门基础准备及入门 2.1 MATLAB 5.x 版对外部系统的要求版对外部系统的要求 2.2 MATLAB 的安装的安装 2.3 ...
matlab用ezmesh绘制单位球,Matlab------------命令大全2
4 基本矩阵函数和操作(Elementary matrices and matrix manipulation) 4.1 基本矩阵(Elementary matrices) eye 单位阵 lins ...
matlab命令大全,Matlab------------命令大全2
转载:http://blog.csdn.net/yf210yf/article/details/7472984 4 基本矩阵函数和操作(Elementary matrices and matrix ...
Gauss Elimination算法分析与实现
2019独角兽企业重金招聘Python工程师标准>>> 高斯消去法分为两个过程:第一步是前向消元(forward elimination),也就是将系数矩阵转化成上三角矩阵的过程:第 ...
matlab中的帮助命令
原文一 help的直接使用 HELP topics matlab\general - General purpose commands. matlab\ops - ...
Five-degree-of-freedom manipulation of an untethered magnetic device in fluid using a single permane
一篇IJRR,非常经典的控制驱动下的定位过程,非常详细,系统算法描述完成后,对系统中存在的奇异点进行了分析,还有其他的性能的分析.是一篇详细系统阐述和系统性能的解释. 使用单个永磁体用于液体中的无尾绳 ...
Matlab的一些术语
出处不详,感谢原作者. 一 matlab常用函数 1. 特殊变量与常数 ans 计算结果的变量名 computer 确定运行的计算机 eps 浮点相对精度 ...
背出来matlab就无敌了
背出来,Matlab就无敌了来源: 卞金鑫的日志一 matlab常用函数 1. 特殊变量与常数 ans 计算结果的变量名 computer 确定运行的计算机 ep ...
[转]MATLAB 主要函数指令表（按功能分类）
1 常用指令(General Purpose Commands) 1.1 通用信息查询(General information) demo 演示程序 help 在线帮助指令 helpbrowser ...

MAALA3.9_初等矩阵和等价 (Elementary Matrices and Equivalence)

MAALA3.9_初等矩阵和等价 (Elementary Matrices and Equivalence)相关推荐

最新文章

热门文章