【最优化方法】【矩阵分析】标量、向量、矩阵之间的求导关系
1.1 数对数微分
设 x∈Dx\in Dx∈D,y=f(x)y=f(x)y=f(x),则 yyy 对 xxx 的微分为:dydx=dfdx=f′(x)\frac{dy}{dx}=\frac{df}{dx}=f^\prime(x)dxdy=dxdf=f′(x)即:标量对标量求导,直接求标量函数的导数。
如 y=f(x)=x3y=f(x)=x^3y=f(x)=x3,则 dydx=3x2\frac{dy}{dx}=3x^2dxdy=3x2
1.2 数对向量微分
设 x⃗∈Rn\boldsymbol{\vec{x}}\in R^nx∈Rn,y=f(x⃗)y=f(\boldsymbol{\vec{x}})y=f(x),f:D⊆Rn→Rf:D\subseteq R^n\rightarrow Rf:D⊆Rn→R,则 yyy 对 x⃗\boldsymbol{\vec{x}}x 的微分为:dydx⃗=∇f(x⃗)=[∂y∂x1∂y∂x2⋮∂y∂xn]\frac{dy}{d\boldsymbol{\vec{x}}}=\nabla f(\boldsymbol{\vec{x}})=\begin{bmatrix} \frac{\partial y}{\partial x_1} \\\\ \frac{\partial y}{\partial x_2} \\\\ \vdots \\\\ \frac{\partial y}{\partial x_n} \end{bmatrix}dxdy=∇f(x)=∂x1∂y∂x2∂y⋮∂xn∂y
即:标量对自变量向量求导时,相当于求多元函数的梯度。
如 y=x12+x23+x34y={{x_1}^2} + {{x_2}^3+{{x_3}^4}}y=x12+x23+x34,那么dydx⃗=[2x13x224x33]\frac{dy}{d\boldsymbol{\vec{x}}}=\begin{bmatrix} 2x_1 \\\\ 3x_2^2 \\\\ 4x_3^3 \end{bmatrix}dxdy=2x13x224x33
1.3 数对矩阵微分
设 X∈Rm×n\boldsymbol{X}\in R^{m\times n}X∈Rm×n,y=f(X)=f(x11,x12,…,xmn)y=f(\boldsymbol{X})=f(x_{11},x_{12},\dots,x_{mn})y=f(X)=f(x11,x12,…,xmn),f:D⊆Rm×n→Rf:D\subseteq R^{m\times n}\rightarrow Rf:D⊆Rm×n→R,那么:dydX=[∂y∂x11∂y∂x12⋯∂y∂x1n∂y∂x21∂y∂x22⋯∂y∂x2n⋮⋮⋱⋮∂y∂xm1∂y∂xm2⋯∂y∂xmn]\frac{dy}{d\boldsymbol{X}}=\begin{bmatrix} \frac{\partial y}{\partial x_{11}}&\frac{\partial y}{\partial x_{12}}&\cdots&\frac{\partial y}{\partial x_{1n}}\\\\ \frac{\partial y}{\partial x_{21}}&\frac{\partial y}{\partial x_{22}}&\cdots&\frac{\partial y}{\partial x_{2n}}\\\\\vdots&\vdots&\ddots&\vdots\\\\ \frac{\partial y}{\partial x_{m1}}&\frac{\partial y}{\partial x_{m2}}&\cdots&\frac{\partial y}{\partial x_{mn}} \end{bmatrix}dXdy=∂x11∂y∂x21∂y⋮∂xm1∂y∂x12∂y∂x22∂y⋮∂xm2∂y⋯⋯⋱⋯∂x1n∂y∂x2n∂y⋮∂xmn∂y
即:矩阵函数对自变量矩阵求导时,把函数对矩阵中对应位置处的变量求导,然后排成同型矩阵。
如:X=[x1x2x3x4]\boldsymbol{X}=\begin{bmatrix}x_1&x_2\\x_3&x_4\end{bmatrix}X=[x1x3x2x4],y=x1+x22+x33+x44y=x_1+x_2^2+x_3^3+x_4^4y=x1+x22+x33+x44,那么:dydX=[12x23x324x43]\frac{dy}{d\boldsymbol{X}}=\begin{bmatrix}1&2x_2\\3x_3^2&4x_4^3\end{bmatrix}dXdy=[13x322x24x43]
1.4 向量对数微分
设 x∈Rx\in Rx∈R,y⃗=f⃗(x)=[f1(x),f2(x),⋯,fn(x)]T∈Rn\boldsymbol{\vec{y}}=\boldsymbol{\vec{f}}(x)=\begin{bmatrix}f_1(x),&f_2(x),&\cdots&,f_n(x)\end{bmatrix}^T\in R^ny=f(x)=[f1(x),f2(x),⋯,fn(x)]T∈Rn,f⃗:D⊆R→Rn\boldsymbol{\vec{f}}:D\subseteq R\rightarrow R^nf:D⊆R→Rn,那么:dy⃗dx=[df1(x)dxdf2(x)dx⋯dfn(x)dx]T\frac{d\boldsymbol{\vec{y}}}{dx}=\begin{bmatrix} \frac{df_1(x)}{dx}&\frac{df_2(x)}{dx}&\cdots&\frac{df_n(x)}{dx} \end{bmatrix}^Tdxdy=[dxdf1(x)dxdf2(x)⋯dxdfn(x)]T
即:向量函数对标量微分,等于向量中的每个分量对标量进行微分。
如 y⃗=[x2lnxexsinx]T\boldsymbol{\vec{y}}=\begin{bmatrix}x^2&\ln x&\text{e}^x&\sin x\end{bmatrix}^Ty=[x2lnxexsinx]T,则 dy⃗dx=[2x1xexcosx]T\frac{d\boldsymbol{\vec{y}}}{dx}=\begin{bmatrix}2x&\frac{1}{x}&\text{e}^x&\cos x\end{bmatrix}^Tdxdy=[2xx1excosx]T
1.5 向量对向量微分
设 x⃗∈Rn\boldsymbol{\vec{x}}\in R^nx∈Rn,y⃗=f⃗(x⃗)=[f1(x⃗)f2(x⃗)⋯fn(x⃗)]∈Rm\boldsymbol{\vec{y}}=\boldsymbol{\vec{f}}(\boldsymbol{\vec{x}})=\begin{bmatrix}f_1(\boldsymbol{\vec{x}})&f_2(\boldsymbol{\vec{x}})&\cdots&f_n(\boldsymbol{\vec{x}})\end{bmatrix}\in R^my=f(x)=[f1(x)f2(x)⋯fn(x)]∈Rm,f⃗:D⊆Rn→Rm\boldsymbol{\vec{f}}:D\subseteq R^n\rightarrow R^mf:D⊆Rn→Rm,那么:dy⃗dx⃗=[∂f1(x⃗)∂x1∂f2(x⃗)∂x1⋯∂fm(x⃗)∂x1∂f1(x⃗)∂x2∂f2(x⃗)∂x2⋯∂fm(x⃗)dx2⋮⋮⋱⋮∂f1(x⃗)dxn∂f2(x⃗)dxn⋯∂fm(x⃗)dxn]≜J\frac{d\boldsymbol{\vec{y}}}{d\boldsymbol{\vec{x}}}=\begin{bmatrix}\frac{\partial f_1(\boldsymbol{\vec{x}})}{\partial x_1}&\frac{\partial f_2(\boldsymbol{\vec{x}})}{\partial x_1}&\cdots&\frac{\partial f_m(\boldsymbol{\vec{x}})}{\partial x_1}\\\\\frac{\partial f_1(\boldsymbol{\vec{x}})}{\partial x_2}&\frac{\partial f_2(\boldsymbol{\vec{x}})}{\partial x_2}&\cdots&\frac{\partial f_m(\boldsymbol{\vec{x}})}{dx_2}\\\\\vdots & \vdots & \ddots & \vdots \\\\\frac{\partial f_1(\boldsymbol{\vec{x}})}{dx_n}&\frac{\partial f_2(\boldsymbol{\vec{x}})}{dx_n}&\cdots&\frac{\partial f_m(\boldsymbol{\vec{x}})}{dx_n}\end{bmatrix}\triangleq \boldsymbol{J}dxdy=∂x1∂f1(x)∂x2∂f1(x)⋮dxn∂f1(x)∂x1∂f2(x)∂x2∂f2(x)⋮dxn∂f2(x)⋯⋯⋱⋯∂x1∂fm(x)dx2∂fm(x)⋮dxn∂fm(x)≜J
即:向量函数对自变量向量求导,等价于求向量函数的雅克比矩阵。
如:y=[x1+x22e2x1+x2]y=\begin{bmatrix} x_1+x_2^2&{\rm e}^{2x_1+x_2} \end{bmatrix}y=[x1+x22e2x1+x2],则 dy⃗dx⃗=[12e2x1+x22x2e2x1+x2]\frac{d\boldsymbol{\vec{y}}}{d\boldsymbol{\vec{x}}}=\begin{bmatrix}1&2{\rm e}^{2x_1+x_2}\\2x_2&{\rm e}^{2x_1+x_2}\end{bmatrix}dxdy=[12x22e2x1+x2e2x1+x2]
1.6 向量对矩阵微分
设 X∈Rm×n\boldsymbol{X}\in R^{m\times n}X∈Rm×n,y⃗=f⃗(X)=[f1(X)f2(X)⋯fr(X)]T∈Rr\boldsymbol{\vec{y}}=\boldsymbol{\vec{f}}(\boldsymbol{X})=\begin{bmatrix} f_1(\boldsymbol{X})&f_2(\boldsymbol{X})&\cdots&f_r(\boldsymbol{X}) \end{bmatrix}^T\in R^ry=f(X)=[f1(X)f2(X)⋯fr(X)]T∈Rr,f⃗:D⊆Rm×n→Rr\boldsymbol{\vec{f}}:D\subseteq R^{m\times n}\rightarrow R^rf:D⊆Rm×n→Rr,那么:
dy⃗dX=[df1(X)dXdf2(X)dX⋮dfr(X)dX]=[∂f1(X)∂x11∂f1(X)∂x12⋯∂f1(X)∂x1n∂f1(X)∂x21∂f1(X)∂x22⋯∂f1(X)dx2n⋮⋮⋱⋮∂f1(X)dxm1∂f1(X)dxm2⋯∂f1(X)dxmn∂f2(X)∂x11∂f2(X)∂x12⋯∂f2(X)∂x1n∂f2(X)∂x21∂f2(X)∂x22⋯∂f2(X)dx2n⋮⋮⋱⋮∂f2(X)dxm1∂f2(X)dxm2⋯∂f2(X)dxmn⋮⋮⋮⋮∂fr(X)∂x11∂fr(X)∂x12⋯∂fr(X)∂x1n∂fr(X)∂x21∂fr(X)∂x22⋯∂fr(X)dx2n⋮⋮⋱⋮∂fr(X)dxm1∂fr(X)dxm2⋯∂fr(X)dxmn]\frac{d\boldsymbol{\vec{y}}}{d\boldsymbol{X}}=\begin{bmatrix}\frac{df_1(\boldsymbol{X})}{d\boldsymbol{X}}\\\\\frac{df_2(\boldsymbol{X})}{d\boldsymbol{X}}\\\\\vdots\\\\\frac{df_r(\boldsymbol{X})}{d\boldsymbol{X}}\end{bmatrix}=\begin{bmatrix}\frac{\partial f_1(\boldsymbol{X})}{\partial x_{11}}&\frac{\partial f_1(\boldsymbol{X})}{\partial x_{12}}&\cdots&\frac{\partial f_1(\boldsymbol{X})}{\partial x_{1n}}\\\\\frac{\partial f_1(\boldsymbol{X})}{\partial x_{21}}&\frac{\partial f_1(\boldsymbol{X})}{\partial x_{22}}&\cdots&\frac{\partial f_1(\boldsymbol{X})}{dx_{2n}}\\\\\vdots & \vdots & \ddots & \vdots \\\\\frac{\partial f_1(\boldsymbol{X})}{dx_{m1}}&\frac{\partial f_1(\boldsymbol{X})}{dx_{m2}}&\cdots&\frac{\partial f_1(\boldsymbol{X})}{dx_{mn}}\\\\\frac{\partial f_2(\boldsymbol{X})}{\partial x_{11}}&\frac{\partial f_2(\boldsymbol{X})}{\partial x_{12}}&\cdots&\frac{\partial f_2(\boldsymbol{X})}{\partial x_{1n}}\\\\\frac{\partial f_2(\boldsymbol{X})}{\partial x_{21}}&\frac{\partial f_2(\boldsymbol{X})}{\partial x_{22}}&\cdots&\frac{\partial f_2(\boldsymbol{X})}{dx_{2n}}\\\\\vdots & \vdots & \ddots & \vdots \\\\\frac{\partial f_2(\boldsymbol{X})}{dx_{m1}}&\frac{\partial f_2(\boldsymbol{X})}{dx_{m2}}&\cdots&\frac{\partial f_2(\boldsymbol{X})}{dx_{mn}}\\\\\vdots&\vdots&\vdots&\vdots\\\\\frac{\partial f_r(\boldsymbol{X})}{\partial x_{11}}&\frac{\partial f_r(\boldsymbol{X})}{\partial x_{12}}&\cdots&\frac{\partial f_r(\boldsymbol{X})}{\partial x_{1n}}\\\\\frac{\partial f_r(\boldsymbol{X})}{\partial x_{21}}&\frac{\partial f_r(\boldsymbol{X})}{\partial x_{22}}&\cdots&\frac{\partial f_r(\boldsymbol{X})}{dx_{2n}}\\\\\vdots & \vdots & \ddots & \vdots \\\\\frac{\partial f_r(\boldsymbol{X})}{dx_{m1}}&\frac{\partial f_r(\boldsymbol{X})}{dx_{m2}}&\cdots&\frac{\partial f_r(\boldsymbol{X})}{dx_{mn}}\end{bmatrix}dXdy=dXdf1(X)dXdf2(X)⋮dXdfr(X)=∂x11∂f1(X)∂x21∂f1(X)⋮dxm1∂f1(X)∂x11∂f2(X)∂x21∂f2(X)⋮dxm1∂f2(X)⋮∂x11∂fr(X)∂x21∂fr(X)⋮dxm1∂fr(X)∂x12∂f1(X)∂x22∂f1(X)⋮dxm2∂f1(X)∂x12∂f2(X)∂x22∂f2(X)⋮dxm2∂f2(X)⋮∂x12∂fr(X)∂x22∂fr(X)⋮dxm2∂fr(X)⋯⋯⋱⋯⋯⋯⋱⋯⋮⋯⋯⋱⋯∂x1n∂f1(X)dx2n∂f1(X)⋮dxmn∂f1(X)∂x1n∂f2(X)dx2n∂f2(X)⋮dxmn∂f2(X)⋮∂x1n∂fr(X)dx2n∂fr(X)⋮dxmn∂fr(X)
即:向量函数对矩阵变量求导,等于向量函数的每个分量对矩阵中的每个位置的元素依次求导,再排成同型矩阵,再将各同型矩阵依次顺列排列。
如:X=[x1x2x3x4]X=\begin{bmatrix}x_1&x_2\\x_3&x_4\end{bmatrix}X=[x1x3x2x4],y⃗=[x1+x22+x33+x44sin(x1+2x3)+ln(x2x4)]\boldsymbol{\vec{y}}=\begin{bmatrix}x_1+x_2^2+x_3^3+x_4^4\\\\\sin (x_1+2x_3)+\ln{(x_2x_4)}\end{bmatrix}y=x1+x22+x33+x44sin(x1+2x3)+ln(x2x4),则:dy⃗dX=[12x23x324x43cos(x1+2x3)1x22cos(x1+2x3)1x4]\frac{d\boldsymbol{\vec{y}}}{d\boldsymbol{X}}=\begin{bmatrix}1&2x_2\\\\3x_3^2&4x_4^3\\\\\cos{(x_1+2x_3)}&\frac{1}{x_2}\\\\2\cos{(x_1+2x_3)}&\frac{1}{x_4}\end{bmatrix}dXdy=13x32cos(x1+2x3)2cos(x1+2x3)2x24x43x21x41
1.7 矩阵对数微分
设 x∈Rx\in Rx∈R,F(x)=[f11(x)f12(x)⋯f1n(x)f21(x)f22(x)⋯f2n(x)⋮⋮⋱⋮fm1(x)fm2(x)⋯fmn(x)]∈Rm×n\boldsymbol{F}(x)=\begin{bmatrix}f_{11}(x)&f_{12}(x)&\cdots&f_{1n}(x)\\\\f_{21}(x)&f_{22}(x)&\cdots&f_{2n}(x)\\\\\vdots&\vdots&\ddots&\vdots\\\\f_{m1}(x)&f_{m2}(x)&\cdots&f_{mn}(x)\end{bmatrix}\in R^{m\times n}F(x)=f11(x)f21(x)⋮fm1(x)f12(x)f22(x)⋮fm2(x)⋯⋯⋱⋯f1n(x)f2n(x)⋮fmn(x)∈Rm×n,F:D⊆R→Rm×n\boldsymbol{F}:D\subseteq R\rightarrow R^{m\times n}F:D⊆R→Rm×n,那么:dFdx=[df11(x)dxdf12(x)dx⋯df1n(x)dxdf21(x)dxdf22(x)dx⋯df2n(x)dx⋮⋮⋱⋮dfm1(x)dxdfm2(x)dx⋯dfmn(x)dx]\frac{d\boldsymbol{F}}{dx}=\begin{bmatrix}\frac{df_{11}(x)}{dx}&\frac{df_{12}(x)}{dx}&\cdots&\frac{df_{1n}(x)}{dx}\\\\\frac{df_{21}(x)}{dx}&\frac{df_{22}(x)}{dx}&\cdots&\frac{df_{2n}(x)}{dx}\\\\\vdots&\vdots&\ddots&\vdots\\\\\frac{df_{m1}(x)}{dx}&\frac{df_{m2}(x)}{dx}&\cdots&\frac{df_{mn}(x)}{dx}\end{bmatrix}dxdF=dxdf11(x)dxdf21(x)⋮dxdfm1(x)dxdf12(x)dxdf22(x)⋮dxdfm2(x)⋯⋯⋱⋯dxdf1n(x)dxdf2n(x)⋮dxdfmn(x)
即:函数矩阵对标量求导,等于函数矩阵中的每个函数元素分别对标量求导。
如:F=[xlnxx2+exx+sinxxex]\boldsymbol{F}=\begin{bmatrix}x\ln x&x^2+{\rm e}^x\\x+\sin{x}&x{\rm e}^x\end{bmatrix}F=[xlnxx+sinxx2+exxex],则:dFdx=[lnx+12x+ex1+cosx(1+x)ex]\frac{d\boldsymbol{F}}{dx}=\begin{bmatrix}\ln{x}+1&2x+{\rm e}^x\\\\1+\cos{x}&(1+x){\rm e}^x\end{bmatrix}dxdF=lnx+11+cosx2x+ex(1+x)ex
1.8 矩阵对向量微分
设 x⃗∈Rs\boldsymbol{\vec{x}}\in R^sx∈Rs,F(x⃗)=[f11(x⃗)f12(x⃗)⋯f1n(x⃗)f21(x⃗)f22(x⃗)⋯f2n(x⃗)⋮⋮⋱⋮fm1(x⃗)fm2(x⃗)⋯fmn(x⃗)]∈Rm×n\boldsymbol{F}(\boldsymbol{\vec{x}})=\begin{bmatrix}f_{11}(\boldsymbol{\vec{x}})&f_{12}(\boldsymbol{\vec{x}})&\cdots&f_{1n}(\boldsymbol{\vec{x}})\\\\f_{21}(\boldsymbol{\vec{x}})&f_{22}(\boldsymbol{\vec{x}})&\cdots&f_{2n}(\boldsymbol{\vec{x}})\\\\\vdots&\vdots&\ddots&\vdots\\\\f_{m1}(\boldsymbol{\vec{x}})&f_{m2}(\boldsymbol{\vec{x}})&\cdots&f_{mn}(\boldsymbol{\vec{x}})\end{bmatrix}\in R^{m\times n}F(x)=f11(x)f21(x)⋮fm1(x)f12(x)f22(x)⋮fm2(x)⋯⋯⋱⋯f1n(x)f2n(x)⋮fmn(x)∈Rm×n,F:D⊆Rs→Rm×n\boldsymbol{F}:D\subseteq R^s\rightarrow R^{m\times n}F:D⊆Rs→Rm×n,那么:dFdx⃗=[∇f11∇f12⋯∇f1n∇f21∇f22⋯∇f2n⋮⋮⋱⋮∇fm1∇fm2⋯∇fmn]=[∂f11(x⃗)∂x1∂f12(x⃗)∂x1⋯∂f1n(x⃗)∂x1∂f11(x⃗)∂x2∂f12(x⃗)∂x2⋯∂f1n(x⃗)∂x2⋮⋮⋱⋮∂f11(x⃗)∂xs∂f12(x⃗)∂xs⋯∂f1n(x⃗)∂xs⋮⋮⋮⋮∂fm1(x⃗)∂x1∂fm2(x⃗)∂x1⋯∂fmn(x⃗)∂x1∂fm1(x⃗)∂x2∂fm2(x⃗)∂x2⋯∂fmn(x⃗)∂x2⋮⋮⋱⋮∂fm1(x⃗)∂xs∂fm2(x⃗)∂xs⋯∂fmn(x⃗)∂xs]\frac{d\boldsymbol{F}}{d\boldsymbol{\vec{x}}}= \begin{bmatrix} \nabla f_{11} & \nabla f_{12} & \cdots & \nabla f_{1n} \\\\ \nabla f_{21} & \nabla f_{22} & \cdots & \nabla f_{2n} \\\\ \vdots & \vdots & \ddots & \vdots \\\\ \nabla f_{m1} & \nabla f_{m2} & \cdots & \nabla f_{mn} \end{bmatrix} =\begin{bmatrix} \frac{\partial f_{11}(\boldsymbol{\vec{x}})}{\partial x_1} & \frac{\partial f_{12}(\boldsymbol{\vec{x}})}{\partial x_1}&\cdots&\frac{\partial f_{1n}(\boldsymbol{\vec{x}})}{\partial x_1}\\\\ \frac{\partial f_{11}(\boldsymbol{\vec{x}})}{\partial x_2} & \frac{\partial f_{12}(\boldsymbol{\vec{x}})}{\partial x_2}&\cdots&\frac{\partial f_{1n}(\boldsymbol{\vec{x}})}{\partial x_2}\\\\ \vdots & \vdots & \ddots & \vdots \\\\ \frac{\partial f_{11}(\boldsymbol{\vec{x}})}{\partial x_s} & \frac{\partial f_{12}(\boldsymbol{\vec{x}})}{\partial x_s}&\cdots&\frac{\partial f_{1n}(\boldsymbol{\vec{x}})}{\partial x_s}\\\\ \vdots & \vdots & \vdots & \vdots \\\\\ \frac{\partial f_{m1}(\boldsymbol{\vec{x}})}{\partial x_1} & \frac{\partial f_{m2}(\boldsymbol{\vec{x}})}{\partial x_1}&\cdots&\frac{\partial f_{mn}(\boldsymbol{\vec{x}})}{\partial x_1}\\\\ \frac{\partial f_{m1}(\boldsymbol{\vec{x}})}{\partial x_2} & \frac{\partial f_{m2}(\boldsymbol{\vec{x}})}{\partial x_2}&\cdots&\frac{\partial f_{mn}(\boldsymbol{\vec{x}})}{\partial x_2}\\\\ \vdots & \vdots & \ddots & \vdots \\\\ \frac{\partial f_{m1}(\boldsymbol{\vec{x}})}{\partial x_s} & \frac{\partial f_{m2}(\boldsymbol{\vec{x}})}{\partial x_s}&\cdots&\frac{\partial f_{mn}(\boldsymbol{\vec{x}})}{\partial x_s}\\\\ \end{bmatrix}dxdF=∇f11∇f21⋮∇fm1∇f12∇f22⋮∇fm2⋯⋯⋱⋯∇f1n∇f2n⋮∇fmn=∂x1∂f11(x)∂x2∂f11(x)⋮∂xs∂f11(x)⋮ ∂x1∂fm1(x)∂x2∂fm1(x)⋮∂xs∂fm1(x)∂x1∂f12(x)∂x2∂f12(x)⋮∂xs∂f12(x)⋮∂x1∂fm2(x)∂x2∂fm2(x)⋮∂xs∂fm2(x)⋯⋯⋱⋯⋮⋯⋯⋱⋯∂x1∂f1n(x)∂x2∂f1n(x)⋮∂xs∂f1n(x)⋮∂x1∂fmn(x)∂x2∂fmn(x)⋮∂xs∂fmn(x)
如:F=[x1+x2+x3sinx1+cosx4x1ln(x1+x2x4)x32]\boldsymbol{F}=\begin{bmatrix}x_1+x_2+x_3&\sin{x_1}+\cos{x_4}\\\\x_1\ln{(x_1+x_2x_4)}&x_3^2\end{bmatrix}F=x1+x2+x3x1ln(x1+x2x4)sinx1+cosx4x32,则:dFdx⃗=[1cosx110100−sinx4ln(x1+x2x4)+x1x1+x2x40x1x4x1+x2x4002x3x1x2x1+x2x40]\frac{d\boldsymbol{F}}{d\boldsymbol{\vec{x}}}= \begin{bmatrix} 1 & \cos{x_1} \\\\ 1 & 0 \\\\ 1 & 0 \\\\ 0 & -\sin{x_4} \\\\ \ln(x_1+x_2x_4)+\frac{x_1}{x_1+x_2x_4} & 0 \\\\ \frac{x_1x_4}{x_1+x_2x_4} & 0 \\\\ 0 & 2x_3 \\\\ \frac{x_1x_2}{x_1+x_2x_4} & 0 \end{bmatrix}dxdF=1110ln(x1+x2x4)+x1+x2x4x1x1+x2x4x1x40x1+x2x4x1x2cosx100−sinx4002x30
1.9 矩阵对矩阵微分
设 X∈Rm×n\boldsymbol{X}\in R^{m\times n}X∈Rm×n,F(X)=[f11(X)f12(X)⋯f1s(X)f21(X)f22(X)⋯f2s(X)⋮⋮⋱⋮fr1(X)fr2(X)⋯frs(X)]\boldsymbol{F}(\boldsymbol{X})= \begin{bmatrix} f_{11}(\boldsymbol{X})&f_{12}(\boldsymbol{X})&\cdots&f_{1s}(\boldsymbol{X})\\\\ f_{21}(\boldsymbol{X})&f_{22}(\boldsymbol{X})&\cdots&f_{2s}(\boldsymbol{X})\\\\ \vdots & \vdots & \ddots & \vdots \\\\ f_{r1}(\boldsymbol{X})&f_{r2}(\boldsymbol{X})&\cdots&f_{rs}(\boldsymbol{X}) \end{bmatrix}F(X)=f11(X)f21(X)⋮fr1(X)f12(X)f22(X)⋮fr2(X)⋯⋯⋱⋯f1s(X)f2s(X)⋮frs(X),F:D⊆Rm×n→Rr×s\boldsymbol{F}:D\subseteq R^{m\times n}\rightarrow R^{r\times s}F:D⊆Rm×n→Rr×s,那么:
dFdX=[df11dXdf12dX⋯df1sdXdf21dXdf22dX⋯df2sdX⋮⋮⋱⋮dfr1dXdfr2dX⋯dfrsdX]\frac{d\boldsymbol{F}}{d\boldsymbol{X}}= \begin{bmatrix} \frac{df_{11}}{d\boldsymbol{X}} & \frac{df_{12}}{d\boldsymbol{X}} & \cdots & \frac{df_{1s}}{d\boldsymbol{X}}\\\\ \frac{df_{21}}{d\boldsymbol{X}} & \frac{df_{22}}{d\boldsymbol{X}} & \cdots & \frac{df_{2s}}{d\boldsymbol{X}}\\\\ \vdots & \vdots & \ddots & \vdots \\\\ \frac{df_{r1}}{d\boldsymbol{X}} & \frac{df_{r2}}{d\boldsymbol{X}} & \cdots & \frac{df_{rs}}{d\boldsymbol{X}} \end{bmatrix}dXdF=dXdf11dXdf21⋮dXdfr1dXdf12dXdf22⋮dXdfr2⋯⋯⋱⋯dXdf1sdXdf2s⋮dXdfrs
即:dFdX=[df11dx11df11dx12⋯df11dx1ndf12dx11df12dx12⋯df12dx1n⋯⋯df1sdx11df1sdx12⋯df1sdx1ndf11dx21df11dx22⋯df11dx2ndf12dx21df12dx22⋯df12dx2n⋯⋯df1sdx21df1sdx22⋯df1sdx2n⋮⋮⋱⋮⋮⋮⋱⋮⋯⋯⋮⋮⋱⋮df11dxm1df11dxm2⋯df11dxmndf12dxm1df12dxm2⋯df12dxmn⋯⋯df1sdxm1df1sdxm2⋯df1sdxmndf21dx11df21dx12⋯df21dx1ndf22dx11df22dx12⋯df22dx1n⋯⋯df2sdx11df2sdx12⋯df2sdx1ndf21dx21df21dx22⋯df21dx2ndf22dx21df22dx22⋯df22dx2n⋯⋯df2sdx21df2sdx22⋯df2sdx2n⋮⋮⋱⋮⋮⋮⋱⋮⋯⋯⋮⋮⋱⋮df21dxm1df21dxm2⋯df21dxmndf22dxm1df22dxm2⋯df22dxmn⋯⋯df2sdxm1df2sdxm2⋯df2sdxmn⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮dfr1dx11dfr1dx12⋯dfr1dx1ndfr2dx11dfr2dx12⋯dfr2dx1n⋯⋯dfrsdx11dfrsdx12⋯dfrsdx1ndfr1dx21dfr1dx22⋯dfr1dx2ndfr2dx21dfr2dx22⋯dfr2dx2n⋯⋯dfrsdx21dfrsdx22⋯dfrsdx2n⋮⋮⋱⋮⋮⋮⋱⋮⋯⋯⋮⋮⋱⋮dfr1dxm1dfr1dxm2⋯dfr1dxmndfr2dxm1dfr2dxm2⋯dfr2dxmn⋯⋯dfrsdxm1dfrsdxm2⋯dfrsdxmn]\frac{d\boldsymbol{F}}{d\boldsymbol{X}}= \begin{bmatrix} \frac{df_{11}}{dx_{11}} & \frac{df_{11}}{dx_{12}} & \cdots & \frac{df_{11}}{dx_{1n}} & \frac{df_{12}}{dx_{11}} & \frac{df_{12}}{dx_{12}} & \cdots &\frac{df_{12}}{dx_{1n}} & \cdots & \cdots & \frac{df_{1s}}{dx_{11}} & \frac{df_{1s}}{dx_{12}} & \cdots & \frac{df_{1s}}{dx_{1n}}\\\\ \frac{df_{11}}{dx_{21}} & \frac{df_{11}}{dx_{22}} & \cdots & \frac{df_{11}}{dx_{2n}} & \frac{df_{12}}{dx_{21}} & \frac{df_{12}}{dx_{22}} & \cdots &\frac{df_{12}}{dx_{2n}} & \cdots & \cdots & \frac{df_{1s}}{dx_{21}} & \frac{df_{1s}}{dx_{22}} & \cdots & \frac{df_{1s}}{dx_{2n}}\\\\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots & \cdots & \cdots & \vdots & \vdots & \ddots & \vdots \\\\ \frac{df_{11}}{dx_{m1}} & \frac{df_{11}}{dx_{m2}} & \cdots & \frac{df_{11}}{dx_{mn}} & \frac{df_{12}}{dx_{m1}} & \frac{df_{12}}{dx_{m2}} & \cdots &\frac{df_{12}}{dx_{mn}} & \cdots & \cdots & \frac{df_{1s}}{dx_{m1}} & \frac{df_{1s}}{dx_{m2}} & \cdots & \frac{df_{1s}}{dx_{mn}}\\\\ \frac{df_{21}}{dx_{11}} & \frac{df_{21}}{dx_{12}} & \cdots & \frac{df_{21}}{dx_{1n}} & \frac{df_{22}}{dx_{11}} & \frac{df_{22}}{dx_{12}} & \cdots &\frac{df_{22}}{dx_{1n}} & \cdots & \cdots & \frac{df_{2s}}{dx_{11}} & \frac{df_{2s}}{dx_{12}} & \cdots & \frac{df_{2s}}{dx_{1n}}\\\\ \frac{df_{21}}{dx_{21}} & \frac{df_{21}}{dx_{22}} & \cdots & \frac{df_{21}}{dx_{2n}} & \frac{df_{22}}{dx_{21}} & \frac{df_{22}}{dx_{22}} & \cdots &\frac{df_{22}}{dx_{2n}} & \cdots & \cdots & \frac{df_{2s}}{dx_{21}} & \frac{df_{2s}}{dx_{22}} & \cdots & \frac{df_{2s}}{dx_{2n}}\\\\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots & \cdots & \cdots & \vdots & \vdots & \ddots & \vdots \\\\ \frac{df_{21}}{dx_{m1}} & \frac{df_{21}}{dx_{m2}} & \cdots & \frac{df_{21}}{dx_{mn}} & \frac{df_{22}}{dx_{m1}} & \frac{df_{22}}{dx_{m2}} & \cdots &\frac{df_{22}}{dx_{mn}} & \cdots & \cdots & \frac{df_{2s}}{dx_{m1}} & \frac{df_{2s}}{dx_{m2}} & \cdots & \frac{df_{2s}}{dx_{mn}}\\\\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\\\\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\\\ \frac{df_{r1}}{dx_{11}} & \frac{df_{r1}}{dx_{12}} & \cdots & \frac{df_{r1}}{dx_{1n}} & \frac{df_{r2}}{dx_{11}} & \frac{df_{r2}}{dx_{12}} & \cdots &\frac{df_{r2}}{dx_{1n}} & \cdots & \cdots & \frac{df_{rs}}{dx_{11}} & \frac{df_{rs}}{dx_{12}} & \cdots & \frac{df_{rs}}{dx_{1n}}\\\\ \frac{df_{r1}}{dx_{21}} & \frac{df_{r1}}{dx_{22}} & \cdots & \frac{df_{r1}}{dx_{2n}} & \frac{df_{r2}}{dx_{21}} & \frac{df_{r2}}{dx_{22}} & \cdots &\frac{df_{r2}}{dx_{2n}} & \cdots & \cdots & \frac{df_{rs}}{dx_{21}} & \frac{df_{rs}}{dx_{22}} & \cdots & \frac{df_{rs}}{dx_{2n}}\\\\ \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \ddots & \vdots & \cdots & \cdots & \vdots & \vdots & \ddots & \vdots \\\\ \frac{df_{r1}}{dx_{m1}} & \frac{df_{r1}}{dx_{m2}} & \cdots & \frac{df_{r1}}{dx_{mn}} & \frac{df_{r2}}{dx_{m1}} & \frac{df_{r2}}{dx_{m2}} & \cdots &\frac{df_{r2}}{dx_{mn}} & \cdots & \cdots & \frac{df_{rs}}{dx_{m1}} & \frac{df_{rs}}{dx_{m2}} & \cdots & \frac{df_{rs}}{dx_{mn}} \end{bmatrix}dXdF=dx11df11dx21df11⋮dxm1df11dx11df21dx21df21⋮dxm1df21⋮⋮dx11dfr1dx21dfr1⋮dxm1dfr1dx12df11dx22df11⋮dxm2df11dx12df21dx22df21⋮dxm2df21⋮⋮dx12dfr1dx22dfr1⋮dxm2dfr1⋯⋯⋱⋯⋯⋯⋱⋯⋮⋮⋯⋯⋱⋯dx1ndf11dx2ndf11⋮dxmndf11dx1ndf21dx2ndf21⋮dxmndf21⋮⋮dx1ndfr1dx2ndfr1⋮dxmndfr1dx11df12dx21df12⋮dxm1df12dx11df22dx21df22⋮dxm1df22⋮⋮dx11dfr2dx21dfr2⋮dxm1dfr2dx12df12dx22df12⋮dxm2df12dx12df22dx22df22⋮dxm2df22⋮⋮dx12dfr2dx22dfr2⋮dxm2dfr2⋯⋯⋱⋯⋯⋯⋱⋯⋮⋮⋯⋯⋱⋯dx1ndf12dx2ndf12⋮dxmndf12dx1ndf22dx2ndf22⋮dxmndf22⋮⋮dx1ndfr2dx2ndfr2⋮dxmndfr2⋯⋯⋯⋯⋯⋯⋯⋯⋮⋮⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋯⋮⋮⋯⋯⋯⋯dx11df1sdx21df1s⋮dxm1df1sdx11df2sdx21df2s⋮dxm1df2s⋮⋮dx11dfrsdx21dfrs⋮dxm1dfrsdx12df1sdx22df1s⋮dxm2df1sdx12df2sdx22df2s⋮dxm2df2s⋮⋮dx12dfrsdx22dfrs⋮dxm2dfrs⋯⋯⋱⋯⋯⋯⋱⋯⋮⋮⋯⋯⋱⋯dx1ndf1sdx2ndf1s⋮dxmndf1sdx1ndf2sdx2ndf2s⋮dxmndf2s⋮⋮dx1ndfrsdx2ndfrs⋮dxmndfrs
所以,矩阵函数构成的函数矩阵对自变量矩阵求导,等于各函数分别对各自变量依次求导,然后顺次排列到相应的位置。
【最优化方法】【矩阵分析】标量、向量、矩阵之间的求导关系相关推荐
- 标量/向量/矩阵求导方法
这篇博客源于在看论文时遇到了一个误差向量欧氏距离的求导,如下: 在看了一堆资料后得出以下结论: 这个结论是怎么来的呢?这就涉及标量/向量/矩阵的求导了.由于标量.向量都可以看做特殊的矩阵,因此就统称为 ...
- 深度学习-数学-第一篇-标量,向量,矩阵,张量
这记录一些我刚开始学习所用到的数学 基础从最基础的开始 小知识: 0 ∈ {0 1 {0 1}表示一个集合,里面有0,1两个元素.所以0属于这个集合,就用0 ∈ {0 1}表示了.∈代表属于. {0 ...
- 复数 标量/向量/矩阵 求导
Wirtinger derivative: 对复标量求导 Wirtinger derivative: 令 z=x+jyz=x+jyz=x+jy,则 f(z)f(z)f(z) 对 zzz 和 zzz 的 ...
- 向量转置的怎么求导_机器学习中的向量求导和矩阵求导
在机器学习的各种关于向量或者矩阵的求导中,经常会出现各种需要转置或者不需要转置的地方,经常会看得人晕头转向.今天我对机器学习中关于这部分的常识性处理方法进行整理并记录下来方便大家参考. 一些约定 首先 ...
- 向量大小和归一化(vector magnitude normalization)、向量范数(vector norm)、标量/向量/矩阵/张量
一.向量大小 首先一个向量的长度或者大小一般记为.上图中的平面向量的大小计算如下: 空间向量的大小计算如下: 维复向量的大小计算如下: 二.向量归一化 向量归一化即将向量的方向保持不变,大小归一化到1 ...
- 向量二范数的求导问题
现有目标函数: f(x)=12∥Ax−b∥22f(x)=\frac{1}{2} \parallel Ax-b \parallel_2^2 f(x)=21∥Ax−b∥22 其中A∈RN×mA\in ...
- 标量、向量、矩阵求导
0.符号说明 本文会用到的几个量: 标量: c c \rm c 向量: n n n维列向量 x" role="presentation">xx\boldsymbol ...
- 矩阵,向量求导-求导布局,表格查找
文章目录 矩阵,向量求导(Matrix calculus) 0.约定 1.目标 2.完整的求导表格 完整表格 1.布局说明 分子布局(Numerator-layout) 分母布局(Denominato ...
- 矩阵向量求导-刘建平Pinard|笔记
矩阵向量求导-刘建平Pinard|笔记 矩阵向量求导(刘建平Pinard) 笔记 原文链接声明 一.求导定义与求导布局 原文图片 个人笔记 二.矩阵向量求导之定义法 原文图片 个人笔记 三.矩阵向量求 ...
最新文章
- git 下载项目和更新项目(1)
- java之集合框架一Collection接口
- Paper:关于人工智能的所有国际顶级学术会简介(IJCAI、ICLR、AAAI 、COLT、CVPR、ICCV 等)之详细攻略(持续更新,建议收藏!)
- Longest k-Good Segment CodeForces - 616D(尺取法)
- android 弹起键盘把ui顶上去的解决办法
- 收到计算机系统公司退款会计分录,企业账户收到退款,怎么做账务处理?
- SAP License:SAP顾问是如何炼成的——你适合做SAP顾问吗?
- Flutter 即将占领整个 Web 开发
- teechart for java_TeeChart java控件
- 【ACM】C++程序设计ACM题库总结
- arcgis公里坐标转经纬度_利用arcgis实现经纬度和平面坐标互转
- Vista Ultimate X64 绝对正宗的激活工具
- javaSE 打印流,PrintWriter,PrintStream。 打印到输出流(文件)中
- 3.2.CPU中的实模式
- matlab八分之一中点画圆算法,中点八分画圆算法
- 仿 trello php,使用jQuery-ui实现仿Trello风格的任务卡拖拉动画
- 【C++课程设计项目】歌手评分系统(代码量1500行含设计文档)
- 对于团队的一些感悟感想
- python3爬虫下载音乐_python3爬虫:下载网易云音乐排行榜
- 01 | Linux详细简介
热门文章
- 2021-08-17 WPF控件专题 Canvas 控件详解
- maven之插件仓库
- font-spider 压缩字体文件 html vue
- 粤港澳大湾区9城最新购房政策一览
- BAT54肖特基二极管;引脚配置 BAV99 串联二极管规格
- Skyline开发:未能加载文件或程序集“Interop.TerraExplorerX, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null
- 、OA系统中的绩效管理
- Pandas基本数据对象及操作
- 动态canvas 相册简单效果展示
- SEO网站优化基础解决方案[快速入门]