Matrix Reference Manual Matrix Calculus Contents of Calculus Section Notation

Matrix Reference Manual
Matrix Calculus
Go to: Introduction, Notation, Index
Contents of Calculus Section
z
z
z
z
Notation
Derivatives of Linear, Quadratic and Cubic Products
Derivatives of Inverses, Trace and Determinant
Jacobians and Hessian matrices
Notation
z
z
z
z
z
z
z
z
d/dx (y) is a vector whose (i) element is dy(i)/dx
d/dx (y) is a vector whose (i) element is dy/dx(i)
d/dx (yT) is a matrix whose (i,j) element is dy(j)/dx(i)
d/dx (Y) is a matrix whose (i,j) element is dy(i,j)/dx
d/dX (y) is a matrix whose (i,j) element is dy/dx(i,j)
xR and xI are the real and imaginary parts of x
x* is the complex conjugate of x
j is the square root of -1
An expression, y, can only differentiated with respect to a complex x if it satisfies the Cauchy-Riemann
equations: dy/dxR = j dy/dxI . Expressions involving the complex conjugate or Hermitian transpose do not
normally satisfy this requirement, so separate expressions for dy/dxR and dy/dxI are given in these cases.
In the expressions below matrices and vectors A, B, C do not depend on X.
Derivatives of Linear Products
z
z
z
z
z
z
d/dx (AYB) =A * d/dx (Y) * B
{ d/dx (Ay) =A * d/dx (y)
d/dx (xTA) =A
T
{ d/dx (x ) =I
T
T
{ d/dx (x a) = d/dx (a x) = a
d/dX (aTXb) = abT
T
T T
T
{ d/dX (a Xa) = d/dX (a X a) = aa
d/dX (aTXTb) = baT
d/dx (YZ) =Y * d/dx (Z) + d/dx (Y) * Z
dy/dxR (YH) = ( dy/dxR (Y) )H
z
dy/dxI (YH) = ( dy/dxI (Y) )H
z
dy/dxR (xHA) = A
{
z
dy/dxR (xH) = I
dy/dxI (xHA) = -jA
{
dy/dxI (xH) = -jI
Derivatives of Quadratic Products
z
z
z
z
z
d/dx (Ax+b)TC(Dx+e) = ATC(Dx+e) + DTCT(Ax+b)
T
T
{ d/dx (x Cx) = (C+C )x
T
T
„ [C=C ]: d/dx (x Cx) = 2Cx
T
„ d/dx (x x) = 2x
T
T
T
{ d/dx (Ax+b) (Dx+e) = A (Dx+e) + D (Ax+b)
T
T
„ d/dx (Ax+b) (Ax+b) = 2A (Ax+b)
T
T
T
{ [C=C ]: d/dx (Ax+b) C(Ax+b) = 2A C(Ax+b)
d/dX (aTXTXb) = X(abT + baT)
T T
T
{ d/dX (a X Xa) = 2Xaa
d/dX (aTXTCXb) = CTXabT + CXbaT
T T
T
T
{ d/dX (a X CXa) = (C + C )Xaa
T
T T
T
{ [C=C ] d/dX (a X CXa) = 2CXaa
d/dX ((Xa+b)TC(Xa+b)) = (C+CT)(Xa+b)aT
d/dxR (Ax+b)HC(Dx+e) = AHC(Dx+e) + DTCT(Ax+b)*
{
z
d/dxR (xHCx) = Cx+CTx* = Cx+(xHC)T
„
[C=CT]: d/dxR (xHCx) = 2CxR
„
[C=CH]: d/dxR (xHCx) = 2(Cx)R
„
d/dxR (xHx) = 2xR
d/dxI (Ax+b)HC(Dx+e) = j( DTCT(Ax+b)*–AHC(Dx+e) )
{
d/dxI (xHCx) = j(CTx* – Cx) = j( (xHC)T – Cx )
„
[C=CT]: d/dxI (xHCx) = 2CxI
„
[C=CH]: d/dxI (xHCx) = 2(Cx)I
„
d/dxR (xHx) = 2xI
Derivatives of Cubic Products
z
d/dx (xTAxxT) = (A+AT)xxT+xTAxI
Derivatives of Inverses
z
z
d/dx (Y-1) = -Y-1d/dx (Y)Y-1 [2.1]
d/dX (aTX-1b) = -X-TabTX-T [2.6]
Derivative of Trace
Note: matrix dimensions must result in an n*n argument for tr().
z
z
z
z
z
z
z
d/dX (tr(X)) = d/dX (tr(XT)) = I [2.4]
d/dX (tr(Xk)) =k(Xk-1)T
d/dX (tr(AXk)) = SUMr=0:k-1(XrAXk-r-1)T
d/dX (tr(AX-1B)) = -(X-1BAX-1)T = -(X-TABX-T) [2.5]
-1
-1
-T T -T
{ d/dX (tr(AX )) =d/dX (tr(X A)) = -X A X
d/dX (tr(ATXBT)) = d/dX (tr(BXTA)) = AB [2.4]
T
T
T
T
{ d/dX (tr(XA )) = d/dX (tr(A X)) =d/dX (tr(X A)) = d/dX (tr(AX )) = A
d/dX (tr(AXBXTC)) = ATCTXBT + CAXB
T
T
T
T
{ d/dX (tr(XAX )) = d/dX (tr(AX X)) = d/dX (tr(X XA)) = X(A+A )
T
T
T
T
{ d/dX (tr(X AX)) = d/dX (tr(AXX )) = d/dX (tr(XX A)) = (A+A )X
d/dX (tr(AXBX)) = ATXTBT + BTXTAT
z
z
z
[C:symmetric] d/dX (tr((XTCX)-1A) = d/dX (tr(A (XTCX)-1) = -(CX(XTCX)-1)(A+AT)(XTCX)-1
[B,C:symmetric] d/dX (tr((XTCX)-1(XTBX)) = d/dX (tr( (XTBX)(XTCX)-1) = -2(CX(XTCX)-1)
XTBX(XTCX)-1 + 2BX(XTCX)-1
z
Derivative of Determinant
Note: matrix dimensions must result in an n*n argument for det(). Some of the expressions below involve
inverses: these forms apply only if the quantity being inverted is square and non-singular.
z
z
z
z
d/dX (det(X)) = d/dX (det(XT)) = ADJ(A)T=det(X)*X-T
T
T
T
-T T
-T
{ d/dX (det(AXB)) = A ADJ(AXB)B = det(AXB)*A (AXB) B = det(AXB)*X
T
-T T
-T
{ d/dX (ln(det(AXB))) = A (AXB) B = X
d/dX (det(Xk)) = k*det(Xk)*X-T
k
-T
{ d/dX (ln(det(X ))) = kX
[Real] d/dX (det(XTCX)) = det(XTCX)*(C+CT)X(XTCX)-1
T
T
T
-1
{ [C: Real,Symmetric] d/dX (det(X CX)) = 2det(X CX)* CX(X CX)
[C: Real,Symmetricc] d/dX (ln(det(XTCX))) = 2CX(XTCX)-1
Jacobian
If y is a function of x, then dyT/dx is the Jacobian matrix of y with respect to x.
Its determinant, |dyT/dx|, is the Jacobian of y with respect to x and represents the ratio of the hypervolumes dy and dx. The Jacobian occurs when changing variables in an integration: Integral(f(y)dy)
=Integral(f(y(x)) |dyT/dx| dx).
Hessian matrix
If f is a function of x then the symmetric matrix d2f/dx2 = d/dxT(df/dx) is the Hessian matrix of f(x). A
value of x for which df/dx = 0 corresponds to a minimum, maximum or saddle point according to whether
the Hessian is positive definite, negative definite or indefinite.
z
z
d2/dx2 (aTx) = 0
d2/dx2 (Ax+b)TC(Dx+e) = ATCD + DTCTA
2
2 T
T
{ d /dx (x Cx) = C+C
d2/dx2 (xTx) = 2I
d2/dx2 (Ax+b)T (Dx+e) = ATD + DTA
2
2
T
T
„ d /dx (Ax+b) (Ax+b) = 2A A
[C: symmetric]: d2/dx2 (Ax+b)TC(Ax+b) = 2ATCA
„
{
{
The Matrix Reference Manual is written by Mike Brookes, Imperial College, London, UK. Please send any
comments or suggestions to mike.brookes@ic.ac.uk