Math rules

This page contains mathematical rules we’ll use in this course that may be beyond what is covered in a linear algebra course.

Matrix calculus

Definition of gradient

Let x=[x1x2xk]be a k×1 vector and f(x) be a function of x.

Then xf, the gradient of f with respect to x is

xf=[fx1fx2fxk]


Gradient of xTz

Let x be a k×1 vector and z be a k×1 vector, such that z is not a function of x .

The gradient of xTz with respect to x is

xxTz=z


Gradient of xTAx

Let x be a k×1 vector and A be a k×k matrix, such that A is not a function of x .

Then the gradient of xTAx with respect to x is

xxTAx=(Ax+ATx)=(A+AT)x

If A is symmetric, then

(A+AT)x=2Ax


Hessian matrix

The Hessian matrix, x2f is a k×k matrix of partial second derivatives

x2f=[2fx122fx1x22fx1xk2f x2x12fx222fx2xk2fxkx12fxkx22fxk2]

Expected value

Expected value of random variable X

The expected value of a random variable X is a weighted average, i.e., the mean value of the possible values a random variable can take weighted by the probability of the outcomes.

Let fX(x) be the probability distribution of X. If X is continuous then

E(X)=xfX(x)dx

If X is discrete then

E(X)=xXxfX(x)=xXxP(X=x)


Expected value of vector z

Let z=[z1zp] be a p×1 vector of random variables.


Then E(z)=E[z1zp]=[E(z1)E(zp)]


Expected value of vector Az

Let A be an n×p matrix of constants and z a p×1 vector of random variables. Then

E(Az)=AE(z)


Expected value of Az+C

Let A be an n×p matrix of constants, C a n×1 vector of constants, and z a p×1 vector of random variables. Then

E(Az+C)=E(Az)+E(C)=AE(z)+C

Expected value of AXAT

Let A be an n×p matrix of constants and X a p×p matrix. Then

E(AXAT)=AE(X)AT

Variance

Variance of random variable X

The variance of a random variable X is a measure of the spread of a distribution about its mean.

Var(X)=E[(XE(X))2]=E(X2)E(X)2


Variance of vector z

Let z=[z1zp] be a p×1 vector of random variables. Then

Var(z)=E[(zE(z))(zE(z))T]


This produced the variance-covariance matrix

Var(z)=[Var(z1)Cov(z1,z2)Cov(z1,zp)Cov(z2,z1)Var(z2)Cov(z2,zp)Cov(zp,z1)Cov(zp,z2)Var(zp)]


Variance of Az

Let A be an n×p matrix of constants and z a p×1 vector of random variables. Then

Var(Az)=E[(AzE(Az))(AzE(Az))T]=AVar(z)AT

Probability distributions

Multivariate normal distribution

Let z be a p×1 vector of random variables, such that z follows a multivariate normal distribution with mean μ and variance Σ. Then the probability density function of z is

f(z)=1(2π)p/2|Σ|1/2exp{12(zμ)TΣ1(zμ)}

Linear transformation of normal random variable

Suppose z is a multivariate normal random variable with mean μ and variance Σ. A linear transformation of z is also multivariate normal, such that

Az+BN(Aμ+B,AΣAT)