SLR: Matrix representation

Prof. Maria Tackett

Jan 21, 2025

Announcements

  • Lab 01 due on TODAY at 11:59pm

    • Push work to GitHub repo

    • Submit final PDF on Gradescope + mark pages for each question

  • HW 01 will be assigned on Thursday

Topics

  • Application exercise on model assessment
  • Matrix representation of simple linear regression
    • Model form
    • Least square estimate
    • Predicted (fitted) values
    • Residuals

Model assessment

Two statistics

  • Root mean square error, RMSE: A measure of the average error (average difference between observed and predicted values of the outcome)

    RMSE=∑i=1n(yi−y^i)2n=∑i=1nei2n

  • R-squared, R2 : Percentage of variability in the outcome explained by the regression model (in the context of SLR, the predictor)

R2=SSMSST=1−SSRSST

Application exercise

📋 sta221-sp25.netlify.app/ae/ae-01-model-assessment.html

Open ae-01 from last class. Complete Part 2.

Matrix representation of simple linear regression

SLR: Statistical model (population)

When we have a quantitative response, Y, and a single quantitative predictor, X, we can use a simple linear regression model to describe the relationship between Y and X.

Y=β0+β1X+ϵ


  • β1: Population (true) slope of the relationship between X and Y
  • β0: Population (true) intercept of the relationship between X and Y
  • ϵ: Error terms centered at 0 with variance σϵ2

SLR in matrix form

The simple linear regression model can be represented using vectors and matrices as

y=Xβ+ϵ

  • y : Vector of responses

  • X: Design matrix (columns for predictors + intercept)

  • β: Vector of model coefficients

  • ϵ: Vector of error terms centered at 0 with variance σϵ2I

SLR in matrix form

[y1⋮yn]⏟y=[1x1⋮⋮1xn]⏟X[β0β1]⏟β+[ϵ1⋮ϵn]⏟ϵ


What are the dimensions of y, X, β, and ϵ?

Derive least squares estimator for β

Goal: Find estimator β^=[β^0β^1] that minimizes the sum of squared errors ∑i=1nϵi2=ϵTϵ=(y−Xβ)T(y−Xβ)

Gradient

Let x=[x1x2⋮xk]be a k×1 vector and f(x) be a function of x.

Then ∇xf, the gradient of f with respect to x is

∇xf=[∂f∂x1∂f∂x2⋮∂f∂xk]

Property 1

Let x be a k×1 vector and z be a k×1 vector, such that z is not a function of x .


The gradient of xTz with respect to x is

∇xxTz=z

Side note: Property 1

xTz=[x1x2…xk][z1z2⋮zk]=x1z1+x2z2+⋯+xkzk=∑i=1kxizi

Side note: Property 1

∇xxTz=[∂xTz∂x1∂xTz∂x2⋮∂xTz∂xk]=[∂∂x1(x1z1+x2z2+⋯+xkzk)∂∂x2(x1z1+x2z2+⋯+xkzk)⋮∂∂xk(x1z1+x2z2+⋯+xkzk)]=[z1z2⋮zk]=z

Property 2

Let x be a k×1 vector and A be a k×k matrix, such that A is not a function of x .


Then the gradient of xTAx with respect to x is

∇xxTAx=(Ax+ATx)=(A+AT)x


If A is symmetric, then

(A+AT)x=2Ax

Proof in HW 01.

Derive least squares estimator

Find β^ that minimizes

ϵTϵ=(y−Xβ)T(y−Xβ)=(yT−βTXT)(y−Xβ)=yTy−yTXβ−βTXTy+βTXTXβ=yTy−2βTXTy+βTXTXβ

Derive least squares estimator

∇βϵTϵ=∇β(yTy−2βTXTy+βTXTXβ)=−2XTy+2XTXβ

Find β^ that satisfies

−2XTy+2XTXβ^=0

β^=(XTX)−1XTy

Did we find a minimum?

Hessian matrix

The Hessian matrix, ∇x2f is a k×k matrix of partial second derivatives

∇x2f=[∂2f∂x12∂2f∂x1∂x2…∂2f∂x1∂xk∂2f∂ x2∂x1∂2f∂x22…∂2f∂x2∂xk⋮⋮⋱⋮∂2f∂xk∂x1∂2f∂xk∂x2…∂2f∂xk2]

Using the Hessian matrix

If the Hessian matrix is…

  • positive-definite, then we have found a minimum.

  • negative-definite, then we have found a maximum.

  • neither positive or negative-definite, then we have found a saddle point

Did we find a minimum?

∇β2ϵTϵ=∇β(−2XTy+2XTXβ)=−2∇β(XTy)+2∇β(XTXβ)∝XTX

Show that XTX is positive definite in HW 01.

Predicted values and residuals

Predicted (fitted) values

Now that we have β^, let’s predict values of y using the model

y^=Xβ^=X(XTX)−1XT⏟Hy=Hy

Hat matrix: H=X(XTX)−1XT

  • H is an n×n matrix
  • Maps vector of observed values y to a vector of fitted values y^
  • It is only a function of X not y

Residuals

Recall that the residuals are the difference between the observed and predicted values

e=y−y^=y−Xβ^=y−Hye=(I−H)y

Recap

  • Introduced matrix representation for simple linear regression

    • Model form
    • Least square estimate
    • Predicted (fitted) values
    • Residuals

For next class

  • Complete Prepare for Lecture 05 - SLR: matrix representation cont’d

🔗 STA 221 - Spring 2025

1 / 27
SLR: Matrix representation Prof. Maria Tackett Jan 21, 2025

  1. Slides

  2. Tools

  3. Close
  • SLR: Matrix representation
  • Announcements
  • Topics
  • Model assessment
  • Two statistics
  • Application exercise
  • Matrix representation of simple linear regression
  • SLR: Statistical model (population)
  • SLR in matrix form
  • SLR in matrix form
  • Derive least squares estimator for β
  • Gradient
  • Property 1
  • Side note: Property 1
  • Side note: Property 1
  • Property 2
  • Derive least squares estimator
  • Derive least squares estimator
  • Did we find a minimum?
  • Hessian matrix
  • Using the Hessian matrix
  • Did we find a minimum?
  • Predicted values and residuals
  • Predicted (fitted) values
  • Residuals
  • Recap
  • For next class
  • f Fullscreen
  • s Speaker View
  • o Slide Overview
  • e PDF Export Mode
  • r Scroll View Mode
  • b Toggle Chalkboard
  • c Toggle Notes Canvas
  • d Download Drawings
  • ? Keyboard Help