Logistic Regression: Inference

Prof. Maria Tackett

Apr 08, 2025

Deviance

The deviance is a measure of the degree to which the predicted values are different from the observed values (compares the current model to a “saturated” model)

In logistic regression,

$D = - 2 \log L$

$D \sim χ_{n - p - 1}^{2}$ ( $D$ follows a Chi-square distribution with $n - p - 1$ degrees of freedom)¹

Note: $n - p - 1$ a the degrees of freedom associated with the error in the model (like residuals)

Drop-in-deviance test statistic

Let $L_{0}$ and $L_{a}$ be the likelihood functions of the model under $H_{0}$ and $H_{a}$ , respectively. The test statistic is

$\begin{aligned} G = D_{0} - D_{a} & = (- 2 \log L_{0}) - (- 2 \log L_{a}) \\ = - 2 (\log L_{0} - \log L_{a}) \\ = - 2 \sum_{i = 1}^{n} [y_{i} \log (\frac{{\hat{π}}^{0}}{{\hat{π}}_{i}^{a}}) + (1 - y_{i}) \log (\frac{1 - {\hat{π}}^{0}}{1 - {\hat{π}}_{i}^{a}})] \end{aligned}$

where ${\hat{π}}^{0}$ is the predicted probability under $H_{0}$ and ${\hat{π}}_{i}^{a} = \frac{\exp {x_{i}^{T} β}}{1 + \exp {x_{i}^{T} β}}$ is the predicted probability under $H_{a}$

term	estimate	std.error	statistic	p.value
(Intercept)	-1.72294	0.0436342	-39.486	0

Why use overall test?

Why do we use a test for overall significance instead of just looking at the test for individual coefficients?¹

Suppose we have a model such that $p = 100$ and $H_{0} : β_{1} = \dots = β_{100} = 0$ is true

About 5% of the p-values for individual coefficients will be below 0.05 by chance.
So we expect to see 5 small p-values if even no linear association actually exists.
Therefore, it is very likely we will see at least one small p-value by chance.
The overall test of significance does not have this problem. There is only a 5% chance we will get a p-value below 0.05, if a relationship truly does not exist.

Drop-in-deviance test

$\begin{aligned} H_{0} : β_{q + 1} = \dots = β_{p} = 0 \\ H_{a} : β_{j} \neq 0 for at least one j \end{aligned}$

The test statistic is

$\begin{aligned} G = D_{r e d u c e d} - D_{f u l l} & = (- 2 \log L_{r e d u c e d}) - (- 2 \log L_{f u l l}) \\ = - 2 (\log L_{r e d u c e d} - \log L_{f u l l}) \end{aligned}$

The p-value is calculated using a $χ_{Δ d f}^{2}$ distribution, where $Δ d f$ is the number of parameters being tested (the difference in number of parameters between the full and reduced model).¹

term	df.residual	residual.deviance	df	deviance	p.value
high_risk ~ age + totChol + currentSmoker	4082	3224.812	NA	NA	NA
high_risk ~ age + totChol + currentSmoker + education	4079	3217.600	3	7.212	0.065

term	df.residual	residual.deviance	df	deviance	p.value
high_risk ~ age + totChol + currentSmoker	4082	3224.812	NA	NA	NA
high_risk ~ age + totChol + currentSmoker + currentSmoker * age + currentSmoker * totChol	4080	3222.377	2	2.435	0.296

Coefficient for `age`

term	estimate	std.error	statistic	p.value	conf.low	conf.high
(Intercept)	-6.673	0.378	-17.647	0.000	-7.423	-5.940
age	0.082	0.006	14.344	0.000	0.071	0.094
totChol	0.002	0.001	1.940	0.052	0.000	0.004
currentSmoker1	0.443	0.094	4.733	0.000	0.260	0.627

Conclusion:

The p-value is very small, so we reject $H_{0}$ . The data provide sufficient evidence that age is a statistically significant predictor of whether someone is high risk of having heart disease, after accounting for total cholesterol and smoking status.

1 / 41

Logistic Regression: Inference Prof. Maria Tackett Apr 08, 2025

Logistic Regression: Inference
Announcements
Questions from this week’s content?
Topics
Computational setup
Risk of coronary heart disease
Modeling risk of coronary heart disease
Drop-in-deviance test
Drop-in-deviance test
Deviance
$χ^{2}$ distribution
Test for overall significance
Drop-in-deviance test statistic
Drop-in-deviance test statistic
Heart disease model: drop-in-deviance test
Heart disease model: drop-in-deviance test
Heart disease model: likelihood ratio test
Why use overall test?
Test a subset of coefficients
Testing a subset of coefficients
Drop-in-deviance test
Example: Include education?
Example: Include education?
Example: Include education?
Drop-in-deviance test in R
Add interactions with currentSmoker?
Test for a single coefficient
Distribution of $\hat{β}$
Distribution of $\hat{β}$
Test for a single coefficient
Confidence interval for $β_{j}$
Interpretation in terms of the odds
Coefficient for age
Coefficient for age
Coefficient for age
Coefficient for age
CI for age
Overview of testing coefficients
Questions from this week’s content?
Recap
References