STA 221 - Spring 2025 – Logistic Regression: Model comparison

term	estimate	std.error	statistic	p.value	conf.low	conf.high
(Intercept)	-6.673	0.378	-17.647	0.000	-7.423	-5.940
age	0.082	0.006	14.344	0.000	0.071	0.094
totChol	0.002	0.001	1.940	0.052	0.000	0.004
currentSmoker1	0.443	0.094	4.733	0.000	0.260	0.627

term	estimate
(Intercept)	-6.673
age	0.082
totChol	0.002
currentSmoker1	0.443

term	estimate
(Intercept)	-6.456
age	0.080
totChol	0.002
currentSmoker1	0.445
education2	-0.270
education3	-0.232
education4	-0.035

Drop-in-deviance test

We will use a drop-in-deviance test (aka Likelihood Ratio Test) to test

the overall statistical significance of a logistic regression model
the statistical significance of a subset of coefficients in the model

Deviance

The deviance is a measure of the degree to which the predicted values are different from the observed values (compares the current model to a “saturated” model)

In logistic regression,

$D = - 2 \log L$

$D \sim χ_{n - p - 1}^{2}$ ( $D$ follows a Chi-square distribution with $n - p - 1$ degrees of freedom)¹

Note: $n - p - 1$ a the degrees of freedom associated with the error in the model (like residuals)

$χ^{2}$ distribution

Test for overall significance

We can test the overall significance for a logistic regression model, i.e., whether there is at least one predictor with a non-zero coefficient

$\begin{aligned} H_{0} : β_{1} = \dots = β_{p} = 0 \\ H_{a} : β_{j} \neq 0 for at least one j \end{aligned}$

The drop-in-deviance test for overall significance compares the fit of a model with no predictors to the current model.

Drop-in-deviance test statistic

Let $L_{0}$ and $L_{a}$ be the likelihood functions of the model under $H_{0}$ and $H_{a}$ , respectively. The test statistic is

$\begin{aligned} G = D_{0} - D_{a} & = (- 2 \log L_{0}) - (- 2 \log L_{a}) \\ = - 2 (\log L_{0} - \log L_{a}) \\ = - 2 \sum_{i = 1}^{n} [y_{i} \log (\frac{{\hat{π}}^{0}}{{\hat{π}}_{i}^{a}}) + (1 - y_{i}) \log (\frac{1 - {\hat{π}}^{0}}{1 - {\hat{π}}_{i}^{a}})] \end{aligned}$

where ${\hat{π}}^{0}$ is the predicted probability under $H_{0}$ and ${\hat{π}}_{i}^{a} = \frac{\exp {x_{i}^{T} β}}{1 + \exp {x_{i}^{T} β}}$ is the predicted probability under $H_{a}$ ¹

Drop-in-deviance test statistic

$G = - 2 \sum_{i = 1}^{n} [y_{i} \log (\frac{{\hat{π}}^{0}}{{\hat{π}}_{i}^{a}}) + (1 - y_{i}) \log (\frac{1 - {\hat{π}}^{0}}{1 - {\hat{π}}_{i}^{a}})]$

When $n$ is large, $G \sim χ_{p}^{2}$ , ( $G$ follows a Chi-square distribution with $p$ degrees of freedom)
The p-value is calculated as $P (χ^{2} > G)$
Large values of $G$ (small p-values) indicate at least one $β_{j}$ is non-zero

Heart disease model: drop-in-deviance test

$\begin{aligned} H_{0} : β_{a g e} = β_{t o t C h o l} = β_{c u r r e n t S m o k e r} = 0 \\ H_{a} : β_{j} \neq 0 for at least one j \end{aligned}$

Fit the null model (we’ve already fit the alternative model)

null_model <- glm(high_risk ~ 1, data = heart_disease, family = "binomial")

term	estimate	std.error	statistic	p.value
(Intercept)	-1.72294	0.0436342	-39.486	0

Heart disease model: drop-in-deviance test

Calculate the log-likelihood for the null and alternative models

(L_0 <- glance(null_model)$logLik)

[1] -1737.735

(L_a <- glance(high_risk_fit)$logLik)

[1] -1612.406

Calculate the likelihood ratio test statistic

(G <- -2 * (L_0 - L_a))

[1] 250.6572

Heart disease model: likelihood ratio test

Calculate the p-value

(p_value <- pchisq(G, df = 3, lower.tail = FALSE))

[1] 4.717158e-54

Conclusion

The p-value is small, so we reject $H_{0}$ . The data provide evidence that at least one predictor in the model has a non-zero coefficient.

Why use overall test?

Why do we use a test for overall significance instead of just looking at the test for individual coefficients?¹

Suppose we have a model such that $p = 100$ and $H_{0} : β_{1} = \dots = β_{100} = 0$ is true

About 5% of the p-values for individual coefficients will be below 0.05 by chance.
So we expect to see 5 small p-values if even no linear association actually exists.
Therefore, it is very likely we will see at least one small p-value by chance.
The overall test of significance does not have this problem. There is only a 5% chance we will get a p-value below 0.05, if a relationship truly does not exist.

term	df.residual	residual.deviance	df	deviance	p.value
high_risk ~ age + totChol + currentSmoker	4082	3224.812	NA	NA	NA
high_risk ~ age + totChol + currentSmoker + education	4079	3217.600	3	7.212	0.065

term	df.residual	residual.deviance	df	deviance	p.value
high_risk ~ age + totChol + currentSmoker	4082	3224.812	NA	NA	NA
high_risk ~ age + totChol + currentSmoker + currentSmoker * age + currentSmoker * totChol	4080	3222.377	2	2.435	0.296