AE 01: Model assessment

Income inequality, healthcare expenditure, and life expectancy

Published

January 16, 2025

Important

Go to the course GitHub organization and locate your ae-01 repo to get started.

If you do not see an ae-01 repo, use the link below to create one:

https://classroom.github.com/a/6jpkfA8n

Render, commit, and push your responses to GitHub by the end of class to submit your AE.



This AE will not count towards your participation grade.

library(tidyverse)    # data wrangling and visualization
library(tidymodels)   # broom and yardstick package
library(knitr)        # format output

Data

The data set comes from Zarulli et al. () who analyze the effects of a country’s healthcare expenditure and other factors on the country’s life expectancy. The data are originally from the Human Development Database and World Health Organization.

  • life_exp: The average number of years that a newborn could expect to live, if he or she were to pass through life exposed to the sex- and age-specific death rates prevailing at the time of his or her birth, for a specific year, in a given country, territory, or geographic income_inequality. ( from the World Health Organization)

  • income_inequality: Measure of the deviation of the distribution of income among individuals or households within a country from a perfectly equal distribution. A value of 0 represents absolute equality, a value of 100 absolute inequality (based on Gini coefficient). (from Zarulli et al. ())

  • health_expend: Per capita current spending on on healthcare good sand services, expressed in respective currency - international Purchasing Power Parity (PPP) dollar (from the World Health Organization)

  • health_pct_gdp: Spending on healthcare goods and services, expressed as a percentage of GDP. It excludes capital health expenditures such as buildings, machinery, information technology and stocks of vaccines for emergency or outbreaks (from Zarulli et al. ()).

life_exp <- read_csv("data/life_exp.csv")

Part 1

Exercise 1

Fit a model using income equality to understand variability in life expectancy. Neatly display the results using 3 digits.

# add code here

Exercise 2

  • Interpret the slope in the context of the data.

  • Does it make sense to interpret the intercept? If so, interpret it in the context of the data. Otherwise, explain why not.

Part 2

We now want to understand the relationship between a country’s healthcare expenditure and its life expectancy. The data set contains two measures for healthcare expenditure: health_expend and health_pct_gdp.

Exercise 3

Fit a model using health_expend to understand variability in life_exp. Compute R2 and RMSE for this model.

# add code here
  • Interpret R2 in the context of the data.

  • Interpret RMSE in the context for the data.

Exercise 4

Which measure of healthcare expenditure would you choose as a predictor of life expectancy - health_expend or health_pct_gdp? Briefly explain, using R2 and/or RMSE to support your choice.

# add code here

Submission

Important

To submit the AE:

  • Render the document to produce the PDF with all of your work from today’s class.
  • Push all your work to your AE repo on GitHub. You’re done! 🎉

References

Zarulli, Virginia, Elizaveta Sopina, Veronica Toffolutti, and Adam Lenart. 2021. “Health Care System Efficiency and Life Expectancy: A 140-Country Study.” Edited by Srinivas Goli. PLOS ONE 16 (7): e0253450. https://doi.org/10.1371/journal.pone.0253450.