Mar 04, 2025
The data set comes from Zarulli et al. (2021) who analyze the effects of a country’s healthcare expenditures and other factors on the country’s life expectancy. The data are originally from the Human Development Database and World Health Organization.
There are 140 countries (observations) in the data set.
life_exp
: The average number of years that a newborn could expect to live, if he or she were to pass through life exposed to the sex- and age-specific death rates prevailing at the time of his or her birth, for a specific year, in a given country, territory, or geographic income_inequality. ( from the World Health Organization)
income_inequality
: Measure of the deviation of the distribution of income among individuals or households within a country from a perfectly equal distribution. A value of 0 represents absolute equality, a value of 100 absolute inequality (based on Gini coefficient). (from Zarulli et al. (2021))
education
: Indicator of whether a country’s education index is above (High
) or below (Low
) the median index for the 140 countries in the data set.
health_expend
: Per capita current spending on on healthcare good sand services, expressed in respective currency - international Purchasing Power Parity (PPP) dollar (from the World Health Organization)
The goal is to use income inequality and education to understand variability in health expenditure
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 2070.599 | 534.653 | 3.873 | 0.000 |
income_inequality | -64.346 | 18.626 | -3.455 | 0.001 |
educationHigh | 1039.298 | 359.736 | 2.889 | 0.004 |
What model assumption(s) appear to be violated?
Typically, a “fan-shaped” residual plot indicates the need for a transformation of the response variable Y
There are multiple ways to transform a variable, e.g.,
When building a model:
Choose a transformation and build the model on the transformed data
Reassess the residual plots
If the residuals plots did not sufficiently improve, try a new transformation!
We fit the model in terms of
Intercept: When
Coefficient of
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 7.096 | 0.324 | 21.895 | 0 |
income_inequality | -0.065 | 0.011 | -5.714 | 0 |
educationHigh | 1.117 | 0.218 | 5.121 | 0 |
Interpret each of the following in terms of health expenditure
Intercept
income_inequality
education
See Log Transformations in Linear Regression for more details about interpreting regression models with log-transformed variables.