|
|
STAT 440 - Forecasting The data file USPop.xlsx, from this website, gives U.S. population in each decennial census, 1790 through 2020. Working with your group member(s), you are to fit three regression models to these data. For each model, predict the 2020 population. Give a confidence interval for this value. Compare this forecast with the actual 2020 population. MODEL 1: The basic linear regression model, with year as the independent variable (X) and population as the dependent variable (Y). MODEL 2: A logarithmic regression model — use the logarithm of population as the dependent variable (Y). MODEL 3: A model of growth rate — use growth rate as the dependent variable. For example, between 1790 and 1800, U.S. population grew by (5,308,463 - 3,929,214)/3,929,214 = 0.351, or 35.1%. NOTE that you will have one less data point for this model, since you won't have data on the 2020-to-2030 growth rate. Which model seems best, and why? What explains why some models work better than others? |