The Long Term Impact of Adolescent Risky Behaviors and Family Environment. OLS Regression


Ordinary Least Squares (OLS) regression is used to estimate continuous outcome variables that are normally distributed. In this report, percent time employed between the end of formal schooling and the age of 33 is estimated using OLS regression.

Since OLS regression assumes a linear function, the interpretation of estimated coefficients is simple and straightforward. Estimated coefficients from OLS regressions measure changes in the outcome variable resulting from a unit change in an explanatory variable. For example, in estimating percent time employed, we obtained an estimated coefficient of 0.02 for males relative to females. This means that the percent time employed for male respondents was on average 0.02 higher than that of female respondents.

Most explanatory variables in our analysis are categorical variables. In estimating outcome models, one of the categories of each categorical variable has to be dropped due to co-linearity. The omitted category becomes the reference group. Any estimates for other categories become relative to the reference group. For example, age of marijuana initiation has four categories: initiated at ages 11-15, at ages 16-17, at ages 18-19, and at ages older than 19 or never initiated. If the reference group is "initiated at ages 11-15" and the estimated parameter for the group who initiated at ages 16-17 is 0.003 in the OLS regression of percent time employed, this means that those who initiated at ages 16-17 had on average percent time employed 0.003 higher than those who initiated at ages 11-15.