This study uses data from the National Longitudinal Survey of Youth--1979 cohort (NLSY79). The NLSY79 is a large, nationally representative, omnibus survey sponsored by the U.S. Bureau of Labor Statistics. 3 Over 12,000 youths ages 14-22 were first interviewed in 1979. They have been re-interviewed annually through 1994 and biennially since. The original sample included over-samples of blacks, Hispanics, economically disadvantaged non-black non-Hispanics, and youth in the military. Over time the economically disadvantaged and military over-samples were discontinued due to budget reductions. The remaining sample has seen remarkably low attrition with over 84 percent having been interviewed in 1998, the eighteenth round of interviewing.
The NLSY79 focuses on labor market behavior with information collected on aspects of the respondents' lives which are thought to influence, or be influenced by, their labor market behavior. The survey routinely collects information on education, job training, marriage, fertility, household composition, health status, income, and assets. In selected years, funding from various government agencies has provided for collecting additional information on such things as alcohol consumption and drug usage.
The advantage of using the NLSY79 is the availability of measures of long-term adult outcomes in a continuous context. This requires a longitudinal survey so data sets such as the Youth Risk Behavior Survey (YRBS) or the Current Population Survey (CPS) are inadequate.
Also, surveys such as the Survey of Income and Program Participation (SIPP) do not follow respondents for a long enough period of time. Other longitudinal surveys of adolescents, such as Add Health, the National Survey of Adolescent Males, or the new NLSY97, began more recently and have not yet followed their respondents beyond the earliest years of adulthood. The youngest respondent in the NLSY79, on the other hand, turned 34 in 1998 (the most recent year of data available at the time of this study). Over the years of the survey, we are able to see the respondents (generally) complete their education, develop their careers, and form families. Also, the NLSY79 has collected information on certain behaviors such as alcohol use and drug use at multiple points in time.
The NLSY79 pioneered asking about sensitive activities in a large omnibus survey. However, these questions did not appear until the respondents were mostly out of their adolescent years. 4 Thus contemporary measures of frequency and intensity are not measured in the years needed for this study. For the most part, we are restricted to a retrospective report on age of initiation. The one exception to this is in reports of delinquent and criminal behaviors. In this case, questions were included in 1980, when the respondents were 15-23 years old. The weakness in this case is that some respondents were no longer adolescents, and it is a one-time measure so that cumulative effects differ across different aged respondents. 5 While these weaknesses limit the analysis in certain ways, the NLSY79 is very in its inclusion of a wide variety of long term adult outcomes.
Like all survey data, the NLSY79 relies on self-reports of socially undesirable behaviors. Thus there may be under-reporting of these behaviors. Mensch and Kandel (1988) find some evidence of under-reporting drug use in the 1984 NLSY79. Also, since we are limited to reports of age of initiation, the length of retrospective recall may distort the distribution. Individuals may recall inaccurately, putting events nearer or further in time than their actual occurrence. However, contemporary reports may include more reporting of less important events (e.g. a single cigarette smoked) and may be more affected by the social desirability of a given response. Thus, retrospective reports of age of initiation may not be worse than contemporaneous reports.
The sample used in this study includes all NLSY79 respondents who are part of the continuing sample. Table 1 shows sample sizes by sex and race/ethnicity.
It should be noted that this is the total available sample for the analysis. Any given behavior, outcome, or other variable may be missing for a particular observation. Thus sample sizes will differ depending on what relationship is being analyzed.