Extending the Utility of Federal Data Bases. Averaging Over Time


Section 4 of this report discusses the possibility of improving the precision for some of the subpopulations by combining data for several years. An immediate question is how many years can be combined without seriously affecting the usefulness of the data.

As with so many of the other issues that have been raised, there is no single time period that would be uniformly acceptable for all surveys, or for that matter, for all items within some of the surveys. We suggest that the decision on the number of years to be combined be based on how slowly or quickly the characteristic(s) that are measured in a survey change over time. For example, it is unlikely that there will be dramatic changes over the course of a few years in most of the health or nutrition items covered in NHANES, e.g., prevalence of hypertension, high cholesterol levels, obesity, etc. This, of course, is the reason that NCHS has been comfortable in having previous NHANES data collection extend over a 6-year period. Even though each year of the current NHANES will be based on a random sample, there is no reason why 6 or more years cannot be combined for analyses of data for small population subgroups. Fertility patterns also are likely to change only slowly over time. However, since the NSFG is currently carried out intermittently (about every 5-years), some thought would have to be given to whether combining two cycles of NSFG would excessively stretch the ability to describe the current situation. On the other hand, the limited information on fertility collected annually in CPS could probably be combined over a 3 or 4-year period without any harm, as could the data on educational attainment. Economic statistics, however, can undergo strong fluctuations over a few years, or even less, since they are subject to swings in the economy. It is probably unwise to combine more than 2 or 3-years of data on such items as median income or the poverty rate.