Pooled analysis of state data versus alternative methods
Based on our prior experiences and the findings of this review, we concluded that pooled analysis of state-level data shows greater promise than other approaches to studying the determinants of program participation. In comparison to time-series approaches for the nation as a whole, this approach allows examination of state-level programmatic changes, provides a much richer way to study the impact of other factors that vary at the state level, and provides many more opportunities for testing the validity of the model. Individual state time-series models offer some of these advantages, but don't take advantage of the efficiency gains and testing opportunities that are afforded by pooled analysis.
Pooled analysis of individual-level data also has considerable appeal, but would require a substantially greater effort. An advantage of that approach is its ability to capture the influence of household demographic variables; this becomes a disadvantage, however, when the possibility that economic factors have a significant influence on household demographic variables is considered. Further, idiosyncratic variation in participation for individuals may mask the effects of state-level variables on participation; such variation is averaged out in aggregate state data.
Caseloads versus Openings and Closings
While some investigators have developed models of openings and closings rather than caseloads, difficulties in the administrative measures of openings and closings led us to conclude that it would be better to follow the direction taken by most investigators and model the size of caseloads directly. Despite these difficulties, it may well be worth developing an openings/closings model in the future.
Logs vs. Levels
While some researchers have used caseload levels for their dependent variable, many others have used logarithms. There is an important reason to use logarithms rather than levels in the pooled analysis: in a levels specification the coefficient of an explanatory variable estimates the effect of a unit change in the variable on the level of participation, while in a log specification it represent the percentage effect of a unit change. Looking across states that vary greatly in population size, the latter is more likely to be constant than the former. A reasonable alternative would be to use a caseload rate -- caseloads divided by the number of women in the relevant age group or some other measure of the at-risk population.
Mixing data with differing periodicities
One problem in specifying a quarterly or monthly model using state-level data is that quarterly or monthly data may not be available for some key variables. Other studies have addressed this problem by interpolating between annual data points. We have made substantial efforts to obtain quarterly data for many variables, but have had to resort to interpolation in a number of cases. This is especially unfortunate for variables that have lagged impacts on the caseload because the timing of major changes in the variable will not be captured accurately by interpolation methods. We necessarily are more skeptical about our findings for these variables than for others.