Four studies have used pooled state data to analyze AFDC participation: Grossman (1985); Cromwell, Hurdle, and Wedig (1986); Moffitt (1986); and Shroder (1995). The first two of these use quarterly data, while the last two use annual data. In the discussion below we give more attention to the first two than the last two, for different reasons. As discussed previously, Grossman compares the findings from her analysis of pooled state data to results from a national time-series analysis. Cromwell et al. adopt an approach that is closer to the approach we recommend for this project than any other study.

Grossman (1985)** **estimates pooled Basic caseload and average Basic benefits models for states using specifications for each state that are very similar to the specifications she used for her corresponding national time-series models, described earlier. This is the earliest example of a pooled model that we have found. Quarterly data are pooled for 51 states (including DC) over the period from 1974.4 through 1983.4 (37 quarters) -- a total of 1,887 observations.

The caseload variable is the average monthly caseload during the quarter. Explanatory variables in the caseload equation include: a dummy for each state (i.e., a state effect); the national series for female headed households that was used in the national model, but allowing for a different coefficient for each state by interacting the variable with the state dummy; a single dummy for OBRA81, which permanently changes from zero to one in the quarter in which the state adopted a key requirement of OBRA81;^{(10)} four lags of the number of persons unemployed; quarterly dummies; and the state's standard of need for a family of three.

Grossman concludes that her pooled caseload model performs very poorly relative to her national model, as well as relative to time-series models estimated for individual states. This conclusion is largely based on an R-square of only 52 percent, compared to an adjusted R-square of over 97 percent in the national time-series model and similarly high R-squares in the individual state time-series models. We find the low R-square to be very surprising. Our own experience in developing similar models for SSA disability program caseloads suggests that R-squares are typically very high because the state effects (i.e., individual state intercepts) explain all of the very considerable average cross-state variation in the dependent variable. Grossman's findings for the average benefit equation are more consistent with our experience; for that equation she obtains an R-square of 93 percent in the pooled model.

The low R-square in the caseload equation suggests that there is a significant specification problem. It may be that the low R-square is due to the following problem: the coefficient of the OBRA81 dummy is constrained to be the same in all states even though one would expect that the impact of the OBRA81 change on the level of the Basic caseload (i.e., on the dependent variable) would increase with the size of the state. There is tremendous variation in size across states, and one would expect comparable variation in the size of OBRA81's impact. The same criticism applies to the specification of the quarterly dummy variables and the standard of need.^{(11)} There may be an analogous problem in the average benefit specification, but it is presumably much less severe because relative variation in average benefits across states is much smaller than relative variation in caseload size. The lesson from this experience is that state models should be specified in such a way that it is reasonable to expect coefficients to be constant across states. Later pooled models, discussed below, solve this problem by specifying logarithmic participation variables, so that each coefficient represents (approximately) the percentage change in participation associated with a unit change in the corresponding explanatory variable.

The Cromwell et al. (1986) study is unique among the studies we have examined because the focus of the study is Medicaid enrollment; AFDC participation is examined because most AFDC participants are Medicaid eligible. The authors measure AFDC participation as Medicaid enrollees who are also AFDC recipients; no distinction is made between Basic and UP households.^{(12)}

The authors pool quarterly data from 44 states for the period from 1976 to 1982 (28 quarters and 44 x 28 = 1,232 observations). They implicitly use fixed state effects by specifying their model in quarterly changes. They do not specify fixed time effects, however.

The model's dependent variable is the natural log of average monthly AFDC Medicaid enrollment per capita for the quarter. All continuous explanatory variables are also in logs: the unemployment rate (current period plus three lags); monthly manufacturing earnings per capita; the maximum monthly AFDC payment level for a family of unspecified size, deflated; and a "tax capacity" index; a political "liberalism" index. In addition to these continuous variables, they include dummy variables for: whether or not a state has an UP program; whether or not the state has a Medicaid eligibility option for independent children between the ages of 19 and 21; and an OBRA81 dummy that is equal to one in the third quarter of 1981 and thereafter. They also include an interaction between the UP indicator and each of the unemployment variables. They hypothesize that participation will be more sensitive to business cycles in UP states than in non-UP states because the UP program is designed to help two-parent families when both parents are unemployed. Their findings provide strong support for this hypothesis.

The estimates of the impact of increases in the unemployment rate obtained by Cromwell et al. are several times larger than those obtained in the national time-series models we have reviewed. Their estimates imply that a one percentage point increase in the unemployment rate increases the caseload in states without an UP program by 1.8 percent after three quarters -- almost identical to CBO's estimate for Basic programs. IN states with an UP program, the estimated effect of the same change is a 3.0 percent increase after three quarters. Assuming that five percent of the caseload in the latter states is in the UP program, which is approximately correct, and that the sensitivity of the Basic program in those states is the same as in states with no UP program, the results imply that a one percentage point increase in the unemployment rate increases the UP caseload by 25.8 percent after three quarters -- more than 2.5 times the estimate obtained by CBO for the national UP caseload. One reason for the stronger result may be that the pooled methodology is better able to separate the effects of the recession in the early 1980s from the effects of OBRA81.

Moffitt (1986) uses the pooled state-level methodology to investigate whether there was a positive shift in AFDC participation among female-headed households from 1967 through 1982 that cannot be explained by labor market or programmatic changes, as many have alleged. Moffitt's analysis uses annual data for nine of the 16 years during the period of interest. The reason that seven years are excluded is that the participation variable in his model is estimated using data from the March Current Population Survey (CPS) and the biennial AFDC Characteristic Surveys (AS). Many states are excluded from the analysis because of missing data. The number of states included varies across years; the average number is just over 27; the total sample size (average number of states times years) is 245.

The dependent variable in Moffitt's model is the rate of AFDC participation among female-headed households; the numerator of the rate is based on AS tabulations and the denominator is based on CPS tabulations. Explanatory variables include the AFDC guarantee level, the benefit reduction rate (BRR), a dummy variable for southern states, the unemployment rate, and each of the following for female-headed households: mean age of head, mean education of head, percent of heads who are white, mean number of children, mean hourly wage rate, and mean unearned income.

Moffitt estimates equations for each year (i.e., using the cross-state data for the year alone), and also estimates three versions of pooled models: a model with fixed time effects but random state effects, a model with fixed time and state effects, and a "between" estimator.^{(13)} The between estimator incorporates fixed time effects also, but ignores the possible existence of an error component that varies across states but not over time -- either random or fixed. It is obtained by jointly estimating cross-state regressions for each time period, constraining all coefficients to be the same except the intercept. As a result, the coefficients depend solely on cross-state relationships between the levels of the model's variables.

Moffitt's findings, taken alone, would be discouraging for those interested in examining AFDC participation through pooled analysis of state data. Very few of his explanatory variables, except the time dummies, are statistically significant using any of the three pooled estimators he considers, and those that are significant are not significant for all three estimators. The model with fixed state and time effects is especially disappointing, with no statistically significant coefficients other than for the fixed effects themselves. Further, the unemployment rate has an insignificant coefficient in all three specifications. The most significant findings are for the guarantee and BRR variables, in both the model with random state effects and the between model. The fact that these variables are not significant in the model with fixed state effects means that their significance in the other two models relies heavily on the cross-state relationship between these variables and participation, rather than on the relationships between changes in these variables and changes in participation over time. It is possible that the significant coefficients are substantially biased because important factors that both vary across states and are correlated with the explanatory variables, are not included in the equation.

There are many possible reasons why there are so few statistically significant coefficients in Moffitt's results, especially when compared to the findings of Grossman and Cromwell et al. The use of biannual data and the relatively limited number of observations are obvious ones. Moffitt also documents a number of problems with the construction of the dependent variable, as well as with some explanatory variables. Note, too, that Moffitt controls for a key demographic variable -- the number of female-headed households -- by using it as the denominator for the dependent variable. Hence, his estimates implicitly control for the effect that any of the explanatory variables would have on AFDC participation via their effect on the number of such households.

Of course one reason the estimates are insignificant may be that the explanatory variables are not very important determinants of participation. Moffitt explores this further with analysis of individual data for three of the seven years, and does find more significant effects. Nonetheless, these estimates, like the estimates using state-level data, imply that most of the increase in AFDC participation rates for female-headed households during the 1967-82 period was due to factors not accounted for by the models.

Shroder (1995) estimates a pooled state model using annual data from 1982 to 1988 for all 50 states and the District of Columbia (357 observations). His research focuses on the issue of endogeneity of AFDC benefits in participation equations. Specifically, he investigates the hypothesis that increases in AFDC participation reduce benefit levels because they increase the cost per taxpayer of each marginal dollar spent on benefits; as a larger share of the population participates in AFDC, there are fewer taxpayers per recipient to fund the program. For this reason he develops a two-equation model in which AFDC participation and benefits are jointly determined. Another unique feature of Shroder's model is the use of variables representing benefits and economic conditions in neighboring states.

The dependent variable in his participation equation is the log of the *recipiency ratio* (the ratio of AFDC recipients to non-recipients). The dependent variable in his AFDC benefit equation is the log of the maximum benefit for a 3-person household, including both the AFDC income guarantee and the value of Food Stamps.

Explanatory variables in the benefit equation include: the log of the recipiency ratio; an index of "Republican power;" the log of the share of AFDC recipients who are non-Hispanic whites; the log of the share of recipient households in which the mother of the youngest child is not married; the log of per capita disposable income; and the log of the state's share of AFDC benefit payments. Explanatory variables in the participation equation are (all in logs): the AFDC benefit variable, the average annual wages for laundry, cleaning and garment services (SIC 721), the unemployment rate, the ratio of women age 15-65 to employed men (F/EM; "it proxies for the availability of the marriage option"), a wage variable for the state's "composite neighbor" (see below), the composite neighbor's unemployment rate, F/EM for the composite neighbor, and the composite neighbors AFDC benefit.

Economic conditions and AFDC benefits in neighboring states are potentially important determinants of AFDC participation in one's own state because state residents may work in other states or migrate to other states to get work or obtain better AFDC benefits. The neighbor variables are weighted averages of variables in other states, with weights based on migration patterns from the 1980 census. Technically, the weight for state "k" as a neighbor for state "I" is defined as W_{ik} = M_{ik}/å_{k}M_{ik}, where M_{ik} is sum of migration from i to k and k to i.

Shroder uses instrumental variables for the benefit variable in the participation equation and the participation variable in the benefit equation to avoid simultaneity bias. He also estimates each model in two ways. In the first, he specifies fixed state effects (but not time effects), and in the second he averages the state data over seven sample years and estimate a cross-section model using the average state data. As discussed previously, the explanatory variable coefficients from the first estimator are based on the cross-state relationship between changes in the variables, and do not depend on the cross-state relationship between the levels of the variables. The second estimator is the antithesis of the first, relying on just the cross-state relationships in the average levels. Shroder's rationale for considering these estimators is worth examining:

"The fixed-effects model is particularly appealing in analyzing the recipiency ratio. Welfare recipiency across states may be affected by differential rates of divorce, abortion, teen pregnancy, size and social isolation of minorities, level of stigma attached to recipiency, structure of the economy, school quality, and so on. Many factors are difficult to measure well; the inclusion or exclusion of numerous imperfect measures will be controversial.

Assume these factors are time-invariant characteristics of the state. With the fixed-effects model, the choice problem of the representative agent can be conceived in terms of variables that do change over time, like the AFDC benefit paid in that state or in a neighboring state, and independent of the invariant factors.

However, two problems may arise with the fixed-effects model. First, the response to a change in the explanatory variables might occur with a lag of some unknown form. Second, the change in the explanatory variables from one period to another may consist of two shocks, one permanent and one transitory; the agents may be able to distinguish between them even if the econometrician cannot, and may respond only to the permanent component. In either case, the fixed-effects estimator will then be biased toward zero (Griliches and Hausman, 1986)." (Shroder, 1995, p. 186)

He goes on to argue that the estimator based on state averages, although possibly biased due to fixed effects, will mitigate problems arising from delayed responses and permanent versus transitory "shocks."

We concur with Shroder's rationale for using state fixed effects. The rationale for the state average estimator deserves closer examination. First, although the state-average estimator is likely to mitigate problems associated with delayed responses, another way to accomplish this and still include fixed state effects is to include lagged explanatory variables. In modeling SSA disability program participation using annual data, we have successfully used lags of as long as three years for unemployment with as few as seven years of data. The lagged variable strategy is likely to be even more successful with quarterly data, as the results reported by Cromwell et al. seem to indicate.

Second, use of lagged values will also help mitigate the problem of permanent versus transitory shocks to the extent that "agent" expectations are based on past experience rather than on other information that might portend a different future.

There is a third problem that Shroder does not mention, but which is mathematically identical to the permanent versus transitory shock problem: measurement error associated with the explanatory variables. As is well known to econometricians, random measurement error on an explanatory variable generally biases the variable's coefficient toward zero (i.e., the estimate understates the magnitude of the true coefficient), and the size of the bias is positively related to the share of the variation in the variable that is due to measurement error. Measurement errors in state average data tend to cancel each other out, implying that the share of variation in state averages for a variable that is due to measurement error is *lower* than the corresponding share for levels of the same variable in any given year. When year-to-year changes in a variable are examined, however, the share of variation due to random measurement error is *higher* than the share for the level of the same variable in a given year because some permanent variation across states has been removed and the variance of the random change in the measurement error is twice as large as the variance of the measurement error itself.

While we concur that the problems of delayed behavioral responses and permanent versus transitory shocks (including the measurement error problem) are problematic for the fixed state effects estimator and are mitigated by the state average estimator, it is preferable to deal with these problems as directly as can be done in the context of the fixed state effects estimator (e.g., by using lagged explanatory variables) than to resort to the state average estimator. Even though the latter mitigates these problems, it does so at the expense of accepting bias due to the state fixed effects. There are compelling reasons to believe that state fixed effects are important, and it seems likely that the bias resulting from ignoring them would be very large.

Based on Shroder's fixed effects estimates, the relationship between the recipiency ratio and the level of AFDC benefits is dominated by the effect of the former on the latter, although a significant reverse effect is identified. The fixed effect estimate of the coefficient of the benefit variable in the recipiency equation is both significant and large; a 10 percent increase in the benefit measure is associated with an increase in participation of almost 17 percent. The fixed effect estimate of the recipiency ratio coefficient is negative and significant, but not very large; a 10 percent increase in the recipiency ratio is associated with just a one percent reduction in benefits.

The fixed effects estimates of the own wage and unemployment coefficients in the participation equation are both statistically significant. A 10 percent increase in the benefit is associated with a 17 percent increase in the recipiency ratio, while a one percentage point increase in the unemployment rate is associated with a 3.5 percent increase in the ratio. Although his dependent variable differs from those used by CBO and Cromwell, his estimated effect is of the same magnitude for states with combined programs.

Only one of the four neighbor variables in Shroder's model has a statistically significant coefficient in the fixed effect estimates of the participation equation: a 10 percent increase in the neighbor wage index is associated with a 10 percent decline in the recipiency ratio. The neighbor AFDC benefit, unemployment and marriage variables are not statistically significant.

The state average estimates differ in many ways from the fixed effects estimates, indicating that the two sets of relationships captured by the two estimators are quite different from one another. Hence, it is important to recognize that these two sets of relationships are not the same.