The randomized design, large samples, and straightforward estimation methodology eliminate the major reasons for questioning the evaluation results. However, despite these strengths a number of methodological issues arose that could cast doubt on the validity of the estimates obtained. The issues include:
 Actual equivalence of the treatment and control groups
 Comparability of the baseline data for treatment and control groups
 Sample attrition
 Validity of pooling observations across sites and models
 Differences between early and late cohorts of sample members
 Inappropriateness of regression procedure for some outcomes
 Effect of use of proxy respondents on impact estimates
Each of these issues was examined early in the analysis to determine whether it would lead to biased or distorted estimates of channeling impacts, and, when necessary, procedures were developed to avoid such distortions. For the major issuescomparability of the baseline data and attrition biasseparate reports were prepared (Brown and Mossel, 1984; Brown et al., 1986) that describe the analyses in detail. For the other issues, internal memoranda document the analyses. The results from these investigations are summarized below. The documents from which they were drawn are available from the author upon request.

A. The Equivalence of Treatment and Control Groups at Randomization

Due to the random assignment of eligible channeling applicants, the control and treatment groups should be composed of individuals that on average were very similar at the time of application on any observed or unobserved characteristic. Hence, the control group should yield reliable estimates of what would have happened to clients in the absence of channeling, and comparison of outcomes for the treatment and control groups therefore should yield reliable estimates of channeling impacts.
Only two factors (other than measurement error) could cause the mean values of the preapplication characteristics of the full treatment and control groups to differ: deviation from the randomization procedures and normal sampling variability. Deviations from the carefully developed randomization procedures could be either deliberate (e.g., intake workers purposely misrecording as treatments some applicants who were randomly assigned to the control group, but who had especially pressing needs for assistance) or accidental. The dedication and professionalism of the channeling program staff at each site and the safeguards built into the assignment procedure made either occurrence very unlikely. Site staff were extremely cooperative in faithfully executing the procedures. (See Phillips et al., 1986, for details of the randomization procedures.)
Sampling variability, on the other band, is the difference between the two groups that occurs simply by chance. For the sample sizes available at the model level, such differences between the two groups should be very small, and are expected to be statistically insignificant.
Despite the expected small, chance differences between the two groups, the implications of large chance differences for estimates of program impacts was so great that it was necessary to verify that in fact the two groups were comparable. This assessment was carried out by comparing mean values of screen characteristics for the treatment and control groups in each model, adjusting for the unequal distribution of the two groups across sites. The following screen characteristics were examined:

Demographic: age, sex, ethnic background.

Financial Resources: monthly income, types of insurance coverage.

Living Arrangement: proportion in longterm care institution; proportion living alone, with spouse, with others, or with spouse and others.

Health and Functioning (see below): activities of daily living (ADL) index, cognitive impairments affecting functioning, unmet needs for service.

Help Received: whether help was received in the areas of meal preparation, housework or shopping, taking medicine, medical treatments at home, and personal care; expected lack of sufficient support from family and friends in coming months (fragile informal supports).

Referral Source: whether referred to channeling by family, by a hospital, by a home health agency, etc.

Nursing Home Application: whether had applied for admission to nursing home or were on a nursing home waiting list at screen.
Estimates of the differences between the treatment and control groups were obtained by regressing the screen characteristics on two binary variables representing treatment status (one each for basic and financial control models) and 10 binary site variables. The coefficients on the treatment variables provided the estimates of the treatment/control differences in means, controlling for the different distribution of the two groups across sites. Estimates of the treatment/control differences in means of these variables at each site were also examined. Both the model and site level differences were tested to determine whether they were larger than could reasonably be expected to occur because of chance sample variation.
This analysis, presented in detail in Brown and Harrigan (1983), showed that there were very few variables for which treatment/control differences were statistically significant. Of the 53 screen variables examined for each model, there was only one characteristic for which differences were statistically significant in the basic model and four in the financial control model. Furthermore, even the significant differences were small in magnitude (three percentage points or less for binary variables) and (with one exception) occurred for characteristics possessed by less than seven percent of the sample. Treatment/control comparisons at the site level yielded similar conclusions: the number of statistically significant differences was no larger than would be expected by chance and no patterns of differences were found to indicate that noncomparable groups were obtained in any site.
Thus, although there may be unobserved differences between the two groups, the comparisons an observed characteristics provided no evidence of either systematic deviations from the random assignment procedures or important treatment/control differences arising by chance. We concluded that the control group provided a reliable measure of what would have happened to the treatment group in the absence of channeling, and therefore, simple comparisons of outcomes for treatment and control groups (controlling for differences in distribution across sites) should yield unbiased estimates of channeling impacts.



B. The Comparability of Baseline Data for Treatment and Control Groups

Another aspect of the evaluation design which could have raised questions about the accuracy of the estimates of channeling impacts was that the baseline data were collected by different types of interviewers for the treatment and control groups. The combination of several factorconflicts between research needs and good case management practices, data collection costs, and the desire to minimize the burden on sample membersled to the decision that baseline data would be collected by channeling staff for members of the treatment group, and by research interviewers for the control group. For a variety of reasons, this difference in data collection could result in differences between the two groups on observed data for some characteristics, when in fact no real differences exist between the two groups on these baseline characteristics. Estimates of channeling impacts that are obtained from regression models which use these baseline data as explanatory variables could then be distorted, because these artificial differences between the two groups are treated as real pretreatment differences that must be accounted for (netted out) by the regression.
Brown and Mossel (1984) conducted an extensive analysis to determine whether the baseline data for treatments and controls were comparable and, if not, what needed to be done to ensure that regression estimates of channeling impacts would not be biased by such differences. Reasons why baseline data may differ for the two groups were identified, including:
 True differences at randomization due to chance
 True differences due to different patterns of attrition between randomization and baseline
 Spurious differences due to differences in the length of time between randomization and baseline for the two groups
 Spurious differences due to incentives of clients or their proxy respondents to overreport needs and impairments to channeling staff (who used the baseline to prepare a care plan for the client), and to underreport ability to pay for needed services
 Spurious differences due to differences between research interviewers and channeling staff in how questions were asked (including clarifications and probing), and how answers were recorded
 Treatmentinduced differences due to anticipated or actual effects of channeling on the treatment group prior to baseline (and known lack of assistance from channeling for the control group)
 Spurious differences due to the differential usage of proxy respondents
As indicated in the previous section, comparison of treatment and control groups on screen variables for the full sample indicated virtually no differences outside the range of normal chance variation. Comparison of the screen characteristics of treatments and controls for baseline respondents indicated that attrition at baseline had led to very few differences between the remaining treatment and control groups. A model of baseline attrition confirmed that only for a few screen variables was the relationship between sample member characteristics and the probability of response significantly different between treatment and control groups.
Despite the overwhelming evidence, based on screen characteristics, that there were essentially no true treatment/control differences at randomization due to chance, and only minor differences due to differential attrition, Brown and Mossel (1984) found a substantial number of large and statistically significant differences between the two groups on baseline variables, including some of the same variables for which no differences were found on the screen. Although real differences between the two groups (either due to differential attrition, or to preexisting differences not detected by screen measures) could not be ruled out entirely, they concluded that differential measurement was largely responsible for the observed baseline differences between treatments and controls. This conclusion was based on several pieces of evidence:
 The finding that very few screen variables exhibited statistically significant differences between treatments and controls among baseline respondents
 The finding that few screen variables exhibited a significantly different impact on the probability of baseline response for treatments than for controls
 The many statistically significant and occasionally very large treatment/control differences found on baseline variables, including some for which no difference was found on the screen version of the same variable
 The general correspondence of results with a priori expectations about which variables were likely to be affected by noncomparable measurement and the direction of the treatment/control differences
 The timing and proxy use differences that were known to exist at baseline and which were obviously responsible for the observed differences on some of the baseline variables and probably responsible for the differences on some others
 The general correspondence of treatment/control differences at baseline with baselinereinterview differences observed for a subsample of treatment group members who were given a second baseline by research interviewers
Brown and Mossel then showed how regression estimates of channeling impacts would be affected by the use of noncomparable data items as explanatory variables in the regression. The expressions for bias induced by noncomparable data suggested two types of tests of baseline variables to determine whether the baseline differences were so large that it was unlikely that they represented true treatment/control differences and therefore might cause significant bias in estimates of channeling impacts, or small enough that they may well be due to chance and were unlikely to affect impact estimates. The two testsone for baseline variables for which comparable measures were available on the screen, and one for variables that had no such screen counterpartsmade use of all of the available information.
For baseline variables with screen counterparts, the test was for whether the treatment/control differences at baseline were significantly different from the treatment/control differences in the screen version of the variable for the same individuals. For those variables for which the hypothesis of no differential was rejected, the baseline version of the variable was considered noncomparable, and only the screen version was used in future analyses. Variables for which no significant differential was found were considered to be comparably measured at baseline and therefore the baseline version could be included as a control variable in later analyses. The conclusions based on this procedure were then compared to the results obtained from the reinterview sample, which were based on comparison of baseline and reinterview responses on these same questions. The two sets of results were found to be broadly consistent in terms of which variables appeared to be noncomparable, and the direction of the differences.
For baseline variables that had no screen counterpart, the procedure used was to regress these variables on treatment status, site, and the variables selected from the group with screen counterparts, and test whether the coefficients on the two treatment status variables (for basic and financial control models) were significantly different from zero. This was a test of whether there were treatment/control differences in these baseline variables beyond what could be explained by the small observed differences at screen in a set of other variables. Variables for which this hypothesis was rejected were then considered noncomparable, under the assumption (based on the evidence cited above) that any such remaining differences were more likely to be due to noncomparable data rather than real differences. Again, the results obtained were found to be broadly consistent with the reinterview sample comparisons of baseline and reinterview responses.
The two sets of tests yielded the following conclusions regarding the comparability of the baseline variables that were used as control variables in a preliminary analysis of channeling impacts at 6 months (Kemper et al., 1985) and were then being considered for use in the final analyses:
Comparable Baseline Variables Noncomparable Baseline Variables Age (*)
Sex (*)
Insurance (*)
Living arrangement (*)
Nursing home waiting list (*)
Home ownership
Stressful events
Hours of informal care received (per week)
Hours of formal care received (per week)
Number of physician visits
Global life satisfactionEthnicity (*)
Income (*)
ADL (*)
IADL
Unmet needs (*)
Attitude toward nursing home
Education
Assets
SPMSQ
Medical conditions
Selfrating of health
Restricted days last 2 months
Hospital days last 2 months
Nursing home days last 2 months(*) indicates that a screen version of the variable exists Only those baseline variables found to be comparable were included as control variables in the final channeling analyses. For noncomparable baseline variables with screen counterparts, the screen version was used as a control variable in its place. The other noncomparable baseline variables were excluded from the set of control variables, with the exception of hospital and nursing home days, which were replaced with information from the screen on whether the sample member was in a hospital or nursing home at screen or referred to channeling by hospital or nursing home staff.
The exclusion of the noncomparable variables is not likely to have caused serious problems for the analysis. Estimates of channeling impacts obtained from regressions with control variables drawn only from the screen were found to be different for some outcome measures from those obtained from regressions using the (comparable and noncomparable) baseline control variables, as expected, but the standard errors of these impact estimates were virtually unaffected by this difference in regressors. Thus, the argument that increased precision would be obtained if the more complete baseline data were used as control variables was not borne out in this case. It is the case, however, that any attritioninduced differences between treatments and controls on excluded characteristics were not controlled for in estimating channeling impacts. The evidence' in Brown and Mossel suggests that real differences between the two groups are likely to be considerably smaller than the observed differences in the data. Thus, failure to control for such real differences, if they exist, is likely to have caused less bias than attempting to account for them by using control variables that were not comparably measured for the two groups. However, the inability to examine impacts for subgroups defined on potentially important but noncomparable variables such as SPMSQ, medical conditions, attitude toward nursing home, and IADL weakens that analysis.


C. The Effects of Sample Attrition

The experimental design of the channeling evaluation was chosen to ensure that the experience of the control group would provide a reliable estimate of what would have occurred to treatment group members in the absence of the demonstration. However, as noted above, attrition from the carefully drawn channeling sample could thwart these intensions if the sample available for analysis after attrition were not comparable for the two groups. Regression models were used in the evaluation to control for observable differences between the treatment and control groups that could arise because of attrition, but estimates may still be biased if the two groups differ on unobservable characteristics. Bias occurs if (1) those sample members for whom data are available differ on unobservable characteristics from those for whom data are not available, (2) those unobservable factors also affect outcomes of interest, and (3) rates or patterns of attrition differ for treatment and control groups.
For each of the major areas of analysis in the evaluation, an analysis sample was defined which included those observations in the research sample for which the data necessary for analysis were available. Thus, the following analysis samples were defined:
 6/12 and 18 month Medicare samples (for hospital outcomes)
 6, 12, and 18 month nursing home samples (nursing home outcomes)
 6, 12, and 18 month followup samples (wellbeing outcomes)
 6, 12, and 18 month incommunity samples (formal and informal care outcomes)
As shown in Table IV.1, the percent of the full sample included in most of these analysis samples was somewhat greater (about 6 to 14 percentage points) for treatments than for controls, especially in the financial control model. Thus, one of the conditions that, in combination with the other two, could lead to bias was present. These differences were due primarily to treatment/control differences in response rates at the baseline interview. However, despite this difference in rates of attrition, the analysis samples exhibited only minor treatment/control differences on initial screen characteristics.
TABLE IV.1. Percent of Full Sample Included in Analysis Samples Basic Model Financial Control Model Full Sample Treatments Controls Total Treatments Controls Total Treatments Controls Total Number of Observations in Full Sample 1,779 1,345 3,124 1,923 1,279 3,202 3,702 2,624 6,326 6 Month Outcomes Percent of Full Sample Included in: Medicare sample 50.4 82.1 86.8 93.3 81.9 88.8 91.9 82.0 87.8 Nursing home sample 72.0 67.1 69.9 80.5 67.3 75.2 76.4 67.2 72.6 Followup sample 66.4 62.0 64.5 73.1 59.2 67.5 69.9 60.6 66.0 Incommunity sample 54.8 51.5 53.3 62.3 48.9 56.9 58.7 50.2 55.2 12 Month Outcomes Percent of Full Sample Included in: Medicare sample 90.4 82.1 86.8 93.3 81.9 88.8 91.9 82.0 87.8 Nursing home sample 76.4 69.5 73.4 82.0 68.9 76.8 79.3 69.2 75.1 Followup sample 59.1 52.1 56.1 63.0 51.4 58.4 61.2 51.8 57.3 Incommunity sample 47.1 41.0 44.5 50.7 40.7 46.7 49.0 40.9 45.6 18 Month Outcomes Number of Observations in 18month Cohort 922 697 1,619 926 620 1,546 1,848 1,317 3,165 Percent of Cohort Included in: Medicare sample 89.3 84.9 87.4 94.1 80.8 88.8 91.7 83.0 88.1 Nursing home sample 69.8 68.1 69.1 78.8 64.4 73.0 74.4 66.4 71.0 Followup sample 43.8 40.3 42.3 50.9 40.2 46.6 47.4 40.2 44.4 Incommunity sample 33.6 31.3 32.6 38.8 31.5 35.8 36.2 31.4 34.2 To investigate whether impact estimates based on these analysis samples were likely to be biased because of attrition, two types of analyses were performed during the evaluation and reported on in Brown et al. (1986)a heuristic approach and a statistical modeling approach. Under the heuristic approach, Medicare data, which were available for virtually the entire research sample, were used to construct several variables measuring the amount of Medicarecovered services used, including hospital days and expenditures, nursing home days and expenditures, and several types of formal communitybased and physician services. Channeling impacts on these Medicareonly variables were then estimated on the full sample, and again on the various analysis samples. These two sets of estimates were then compared to determine whether limiting the analysis to observations in the analysis samples produced different estimates than the full sample.
For the variables examined, the impact estimates obtained on the analysis samples rarely differed substantively from those for the full sample. This was especially true for the Medicare sample. Since over 98 percent of all hospital use by sample members was covered by Medicare, it was clear that attrition led to no bias in estimated impacts on hospital outcomes. For other outcomes and samples, however, this type of comparison was less compelling: although there were few instances of noteworthy differences between the full and analysis samples on the Medicarecovered variables examined, the Medicare data covered only a fraction of the total use of nursing homes and formal services and contained no information at all on other key outcomes, including wellbeing and informal care. Thus, it was possible that estimated impacts on these other outcomes would be biased by attrition, even though the estimates on Medicarecovered outcomes were not. Alternative procedures were required to determine whether attrition bias for these outcomes was present.
A statistical model developed by Heckman (1979) to control for the nonrandom selection of an analysis sample was used for this purpose. For each analysis sample, a model was estimated to predict which of the full sample observations were retained in the analysis, as a function of personal characteristics measured on the screening interview. Each estimated "sample inclusion" model was then used to construct for each member of the corresponding analysis sample a new variable that, when included as an additional explanatory variable in the regression equation used to estimate channeling impacts, controls for the effects of attrition. The coefficient on the constructed attrition bias term was then tested for statistical significance to determine whether the condition necessary for regression estimates to be biased by sample attrition was met.
This procedure was implemented for the 6, 12, and 18month measures of the following key outcomes:
 Nursing home outcomes (nursing homes samples)
 whether admitted
 number of days in nursing homes
 nursing home expenditures
 Wellbeing outcomes (followup samples)
 number of unmet needs
 number of impairments on activities of daily living
 whether dissatisfied with life
 Formal and informal care outcomes (incommunity samples)
 whether received care from visiting formal caregivers
 hours of formal inhome care received
 number of visits from formal caregivers
 whether received care from visiting informal caregivers
 hours of care received from visiting informal caregivers
 number of visits from informal caregivers
In general, this procedure yielded very little evidence of attrition bias. The estimated correlations between unobserved factors affecting attrition and those affecting a given outcome variable were typically small and rarely significantly different from zero. Impact estimates obtained from the regressions which included the control variable for the effects of attrition were very similar to the impact estimates obtained without this correction term.
Finally, to ensure that the results obtained from the statistical correction procedure were not distorted by overly restrictive assumptions, Brown et al. (1986) developed a somewhat more general model that would take into account two possible differences between treatments and controls and between models: differences in the relationship between observed (screen) characteristics and attrition, and differences in the covariance between unobserved factors affecting attrition and those affecting the outcome variable under examination. Use of this more general procedure showed (1) that the attrition models were not very different for treatments and controls or for basic and financial control models, and (2) that although there were some substantive differences between the 4 treatment/model groups in the correlations between unobserved factors, controlling for them separately yielded no convincing evidence that the unadjusted estimates were biased by attrition.
Although both the heuristic and statistical approaches led us ultimately to conclude that attrition bias was not a major problem, there were a number of isolated results that, if viewed alone, would have caused greater concern about attrition. To further ensure that no important evidence of attrition bias was being overlooked, the results from the heuristic Medicare data analysis were compared to those obtained from the statistical analyses for each outcome area to see if the alternative approaches both indicated that attrition bias might be a problem for any given set of outcomes. The specific patterns of attrition implied by the two approaches were also compared for consistency.
Estimates of impacts on hospital outcomes were shown conclusively to be unaffected by attrition, based on Medicare data alone. For nursing home outcomes, the Medicare comparison showed no evidence of bias in the estimates, and the only evidence to the contrary from the statistical procedure was two cases in which impact estimates changed in statistical significance. However, in both of these instances, the impact estimates changed only marginally after controlling for the effects of attrition, going from slightly below the critical value for statistical significance to slightly above it (and vice versa). Furthermore, the results that ostensibly controlled for the effects of attrition had the implausible implication that the bias was in one direction at 6 months and in the opposite direction at 12 months, and occurred only in the basic model. Finally, some sensitivity tests were performed which showed that estimates of channeling impacts changed only slightly under a variety of different assumptions about the use of nursing homes by those with missing data. Thus, it seemed clear that estimates of impacts on nursing home outcomes were not biased by attrition, and it was virtually certain that conclusions about the lack of channeling impacts on nursing home use would not change even if some bias did exist.
For wellbeing outcomes, the Medicare data could provide no direct evidence concerning attrition bias, but comparison of the full and followup sample estimates of impacts on Medicarecovered services suggested that bias was potentially a problem only for the basic model, and only at six months. However, the results from the statistical procedure to measure attrition bias implied that there was no bias in any of the wellbeing outcome measures examined in any time period for either model.
For formal care outcomes, the incommunity sample estimates of impacts on use of Medicarecovered services were very similar to the estimates obtained on the full sample in all three time periods for the financial control model, and at 12 and 18 months in the basic model. However, at 6 months in the basic model, estimated impacts on skilled nursing visits and reimbursements were statistically significant for the analysis sample but not for the full sample. This suggested that the incommunity sample estimates of impacts on use of formal care might be overstated in this time period for the basic model because of attrition. However, the statistical significance of the impact estimates did not differ between the two samples for several other outcomes even in this period, nor was the magnitude of the difference that great even for skilled nursing (13 percent of the control group mean for the full sample estimate compared to about 24 percent of the control group mean for the analysis sample estimate). The lack of evidence of bias at 12 months and in the other model led us to doubt further that attrition bias was a major problem for the estimates of impacts of formal care. This conclusion was further supported by the results from the statistical analyses, which indicated an absence of the conditions necessary for attrition bias and strong similarity between impact estimates obtained using the procedure to control for the possible effects of attrition and estimates obtained without such control.
For informal care outcomes the evidence was was less clear cut. The above comparison of estimated impacts on Medicarecovered services for the full and incommunity samples suggested that attrition from the incommunity sample used in the informal care analysis was not systematic. However, because the Medicare claims lack data on informal care outcomes, this analysis provided only weak evidence that no bias occurred in estimates of impacts an informal care. The results from the initial statistical procedure showed no evidence of bias, but the other, less restrictive statistical approach of controlling for attrition effects led to results that implied serious bias in the estimates for both channeling models. Whereas the unadjusted results implied no effect of channeling on informal care in the basic model, and (at most) modest reductions in the financial control model, the latter adjusted estimates showed large, statistically significant reductions in informal care in the basic model and no reductions in the financial control. Also, both the Medicare and more general statistical approaches implied similar patterns of attrition, i.e., that the systematic attrition occurred mainly for the treatment group in the basic model. However, a number of factors were identified by Brown et al. which suggested that this result was a statistical anomaly rather than credible evidence of severe attrition bias. Hence, we concluded that informal care impact estimates were probably not biased by attrition either.
The two approaches used in this analysis of attrition each have their flaws. The heuristic approach of seeing how estimated impacts on some variables changed when the analysis was restricted to a subset of the full sample is appealing because it provides a direct measure of attrition bias, albeit for variables other than those in which we are most interested. Reliance on these results as proof that there is no attrition bias in the estimated impacts on those outcomes in which we are interested requires belief that any unobserved factors affecting both attrition and the outcomes of interest also affected the Medicare outcomes. Although this assumption may be plausible, it obviously cannot be verified.
The statistical approach is also appealing, but for different reasonsit pertains to precisely the outcome variables of interest, provides a direct test of whether there is bias in the estimates obtained on the analysis sample, and also offers a way to obtain unbiased estimates of impacts on any outcome. The more general model developed and used in Brown et al. adds to the attractiveness of this approach by making the results sensitive to potentially different observed and unobserved patterns of attrition for treatment and control groups. However, in either statistical model the estimates may be quite sensitive to the assumptions of the model (bivariate normal disturbance terms in the outcome and sample inclusion equations), may reflect other nonlinear relationships between the outcome and control variables that have nothing to do with attrition, and are sensitive to colinearity between the correction term and the control variables in the outcome equations.
Despite these flaws, the analyses that were conducted on attrition from the channeling sample greatly exceed what is normally done or is possible to do to examine attrition bias, because the data available from the screen and Medicare claims on nonrespondents greatly exceeds what is usually available on sample dropouts. By definition, it is never possible to know with certainty what results would have been obtained had no sample attrition occurred. The heuristic and statistical approaches were the best methods available to assess the effects of attrition on our impact estimates, and both approaches provided convincing evidence that the inferences drawn from the analysis samples about the existence and magnitude of channeling impacts were no different from what would have been drawn had the full sample been available for analysis.


D. The Validity of Pooling Observations

In selecting a regression model to estimate channeling impacts, a key issue was poolingi.e., whether channeling impacts for each model could be accurately estimated by a single parameter in a single regression equation estimated on the full sample, or whether segments of the sample were so different from each other that a single equation or parameter would not accurately or adequately reflect the real relationships and would produce distorted impact estimates. Three pooling issues were examined:

Can valid estimates of channeling impacts at the model level be obtained by treating observations from any site implementing the model as if they were all from the same site, or must separate impact estimates be obtained for each site and then explicitly averaged to obtain model impacts?

Can a single regression equation be used to estimate channeling impacts at the model level, or are separate equations necessary for each site and/or treatment group in order to obtain valid estimates?

Can valid estimates of impacts at the site level be obtained from a single regression equation, or are separate equations necessary for each site?
The regression model specified in Chapter III is based on the assumption that the above types of pooling are appropriate. That is, a single equation was estimated using all observations, with impacts for each model represented by a single parameter. The advantage of pooling is that if the restrictions on regression estimates implied by pooling are true, much more precise estimates (i.e., estimates with smaller variances) can be obtained because only one estimate is being made for each model rather than one for each site. The possible disadvantage of pooling is that if the implied restrictions are not true, pooling observations could produce biased and misleading estimates of the model or site level impacts. The analysis described below was conducted to determine whether the smaller variances produced by pooling observations could be obtained without distorting estimates of channeling impacts.
1. Were Separate Impact Estimates for Each Site Necessary to Accurately, Estimate Model Impacts?
The type of pooling of greatest concern for this evaluation was whether a single parameter would be sufficient to estimate the effects of a channeling model or whether impacts were so different across sites that separate impact estimates were required for each site. In the latter case, model impacts would be obtained by computing a weighted average of the estimated impacts for the five sites implementing the model.^{26}
The restriction implicit in using a single parameter is that impacts are the same in all sites implementing a given model. This restriction was tested by estimating an unrestricted version of thisan equation with 10 site* treatment interaction terms in place of the two binary treatment status variables used in equation 1and testing whether the coefficients on the 5 site* treatment terms involving a given channeling model were equal to each other.
This test was conducted for a set of 14 key outcome variables at 6,12, and 18months, including hospital and nursing home use (whether admitted, number of days), receipt of case management,^{27} receipt of formal and informal care (whether received, hours received),^{28} sample member wellbeing (number of unmet needs, number of impairment on activities of daily living, degree of global life satisfaction), and sample members' living arrangement (in community, hospital, nursing home, or, deceased). Of the 82 tests (41 for each model), the hypothesis that impacts were equal across sites was rejected in eight cases. The eight cases included whether case management was received, for both the 6 and 12 month measures in both models, and four scattered outcome measures at 18 months (for which the sample sizes were smaller by half). This is a relatively small proportion of the tests and the fact that results were strongest for case management outcomes made it less troubling, since impacts were large and statistically significant in all sites. Even more compelling was the finding that even for those eight outcomes, impacts at the model level computed from the equation yielding separate site impact estimates tended to differ little from model impacts computed from the equation without sitespecific impacts. Thus, even if channeling impacts differed across sites, model level impact estimates were not distorted by the implicit assumption to the contrary in the pooled specification (equation 1). The smaller standard errors led us to prefer the pooled specification.
2. Were Separate Equations for Each Site and/or Treatment Group Necessary to Estimate Channeling Model Impacts?
Estimating a single equation on all of the observations combined implicitly constrains the estimated relationship between client characteristics and outcomes to be the same in all sites. However, if this assumption were not true, the estimated impacts at the model level from the pooled data could be distorted. The test described in the previous section addressed only the issue of whether separate impacts for each site were required, and was based on the assumption that a single equation for all sites was appropriate. If the assumption were incorrect, the results from the above tests could be erroneous as well.
To test the implicit constraints implied by pooling observations from all of the sites, separate equations were estimated for each site and the sum of squared residuals from these regressions was compared to the sum of squared residuals from the single equation. The Ftests constructed from these two sums for each outcome variable showed that in 10 of the 41 instances, the constraints on the regression coefficients implied by pooling were rejected. However, since our concern was only with whether estimates of channeling impacts were distorted by estimating a single equation rather than separate equations, we used the sitespecific equations to construct an estimate of channeling impacts at the model level^{29} and then compared this estimate to the impact obtained from a single equation, for each of the key outcome measures. For each of the 82 comparisons the difference between the two alternative estimates was slight. Thus, despite the greater than chance incidence of formal rejection of the constraints on regression coefficients implied by pooling, the primary estimates of interest for the evaluation (channeling model impacts) were unaffected by estimation of a single equation rather than sitespecific equations.
We also tested another set of restrictions that are implicit in the use of a single equation: that the relationship between outcomes and sample member characteristics were the same for treatments and controls. As always, the concern was with whether these implicit constraints, if not appropriate, would lead to different estimates of channeling impacts. Performing statistical tests of these restrictions indicated that for only 3 of the 41 outcomes examined were the implied restrictions rejected. Again, even for the 3 outcomes for which the constraints on the coefficients on explanatory variables were formally rejected, the impact estimates obtained from the separate equations were very similar to those obtained from the single equation.
Based on the above findings, we concluded that use of a single equation provided the best estimates of channeling impacts at the model level. The single equation yielded very similar impact estimates with considerably (up to 20 percent) smaller standard errors, thereby reducing the probability of erroneous inferences of the types discussed in Chapter III.
3. Can Valid Estimates of SiteSpecific Impacts be Obtained from A Single Equation?
Despite the widespread findings that impacts at the model level did not seem to be distorted by pooling, there was still some concern that the sitespecific impact estimates to be computed (see Applebaum et al., 1986) might be distorted if they were obtained from a single equation (with site*treatment interaction terms) rather than from separate equations for each site. Comparison of the two alternative estimates showed that of the 530 impact estimates,^{30} 438 were not significantly different from zero whether the single or multiple equation variant was used. Of the remaining 92 estimates, 65 were statistically significant under both procedures and in all but 2 of these cases the estimate was quite similar in magnitude. There were 19 cases in which the single equation estimate was statistically significant but the separate equation impact estimate was not. In over half of these cases however, the estimates were quite close in magnitude, and the insignificant estimate had tvalues very close to the critical value. The reduction in standard errors achieved by pooling was the primary reason for these differences in significance. Finally, there were 8 instances in which the separate equations produced statistically significant impact estimates at the site level, but the single equation did not. In most of these cases the two estimates differed substantially in size as well as significance.
We concluded that estimates of impacts at the site level obtained from a single regression equation would only rarely yield different conclusions about channeling impacts than would the estimates obtained from the unpooled model. Furthermore, even when different it may well be the case that the pooled estimate would be preferred because the standard errors would be smaller.



E. Differences Between Early and Late Cohorts of Sample Members

From the outset of the demonstration it was recognized that the impacts of channeling might vary with the length of time since the client entered the program, as clients' needs and health status change and as case managers and clients become more familiar with each other. However, comparing estimates of channeling impacts at 18 months to those obtained at 12 months could result in misleading inferences about such changes because, as pointed out in Chapter II, only half of the sample was followed up at 18 months, and time constraints led to defining this group as the half who entered the sample earliest. Erroneous inferences would occur if channeling's effectiveness changed with calendar time (because of specific changes in the environment in which channeling operates or in the program itself) rather than with the length of time the sample member was in the program. Alternatively, program effectiveness could change if the type of clients served by channeling changed over time. Since the 18month cohort consists of those enrolling earliest, we must ensure that any differences between 12 and 18month results are not due to differences in the calendar period covered by the early and late cohorts or to differences between the cohorts rather than to the length of time spent in channeling.
To distinguish changes in impacts due to length of time in the program from those due to cohort effects such as those just described, estimated impacts on a set of 14 key outcomes (those used in the attrition and pooling analyses) at 6 and 12 months for the early cohort were compared to the corresponding estimates for the late cohort. Equivalence of the impacts at these earlier points would suggest that comparison of 18month estimates obtained on only the early cohort to estimated impacts at 12 months based on the full sample should be interpreted as effects of the length of time in channeling. A finding of statistically significant differences between cohorts in impacts during the 16 and 712 mouth periods would indicate that 18month results should be compared to 6 and 12month results estimated on only the early cohort.^{31} While such cohort differences for the early periods would not necessarily imply that any differences in estimated impacts between 12 and 18 months would be due to cohort effects rather than to the length of time in channeling, it would suggest that possibility.
To investigate this issue, the standard regression model shown in equation 1 was modified in order to estimate separate impacts of channeling for each cohort on the key outcome variables listed in Section D above. The modification was to replace each of the binary treatment status variables in equation 1 with two new binary variables, the first equal to 1 only for treatment group members in that model in the early cohort and the second equal to 1 for treatments in the late cohort for that model. Two additional binary variables were also added to the regression equation, one for each channeling model, indicating whether the sample member was in the late cohort. The coefficients on the four new treatment variables provided estimates of channeling impacts for the two cohorts for each channeling model. The coefficients on the cohort indicator variables provided estimates of the differences in mean outcomes between cohorts for the control group in each model, controlling for possible differences between the cohorts on other explanatory variables.
For each key outcome measure, the revised regression equation was estimated and an Ftest was performed (separately for basic and financial control models) to test for significant differences between the impact of channeling for the early cohort and the impact for the late cohort. In addition, multivariate tests were conducted on groups of related outcome measures to determine whether jointly, across the set of outcomes, impacts for the early cohort differed from those estimated for the late cohort.
The tests indicated that channeling impacts differed very little between cohorts at 6 and 12 months after randomization. Of the five instances of significantly different estimates (out of 72 tests), two were for receipt of case management at 6 months, for which the impact estimates were large, positive, and highly significant for both cohorts. Thus, even though the estimates were statistically different, the inferences to be drawn from the case management results were the same for both cohorts. The fact that it was changes in the control group which were responsible for the observed differences between cohorts in impacts on case management suggests that channeling may have changed relatively little, but the availability of nonchanneling case management may have changed over time.
The remaining three instances of significant differences by cohort were isolated, and two of these occurred at 6 months. This is important, since it is the comparison of impacts at 12 and 18 months that we were most concerned about being distorted by cohort effects. 'Whether formal care was received" was the only 12month outcome for which a statistically significant difference across cohorts was found, and only for the basic model (although the cohort differential in the financial control model was nearly as large and had a test statistic only slightly smaller than the critical value for significance at the .05 level). The difference in impacts was due entirely to the significant difference (decline) between the early and late cohorts in the proportion of the control group receiving formal care. Whether this drop was due to different attrition of controls for the two cohorts, to changes in the types of clients attracted, or to changes in the local availability of formal services is not clear. However, the two former explanations do not seem likely given that the proportion of controls receiving formal care at baseline was very similar for the two cohorts in the basic model57 and 55 percent for early and late cohorts, respectively. The fact that estimated impacts on hours of formal care did not differ significantly across cohorts further increased our confidence that cohort differences did not distort the comparison of 12 and 18month impacts in general.
We concluded that estimates at 18 months on the early cohort could be compared to those at 12 months for the full sample with little concern that the comparison would be distorted by differences between the cohorts. The exception to this conclusion was that if such comparisons for formal care outcomes suggested sizeable changes in impacts between 12 and 18 months, it would be important to interpret these changes in light of the cohort difference identified here. In the final analysis of channeling impacts on use of formal community services, Corson et al. (1986) did in fact find a marked decline in impacts between 12 and 18 months, which was attributed to this cohort effect.


F. Potential Problems with Regression Analysis

Under certain statistical assumptions, the regression procedure described in Chapter III will provide unbiased estimates of channeling impacts. The assumption on which unbiasedness depends is that the disturbance term representing the unobserved factors affecting outcomes be uncorrelated with the screen/baseline control variables and treatment status. This condition is not definitely verifiable, but the fact that sample members were randomly assigned to treatment and control groups makes it unlikely that the disturbance term is correlated with treatment status; hence, estimates of channeling impacts obtained by regression are expected to be unbiased.
Unbiasedness is not the only desirable property of the estimates, however. When outcome variables are not normally distributed, regression estimates lose some of their other desirable properties and may exhibit other characteristics that are undesirable. Two types of channeling outcome variables that had nonnormal distributions were those that were binary or truncated at zero, and those that were skewed (i.e., that had extremely large values for a small number of observations). Analyses were conducted to determine whether the regression estimates of impacts on these two types of outcomes were distorted or less reliable in some way than alternative estimates.
1. The Validity of Regression Estimates of Channeling Impacts for Binary and Truncated Dependent Variables
Estimates that are unbiased are known to be accurate on average; however, we also want impact estimates that in any particular instance are unlikely to deviate greatly from true impacts. The smaller the variance of the estimates, the narrower the confidence intervals around the estimates and the lower the probability of failing to detect important channeling impacts. However, the requirement for regression estimates to have minimum variancehomoscedasticity of the disturbance termswill not be met for many of the dependent variables examined in the channeling evaluation because they are binary (e.g., whether admitted to a nursing home) or bounded at zero (e.g., number of days spent in the hospital). Furthermore, if the disturbance term is not homoscedastic, the test statistics calculated by the regression program will not be strictly correct. Finally, the predicted value for some observations may be less than zero when regression is used for binary or bounded dependent variables, which is obviously inappropriate. (Predicted values may also be greater than one, which is equally inappropriate for binary variables.)
For cases such as these, econometric procedures have been developed to provide estimates with desirable properties (under certain assumptions). Probit and logit models are the estimation procedures most widely used for binary dependent variables and Tobit analysis is used by economists for bounded variables. (See Maddala, 1983, for a discussion of these procedures, their statistical properties, and the assumptions on which they are based.) In practice however, these more complex and expensive estimation procedures typically provide estimates of the effects of explanatory variables on dependent variables which closely resemble in size and significance the estimated effects obtained from least squares regression. This result has been demonstrated in several previous applied studies (Corson et al., 1985; Grossman, et al., 1986; Hollister, et al., 1985; and others) as veil as in the recent econometric literature (Greene, 1981, 1983). Furthermore, all of the statistical properties of the probit and Tobit estimators, including unbiasedness, depend on the assumption that the disturbance term is normally distributed, a condition not required by regression.
The much greater ease with which statistical tests can be performed with least squares regression and the much lower computational cost compared to probit, logit, and Tobit (which require iterative maximum likelihood estimation) led us to strongly prefer least squares as an estimation strategy. However, to ensure that computational ease and cost savings were not achieved at the cost of seriously distorted impact estimates or test statistics, we compared estimates of channeling impacts obtained from regression to estimates obtained from the more complex procedures, using key outcome variables that were binary or truncated at zero.^{32}
Comparison of the probit model estimates to least squares estimates for binary dependent variables.^{33} The probit model is based on the assumptions that individuals will take a given action (e.g., enter a nursing home) when a certain unobserved threshold is reached, that this threshold is determined by observed and unobserved factors, and that the threshold differs across individuals. Consider, for example, the decision to enter a nursing home. The'probit model for this outcome is written as:
Y* = a_{o} + a_{B}T_{B} + a_{F}T_{F} + a_{s}S + a_{x}X  e Y = 1 if Y* > 0 0 if Y* < 0. where Y* is the unobserved indicator of the propensity to enter a nursing home, which depends on the set of variables specified as explanatory variables in the standard regression equation given in Chapter III. The disturbance term e is the unobserved individualspecific threshold, for example, the individual's unwillingness to enter nursing homes.^{34} Sample members whose unmet need for services is so great that it outweighs their distaste for nursing homes are assumed to enter such institutions (given the availability of beds). The observed binary dependent variable (Y) is equal to 1 for those who enter nursing homes and 0 for those who do not. The parameters of this probit model (the a_{i}’s) are estimated by maximum likelihood, i.e., by choosing the values that maximize the product of predicted probabilities of entering a nursing home (for actual entrants) or not entering (for nonentrants). Predicted probabilities from this model will always be between zero and one, and if the assumed model is correct, the resulting estimates have the minimum variance possible. The estimated impacts of channeling are obtained by computing the predicted probability of entering for a treatment group member, with all of the other characteristics X set at the sample mean, and subtracting the predicted probability for controls computed at the same values of X.
TABLE IV.2. Impact Estimates from Least Squares Regression and from Probit for Selected Binary Outcome Measures
(In percentage points; tstatistics in parentheses)Basic Model Financial Control Sample Size Regression Probit^{a} Regression Probit^{a} Whether Received Any Formal Care  6 months 6.96** (3.49) 7.35** (3.49) 16.31** (8.09) 17.23** (8.12) 4,974 Whether Had Any Visiting Informal Caregiver  6 months 2.33 (1.22) 2.34 (1.18) 2.57 (1.33) 2.77 (1.33) 4,899 Whether Received Any Informal Care  6 months 2.97 (1.50) 3.12 (1.44) 2.64 (1.32) 2.92 (1.38) 4,899 Whether Received Comprehensive Case Management  months 16 51.17** (26.33) 52.67** (26.44) 56.34** (28.93) 58.35** (29.36) 3,955 Whether Admitted to Hospital  months 16 2.80 (1.44) 2.93 (1.47) 2.04 (1.04) 2.12 (1.07) 5,554 months 712 0.36 (0.20) 0.43 (0.23) 0.37 (0.20) 0.48 (0.26) 5,554 Whether Admitted to Nursing Home  months 16 0.52 (0.37) 0.20 (0.15) 0.37 (0.27) 0.16 (0.12) 4,593 months 712 2.23 (1.88) 2.22 (1.93) 0.29 (0.25) 0.40 (0.36) 4,752 NOTE: Regression estimates and sample sizes do not in all cases correspond exactly with those presented in final channeling reports, because some changes may have taken place between the time that this analysis was conducted and the final analyses were completed.  Estimates of channeling impacts were obtained from the probit coefficients by computing the predicted probability of the dependent variable for treatments and for controls (with all of the explanatory variables set at their overall sample means) and subtracting. Thus, impact = F(Xb + a)  F (Xb), where F is the cumulative normal distribution function, X is the mean of the explanatory variables for treatments and controls combined, b is the vector of estimated probit coefficients on the explanatory variables, and “a” is the estimated probit coefficient on the treatment status indicator. The standard error of this difference was then calculated using the usual formula for approximating the variance of a nonlinear a combination of estimators. (Kmenta, 1971; p. 444). The tstatistics is simply the ratio of the estimated impact to the estimated standard error of the impact.
** Significantly different from zero at the .01 level (2tailed test).
The least squares and probit estimates of channeling impacts on a set of key binary outcome variables are compared in Table IV.2. The impact estimates and tstatistics were very similar for all six of the variables examined, for both models. For no outcome was there a change in the statistical significance when probit was used. Even estimates that were statistically insignificant exhibited only small changes in magnitude.
Comparison of Tobit estimates to least squares regression estimates. When the dependent variable is truncated at zero but not binary, such as nursing home expenditures or days, regression estimates lose some of their desirable properties. The Tobit procedure, which is closely related to the probit procedure, was designed to overcome these weaknesses. A Tobit model of the number of days spent in nursing homes, for example, would be written as:
Y* = a_{o} + a_{B}T_{B} + a_{F}T_{F} + a_{s}S + a_{x}X  e Y = Y* if Y* > 0 0 if Y* < 0. where observed nursing home days (Y) is equal to the expression given for Y* for individuals whose need for nursing home care outweighs their unobserved unwillingness to enter nursing homes (e), and equal to zero for others. Again, maximum likelihood methods are used to estimate the coefficients and the standard error of e. The effects of channeling are estimated by computing the expected value of the outcome Y for treatments and for controls, both at the point of means of the other explanatory variables, and taking the difference. (See Moffitt and McDonald, 1980, for the correct expression for obtaining predicted outcomes from Tobit models.)
The regression and Tobit estimates of channeling impacts on a set of key outcome variables that are bounded at zero are contained in Table IV.3. For most of the 24 comparisons, the differences between the two alternative estimates were quite small (though somewhat greater than the differences observed between probit and regression). However, in 3 instances, the differences were fairly large and resulted in a change in the statistical significance of the impact estimates: hours of formal care at 6 and 12 months in the basic model and nursing home expenditures at 6 months in the basic model. The impact of channeling on formal care in the basic model went from essentially zero using the regression model to nearly 1 hour per week at 6 months (about 15 percent of the control group mean) using the Tobit model, with the latter being statistically significant at the .05 level. The same change in statistical significance occurred at 12 months for this outcome in the basic model, although the two estimates were not that different in magnitude. The effect on nursing home expenditures went in the opposite direction. The regression estimate was a reduction of 165 dollars (about 25 percent of the control group mean), which dropped to 47 dollars when Tobit was used.
TABLE IV.3. Impact Estimates from Least Squares Regression and from Tobit for Selected Truncated Outcome Measures
(tstatistics in parentheses)Basic Model Financial Control Sample
SizeRegression Tobit^{a} Regression Tobit^{a} Hours of Formal Care 6 Months: impact 0.14 (0.22) 0.92* (2.00) 5.35** (8.15) 5.09** (9.99) 4,974 control mean^{b} 6.4 6.2 4.8 6.3 12 Months: impact 1.14 (1.78) 1.46** (3.38) 3.58** (5.56) 3.39** (6.62) 5,040 control mean 5.2 4.8 4.5 6.0 Hours of Informal Care 6 Months: impact 0.98 (1.29) 0.74 (1.42) 0.31 (0.41) 0.59 (1.02) 4,899 control mean 6.02 6.27 6.31 7.08 12 Months: impact 0.03 (0.04) 0.08 (0.21) 0.07 (0.12) 0.29 (0.62) 4,998 control mean 3.69 3.96 4.56 5.15 Hospital Days 6 Months: impact 0.35 (0.41) 0.59 (0.83) 0.71 (0.83) 0.00 (0.01) 5,554 control mean 11.5 12.8 16.2 14.3 12 Months: impact 0.18 (0.25) 0.20 (0.33) 0.56 (0.75) 0.20 (0.33) 5,554 control mean 7.0 8.1 9.0 8.6 Nursing Home Days 6 Months: impact 2.36 (1.93) 0.59 (0.67) 1.14 (0.94) 0.27 (0.33) 4,593 control mean 12.2 6.4 9.6 5.6 12 Months: impact 1.19 (0.63) 2.56 (1.59) 2.19 (1.15) 0.02 (0.02) 4,752 control mean 16.3 12.8 16.7 10.1 Hospital Expenditures 6 Months: impact 119 (0.45) 206 (0.94) 68 (0.25) 89 (0.36 5,554 control mean 3,412 3,869 4,899 4,643 12 Months: impact 59 (0.29) 11 (0.06) 161 (0.79) 63 (0.34) 5,554 control mean 2,015 2,307 2,706 2,641 Nursing Home Expenditures 6 Months: impact 165* (2.15) 47 (0.92) 8 (0.11) 6 (0.12) 4,593 control mean 666 369 560 332 12 Months: impact 58 (0.56) 120 (1.42) 103 (0.99) 1 (0.01) 4,752 control mean 819 657 894 546 NOTE: Regression estimates and sample sizes do not in all cases correspond exactly with those presented in final channeling reports, because some changes may have taken place between the time that this analysis was conducted and the final analyses were completed. 
Estimates of channeling impacts were obtained from the tobit coefficients by computing the predicted value of the outcome variable for treatments and controls (with all of the explanatory variables set at their overall sample means) and subtracting. Using the expression given by Moffitt and McDonald (1980) for the expected value of the dependent variable in a tobit model, the estimated impact was:
Impact = (b + a) * F((b + a)/a) = s*f((b + a)/s) ]  [b * F(b/s) + s*f(b/s ],
where is the mean of the explanatory variables for the treatment and control groups combined; b and a are the estimated tobit coefficients on the explanatory variables and treatment status indicators respectively; s is the estimated standard error of the disturbance term in the tobit model; f(.) is the standard normal density function; and F(.) is the cumulative distribution function of the standard normal (the predicted probability that the dependent variable is greater than zero). The standard error of the estimated impact was calculating using the usual formula for approximating the variance of a nonlinear combination of estimators (Kmenta, 1971: p 444). The tstatistic (in parentheses) is simply the ratio of the estimated impact to the estimated standard error of the impact.
* Significantly different from zero at the .05 level.
** Significantly different from zero at the .01 level.Despite these differences, it was not clear that the Tobit procedure produced better estimates than regression even in these two instances. The predicted nursing home expenditures for controls was far below the actual mean, suggesting that Tobit may not have provided reliable estimates. Furthermore, for both the variables for which least squares and Tobit produced substantially different estimates there was evidence that the Tobit estimates reflected the probability of any use of these services more strongly than the extent of use. Both of these problems were due to outliers, cases with extremely large values of the outcome variable, which affect Tobit estimates somewhat differently than least squares estimates. Although less sensitivity to outliers would be a desirable feature, the distorting effects of outliers on Tobit estimates may be even greater than their effects on least squares estimates, especially if there are treatment/control differences in the number of outliers. These potential problems, combined with the greater expense and difficulty of hypothesis testing with the Tobit model, again led us to prefer least squares regression as the estimation procedure, and to analyze the effects of outliers on these estimates directly.
2. The Effects of Outliers on Regression Estimates of Channeling Impacts
The effects of outliers (i.e., extremely large values of the outcome variable that are not simply data errors) on estimates of population means and regression coefficients are wellknown, but there is much less documentation about what should be done when confronted by such problems. A common "solution", discarding the outliers, may distort estimates of program impacts more than leaving them in, since one of the effects of the program may be to reduce extreme use of or expenditures on services. This effect would be totally missed if outliers are discarded. However, it may be the case that differences between the two groups in the very small proportion of outliers could arise strictly by chance and affect the estimated treatment/control difference so greatly that it no longer provides a reliable estimate of channeling impacts.
Duan et al. (1983) cite examples of how even estimates which are unbiased can yield very misleading inferences about program impacts in cases where the outcome variable is zero for a substantial fraction of the sample but has extremely large values for a small fraction of the remaining cases. They then propose an alternative estimator for such situations. This procedure seemed potentially appropriate for the channeling evaluation, since several of the key outcome variables exhibit these characteristics, especially hospital and nursing home days and expenses.
The procedure advocated, by Duan et al. is to break such service use variables (measured either in physical units or expenditures) into two separate variables: whether the service is used at all, and for those who use it, the amount of such services. The expected value of use is the product of the probability of use and the expected amount of use given that some occurred. Thus, a probit model is estimated first for whether any use occurred, as a function of treatment status and other explanatory variables. Then, using only observations that had some service use, a regression model is estimated to predict the amount of use (again dependent on treatment status and control variables), with the amount being expressed in logarithmic form to reduce the influence of outliers on the estimates. These two equations are then used to obtain predicted probabilities of use and amounts of use by service users for treatments and for controls with the same characteristics. These estimates in turn are used to compute overall expected use for the treatment and control groups and the difference between them.
This procedure was used on a set of key hospital and nursing home outcome variables with skewed distributions. Table IV.4 contains a comparison of the 2part, least squares and Tobit estimates of channeling impacts. The 2part method yielded estimates which differed somewhat from the regression estimates, but not by enough to change the inference about whether channeling affected hospital and nursing home outcomes. The 2part estimates were also generally closer to the least squares estimates than to the Tobit estimate, especially for the outcomes exhibiting the largest discrepancy between least squares and Tobit.
These results suggested that the more cumbersome twopart method was not necessary, at least for hospital and nursing home outcomes where outliers were most likely to occur. However, the results from the Tobit analysis suggested that estimates of channeling impacts on hours of formal care received at 6 months was also affected by outliers. To investigate this, the 2part method was used for this outcome variable as well. In the financial control model, estimated impacts from least squares and the 2part methods were both large and statistically significant. In the basic model, however, the estimated impact from regression was small (.14 hours) and not statistically significant, but the 2part method estimate was much larger (2.5 hours) and the impact on both the probability of receiving care and the amount of care received by service recipients were statistically significant.
The nonsignificant effect on hours was unexpected because other estimates indicated that the basic model led to an increased proportion of sample members receiving any services. Thus, to have no effect on hours channeling would have had to decrease the average amount of services received by those who would have received some services even in channeling's absence. Further examination of the data showed that the small regression estimate of treatment/control differences was heavily influenced by the receipt of continuous (24 hours per day) formal care by 7 control group members (representing 20 percent of total use by the 1,000 controls in the sample) but only 2 treatment group members. Use of the 2part method dampened the effect of these outliers on the estimated treatment/control difference, and completely reversed the inference about channeling's effects on the average amount of care received by recipients. The estimate in column 7 of Table IV.4 indicates that treatment group recipients received significantly (2.8) more hours of care than recipients in the control group.
TABLE IV.4. Comparison of Least Squares, Tobit, and 2Part Estimates of Channeling Impacts for Skewed Outcome Variables Outcome Alternative Estimates of Impacts Control
Group
MeanComponents of 2Part Method Estimate Sample
SizeTobit Least
Squares2Part
Method^{s}Probability of Use Quantity of Users Impact Control
MeanImpact Control
Mean6 Month Outcomes Hospital Days Basic 0.59 0.35 0.74 11.5 0.024 0.539 0.4 22.19 5,554 Financial Control 0.00 0.71 0.77 16.2 0.018 0.546 2.3 29.03 Hospital Expenditures Basic 206 119 227 $3,412 0.024 0.539 131 6,632 5,554 Financial Control 89 68 178 $4,889 0.018 0.546 596 8,813 Nursing Home Days Basic 0.59 2.36 2.42 12.2 0.004 0.113 19.2* 81.30 4,593 Financial Control 0.27 1.14 0.08 9.6 0.001 0.107 1.4 68.37 Nursing Home Expenditures Basic 47 165* 131 $666 0.004 0.113 1035 4,521 4,593 Financial Control 6 8 30 $560 0.001 0.107 320 4,158 Hours of Formal Care Basic 0.92* 0.14 2.50* 6.50 0.074** 0.400 2.82* 16.24 4,974 Financial Control 5.09** 5.35** 8.41** 5.02 0.172** 0.474 10.20** 10.60 6 Month Outcomes Hospital Days Basic 0.20 0.18 0.40 7.0 0.005 0.339 1.5 21.06 5,554 Financial Control 0.20 0.56 0.44 9.0 0.0003 0.350 1.2 25.17 Hospital Expenditures Basic 11 59 139 $2,015 0.005 0.339 506 6,079 5,554 Financial Control 63 161 132 $2,706 0.0003 0.350 370 7,597 Nursing Home Days Basic 2.56 1.19 0.78 16.3 0.025 0.129 19.3 111.41 4,752 Financial Control 0.02 2.19 2.43 16.7 0.004 0.103 27.5 128.66 Nursing Home Expenditures Basic 120 58 4 $819 0.025 0.129 1,345 5,757 4,752 Financial Control 1 103 124 $894 0.004 0.103 1,420 6,910 
The impact estimate obtained from the twopart method was calculated as follows:
Impact = (proportion of control group with Y > 0 + estimated channeling impact on proportion)
* (average value of Y for control group members with Y > 0 + estimated impact on Y for those with Y > 0)
 (proportion of controls with Y > 0) * (average Y for controls with Y > 0).where Y is the value of the outcome variable examined. The impact on the proportion for sample members with Y > 0 was estimated from a probit model. The impact on outcomes for those with Y > 0 was estimated by first regressing the logarithm of the outcome variable on binary treatment indicators and the standard control variables, using only those cases with Y > 0. The coefficients (b) on the treatment status variables from this log regression were then used to calculate impacts on expenditures:
Impact on those with Y > 0 = (e^{b}  1) = (control group mean for those with Y > 0).
These four components used to construct the overall impact are presented in columns 5 through 8 of this table.
* Significantly different from zero at the .05 level (2tailed test).
** Significantly different from zero at the .01 level (2tailed test).Given the similarity of the 2part estimates to the ordinary least squares regression estimates for nursing home and hospital days and expenditures, the final reports on these outcomes relied upon the ordinary regression results. This was done because the standard errors of impacts from the 2part method are more cumbersome to calculate, and multivariate tests would be especially difficult to conduct. Even for hours of formal care, we chose in the final reports to rely on least squares estimates (computed both with and without the outliers), despite the fact that the 2part method did yield estimates that were less sensitive to outliers than the ordinary least squares estimates. The reason for this decision was that if channeling did in fact reduce the service use of a small number of cases who would otherwise have used large amounts of services, the savings from such effects could be very substantial. The twopart method may understate the importance of such cases.
The 2part method therefore may never give the most appropriate estimates. If important channeling effects occur for outliers, the twopart method may mask them. On the other hand, if treatment/control differences in outliers were due strictly to chance, the optimal approach is to drop them, rather than to just reduce their influence. Thus, throughout the evaluation, least squares regression was used to estimate channeling impacts. As shown in Table IV.4, this yields the same inferences about impacts on hospital and nursing home outcomes as the 2part method. For formal care at 6 months, impacts were estimated in the final report with outliers included and then with them excluded. Evidence was presented indicating which estimates provided the most accurate indication of channeling impacts. (See Corson et al., 1986 for further discussion of those results.) No other outcome measures appeared to have skewed distributions; hence, no other analyses of the effects of outliers were conducted.


G. The Effects on Impact Estimates of Using Proxy Respondents

Because of the frailty of the sample, many sample members required the help of others (family, friend, nurse, caregiver) to complete the interview. However, proxies' responses to questions may differ considerably from those that the sample members would have given, especially to questions about attitudes or feelings. This issue raised concerns from the beginning of the evaluation about whether use of proxies at followup would distort our estimates of channeling impacts.
In order for proxy use at followup to bias impact estimates, it must be true that proxies for either the treatment group, the control group, or both respond differently than sample members would. There are three ways in which proxy use at followup could affect impact estimates:

If proxies over or underreported (relative to sample members) to the same extent for treatment and control groups, but rates of proxy use differed for treatment and controls.

If proxies for the treatment group over or underreported more or less than did proxies for control group members (whether rates of proxy use differed or not).

If proxies over or underreported to the same extent for both groups and rates of proxy use were the same. (In this case, the bias will be proportional because if the dependent variable mean is, say, overstated by a certain proportion for both treatments and controls, then the treatment/control difference is overstated by the same proportion.)
Of these, the first was considered to be the most likely to occur, and the second the least likely. The third situation would be clearly less serious than the other two, since proportional misreporting for both treatments and controls implies that impacts expressed as a percent of the control group mean will be unaffected. Therefore, we looked first at rates of proxy use and compared them for treatment and control groups, and then we compared impact estimates for selfrespondents and proxy respondents.
Rates of proxy use for treatment and control groups were be remarkably similar for the two groups at all 3 followup interviews, both in answering specific questions and in overall response to the interview. Overall, about 40 to 45 percent of the interviews were completed without any assistance from proxies, while another 40 to 45 percent were completed entirely by proxies. For 45 to 50 percent of the sample members, a proxy answered the specific interview questions about the sample member's attitudes about satisfaction and contentment with life and with service arrangement.
The similarity of rates for the two groups made it, less likely that proxy use distorted estimates of channeling impacts. However, it was still possible, unless proxies responded no differently from sample members on average. To examine this question, the mean responses of proxies and selfrespondents to several key questions at followup were compared. These comparisons showed that sample members with proxy respondents were recorded as being more impaired (on ADL and IADL tasks), less satisfied with life, and lonelier than sample members who responded themselves. However, examination of records data showed that sample members requiring proxies also had many more hospital and nursing home days, which suggests that the reported differences on interview items between those with and those without proxies may be real differences rather than the result of differential reporting by proxies and sample members. However, this conclusion could not be drawn without direct investigation of the effect on impact estimates of using proxy respondents.
To provide an indication of whether impact estimates were affected by proxy use, we estimated impacts on key outcomes separately for sample members with proxy respondents and those who responded themselves. We did this by modifying the standard regression model, replacing the binary treatment variables (T) with interaction terms (T* respondent type), then testing to see if impacts (the coefficients on T* respondent type) were equal.
We found relatively few significant differences in impacts (16 out of 90) between these two groups, but more than would be expected by chance. For impairment/health status outcomes (ADL, IADL, hospital days, nursing home days) we found a few significant differences but no systematic pattern. Among the formal and informal care measures, we found statistically significant differences in impacts across types of respondents only for the outcome variable indicating whether any informal care was received. The treatment group had a significantly lower proportion receiving informal care (from visiting caregivers or from anyone) than the control group among self respondents, but not among proxy respondents. However, it was unclear whether this difference was due to differences in physical or cognitive impairment between the types of clients who required proxy respondents and those who did not or to responses by proxy members that were not accurate reflections of what the sample members would have given themselves.
Six variables measuring sample members' attitudes were also examined, including their loneliness, overall satisfaction with life, confidence about receipt of care, contentment, self rating of health, and degree of concern about receiving needed care. Again we found relatively little difference in impacts across respondent types, except for the global life satisfaction variable. Among sample members with proxy respondents, the proportion reporting low satisfaction at 6 and 12 months was significantly smaller for the treatment group than for controls in both models, but no such pattern occurred for self respondents. Again, the relevant question was whether these results were due to differential reporting by proxies, or whether they, perhaps reflected the fact that proxy users were the most impaired (and presumably, least satisfied initially) and channeling may have had the biggest impact on the morale of those who were originally the most impaired/least satisfied (perhaps because they were not receiving needed services).
To distinguish between these two alternative explanations for the differences in impacts between self and proxy respondent cases, the regression model used to estimate impacts for the two groups was modified by including additional interaction terms involving treatment status and baseline measures of other factors that could affect channeling impacts. These factors were ones that were used in the analysis of channeling impacts on particular subgroups (see Chapter III): ADL, continence, unmet needs, referral source, Medicaid eligibility, living arrangement, whether on a nursing home waiting list, cognitive impairment, and site. Respondent type was added to this model as an additional set of subgroups. If the apparent differences in impacts across proxy use categories observed for informal care and global life satisfaction were in fact due to differences in impacts across impairment levels, impacts estimated from the revised subgroup regression for these two outcomes should no longer differ significantly across proxy use category, because the differences in channeling's effects across impairment subgroups would now be controlled for.
Once these other interactions were entered, impacts on informal care were no longer significantly different across types of respondents. Thus, it appeared that for informal care, proxy use did not affect impact estimates.
For the outcome variable representing sample members' satisfaction with life, however, the difference in impacts by respondent type remained statistically significant. Differences without controlling for subgroup effects were statistically significant for both models at 12 months and for the financial control model at 6 months. After controlling for other subgroup effects, only the 6month basic model results indicated significantly different impacts by respondent type. However, it was clear that in all three cases, the overall significant improvement in life satisfaction was driven by the treatment/control difference for those with proxy respondents. Thus, for this outcome the difference in impacts across types of respondent were not merely reflecting impact differentials across baseline impairment or unmet need categories.
From the set of analyses conducted we concluded that with one possible exception the use of proxy respondents did not result in distorted estimates of channeling impacts. The potential exception to this was the result for life satisfaction, for which it was difficult to distinguish between two plausible alternatives. It is possible that, as caregivers, proxies for treatment group members were so pleased with the additional help channeling provided that their response reflected the proxy's own satisfaction more than that of the sample member. On the other hand, it may have been the case that sample members requiring proxies at followup were those most dissatisfied with life at baseline and it was this dissatisfied group for which channeling had the biggest effect on reported life satisfaction. Yet another possible explanation is that those who required proxies at followup but were not highly impaired at baseline may be the group whose health or ability to function deteriorated, the most over the six months. Channeling impacts on satisfaction could be greatest for this group. In any case, however, channeling appears to have had an impact on satisfaction. Whether these impacts were for a certain set of sample members or for the caregivers of those sample members is unclear.


View full report
"methodes.pdf" (pdf, 2.16Mb)
Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®