The number of observations in each of these analysis samples is given in Table II.2. Observations were roughly evenly split between the models, but included more treatments than controls. This imbalance arose because the rate of client intake was slower than expected and would not have yielded a sample size large enough to meet the desired precision standards if the sample were restricted to only the subset of applicants who were initially intended for the research sample.11 The in-community sample at 12 months included only 40 to 50 percent of the full research sample for either model or experimental group, due in large part to the high mortality rate for the sample (nearly 30 percent by 12 months). Sample sizes at 18 months were much smaller, of course, since followup data were sought for only half of the original sample.
The samples were originally designed to meet a certain precision standard. The precision standard specified (see Kemper et al., 1982) was based on the accuracy of the estimates of channeling impacts on nursing home use, since that was the single most important, outcome measure and the source of most of the cost savings channeling was expected to generate. The sample was to be large enough so that if 50 percent of controls entered a nursing home, we would be able to identify channeling impacts of 6 percentage points or larger with 90 percent power and 90 or 95 percent confidence (for a two-tailed or one-tailed test, respectively). This means that the sample was to be large enough that if channeling actually reduced the probability of nursing home admissions by 6 percentage points or more, there would be at least a 90 percent a priori probability that conventional statistical tests conducted on the sample would reject the hypothesis that channeling had no impact. With such samples, the probability of erroneously concluding that channeling had no impact when in truth it had affected nursing home use (type II errors) and the probability of erroneously concluding that channeling had reduced nursing home when in fact it had not (type I errors) would both be small.
The actual precision of our tests differed from this standard for 3 reasons. First, we actually conducted tests at the 95 rather than the 90 percent confidence level, using two-tailed tests, which meant that differences had to be as large as 6.6 percentage points in order to have 90 percent power of detecting them. Second, the actual sample sizes differed from the 1200 treatments /1200 controls that were required to produce the desired precision (see Table II.2), for the reasons cited above. If half of the control group had entered nursing homes, we would have been able to detect impacts of 6.9 percentage points or larger with 90 percent power and 95 percent confidence using the actual sample and two-tailed tests.
In fact, however, the control group use of nursing homes was much smaller than the assumed 50 percent, which had been used in the calculations because it resulted in the largest possible variance for a binary variable. This is the third and by far the most important reason why the precision of our estimates differed from what was originally planned. Given that only 13 percent of controls were admitted to nursing homes in the first 6 months, the variance was smaller than assumed. Thus, the sample was sufficiently large to detect impacts as small as 4.6 percentage points. Proportionately, however, 4.6 percentage points is over one-third of total actual use. Thus, unless channeling's impact on nursing home use was proportionately quite large, we cannot be highly confident that the treatment/control difference observed in our sample will be significantly different from zero statistically. Had control group use been equal to the assumed 50 percent, reductions due to channeling as small as 14 percent (6.9 percentage points) would have been detectable.
Despite this fact, the sample sizes are sufficiently large that it is very unlikely that channeling impacts large enough to make channeling a cost-effective program would go undetected by the statistical tests. Thus, the sample sizes used in the evaluation were large enough to ensure a low probability of either seriously overstating or understating channeling impacts.
|TABLE II.2. Sample Sizes Used in the Evaluation|
|Basic Model||Financial Control Model||Full Sample|
|Number of Observations in Full Sample||1,779||1,345||3,124||1,923||1,279||3,202||3,702||2,624||6,326|
|6 Month Outcomes|
|Nursing home sample||1,281||903||2,184||1,548||861||2,409||2,829||1,764||4,593|
|12 Month Outcomes|
|Nursing home sample||1,359||935||2,294||1,577||881||2,458||2,936||1,836||4,752|
|18 Month Outcomes|
|Number of Observations in 18-month Cohort||992||697||1,619||926||620||1,546||1,848||1,317||3,165|
|Nursing home sample||644||475||1,119||730||399||1,129||1,374||874||2,248|
|NOTE: Sample sizes used in analyses were actually slightly smaller than these figures in some cases due to missing data on specific outcomes. Thus, these sample sizes differ slightly from those reported elsewhere. Some analyses based on the Medicare and nursing home samples were further restricted to sample members alive at the beginning of the analysis period. See Wooldridge and Schore (1986) for these sample sizes.|