Evaluation of Family Preservation and Reunification Programs: Final Report - Volume Two. 3.12 Summary of Outcome Data


Information from the caretaker interviews, the caseworker interviews, and the administrative data were analyzed for indications of differences between the experimental and control groups subsequent to the referral to the family preservation program. Tables 3-15 and 3-16 contain a summary of those outcomes on which we found significant differences between the experimental and control groups in any state for the primary analyses (p < .05). Items in bold are those on which the experimental group had better outcomes, those in italics are those on which the control group had better outcomes.

In none of the three states were there significant differences between the experimental and control groups on family level rates of placement or case closings. Subsequent maltreatment was generally not related to experimental group membership, except for one subgroup in Tennessee. In Tennessee, in those families with an allegation within 30 days prior to random assignment, the experimental group children experienced fewer substantiated allegations than children in the control group.

In Tables 3-17 and 3-18 there are a number of child and family functioning items in which the experimental group displayed better outcomes than the control group in one of the states. It should be noted that the results have not been adjusted for the multiplicity of significance tests performed. That is, these significant items surfaced out of a large number of items and scales examined. In such a situation it is to be expected that some items will show significant differences simply by chance, so the appearance of a few significant differences should not be taken as an indication of superiority of one group over another, particularly when the results are not confirmed in more than one state. On only two items were differences found in two states: caretakers' assessment of whether goals had been accomplished and their assessment of overall change. We are inclined to believe that family preservation programs as represented in these states do result in higher assessments by clients of the extent to which goals have been accomplished and of overall change, since differences on those items were found in both states. Beyond that, we are unable to claim consistent evidence of positive effects of family preservation services. (56)

Table 3-17
Summary Of Outcomes, Post-Treatment Interview
Caretaker Interview: Proportion of affirmative answers to yes/no questions
  Kentucky New Jersey Tennessee
p Control
p Control
Is apartment/house rented (vs. owned) 75 89 0.005 70 68   69 75  
Got together with anyone to have fun 64 64   65 59   38 75 0.001
Felt had few or no friends 14 18   20 18   38 19 0.03
Had difficulty buying clothes 17 21   47 33 0.008 27 24  
Out of control when punishing child 24 24   40 30 0.05 11 12  
Punished for not finishing food 7 1 0.02 6 5   0 0  
Unable to find someone to watch child 9 12   21 12 0.04 20 27  
Encouraged child to read a book 92 90   82 91 0.02 94 96  
Have goals been accomplished 63 77 0.02 52 71 0.001 81 84  
Assessment of overall change: 0.02 0.001  
Great improvement 16 22   9 16   32 32  
Some improvement 31 42   41 52   32 42  
Same 42 29   34 20   22 14  
Somewhat or a great deal worse 12 6   16 12   14 13  
Caretaker Scales:
Difficulty paying bills (proportion of 4 items) 0.17 0.22   0.34 0.25 0.02 0.25 0.18  
Negative child care practices (proportion of 10 items) 0.14 .0.13   0.18 0.14 0.02 0.09 0.09  
Punishment (proportion of 5 items) 0.16 0.17   0.25 0.20 0.04 0.13 0.13  
Negative child behaviors (proportion of 21 items) 0.34 0.34   0.33 0.28 0.04 0.21 0.21  
Change in proportion of punishment items from initial to post-treatment interviews -0.04 -0.09 0.05 -0.05 -0.07   -0.07 -0.13  
Change in proportion of negative child care practices from Initial to post-treatment interviews -0.02 -0.06 0.04 -0.04 -0.05   -0.01 -0.08 0.02
Ability giving affection (higher = more adequate) 2.83 2.83   2.93 2.70 0.04 2.73 2.95  
Providing learning opportunities for child
(higher = more adequate)
2.38 2.42   2.89 2.60 0.008 2.64 2.64  
Respecting child's opinions (higher = more adequate) 2.58 2.45   2.55 2.42   2.35 2.84 0.01
Responding patiently to child's questions (higher = more adequate) 2.43 2.34   2.44 2.27   2.26 2.67 0.04
Adequate supervision / Responsible child care (higher = more adequate) 2.50 2.59   2.80 2.71   2.52 2.93 0.04
Household condition (proportion of 13 items, higher = worse condition) 0.10 0.13 0.01 0.09 0.11 0.02 0.12 0.12  
Caretaker problems (proportion of 21 items, higher = more problems) 0.25 0.31 0.0005 0.21 0.23   0.21 0.18  
Caretaker functioning (higher = better) 2.56 2.55   2.79 2.66 0.10 2.51 2.82 0.04
Respecting child's opinions (change in average ratings from Time 1 to Time 2)** 0.19 -0.06 0.05 0.27 0.04 0.05 0.06 0.14  
Setting firm/consistent limits/rules (change in average ratings from Time 1 to Time 2) ** 0.35 0.22   0.33 0.25   -0.29 0.29 0.01
Caretaker Problems (Change in proportion of 21 items; lower = less at Time 2) -0.06 -0.04   -0.05 -0.04   -0.03 -0.08 0.05
NOTE: This table only includes items with a primary analysis p-value less than .05 in at least one of the states; p-values greater than .10 are not reported.
Items in bold indicate significant findings in favor of the experimental group whereas italicized items indicate significant findings in favor of the control group.
** Scale for change in ratings: -4 = ability decreased greatly over time, 0 = no change in ability over time, +4 = ability increased greatly over time
Table 3-18
Summary Of Outcomes, Caretaker Followup Interview
Proportion of affirmative answers to yes/no questions Kentucky New Jersey Tennessee
p Control
p Control
Has spouse held full time job 81 78   86 68 .05 100 85  
Had difficulty paying rent 20 20   34 27   39 20 .04
Have children handled household chores 75 75   70 83 .02 94 89  
NOTE: This table only includes items with either a primary p-value less than .05 in at least one of the states; p-values greater than .10 are not reported
Items in bold indicate significant findings in favor of the experimental group whereas italicized items indicate significant findings in favor of the control group.

There are a few items on which the control group had better outcomes, nearly all of them on measures provided by caseworkers. We are not inclined to read too much into these results, since experimental group caseworkers generally knew the families better and there may well have been significant differences in the ways that workers serving the two groups saw families and judged their functioning.


23. The full list of New Jersey service codes that were included is: public institution, teaching family placement, para-foster care income maintenance, juvenile-family crisis shelter placement, relative placement, foster care placement, residential treatment placement, finalized adoption placement, selected adoption placement - pending, maternity home care, group home placement, independent living, and shelter care placement. Four of these categories did not actually occur in the data: teaching family placement, para-foster care income maintenance, finalized adoption placement, and selected adoption placement - pending. In Kentucky placement (as reflected in the variable FACTYPE), included: adoption, foster care, private institution/boarding schools, family treatment home, unmarried parent, other, children's psychiatric hospital, and foster care medically fragile. The data did not include adoption, family treatment home, and unmarried parent. In Tennessee, placements included: foster care, relative home, trial home, residential care, continuum contract, non-relative home, adoptive home, runaway, shelter, independent living, and detention.

24. Cases entered the study at varying points in time. In Kentucky, cases entered between May 7, 1996 and February 13, 1998; in New Jersey, cases entered between November 6, 1996 and February 26, 1998; and in Tennessee, between November 19, 1996 and May 26, 1998.

25. There are two reasons for focusing on family-level analyses. First, we are not confident that the administrative data allow for accurate identification of children to be included in the risk pool (what would be the denominator in a rate of placement calculation). Children are identified as belonging to a family through a case number. The analysis requires that we identify children who are in the home at the time of random assignment (or who are born or return to the home subsequently). In these states, children apparently often retain a family case number even when they are not in the home, and the administrative data do not allow us to verify the location of the child at the time of random assignment (or even sometimes at the time of an event such as placement). This problem is alleviated in analyses at the family level, since we know that the family is at risk of having a child placed (as long as there are any children in the family).

As to the accuracy of the "numerator" in our analyses, we focus on the first event (e.g., placement) in the family, subsequent to random assignment. It is possible that the first event occurs with regard to a child identified with a family but not living in that family at the time of the event. We judge the likelihood of that occurring to be small (the effects of this source of error would be similar in a family and child level analysis). In addition, subsequent events involving other children identified with the family but not in the family at the time of the event would not affect the family level analysis, while they would create inaccuracies in a child level analysis.

The second reason for focusing on the family level has to do with a "clustering" effect in the child level analysis. Clustering refers to the lack of independence between children within the same family of observations of such things as placement. If one child is removed from the home, the remaining children are more likely to experience placement. The "clustering effect" leads to an underestimate of the significance levels when analyses are conducted at the child level. Conducting the analyses at the family level is one approach to resolving this dilemma.

We did conduct a few analyses at the child level, when we wanted to take into account child characteristics, but it should be remembered that significance levels in those analyses are downwardly biased.

26. In Kentucky, the ratio of assignment to experimental and control groups was 50-50.

27. In New Jersey, approximately 60 percent of the cases were assigned to the experimental group.

28. In Tennessee, approximately two-thirds of the cases were assigned to the experimental group.

29. Kentucky policy specifies that imminent risk includes children who are at risk of commitment as dependent, abused, or neglected; who are identified through the Regional Interagency Council, an interdepartmental unit, as severely emotionally disturbed; or whose families are in conflict such that they are unable to exercise reasonable control of the child. Both the referring worker and family members shall believe that without immediate intensive intervention, out-of-home placement is imminent. At the time of this study, New Jersey targeted family preservation services for families at imminent risk of having at least one child enter placement. The referring worker must have based the assessment of imminent risk on a face-to-face interview with the family no more than 5 days prior to the referral. The family must need services immediately and the worker must determine that other, less intensive, services have been used, are not appropriate, or are not available. In Tennessee, CPS intake workers complete a risk assessment form to identify high, intermediate, low, or no risk. High risk cases are identified as cases where "the child or children in the home are at imminent risk of serious harm if there is no intervention in the situation." A typical high risk case might involve such factors as: 1) a vulnerable child; 2) a history of previous maltreatment; 3) an active perpetrator who has continued access to the child; and 4) no available support or family strengths to offset the stated risks.

30. Analyses were also done on all allegations, whether substantiated or not. The results were very similar, although, of course, rates for all allegations were higher.

31. The six months analyses and survival analyses are obviously not independent.

32. Often we used average responses or proportions of positive responses rather than sums of responses to items. This was done in order to have scores for individuals when there were a few missing items on the scales. If an individual had too many missing items (usually 1/3rd or more) the score was declared missing. Rules for the calculation of all scales are given in Appendix J.

33. In multivariate repeated measures analysis, three main hypotheses are tested, first, that the scores for the experimental group, averaged over the three points in time are equal to those of the control group, (the "group" hypothesis); second, that the averages of the groups at each point in time are the same (the "time" hypothesis); and third, that there is no interaction between time and group. It is the third hypothesis that is central, indicating whether the groups change in different ways.

34. Variables in Tables 3-3, 3-4 and Figure 3-4 are described in Vol. 3, Appendix J.

35. This difference was slightly greater and statistically significant in the secondary analysis (48% vs. 35%, p= .04).

36. In the secondary analysis, fewer experimental group respondents reported health problems (12% vs. 21% for the control group, p = .04).

37. The control group had a slightly lower average proportion of affirmative responses to these items at post-treatment (.17 vs. .22, p = .16).

38. In the primary analysis, at post-treatment, a greater proportion of the experimental group reported difficulties paying rent (20% vs 13%, p = .13) and electric or heat bills (28% vs. 20%, p = .11). In the secondary analysis, differences were smaller and p-values for both items were above .20.

39. This difference was maintained but not significant in the secondary analysis (5% vs. 1%, Fisher's exact p-value = .078).

40. In the secondary analysis, there was again a .09 reduction in the average proportion of punishment items endorsed by the experimental group and a .04 reduction for the control group (p = .03).

41. Derogatis, L. R., Lipman, R. S., & Covi, L. (1973) SCL-90: An outpatient psychiatric rating scale -- preliminary report. Psychopharmacology Bulletin, 9 (1), 13 - 28.

42. Reliability analysis yielded a Cronbach's alpha of .92 at initial, .93 at post-treatment, and .92 at follow-up in Kentucky; .95 at initial, .94 at post-treatment, and .95 at follow-up in New Jersey; and .91 at both initial and post-treatment, and .90 at follow-up in Tennessee.

43. This difference was also significant for the secondary analysis (28% vs. 33%, p = .006).

44. In the secondary analysis, the difference was maintained and remained significant (31% vs. 24%, p = .0004).

45. In the secondary analysis, the average percents were 24 percent for the experimental group and 21% for the control group (p = .06).

46. In the secondary analysis, however, the difference increased and approached significance with 29 percent for the experimental group and 24 percent for the control group, p = .06.

47. The difference for the secondary analysis was also not significant (25% vs. 28%, p = .12).

48. The significant interactions with experimental group were as follows. For depression at post-treatment in New Jersey, there was an interaction of experimental group with single motherhood; for single mothers, there was no relationship between experimental group and depression, for other caretakers, the control group had higher depression scores. Also for depression at post-treatment in New Jersey, there was an interaction with employment; for those employed at the initial interview, there was no difference between the experimental and control groups, for those unemployed, the control group had higher depression scores. For negative life events at post-treatment in Tennessee, there was an interaction with income support; for those not receiving income support the control group had more negative life events, for those receiving income support, there was no difference between the experimental and control groups in negative life events. For household condition at follow-up in Tennessee, there was an interaction between age of caretaker and experimental group; in the control group there was no relationship between age and household condition while in the experimental group, older caretakers had worse household conditions.

49. Brown, D., Ahmed, F., Gary, L., & Milburn, N. (1995) Major depression in a community sample of African Americans. American Journal of Psychiatry 152(3), March 373-378.

50. Humke, C. & Schaefer, C. (1995) Relocation: A review of the effects of residential mobility on children and adolescents. Psychology; a quarterly journal of human behavior, 32(1), 16-24.

51. Honig, A. & Pfannestiel, A. (1991) Difficulties in reaching low-income new fathers: Issues and cases. Early Child Development & Care 77, 115-125.

52. Baxter, A., & Kahn, J. (1999) Social support, needs and stress in urban families with children enrolled in an early intervention program. Infant-Toddler Intervention 9(3), September 239-257.

53. The differing results for the uncontrolled analysis and the regression analysis may be due to the significant interaction in the regression equation of experimental group and income support.

54. The questions were: have you lost your temper when your children got on your nerves, have you found that hitting your child was a good way to get him/her to listen, have you sometimes found yourself hitting your child harder than you meant to, have things sometimes gotten out of control when you punished your child, have you punished your child by tying him/her up with a rope, cord, string, or belt, have you sometimes punished your child by not letting him/her into the house, have you punished your child for not finishing the food on his/her plate.

55. Because the dependent variable was dichotomous, the logit link function was used, transforming the outcome into log-odds. Hence, the analysis actually used a hierarchical non-linear model.

56. The reader is reminded of the findings reported in Chapter 7 indicating that experimental group caretakers generally had more positive views of service and of their relationships with workers than control group caretakers.

