In addition to the good news about positive youth development programs, we present some concerns related to specific findings, and considerations for the future.
A little more than half of the well-evaluated programs measured outcomes only at the end of the program; in other words, no further follow-up was done or was available at the time of this review. Whether those programs will continue to show positive results is a question that remains unanswered. This is of particular concern since in two instances, programs that reported long-term results were unable to sustain their initial positive findings. It is clearly most desirable - and presents the most compelling evidence - when programs can demonstrate positive long-term outcomes. In the case of the two studies unable to demonstrate long-term results after initial positive effects, the reasons for these findings needs additional study, and should be shared with the positive youth development community.
Evaluators of positive youth development programs are encouraged to take action to expand the knowledge gained from evaluations. Consensus on the use of standardized youth outcome measures needs to be reached. Studies should measure changes of both positive and problem behaviors because to do so is truly representative of the "whole child." Although such positive outcomes as academic achievement, engagement in the workforce, and income are widely accepted positive outcome measures, there is little consensus on what constitutes a complete set of positive youth development outcomes.
Standardized measures of positive youth development constructs need to be developed and used. While the positive youth development constructs are typically seen as important mediating variables, the field is just beginning to grapple with defining outcomes of positive developmental experiences. Further, measurement of a comprehensive set of predictors of positive and problem outcomes will allow for a better understanding of the processes through which the intervention has an impact on youth outcomes. A complete measurement package (positive and problem behaviors, appropriate and relevant positive youth development constructs, and risk and protective factors) common across promotion and prevention studies would increase our understanding of the processes leading to positive youth development. This will help to establish a shared language and framework.
We call for consensus on the use of structured comparisons in evaluation designs. While it is true that there are many innovative ways to evaluate programs, so far nothing has come close to substituting for the credibility of a strong structured comparison. Admittedly the rigors of experimental designs with the complexities of random assignment are beyond the reach of many programs, but evaluations can be only as credible as the framework they use. A good quasi-experimental design with well-balanced comparison groups can provide acceptable proof of effectiveness.
Finally, we call on all investigators who submit articles to peer reviewed journals to move toward consensus on which information they will report, particularly the quantitative data, and in what forms they will report it. In program reports, particularly in peer reviewed journals but also in unpublished evaluation studies, there must be both sufficient narrative description, and quantitative and statistical detail, to enable an independent assessment of what the program accomplished. Program descriptions should specify which youth constructs they address, and they should specify the relationship between these constructs and the outcomes that the evaluation measures. As a field of youth development specialists, we show surprisingly little agreement on the issue of a common statistical metric in published reports. As long as some studies report such key information as group means and standard deviations, and others do not, we will not give each other the tools to create a viable basis for comparison between studies. Consistency in the presentation of the evidence will truly advance our understanding of program effectiveness.