To begin to understand the measure's validity, we calculated its sensitivity and specificity. For the purposes of this investigation, sensitivity is defined as the proportion of clinicians identified by clients or the clinicians themselves as high performers in the delivery of evidence-based psychotherapy when compared to supervisor scores. Specificity, in contrast, is the proportion of clinicians identified as low performers in the delivery of evidence-based psychotherapy. We compared clinician and client scores to the supervisor scores, which for the purposes of these analyses, we treated as the gold standard. We examined the implications for the measure's sensitivity and specificity using two thresholds, the median (P5) and above the 75th percentile (P75) to determine high and low delivery of evidence-based psychotherapy.
Table V.6 summarizes the sensitivity and specificity results. For supervisors and clinicians, the sensitivity rate ranged from 0.32 to 0.78 across the factors. The specificity rate ranged from 0.51 to 0.88. For supervisors and clients, the sensitivity rate was 0.22-0.61 and the specificity rate was 0.49-0.81 (Table V.6).
Based on these preliminary findings, the P50 (median) threshold appears to better discriminate performance than the more stringent P75 threshold. This threshold obtained consistently higher values for sensitivity and specificity in supervisor-clinician pairings when compared to the P75 threshold.
In both supervisor-clinician and supervisor-client pairings, the P75 threshold demonstrated higher specificity. However, in supervisor-client pairings, the sensitivity values with the P75 were quite low compared to those observed among the clinicians at the same threshold, suggesting a differential performance with the instrument between respondents. The observed differences in performance across pairings suggest a need to further evaluate the instrument to identify the optimal threshold for each respondent type.
When thinking about measure implementation, it is important to note there may be instances where a supervisor is not the gold standard. For example, supervisors may treat too few patients to serve as experts in the delivery of evidence-based psychotherapy or they may not be trained in cognitive behavioral approaches--which the measure largely draws upon--and therefore, may not be best positioned to identify a clinician's use of these techniques. In Chapter VI, we discuss next steps for further assessing the measure's validity.
|TABLE V.6. Results of Sensitivity and Specificity Analyses|
|Comparison of Supervisor and Clinician Scores||Comparison of Supervisor and Client Scores|
|Specificity P50 Threshold||Sensitivity P50 Threshold||Specificity P75 Threshold||Sensitivity P75 Threshold||Specificity P50 Threshold||Sensitivity P50 Threshold||Specificity P75 Threshold||Sensitivity P75 Threshold|
|Factor 1: Structuring and conducting the Session||0.78||0.78||0.81||0.45||0.49||0.50||0.81||0.32|
|Factor 2: Psychoeducation and therapeutic techniques||0.51||0.50||0.78||0.32||0.56||0.56||0.76||0.26|
|Factor 3: Therapeutic Alliance||0.63||0.64||0.82||0.48||0.50||0.51||0.75||0.22|
|Factor 4: Suicide assessment||0.61||0.63||0.85||0.57||0.56||0.58||0.79||0.37|
|Factor 5: Homework||0.63||0.63||0.80||0.42||0.62||0.61||0.76||0.32|