Evaluating the impact of teleconsultations requires some basis of comparison. Consistent with remarks in the literature, several of our expert interviewees indicated that a recurrent weakness in telemedicine evaluations has been the lack of a clearly defined control group. In general, a comparator should be the standard or level of care that would be provided in the absence of the experimental intervention.
The design of a teleconsultation evaluation should specify, and justify, the comparator. For an evaluation of teleconsultations for patients in a local site that is remote from desired care (e.g., from a physician specialist), possible comparators include:
- no care;
- inadequate or underspecialized in-person care locally;
- in-person care remotely (requiring patient travel);
- delayed in-person care remotely (requiring patient travel); and
- delayed in-person care locally (requiring physician specialist travel).
Identifying an appropriate control group also depends on whether a telemedicine application substitutes for care provided by on-site personnel, or if it is additive to existing care.
In order to establish a realistic basis upon which to determine the true size of the effect of the teleconsultation, the selection of comparator should reflect as nearly as possible the usual care that would be available in the absence of the teleconsultations. For any given population, this may include any or all of the five possibilities listed above. Therefore, the way to achieve the most realistic comparison may be to randomize patients to usual care, which could include any care mode that they would seek in the absence of teleconsultations, and to teleconsultations.
Experimental designs with contemporaneous controls (i.e., current, parallel control groups) are generally stronger than those with historical controls. Using historical controls fails to account for the confounding effects of the passage of time, i.e., the different prevailing conditions that may exist between the time of data collection for the historical control group and the time of data collection for the current intervention group. That is, changes may have occurred in the study population or aspects of health care delivery or administration during this time that would confound the study results concerning the causal effect between the type of care and the outcomes of interest. Historical controls can be sufficient if there is strong reason to assume that prevailing conditions have not changed over time, and that the relationship between usual (or no) care and the outcomes of interest has remained virtually constant. This confounding effect of time also pertains in instances where data are collected for a population prior to an intervention (e.g., the establishment of a teleconsultation program) and following the intervention. (This is sometimes referred to as "pre-test post-test" design.) Another methodological weakness of historical controls is the opportunity for selection bias to occur, i.e., in the selection of the basis for the historical control. Comparative studies of telemedicine have too often relied on historical controls.
The reliance on historical controls is due to a variety of reasons, including the practical difficulties of assignment of patients (randomly or not) to intervention groups and control groups. In some instances, once the telemedicine intervention was in place, it was impractical to keep patients from using it, thereby losing the basis of a contemporary control group. In other instances, the number of participating patients has been so small (e.g., in low-density rural areas) that dividing them into intervention and control groups would yield too little data upon which to base any statistically meaningful findings. In these instances where sample sizes are small, it may be desirable to conduct multicenter studies. Of course, this typically requires greater funding. Another approach is to use meta-analysis or similar statistical techniques to combine the results of multiple small studies (each of which may not have sufficient sample sizes to yield statistically significant findings) to yield a larger study that can achieve statistically significant findings. However, doing so requires making assumptions about the comparability of the populations and interventions used in the smaller individual studies. As noted above, a recent comprehensive attempt to conduct such a meta-analysis of research reports on telemedicine costs was unable to identify a sufficient number of studies to meet minimal criteria for combining the study findings (Whitten et al. 2000).
Another approach used in teleconsultation evaluations is to use matched populations served by different, yet similar health delivery sites. In these instances, one community retains usual care while the other gets the teleconsultation intervention. The validity of this type of design rests on assumptions about the similarity of the two populations, their respective health delivery sites, and other circumstances that might affect study results.
Among our site visits and expert interviews, the single most often cited aspect of disparity between telemedicine interventions and usual care was third-party reimbursement. Several experts asserted that reimbursement drives utilization of telemedicine, and this theme was confirmed during our site visits. As several experts noted, the basis for comparison is may be undermined when reimbursement is available for usual care but not for teleconsultations. Reimbursement differences might not affect certain telemedicine evaluations, e.g., of the technical performance of a system, ease of use, or operating costs. However, reimbursement differences may confound findings about clinician or hospital acceptance, access, utilization, health outcomes (if dependent on utilization), and other evaluation measures. Thus, for a valid evaluation of a teleconsultation program to be conducted, it may be necessary for the program to be conducted in the same payment environment as usual health care. Even to the extent that a teleconsultation program is shown to be effective and cost-effective, inadequate reimbursement could stand as a barrier to its use.
Reimbursement anomalies can have other unintended effects on telemedicine services. For example, the availability of reimbursement for telemedicine using video technology may prompt the medically unnecessary substitution of video-based encounters for simple telephone calls, which are not reimbursed.
The potential for differences in reimbursement status to confound comparative studies should be considered for demonstration projects where the results of such demonstrations are used to inform decisions about deploying or modifying telemedicine programs or establishing policies about telemedicine delivery or payment. In areas where reimbursement is not available for teleconsultations, a demonstration project should consider including funding for payment for teleconsultations that is comparable to payment for corresponding health care services. This funding could come from regular payers (e.g., Medicare, Medicaid, managed care organizations) on a special basis for the purposes of the demonstration, or it could be part of the demonstration budget itself. In either case, the process for providers to secure such reimbursement should not entail any different level or process of administrative than is entailed in reimbursement for usual care.