The distinction between efficacy and effectiveness poses a challenge to telemedicine evaluation. Efficacy refers to the benefit of using a technology for a particular health problem in ideal conditions of use, for example, in a strict protocol of a randomized controlled trial conducted at a "center of excellence." Effectiveness is the benefit of using a technology for a particular health problem in general or routine conditions of use, for example, in a community setting. In most health care applications, efficacy and effectiveness comparisons present tradeoffs between internal and external validity.

The carefully controlled, ideal circumstances of an efficacy trial tends to provide findings with stronger internal validity concerning the causal relationship between a health care intervention and outcomes of interest. However, the findings of an efficacy trial may have only limited external validity, or generalizability, to other settings. On the other hand, the less controlled, routine circumstances of an effectiveness trial may provide more generalizable findings, but may have been less able to account for factors that may have confounded the causal relationship between an intervention and outcomes of interest. For many types of technologies, efficacy trials are conducted initially. If the technology is shown to be efficacious, it is then tried in other circumstances (different settings, patient group, different providers) to determine if it is effective more broadly.

Reports of the findings of telemedicine demonstrations or other studies are often made by "champions" or "early adopters" who tend to be advocates using the telemedicine applications in carefully chosen settings. As such, it may be difficult to generalize findings of individual telemedicine studies or demonstrations to general or routine circumstances. That is, while efficacy may be established in these studies, it may be more difficult or impractical to demonstrate effectiveness.

Getting a "fix" on effectiveness is complicated by the "moving target" nature of the field. Even as this initial experience is gained with a telemedicine application, its component technologies, their configurations, or other aspects of the application are evolving. As such the findings of a study may be outdated by the time a report appears in the literature.

In order to evaluate the effectiveness of telemedicine, the application of the technology needs to be considered. By focusing on a specific application of the technology (and/or on a specific setting and condition treated), an efficacy evaluation may achieve greater internal validity. Within the field of teleconsultation, applications include a number of activities (Grigsby et al. 1994):

  • Supervision and consultation for primary care encounters in sites where a physician is not available.
  • Routine diagnostic evaluations based on history, physical exam findings, and available test data.
  • Extended diagnostic work-ups or short-term management of self-limited conditions.
  • Medical and surgical follow-up and medication checks.
  • Management of chronic diseases and conditions requiring a specialist not available locally.
  • Initial urgent evaluation of patients, triage decisions, and pretransfer arrangements.

An all-encompassing evaluation of "telemedicine," per se, is not necessary to demonstrate the application's effectiveness. If an application is effective consistently across a representative set of indicators/applications, it is not necessary to evaluate for all indications. An illustration of this point is provided by considering the case of antibiotics. It is commonly understood that antibiotics are effective - as a treatment class, they do not need to be evaluated every time they are used. It remains, however, necessary to demonstrate that a particular antibiotic is effective at destroying a particular infection. Similarly, Grigsby et al. (1994) suggests narrowing the scope of evaluation, by selecting certain conditions to serve as indicators of the effectiveness of telemedicine. The accuracy of the diagnosis (specificity and sensitivity) for these conditions would demonstrate the effectiveness of this mode of health care delivery. The degree of accuracy required for a given condition depends not only on the seriousness of the condition, but on the nature of its progression as well. Grigsby illustrates his point by comparing the diagnostic process for Chronic Obstructive Pulmonary Disease (COPD) to that of hantavirus pulmonary syndrome. A missed diagnosis in the early stages of a progressive, chronic disease like COPD may not result in adverse health outcomes in the patient. On the other hand, for a condition such as hantavirus that becomes life-threatening very quickly, accuracy in initial diagnosis is critical.

A specific example of measuring effectiveness is provided by the clinical evaluation of Parkinsonian tremor via a teleconsultation. If the patient were to be evaluated over a telemedicine connection that allowed for too few screens per second, the tremor could not be appropriately evaluated by the clinician, and this technology would be ineffective for this particular indication.

Currently, health outcomes data for telemedicine applications are limited, as small sample sizes limit the ability to derive meaningful results from an evaluation. Our interviewees emphasize that the dearth of outcomes data stems, in part, from the limited funding for effectiveness studies of telemedicine.

