Privacy and Health Research. Research to Know Patterns of Health, Disease, and Disability


All over the world, health and disease are monitored. Starting with prenatal observations and birth data, throughout life health-related measurements and observations accumulate. Analyses are made to portray the "natural history" of diseases and disabilities—how they start, progress in a person or spread to others, and run their courses. Also analyzed are risk factors, and the effects of preventions and interventions. Now genetic patterns in populations are being analyzed much more actively, and in far greater detail, than ever before.

Public-health surveillance, recording the occurrence of events in populations, is one of the longest-established of public-health functions. A representative definition is this one by Stephen Thacker:39

Public health surveillance is the ongoing systematic collection, analysis, and interpretation of outcome-specific data for use in the planning, implementation, and evaluation of public health practice. A surveillance system includes the functional capacity for data collection and analysis as well as the timely dissemination of these data to persons who can undertake effective prevention and control activities.

The tasks of surveillance include assembling vital statistics (on births and deaths, and sometimes other events); profiling health status within populations; analyzing patterns of illness and disability, and health risks and risk factors; and studying how people interact with healthcare systems.

Practitioners of surveillance sometimes protest that what they do is not "research." Dr. Thacker insists: "The boundary of surveillance practice excludes actual research and implementation of delivery programs. Because of this separation,epidemiologic cannot accurately be used to modify surveillance."

This is controversial. Perhaps part of the problem is that much surveillance is of necessity based on less-than-fully-standardized, non-validated reports from local physicians and laboratories. Too, in emergencies, such as when surveillance is quickly mounted to trace a contagious disease outbreak, the observations may lack scientific elegance. And generally, surveillance does not itself test a hypothesis (about cause, for instance), but rather passively collects data (although the data generated by surveillance may be used to test a hypothesis). Surveillance may indicate that something is happening, but not necessarily why, or what the factors are. But it does perform highly structured searches for data that, among several purposes, become input for research. This deserves continued discussion. The reason it may matter for privacy is that it can have implications for how the activity is treated under human-subjects protection regulations.40

Notifiable disease reporting is a standard public-health activity everywhere. Under the National Notifiable Diseases Reporting System in the U.S., local and State health departments routinely forward case reports, including data on age, gender, and race, on around 50 diseases (measles, mumps, tuberculosis, hepatitis A and B, syphilis...) to the Centers for Disease Control and Prevention (CDC). The CDC then quickly publishes analyses, which help public-health experts and authorities discern patterns of occurrence, and intervene.41 The CDC receives the case reports with the identifiers removed. Many other such surveillance programs are in operation all over the world. The World Health Organization publishes summaries.

A spirit of openness and reassurance can encourage a community to cooperate. Robert Hahn has proposed this "Ethical checklist for public health surveillance":42

  1. Justify the surveillance system in terms of maximizing potential public health benefits and minimizing public and individual harm.
  2. Justify use of identifiers and the maintenance of records with identifiers.
  3. Have surveillance protocols and analytic research reviewed by colleagues, and share data and findings with colleagues and the public health community at large.
  4. Elicit informed consent from potential surveillance subjects.
  5. Assure the protection of the confidentiality of subjects.
  6. Inform health-care providers of conditions germane to their patients.
  7. Inform the public, the public health community, and clinicians of findings of surveillance.

Health statistics programs collect a very large variety and volume of facts, to provide the descriptive backdrop against which society can decide how to optimize interventions, use resources most effectively, and cope with change.

The U.S. National Center for Health Statistics (NCHS), a component of the National Centers for Disease Control and Prevention, analyzes data from existing records, and it gathers data itself via interviews and examinations. The NCHS's National Health Interview Survey periodically collects data on a very wide range of health-status measures and illnesses, and on hospital use, dental care, hearing impairment, nursing home experience, and many other matters; the most recent survey interviewed 120,000 people. To provide data on infant-death risks, NCHS maintains linked files of live births and infant deaths. It also gathers data such as birthweight, which is a reliable index to both maternal and infant health, on a sampled national basis. To help epidemiologists identify subjects for in-depth causal analyses, NCHS assembles selected mortality data from the States into the National Death Index.

In the next round (IV) of its famous National Health and Nutrition Examination Survey (NHANES), the NCHS will examine some 30,000 carefully sampled people to determine health trends. Special coverage will be given to such subgroups as Blacks, Mexican-Americans, low- income persons, preschool children, and the elderly. Like earlier rounds, the Survey will be based on extensive confidential interviews, physical examinations, and laboratory tests. It will amass about 8,000 pieces of data on each subject. These NHANES surveys are data quarries, from which insights derived from data on relatively few people help improve health for countless others in the larger society, including people outside the U.S.

Many of the data collected by NCHS are personally identifiable data. The Center's statute stipulates that personally identifiable data must be carefully protected, and that they may not be used for any purpose other than that for which they were collected unless the data-subject gives new informed consent to the new use.43 It shares identifiable data with researchers in other U.S. government agencies only if the data-subjects have been informed of and consented to such sharing, and then only under highly restrictive interagency agreements. NCHS never releases identifiable data to anyone else. It does release data for public use, but only after all personal identifiers, and all information that might allow deductive identification of the subjects, have been removed.

Registries. Public-health agencies and other organizations maintain many registries in addition to those for notifiable diseases. Registries usually collect data on individuals' or populations' experience over time, perhaps linking data from several sources (occupational hazard exposure + disease incidence...), and may be cumulated so that the progression of events can be studied. Registries may cover locally important diseases (Lyme disease...), for instance, or occupational illnesses (carpal tunnel syndrome...), or consequences of disasters (Chernobyl...).

A crucial function of registries and other databases can be the identifying and monitoring of health problems of minority and underserved groups, and the effectiveness of interventions.44 Special precautions may need to be taken in order to protect the identities and rights of minority data-subjects.45

(39) Stephen B. Thacker, p. 3 in "Historical development," pp. 3–17 of Steven M. Teutsch and R. Elliott Churchill, editors, Principles and Practice of Public Health Surveillance (Oxford University Press, New York and Oxford, 1994). This book, written mostly by experts at the U.S. Centers for Disease Control and Prevention, is an excellent overview.

(40) A recent "viewpoint" essay on this from the CDC was Dixie E. Snider, Jr. and Donna F. Stroup, "Defining research when it comes to public health," Public Health Reports 112, 29–32 (1997); a "counterpoint" essay was Wendy K. Mariner, "Public confidence in public health research ethics," Public Health Reports 112, 33–36 (1997).

(41) The CDC's weekly Mortality and Morbidity Report and much related CDC information is available on the Internet at < >.

(42) Page 188 of "Ethical issues," in Teutsch and Churchill, as cited in endnote (39).

(43) Public Health Service Act § 308(d); 42 United States Code 242m(d).

(44) U.S. Department of Health and Human Services, "Directory of Minority Health and Human Services Data Resources," prepared under U.S. Agency for Health Care Policy and Research contract No. 282-90-0031 by Moshman Associates, Inc., of Bethesda, Maryland (October 1995); published only on the Internet at < >.

(45) For research examples and ethical context, see Jonathan R. Sugarman, Martha Holliday, Andrew Ross, and Doni Wilder, "Improving health data among American Indians and Alaska Natives: An approach from the Pacific Northwest" and other chapters in Audrey R. Chapman, editor, Health Care and Information Ethics: Protecting Fundamental Human Rights (Sheed & Ward, Kansas City, Missouri, April 1997).