Privacy and Health Research

From:
William W. Lowrance, Ph.D.

Preface

In September 1996 the U.S. Secretary of Health and Human Services, Dr. Donna E. Shalala, requested this study as background for policy decisions that her Department and American society, along with their counterparts in other countries, urgently must confront.

The study was conducted by Dr. William W. Lowrance, an external consultant, who for administrative purposes was appointed an interim government employee during the project. The project was supported by the Office of the Assistant Secretary for Planning and Evaluation.

The purposes of the study were to:

The author interviewed several hundred leaders in the U.S. and Europe, in government, academic, and private-sector research institutions; in government regulatory and public-health agencies; and in intergovernmental organizations. He also met with patient advocates, public policy experts, legal analysts, privacy advocates, and privacy commissioners. And he reviewed the relevant literature.

Everywhere, the author found deep interest in the issues, and unease about the present situation—concern that the very concept, "privacy," needs recasting; concern that health data are handled with too little respect for the people whose frailties they describe; and concern that research, disease prevention, and health care all will suffer if the current privacy, confidentiality, and security issues are not handled properly.

The author deeply thanks the many people who met with him or otherwise contributed to the study.

He is especially grateful to Mr. John P. Fanning, who as the Privacy Advocate coordinated this study within the Department of Health and Human Services and made many substantive contributions. It is now clear why Mr. Fanning's name has appeared in so many "acknowledgments" sections of reports over the years.

The author thanks, for various reasons known to them, Dr. Lawrence Bell, Mr. David Garrison, Mr. Frank Gladney, Dr. Arnold J. Gordon, Prof. Samuel Gorovitz, Dr. Betsy L. Humphreys, Ms. Doris L. Mandel, Ms. Candace Somers de Matteis, Dr. Nancy Mattison, Dr. Peggy McCardle, Dr. Hugh H. Tilson, Prof. Alan F. Westin, and Dr. Paul Williams.

He also thanks Ms. Karen K. Norrell and her associates for their administrative support from Washington, and Mme. Anne-Marie Togni and Mr. Leo Verhoeven for theirs from Geneva.


Executive Summary

This Report examines how society can best pursue two very important goods simultaneously: Protect individuals' privacy; and at the same time, preserve justified research access to personal health data, to gain health benefits for society.

As the fundamental nature of health care, and of health data and their uses, is changing dramatically, society must—now—examine and re-decide how much it cares about protecting health privacy. Health researchers must be certain that they are taking all reasonable measures to safeguard the data they collect and use, and to maintain the respect for privacy that is embodied in the very compact with society under which they work. And society must reformulate and update some of the rationales and criteria under which the health experience of individuals may be studied to benefit society.

Health research, compared with all the other potential avenues for intrusion, hardly threatens privacy. Many effective protections are in place. But possibilities for harm always exist. The challenge is to transpose and translate the traditional ethical and technical practices, which have served society reasonably well, to meet the contemporary demands.

CURRENT STRAINS

New approaches are being taken in providing health care, which is posing new research questions and changing the setting with which much research is conducted. Much more research now is being performed on data from private-sector managed-care organizations, for instance.

Also, new approaches are being taken in research, such as elaborate computerized analysis of large multipurpose health databases. And health factors such as genetics are being explored as never before.

For many reasons, the public are rightly apprehensive about the erosion of privacy of information about their health, generally. Among other matters, the security of computerized health records and electronically-transmitted health data is not fully assured. And potentially many harms can be suffered from unwarranted disclosure.

Respect for individuals will be best served not by insisting on absolute privacy, which is unattainable in modern life anyway, but by seeking informed consent to reasonable use of health information under strictly delimited conditions; by safeguarding personal data carefully; by genuinely affording fair-information-use rights to data-subjects; and by enforcing sanctions against improper use.

THE DIVERSITY OF HEALTH DATA

Health data of concern for ethics and policy, and for this Report, include not only primary medical and hospital records, but also pharmacy and laboratory data; vital records; administrative and financial data; data from surveys, clinical trials, adverse-drug-event reports, and outcomes and health-economics studies; registries organized by diseases, by treatment regimens, by demographics, or by other categories; and many other compilations. The data may be personally identified, or key-coded (pseudonymized), or fully anonymized.

DATA FROM RESEARCH, RESEARCH ON DATA

Contemporary health research is generating a multitude of benefits for humankind, and the future benefits look at least as promising. As the above heading indicates, health research generates new data by observation and experiment, but also—in part because its questions are of such an "applied," practical nature—it often proceeds by analyzing data that were originally collected for another purpose. The two approaches can have different implications for privacy.

The purposes of research are many, and they overlap. Research is conducted:

The Report discusses these purposes, and the approaches, the character of the data, and the privacy-protection problems involved.

IDENTIFIABLE---KEY-CODED---ANONYMIZED

From a privacy-protection perspective, there is a very wide distinction between personally identifiable data and truly anonymized data. But in practice the demarcation between these extremes is not sharp. Attending assiduously to where particular data lie on the spectrum between them, and especially to data that are somewhere in the middle, is a crucial protection strategy.

At present, large amounts of data lie in-between—they are not completely anonymized, but they are not readily identified, either. The power of computers to perform elaborate, powerful, rapid searches, and the pressures for access, mean that merely assigning simple pseudonyms affords little protection.

For data whose identifiability has, up to now, been only lightly obscured, greater efforts must now be made either: (a) to much more effectively remove personally identifying information, or to aggregate, and thus anonymize, the data; or (b) to seek the data-subjects' informed consent and hold the data under a suitably protective regimen if identifiability is retained.

For key-coded data—that is, data for which personal identifiers are removed and secreted but which are still potentially traceable via a matching code, held separately—a variety of measures must be taken to mask the identifiability near the source, separate and lock up the identifiers, safeguard the linking codes, and carefully manage linking-back to the data-subject when it is required.

REASONS FOR RETAINING IDENTIFIABILITY

For many purposes researchers must potentially be able to trace back, even if through intermediaries, to the data-subject. Irreversible anonymization is not necessarily desirable. There are a number of important reasons why retaining personal identifiability—either openly labelled or via key-coding—may be essential:

CONSENT, AND IRB REVIEW

The Federal Common Rule and other laws and regulations require many protections for human subjects of research. The main social instruments are informed consent of the data- subject, and Institutional Review Board (IRB) supervision. Both of these mechanisms have served society well. But both now need to be renewed.

For formal clinical trials and some other research, informed consent is routinely sought, Institutional Review Boards supervise the research, and other protections are enforced. But for many other kinds of research, for a variety of reasons notice is not routinely given nor explicit consent sought, and indeed these may be practically impossible to seek. The policy and pragmatic questions are obvious.

Retrospective studies, such as epidemiological reviews initiated years after the medical events, pose both special research opportunities and special ethical problems. So do secondary studies in databases. How should identifiability and consent be dealt with when such reviews are undertaken? And in general when data are collected, how broad consent should be sought for perhaps unanticipatable future studies?

There is no doubt that IRBs enhance research-subject protections and provide much public reassurance. They are an integral part of biomedical research. But it is less clear that IRBs have been attending as vigorously to privacy risks as they have to physical and emotional risks. For many IRBs the workload already is heavy. Now they may well have to be asked to become more deeply engaged with the privacy and confidentiality aspects of subject protection than they have been, in database research as well as in direct experimentation, and with genetic privacy. Whether they are able and willing to do so should be assessed.

PRINCIPLES

The following principles are recommended for organizations that conduct, sponsor, or regulate health research involving personally identifiable data. They can be transposed into professional guidelines, standard operating principles, regulations, or laws. Criteria and procedures should be established that are specific to the context.

MAJOR CURRENT ISSUE CLUSTERS

The Report identifies many problem areas. The following are four large groups of issues that, while not entirely new, are growing rapidly in scale and complexity, and must urgently be attended to:

Issue cluster: Secondary research use of data, and data linking

Secondary use is, as it sounds, use of data subsequent to the original use. Much highly beneficial health research depends on it.

As databases are maturing and increasing in size and quality, their appeal as research resources also is growing. Thus the databases of healthcare finance systems and managed-care organizations, among others, are much in demand. The data hunger of managed care, and of national healthcare systems, is insatiable. Ultimately the public will benefit from research studying these systems themselves as systems, as well as from research that uses data in the systems for external purposes.

If it is decided that personally identifiable data must be used, then the most difficult issue is consent. The Report proposes a scheme of "Consent scenarios in secondary research" in the hope that it will attract discussion and development.

Related to secondary research is data linking, in which associations (links) are made between data on the same data-subject(s) in more than one data collection. Beyond such considerations as consent, the concern about particular linking studies usually is whether they might assemble "too much" information about data-subjects or the social groups of which they are representative, even if personal identities are not revealed, and/or whether the linking can lead to data-subjects' becoming identifiable by deduction.

Issue cluster: Research on private-sector health data

Immense volumes of personally identifiable data and lightly masked key-coded data, as well as effectively key-coded or anonymized data, are handled by managed-care organizations, pharmaceutical and related companies, and other private-sector institutions. Some State legal controls apply, as do the Privacy Act and Federal laws where there is Federal involvement.

But for many health data held in the private sector, few legal controls apply in theory or are enforced in practice regarding such matters as data-subject consent, public notification, Institutional Review Board supervision, or transfer of the data for secondary study. Effective privacy, confidentiality, and security safeguards may well be in place, but this may not be fully evident.

The status of private-sector health data deserves to be reviewed. Probably it should be brought under a uniform Federal regimen.

Issue cluster: Cybersecurity

It is not a exaggeration to say that all over the world, the protection of the confidentiality and security of health data, especially data that are stored, processed, and transferred electronically, is under review. Until the several intersecting (and perhaps conflicting) goals are clarified and these problems are resolved, the envisioned future of lifetime electronic medical databases, elaborate health-data networks, and the like, will not be realized.

These issues are very different from those surrounding the security of paper records and physical filing cabinets (although, these are involved, too). Thus the rubric, "cybersecurity," is used here to connote the new character of the problems.

For research, how are various consents and differential access conditions to be trailed along with various data as the data are moved around, combined with other data, linked to other data, split apart and reassorted, and processed by different users for different purposes?

Issue cluster: Genetic privacy

As the newsmedia are constantly reminding us, the world has entered an entirely new era in genetics: The human genome is being mapped, incredibly sensitive and precise genetic tests have been developed, genetic screening has become commonplace, and an almost incredible array of genetic interventions is being explored. As an area of medicine and public-health practice, so much of the new genetics work is so innovative that for many purposes it must be considered "research."

What is changing rapidly is that we are becoming able to identify genetic factors that increase disease risk but are not uniquely the determinants of disease.

Ethical and policy solutions are being sought that will protect against using genetic data prejudicially against people's interests, such as eligibility for employment, financial credit, or health or life insurance.

Research on stored tissue samples, such as blood samples, biopsied tumor or other pathology materials, semen, and other human tissues that contain nucleated cells involves special questions. Identifiability, consent, and disclosure are the core issues.

Developing ethical guidance over genetic privacy is crucial to the future of both genetic research and applied genetics. Because genetic science is becoming more deeply integrated with other kinds of biomedical knowledge, genetic ethics must be integrated with basic biomedical ethics and not developed entirely separately.

THE INTERNATIONAL FLOW OF DATA

Personally identifiable health-research data are exchanged internationally every day, by governments, pharmaceutical firms, and others, and this will inevitably increase. Data on Americans are transferred, and American-based institutions do much transferring. Uniform international standards for protecting privacy, confidentiality, and security urgently must be developed.

NEW LAWS IN EUROPE

In October 1995 the European Union (E.U.) adopted a "Directive on the Protection of Individuals with regard to the Processing of Personal Data and on the Free Movement of Such Data." By October 1998 all fifteen E.U. Member States must bring their national laws into congruence with the Directive.

In February 1997 the Council of Europe adopted a "Recommendation on the Protection of Medical Data," the principles of which the 39 Members (which includes all the E.U. countries) are urged to transpose into their national laws.

Thus the European countries are revising both their general privacy laws and their laws covering health data. The resulting changes in European laws, regulations, codes, guidelines, and practices will have important implications for international health research, and for movement of data from Europe to the U.S. and other non-European countries.

NEW HEALTH INSURANCE LAW IN THE U.S.

A new "Health Insurance Portability and Accountability Act," which became law in August 1996, established several provisions relating to confidentiality of medical records as they are handled in health insurance, billing and payment data, and the like. How these are worked out will have implications for how data are accessed and processed in health research.

PROPOSED NEW U.S. LAWS

Versions of an omnibus "Medical Records Confidentiality Act" are being considered by the U.S. Congress, as is a "Genetic Confidentiality and Discrimination Act." Some States are revising their medical-privacy laws covering information on mental health, HIV–AIDS status, or genetics. All of these will have implications for research.

DIALOGUE BETWEEN THE U.S. AND EUROPE

For the U.S., it will be very important over the next few years to engage in high-level, broadly based dialogue with European leaders over the implementation of the E.U. Directive and the Council of Europe Recommendation. Discussions will have to be held with national governments and with intergovernmental organizations. Health care and health research must be addressed specifically; they simply cannot be dealt with in the same way as banking, credit, tax, education, transport, or criminal data. Private-sector organizations involved with health research should participate fully. So should regulatory agencies that require international transfer of health data.

Focal issues regarding health research will be:

In all of this, the U.S. government and other American organizations should not only be asking for concessions and exemptions, but also taking the opportunity of this period of reform to improve the ways they themselves handle these matters, and exerting international leadership.


[Previous]

[Table of Contents]

[Next]

Comments/suggestions about the HHS Data Council web pages should be directed to the Data Council Web Master.

"" Return to the Data Council home page .

Last updated 5/26/97.