Privacy and Health Research. Data linking


Related to secondary research is data linking, in which associations (links) are made between data on the same data-subject(s) in more than one data collection.111

(In everyday life we do this when we match a name on one list, say a school student list, with the list of names in the telephone directory to make a best-guess at her address and parents' names, and then, because it "rings a bell," find ourselves associating the mother's name with a name on a list of local attorneys, and so on.)

Linking may occur within a data set, or between data sets. It may occur within an organization, or between organizations. It may involve health data only, or health data and other data (such as lifestyle, socioeconomic, or police data).

Typical of how secondary analysis, with data linking, can be useful is indicated by this example:112

To examine health status of nursing home residents, cost issues, quality of care concerns (e.g., pressure ulcers, methicillin-resistant Staphylococcus aureus, or [hospital incurred] infections), outcomes (mortality, readmissions), and prevention for residents eligible for both Medicare and Medicaid, data are needed from two sources: Medicaid data to identify nursing home residents and their characteristics, and Medicare Part A data to assess hospitalization episodes.

Beyond such considerations as consent, the concern about particular linking studies usually is whether they might assemble "too much" information about data-subjects or the social groups of which they are representative, even if personal identities are not revealed, and/or whether the linking can lead to data-subjects' becoming identifiable by deduction.

