Minimizing Disclosure Risk in HHS Open Data Initiatives. A. Disclosure Risk


In a seminal paper on the protecting the confidentiality of data released to the public, Dalenius (1977) described the problem in the following terms: “access to a statistical database should not enable one to learn anything about an individual that could not be learned without access” (cited in Dwork and Naor 2010). The literature on disclosure distinguishes between identity disclosure and attribute disclosure (Duncan and Lambert 1989). An identity disclosure assigns a name to a record in a database while an attribute disclosure assigns a characteristic to an individual or small group of individuals. Identity disclosure implies attribute disclosure, but attribute disclosure can occur without identity disclosure. A database may reveal that all of the members of a particular subpopulation share a specific characteristic. If an individual is known to belong to this subpopulation, something is learned about the individual even though no record in the database can be assigned unambiguously to that individual. Research on protecting confidentiality in tabular data has recognized the risk of attribute disclosure and devoted considerable attention to it, but research on protecting confidentiality in microdata has focused on identity disclosure. For federal agencies in particular, the goal in protecting the confidentiality of microdata is to prevent the re-identification of records that have been released as anonymous.

Some confidentiality provisions in federal legislation interpret any disclosure of an individual identity—regardless of how it is accomplished—as a violation of the law. Other confidentiality provisions—for example, those in HIPAA—acknowledge that it is impossible to release data for which the risk of disclosure is zero. If the effort to prevent disclosure produced a very low risk of re-identification, that effort would satisfy the legal requirement for protecting the data, even if a breach of confidentiality occurred. The United Kingdom’s National Office of Statistics does not consider a re-identification to be a disclosure by the agency if the breach required more than a reasonable amount of effort (Duncan et al. 2011, p. 28).

View full report


"rpt_Disclosure.pdf" (pdf, 1.01Mb)

Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®