In § 164.506(d) of the NPRM, we proposed that the privacy standards would apply to "individually identifiable health information," and not to information that does not identify the subject individual. The statute defines individually identifiable health information as certain health information:
(i) Which identifies the individual, or
(ii) With respect to which there is a reasonable basis to believe that the information can be used to identify the individual.
As we pointed out in the NPRM, difficulties arise because, even after removing obvious identifiers (e.g., name, social security number, address), there is always some probability or risk that any information about an individual can be attributed to that individual.
The NPRM proposed two alternative methods for determining when sufficient identifying information has been removed from a record to render the information de-identified and thus not subject to the rule. First, the NPRM proposed the establishment of a "safe harbor": if all of a list of 19 specified items of information had been removed, and the covered entity had no reason to believe that the remaining information could be used to identify the subject of the information (alone or in combination with other information), the covered entity would have been presumed to have created de-identified information. Second, the NPRM proposed an alternative method so that covered entities with sufficient statistical experience and expertise could remove or encrypt a combination of information different from the enumerated list, using commonly accepted scientific and statistical standards for disclosure avoidance. Such covered entities would have been able to include information from the enumerated list of 19 items if they (1) believed that the probability of re-identification was very low, and (2) removed additional information if they had a reasonable basis to believe that the resulting information could be used to re-identify someone.
We proposed that covered entities and their business partners be permitted to use protected health information to create de-identified health information using either of these two methods. Covered entities would have been permitted to further use and disclose such de-identified information in any way, provided that they did not disclose the key or other mechanism that would have enabled the information to be re-identified, and provided that they reasonably believed that such use or disclosure of de-identified information would not have resulted in the use or disclosure of protected health information.
A number of examples were provided of how valuable such de-identified information would be for various purposes. We expressed the hope that covered entities, their business partners, and others would make greater use of de-identified health information than they do today, when it is sufficient for the purpose, and that such practice would reduce the burden and the confidentiality concerns that result from the use of individually identifiable health information for some of these purposes.
In §§ 164.514(a)-(c) of this final rule, we make several modifications to the provisions for de-identification. First, we explicitly adopt the statutory standard as the basic regulatory standard for whether health information is individually identifiable health information under this rule. Information is not individually identifiable under this rule if it does not identify the individual, or if the covered entity has no reasonable basis to believe it can be used to identify the individual. Second, in the implementation specifications we reformulate the two ways in which a covered entity can demonstrate that it has met the standard.
One way a covered entity may demonstrate that it has met the standard is if a person with appropriate knowledge and experience applying generally accepted statistical and scientific principles and methods for rendering information not individually identifiable makes a determination that the risk is very small that the information could be used, either by itself or in combination with other available information, by anticipated recipients to identify a subject of the information. The covered entity must also document the analysis and results that justify the determination. We provide guidance regarding this standard in our responses to the comments we received on this provision.
We also include an alternate, safe harbor, method by which covered entities can demonstrate compliance with the standard. Under the safe harbor, a covered entity is considered to have met the standard if it has removed all of a list of enumerated identifiers, and if the covered entity has no actual knowledge that the information could be used alone or in combination to identify a subject of the information. We note that in the NPRM, we had proposed that to meet the safe harbor, a covered entity must have "no reason to believe" that the information remained identifiable after the enumerated identifiers were removed. In the final rule, we have changed the standard to one of actual knowledge in order to provide greater certainty to covered entities using the safe harbor approach.
In the safe harbor, we explicitly allow age and some geographic location information to be included in the de-identified information, but all dates directly related to the subject of the information must be removed or limited to the year, and zip codes must be removed or aggregated (in the form of most 3-digit zip codes) to include at least 20,000 people. Extreme ages of 90 and over must be aggregated to a category of 90+ to avoid identification of very old individuals. Other demographic information, such as gender, race, ethnicity, and marital status are not included in the list of identifiers that must be removed.
The intent of the safe harbor is to provide a means to produce some de-identified information that could be used for many purposes with a very small risk of privacy violation. The safe harbor is intended to involve a minimum of burden and convey a maximum of certainty that the rules have been met by interpreting the statutory "reasonable basis to believe that the information can be used to identify the individual" to produce an easily followed, cook book approach.
Covered entities may use codes and similar means of marking records so that they may be linked or later re-identified, if the code does not contain information about the subject of the information (for example, the code may not be a derivative of the individual's social security number), and if the covered entity does not use or disclose the code for any other purpose. The covered entity is also prohibited from disclosing the mechanism for re-identification, such as tables, algorithms, or other tools that could be used to link the code with the subject of the information.
Language to clarify that covered entities may contract with business associates to perform the de-identification has been added to the section on business associates.