Core Health Data Elements: Report of the National Committee on Vital and Health Statistics. Progress, issues and problems raised


Several major issues were raised that were broader than the discussions of specific data elements. Virtually all saw the need for uniform data items and definitions, and the issue of a unique identifier was a frequent topic. This issue represents more than just what item or set of items the identifier will include; it opens up the whole issue of data linkage, privacy, and data confidentiality with its relevant benefits and risks. Another issue was the role of the National Committee itself as the source of information on common data elements. Most participants eagerly supported an independent committee, such as this, to gather input and advise the public health and health care communities. However, the activities envisioned by many participants go much farther than an advisory committee can handle. These discussions led to the issue of needing DHHS staff dedicated to participating in the meetings of numerous data standards committees, advising the Department, and producing further iterations of data elements as future agreement is reached. Currently, such a staff does not exist.

Some states and organizations are on the cutting edge of multiple use of standardized data. For example, the State of California, in testimony to the NCVHS, described its efforts in improving health and health care delivery by linking data collected through medical facilities, school-based health and educational data bases, as well as need-based data bases such as eligibility listings for the Special Supplemental Nutrition Program for Women, Infants and Children (WIC) or reduced school-lunch programs. This project has brought together efforts from several state agencies, including education (for the school data), agriculture (the source of WIC data in some states), as well as health departments. Consensus building on data elements and definitions was, as always, a complex issue.

Data quality is a perennial issue. Although the UHDDS has been in the field for two decades and its data items are widely used by government and private organizations, issues of quality and comparability remain. A presentation by AHCPR reported on a study of 10 state data organizations and two statewide hospital associations participating in the Healthcare Cost and Utilization Project (HCUP-3). (Currently approximately 40 states collect health data on inpatient hospital stays.) AHCPR compared the 12 systems with the UB-92 and monitored deviations at 3 levels - easy, moderately difficult, and difficult to correct problems.

A detailed report of these findings is in the process of publication by AHCPR, but findings have shown that even well-recognized standards are not consistently followed. Any new data items, as well as the old, must be produced with clear instruction on data collection and coding.

Confidentiality of identifiable records is another critical issue. Currently, data are often shared within a facility in an identifiable format. However, identifiers are commonly removed when a data set is provided outside of a facility, such as to a state health data organization. And now, with movement toward HMO's, PPO's, and other types of managed care, there may be a greater need to share identifiable data. States have varying laws to protect the confidentiality of these data, and often the laws do not protect data that have crossed state lines. Sufficient penalties for breach of confidentiality either do not exist or are not enforced. There have been several proposals for Federal legislation in recent years; however, to date, no Federal legislation protecting the confidentiality of health records exists.

Several states, including California, Oklahoma, and New York presented findings on using a combination of key data items to perform probabilistic matches. Using items such as first name of mother; first digits of last name; date of birth; place of birth, etc., matches could be obtained without identifying the individual. It appeared that some types of data linkage could be obtained in states with smaller populations, but might not work nationwide. New York, using the last 4 digits of the Social Security Number, with other characteristics (such as date of birth), indicated a match rate exceeding 99 percent.

Problems could arise from adding and modifying data items and definitions too frequently. James Cooney, Ph.D., former member, NCVHS, described the burden to organizations from the addition of a single data item. Each item that is recommended must be considered carefully. Additionally, too frequent modification of items or definitions will cause confusion, overlapping data definitions in a single data year, and add to the burden of the facility or organization.

In addition to the presentations at the meetings, more than 100 written responses to the solicitation letter were reviewed and considered. Of these, approximately 70 percent provided information about their data elements. A chart showing the distribution of all respondents by type of organization is shown in appendix D. Approximately 30 percent of respondents were from state and local governments, followed by professional associations and the Federal Government with 18 Percent and 17 percent respectively. Providers, Insurers, and universities represented about 7 percent each. A listing of all participants in the two meetings as well as those who provided written responses at any point in the process is found in appendix E.

The Committee reviewed all of the input received from the hearings, meetings, letters and other communications. In addition, the historical knowledge of the NCVHS and its earlier decisions in the area of data standardization played a role in the preparation of a listing of core data elements and, where possible, recommended definitions. The draft listing was again disseminated in early April 1996 (see appendix F) to the original mailing list and especially to those who had provided earlier assistance. To assure the widest possible distribution, the document was also placed on the DHHS and NCHS Home Pages in an electronic format. More than 150 responses to this second request were received, including responses from the leaders in the health care and health care information fields. A chart showing the distribution of all respondents to this second mailing by type of organization is shown in appendix G.