Despite existing challenges to meeting the conditions needed to use EHR and other electronic health data for research, our interviews and literature review illustrate that innovative solutions are being developed through a variety of publicly supported and private efforts. In particular, a number of large delivery systems and research networks have made substantial steps forward in developing the infrastructure and methods needed to conduct this type of research.
Experts in the field have suggested ways to move forward in the field of research using EHR or other electronic health data in general and/or ways to study specific small or minority populations. These suggestions can be categorized as potential studies aimed at data validation, new tools and methods for mining and extracting data, descriptive studies around specific populations, and outcomes research. There were also recommendations to explore the types of research for which EHR data are best suited, as well as ways that it can be used in combination with other data sources for research, including survey data. In addition to potential studies, there have been recommendations for efforts to engage clinicians in order to improve the quality of data available for EHR research. Providing education around the importance of the data may motivate physicians to enter data into structured fields rather than free text. Opportunities also exist to update the current legal framework that regulates use of electronic health data for research to both promote patient ability to make meaningful choices while minimizing the burden on both patients and researchers.
In order for research using EHR and other electronic health data to reach its full potential both in general and with small populations, engagement of key stakeholders must continue. Many of these stakeholders are working to identify critical next steps and promising pilots through an effort led by the Assistant Secretary for Planning and Evaluation (ASPE), including the development of this report with the input of technical experts. Other key stakeholders include government agencies, EHR vendors, health plans, providers, researchers, and consumer/patient groups, which all play an important role in achieving the conditions needed for research using EHR and other electronic health data.
Table ES.1. Major Conditions Required for Research Using EHR and Other Electronic Health Data on Small Populations
|Condition||Challenges||Solutions Being Tested|
|Data extraction||Requires IT skills, data storage, vendor cooperation, identification of desired records and variables||Central data warehouse within an organization, software to extract data from distributed data systems|
|Processing unstructured data||Highly heterogeneous, use of acronyms and appreciations, may include typing and spelling errors||Tools for natural language processing|
|High-quality, complete data||Errors of omission and commission, data limited to population receiving care from the organization, who may also receive care elsewhere that is not included; generalizability||Careful interpretation of results, linkage to other data sources, use of data from integrated delivery systems and research networks|
|Privacy and Security|
|Protection of patient privacy||Informed consent required for traditional research too burdensome for EHR-based research and may result in biased samples when only consenters included, information needed to identify small populations may be a threat to privacy for individuals||Obtaining general consent from patients for research using EHR data, use of de-identified data, classifying analysis as quality improvement rather than research|
|Governance||Resource investment and cooperation needed for infrastructure specifying who owns, controls, and regulates the data for research use||HIPAA provides some guidance, some organizations have developed a separate institute or company to conduct research|
|Combining Multiple Data Sources|
|Data sharing||Creating central warehouse for multiple organizations is resource intensive to build, maintain, and govern, privacy and data ownership concerns||Virtual/distributed data warehouses, practice-based research networks, regional health information exchange|
|EHR interoperability||Large variety of EHR systems and vendors, lack of standards||Federal incentives, voluntary consensus standards, efforts across organizations and vendors to standardize|
Table ES.2. Ability of Federal Survey and EHR/Other Electronic Health Data to Address Challenges in Studying Small Populations
|Challenge||Survey Data||EHR and Other Electronic Health Data|
|Small size of population||Difficult to obtain an adequate sample when sampled randomly||Larger sample (although not random) increases the potential to obtain enough records from a small population|
|Uneven distribution across the country of some small populations||Difficult to obtain an adequate sample when randomly sampled||Can use data from providers where the targeted subpopulation is concentrated|
|Ability to identify members of small populations||Lack of consistent categories used to classify members makes this challenging. Also, at times categories are not granular enough to identify specific small populations||Same, although natural language processing and use of multiple electronic data sources has shown some promise to help identify certain small populations. Challenges exist training providers and staff to collect needed information|
|Detail available to understand health and health care needs||Limits to survey length and self-reported information make level of detail low||Large volume of detailed information available, documented by providers, registration staff, and patient|
|Validity of data||Relatively strong, although there are weaknesses with self-reported information||Varies by type of electronic health data as providers document information for non-research purposes|
|Ability to study small populations over time||Cross sectional nature of most surveys does not allow this||Longitudinal nature of electronic health records well suited to follow populations over time|
|Need for different types of research||Data collection designed for generalization across the broader population and for hypothesis testing||Better suited to study unique populations than for generalization, as well as for descriptive or hypothesis generating research|
|Privacy||Access to information needed to identify small populations may risk ability to identify individuals||Secondary use of EHR and other electronic health data for research is challenging in the current legal framework|