Within the Microsoft Access relational database, each filing corresponds to a single data entry form. Analysts used this database of filings to capture the analytic variables from each PDF filing. Similar to the data extraction phase, NORC used pre-populated drop-down menus to reduce transcription error for quality control/quality assurance purposes. Variables captured in this phase of the project included:
- Unique identifier
- Market classification
- Insurer name
- Insurer NAIC code
- Date filing filed
- Date filing approved/filed
- Date effective
- Filing status
- Business status
- Increase period
- Proposed rate
- Approved rate
- Approved minimum and maximum rate increase
- Product type
- Reported member months
- Numbers of group contracts
- Numbers of covered members
For analytic purposes, NORC assigned product types listed in the filings to one of three categories with the intention of simplifying and standardizing classifications across states. Each product type identified in a filing was recorded as “HMO, EPO, POS”; “PPO, High Deductible”; or “Indemnity, Fee-for-Service, Conventional.” Product types with other labels were researched to make an assignment based on benefit structure, particularly whether the plan has a provider network and if so, how out-of-network care is covered. NORC coded filings that include two or more types of insurance product as having both or all three as appropriate. NORC recorded values for plan enrollment – member months, group contracts, or covered members – by product type.
Research analysts flagged any questions with a filing during data entry for review. Additionally, they manually checked each filing during the data entry process to ensure that it fell within the parameters of the study. Despite the fact that queries were designed to return only “in-scope” data, NORC research staff found many of the extracted filings were actually not part of the study’s area of focus. This implies continuing inconsistency in the quality of the state portals used. The analysts excluded filings if they did not meet the study’s targeted market class, type of insurance, membership, and effective date (see Table 3 for additional details).
Table 3: Reasons to Exclude Filings during the Data Entry Phase
After data entry for a state was finished, NORC conducted a multi-step review process, using a combination of randomized and purposive techniques to ensure data consistency, completeness, and quality. Initially, a first reviewer examined all filings flagged for review by the original research analyst entering filing data. The reviewer commented on all filings and instructed the original research analyst to either exclude the filing or address the specific issue as instructed.
The second review process examined 100 randomly selected filings to determine an overall error rate for the filings. Our lead analyst selected these 100 filings randomly from a pool of all possible valid state filings. A research analyst different than the one responsible for the original data entry examined each of the 100 filings for errors within each of the date entry variables captured for the filing; any errors encountered were manually corrected and tallied. The error rate for the data entry phase of this project was approximately one percent among analytic variables, and many of the issues found were errors of omission resulting from interpretation of idiosyncratically reported data. Subsequent investigations and corrections likely reduced this rate further.
NORC analysts cleaned the data in two stages. First we reviewed filings marked as valid for errors such as erroneously checked boxes (i.e. not checked as an invalid filing, but no information entered), clear typographic mistakes (i.e. incorrect date typed), and outlier values for analytic variables. Second, we examined filings with data which tripped automated “flags” programmed using SAS. These flags identified a number of types of anomalies such as records with no year assigned to filing, no rate change proposed or approved, or a minimum rate change larger than the maximum rate change. Analysts reviewed the flagged variables, made corrections as needed, and summarized the changes in an open “Notes” field for another reviewer to examine.
28 If effective date missing, filings excluded on the basis of date used the following rule – the approval date must fall between 10/1/2010 and 12/31/2012 to be included.