The MarketScan database provides linked claims data on over 5 million enrollees from 52 employers and 80 different health insurance carriers. The data include individuals with private insurance from across the United States. The data, obtained directly from large employers, include comprehensive claims information (inpatient, outpatient, pharmaceutical and behavioral carve-out information) on all employees who work for a firm, regardless of health plan or whether medical benefits are received from the same carrier as behavioral health benefits. MarketScanincludes plans offering very generous health benefits (e.g., large employers and union health and benefit plans), as well as more traditional plans and consumer-directed health plans. Thus the database provides us with a population of enrollees with unlimited access to behavioral health services and those with very limited access.
For a small subset (10%) of the claims and encounters databases (110 health plans in 2008), Thomson Reuters has added benefit plan design information, which they have created from plan booklets obtained from the employers providing the data. The booklets range in their level of detail and depth, so Thomson Reuters codes as much information as possible. Due to the variability in the quality and specificity of information, however, the health plan benefit data are not always complete; nor is it guaranteed that the same specific constructs are being measured precisely across plans. Despite these limitations, we believed useful information could be obtained with respect to general cost sharing requirements (deductibles, co-payments,co-insurance rates), limits, exclusions, and other plan aspects important for understanding the average cost of providing coverage for a plan.
We used the 2008 linked benefits claims and encounters databases to generate a plan-level database for conducting descriptive analyses of current coverage of behavioral health spending and assess the feasibility for estimating an econometric model of the average medical cost (PMPM cost), which would form the backbone of an actuarial model. Although the 2008 database listed identifiers for 110 plans, two plans in the benefits database had no actual enrollees, four plans consistently reported missing information for all plan benefit design measures, and another lacked information on key benefit variables (co-payment and deductibles) relevant for examining PMPM costs (which when combined with an administrative loading factor determine premiums). Thus our starting analytic sample consisted of general plan benefit information for 103 plans.
Limited project resources and the high cost of the data precluded us from obtaining additional years of data to augment the sample. Because it is known that medical costs and medical practices vary substantially across geographic regions, additional information regarding cost of providing particular services can be gleaned by disaggregated the 103 plans down to the region level. Four principal regions are specified in the data (Northeast, Southeast, Midwest and West), but a “national” option was also provided, generating five possible values for this region indicator and a maximum of 432 plan-by-region observations (before missing values are considered). This relatively large number of plan-by-region observations emerges because the overwhelming majority of the 103 original plans (87.4%, n= 90) operated in more than one region.7
A problem with disaggregating plans, however, is that it can artificially generate “small” plans out of what are actually large plans. By that we mean that a relatively small share of a plan’s enrollee’s might be serviced in one region, while the bulk of the plan’s enrollees are covered in one or two other regions and yet calculations of average cost are based on the number of enrollees in a given region and not the overall plan. If an intermediate service used infrequently, such as residential treatment, is used by an enrollee in the artificially-generated “small” plan, then it would give the appearance of a much higher impact on total spending than what was truly experienced by the health plan. To ensure our analysis was not affected by the disaggregation of plans across regions, we used as our final analytic sample a version of the data that removed plans that had fewer than 50 people in one region if 85% or more of the enrollees were located in another region. This sample had 290 region plans represented in the data. Although some person-level data were not used in creating the analytic sample, all 103 plans are represented.
7. Seven of the 13 plans operating in one region operated only in the West, four operated in the Midwest, and two operated in the South. None of the plans indicating only one region listed that region as national.