After everyone in the sample had been grouped into each of HCC, ADG and ACG categories using the DCG and ACG software, we estimated multiple regression models on part of the sample. In the models the dependent variable is total payments for inpatient and outpatient treatment in 1995. Total payments were based on submitted claims, including both the portion of costs paid for by the health plan and the portion paid for by the individual (e.g., cost-sharing and deductibles). Total payments excluded pharmaceutical expenditures because pharmacy data were not available for all observations.
The estimated coefficients from these models are then applied to each person in the sample to predict their future costs relative to a base average expenditures. In effect, this applies relative weights across the groups. The four regression models used to predict expenditures included the following independent variables:
- Baseline: age, sex, hospital wage index
- Baseline+HCC: age, sex, hospital wage index, HCCs
- Baseline+ACG: age, sex, hospital wage index, ACGs
- Baseline+ADG: age, sex, hospital wage index, ADGs
We also predicted expenditures by a fifth method, termed HCC-SW (for HCC-Supplied Weights). These weights are provided with the DCG software and reflect a regression model performed by the authors of the DCG system. Our method for using them is described below.
The unit of observation in each model was the individual. Each variable was defined over the course of a year based on an individual's claims and enrollment data. The models were estimated on people who were continuously enrolled for two years, 1994 and 1995.
The independent variables in the first model were the person's age and sex, and the hospital area wage index for the patient's locale. The hospital wage index was created by the Health Care Financing Administration to measure hospital labor costs. Each MSA (metropolitan statistical area) in the country is assigned a hospital wage index. Areas outside an MSA are assigned a state wage index. Because labor expenditures make up the majority of hospital expenditures and hospital expenditures make up a large portion of total expenditures, the hospital wage index is often used to control for differences in the price of health care across different areas of the country.
The independent variables in the second model were the person's age, sex, hospital wage index, and HCC. The HCCs are entered as binary variables in the model (excluding one reference HCC). The third model consisted of the person's age, sex, hospital wage index, and ACG. As with the HCC model, the ACGs entered the model as binary variables. The fourth model included the person's age, sex, hospital wage index, and binary indicators for the ADG categories.
We estimated the models through an iterative method. We first drew a random 70 percent sample and used it to estimate weights for the independent variables. These weights were used to predict total expenditures for the remaining 30 percent of people. Because predictive accuracy can vary across random samples, we estimated each model 50 times, each time drawing a different 70 percent random sample. This yielded 50 estimates of total expenditures for each person using the remaining 30 percent sample. The predicted expenditures we report are the averages of the 50 predictions.
The models were estimated using ordinary least squares (OLS) regression, a linear model. Although, the average annual total expenditures are highly skewed by the presence of very high expenditures for a few cases, OLS is quite robust to asymmetric and highly skewed errors. Second, we expected that in our sample of people with chronic conditions, few individuals would have zero values for health care expenditures in the following year. Third, the most common nonlinear alternatives to OLS (such as the two-part model of Duan (1983)) have been found to be sensitive to the transformation problem. In these models the dependent variable is not expenditures but rather a nonlinear transformation of expenditures such as the natural logarithm or the square root. The factor used to convert the transformed expenditure estimates back to dollars may depend on the levels of the independent variables, a complicated problem that many researchers have ignored (Mullahy, 1998; Manning, 1998). Finally, we chose the linear model over common nonlinear models because we anticipate that this approach will be most widely used due to its relative ease of implementation.
In addition to estimating new risk-adjustment weights, in the fifth model we tested the predictive ability of the HCC weights which accompany the DCG software. We did this by normalizing the weights to have a mean value of 1.0 across our sample and then multiplying the weight for each condition by the average expenditures for all people in the plan having that condition. For example, suppose the normalized weight for rheumatoid arthritis is 1.5. This would indicate that people with arthritis are expected to have expenditures 50 percent higher than the average person in the population. We then compared the resulting predicted expenditures for each condition against the actual expenditures.
To measure how well the models predict for individuals with potentially disabling chronic conditions, we calculated the average predicted costs for the 30 percent samples of people with chronic conditions and compared the predictions to the actual costs. We also simulated how much money the plans would have lost or gained by providing care to people with various chronic and potentially disabling conditions under various systems. We did this by subtracting the mean predicted expenditures from the mean actual expenditures across the sample and then multiplied the difference by the number of people in the selected plans.