Alternative Risk-Adjustment Approaches to Assessing the Quality of Home Health Care: Final Report. Preliminary Analyses


Currently, different subsets of home care patients are assessed when determining an agency’s performance on each OBQI quality indicator. The three utilization outcomes are computed for all episodes except those ending in death (i.e., approximately 98% of episodes are included). For all other outcomes, two additional criteria are used to determine whether or not a given episode will be included. First, the episode must end in discharge to the community (approximately 70% of episodes), because the endpoint measures used to calculate improvement or stabilization on the non-utilization outcomes are collected only on the more comprehensive assessment made for those patients discharged to the community. Second, the start of care (SOC) assessment item for the outcome must permit the patient to have the potential to have the outcome. OBQI health status improvement measures are binary indicators of whether the patient’s status at discharge is better than at baseline. Individuals who cannot improve because they do not have any deficit in the quality indicator at baseline are excluded from estimates of improvement. OBQI health status stabilization measures are binary indicators of whether the patient’s status at discharge is the same or better than at baseline. Individuals who cannot deteriorate because they are in the worst category of the quality indicator at baseline are excluded from stabilization estimates.

The initial developmental sample from which the University of Colorado identified individuals with the potential to have an outcome is 125,000 episodes. However, the developmental sample was supplemented by the University of Colorado for four of the 11 HHQI outcomes due to low numbers of episodes where patients had the potential to have the outcome. The developmental sample was 250,000 episodes for Improvement in Upper Body Dressing, Improvement in Transferring, and Improvement in Oral Medications, and approximately 350,000 episodes for Improvement in Confusion.

Respecification of Core Risk-Adjustors

After replicating the risk-adjustment models developed by the University of Colorado, alternative models were estimated using exactly the same coding of risk-adjusters as in current models with two exceptions where theory or prior evidence suggested other codings were likely to be more meaningful. Instead of a continuous measure of the age of the home care patient, four categories were specified: <65; 65 to <75 (reference category); 75 to < 85; 85 or older. The other change was the creation of a single numeric scale from the individual OASIS ADL and IADL measures at baseline. Spector and Fleishman (1998) examined the psychometric properties of ADLs and IADLs and concluded that they represent a single construct. We approximated the scale developed by Spector and Fleishman by classifying persons as either independent or dependent on human help to complete each ADL and IADL. The scale is a simple count of the number of ADLs and IADLs that the patient needs human help to complete. It ranges from 0 to 14.

After initial models were estimated, we examined the direction and consistency of the effect of the core risk-adjusters across the 11 HHQI quality indicator outcome models. A number of the original risk-adjusters were integer scales that did not appear to be linearly related to the HHQI quality indicators and/or the effect on the outcome measures was the opposite of what would be expected.

  • Hearing impairment was dropped from the core set of measures because of inconsistent effects and limited conceptual importance.

  • Vision impairment was respecified into two dummy variables with a reference category of no impairment.

  • Speech impairment was grouped into four categories with no speech impairment as the reference category and a top category that combined levels 3, 4 and 5.

  • The original depression measure is a count of depressive symptoms, ranging from 0 to 5, which is highly skewed toward no symptoms; it was respecified as two dummy variables (i.e., 1 symptom only, 2 or more symptoms) with a reference category of no symptoms.

  • A set of mutually exclusive indicators was created to measure frequency of urinary incontinence (“during the night,” “during the day,” “night and day,” and “urinary catheter present”) with a reference category of no incontinence.

  • A set of mutually exclusive categorical variables was created for bowel incontinence similar to those created for urinary incontinence.

  • A set of mutually exclusive categorical variables was created to indicate the type of help provided by the primary caregiver (i.e., the primary caregiver provides “help with ADLs (with or without providing help with IADLs),” “help with IADLs only,” or “some other type of help”) with a reference category of no primary caregiver.

We also categorized dyspnea which was included in the risk-adjustment models of the ADL outcomes. The original integer scale was not linearly related to these outcomes. In some models of ADL outcomes, the direction of the effect of dyspnea was positive, suggesting improvement in ADL outcomes as the level of impairment increased (although generally decreasing in magnitude as impairment level increased). In other models the effect of higher levels of impairment on ADL outcomes was negative although never statistically significant. Despite its unexpected and inconsistent effects, we left dyspnea in the preliminary alternative risk-adjustment models for ADLs because of its conceptual importance. Dyspnea did have the expected effect on the utilization outcomes, with the probability of Emergent Care and Acute Care Hospitalization rising as the severity of dyspnea increased.

Respecification of Baseline and Prior Values of Outcome Indicators

The baseline and “prior” values of the outcome indicators were treated as continuous variables, following the approach of the University of Colorado, in our initial analyses. Higher values always represent a “sicker” state. Subsequently, these indicators were respecified as categorical variables to test the assumption that baseline and prior variables are linearly related to the outcome indicators. The respecification improved the explanatory power of the risk-adjustment models--in a few cases, substantially.

Summary of Preliminary Modeling Results

Six models were estimated for each outcome. We began with a model limited to the core set of clinical, demographic and payment risk-adjusters, including the baseline value of the outcome measure if it was not already among the core variables. Outcome-specific risk-adjusters were added at subsequent steps: Model 2 included other clinical characteristics at baseline that might plausibly affect the outcome, and Model 3 included measures of clinical status prior to home health admission. Four clinical therapies at baseline (i.e., oxygen therapy, IV/infusion therapy, enteral/parenteral nutrition, and ventilator) then were added to the risk-adjustment models for all 11 outcomes (Model 4). The living arrangements and social support indicators subsequently were added to all models (Model 5). Finally, LOS was added solely to allow comparison of current and alternative model statistics and parameter estimates.

By Model 3 (i.e., after the addition of the prior health status measures) the risk-adjustment models developed in the preliminary analyses generally approached but did not exceed the explanatory power of the HHQI risk-adjustment models developed by the University of Colorado. The effect of the measures of health status prior to admission on the explanatory power of the risk-adjustment models varied depending on the outcome indicator. They had a modest effect in the improvement in ADL models as well as the one improvement in an IADL model (i.e., Improvement in Management of Oral Medication). Prior health status risk-adjusters had virtually no effect in the remaining models of health status outcomes and were not included in the risk-adjustment models of the two utilization outcomes.

The social support indicators, while conceptually important, added almost nothing to the explanatory power of risk-adjustment models that already included clinically relevant variables. The one exception was the Improvement in Oral Medication risk-adjustment model where there was a one percentage point increase in the R-squared statistic after the addition of the core social support measures and a statistically significantly improvement in the fit of the model (p < 0.001).

The generally lower explanatory power of the preliminary alternative models is not surprising since the “stepwise” logistic regression technique used to develop the current models is likely to result in models with close to the best explanatory power possible for the data set analyzed. In addition, the exclusion of LOS from the alternative models, because it can be affected by the quality of care provided and therefore is not an appropriate risk-adjuster, results in a reduced R-squared value for the alternative utilization outcome models relative to the current models.

Whether the alternative models are more parsimonious than the University of Colorado models depends on whether the models are considered individually or all 11 are considered together. Only two of the preliminary risk-adjustment models were more parsimonious than the corresponding models developed by the University of Colorado to risk-adjust the 11 initial HHQI outcome indicators. The total number of OASIS items and elements used to risk-adjust all 11 HHQI outcome indicators, however, was smaller.

View full report


"qualHH.pdf" (pdf, 3.72Mb)

Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®