Alternative Risk-Adjustment Approaches to Assessing the Quality of Home Health Care: Final Report


Christopher M. Murtaugh, Ph.D., Timothy R. Peng, Ph.D., Gil A. Maduro, Ph.D., Elisabeth Simantov, Ph.D., and Thomas E. Bow, M.A., M.S.W.

Center for Home Care Policy and Research, Visiting Nurse Service of New York

July 2006

This report was prepared under contract #HHS-100-03-0011 between the U.S. Department of Health and Human Services (HHS), Office of Disability, Aging and Long-Term Care Policy (DALTCP) and the Urban Institute. For additional information about this subject, you can visit the DALTCP home page at or contact the ASPE Project Officers, William Marton and Hakan Aykan, at HHS/ASPE/DALTCP, Room 424E, H.H. Humphrey Building, 200 Independence Avenue, S.W., Washington, D.C. 20201. Their e-mail addresses are: and

The opinions and views expressed in this report are those of the authors. They do not necessarily reflect the views of the Department of Health and Human Services, the contractor or any other funding organization.



One of the central goals of the U.S. Department of Health and Human Services is to improve the quality of health care received by all Americans. In the home health care area, the Department has two key initiatives developed and implemented by the Centers for Medicare and Medicaid Services (CMS) to assess, improve, and report quality. The Outcome-Based Quality Improvement (OBQI) program provides reports to all Medicare-certified home health agencies so that they can identify potential quality problems and devise appropriate strategies to address them. The Home Health Quality Initiative (HHQI) uses a subset of the OBQI quality measures for public reporting.

There are 41 home health quality measures in the context of the OBQI framework including functional, physiologic, emotional/behavioral, cognitive, and health care utilization outcomes. The source of the data used in OBQI and HHQI is the Outcome and Assessment Information Set (OASIS). Since July 1999, home health agencies participating in the Medicare or Medicaid programs have been required to collect OASIS on all patients age 18 or older admitted to Certified Home Health Agencies. The two exceptions are persons receiving pre or postpartum maternity services and those receiving only personal care, chore or housekeeping services.

Thirty of the 41 OBQI quality indicators now are risk-adjusted when comparing outcomes for patients from one agency with outcomes for patients from all agencies in OBQI reports. An additional OBQI patient outcome indicator (Improvement in Pain Interfering with Activity) is risk-adjusted for public reporting in HHQI but not in OBQI reports sent to agencies. A data-driven “stepwise” approach currently is used to risk-adjust the OBQI indicators with a separate set of risk factors included in the risk-adjustment model for each outcome.

The purpose of this project was to use a theory and evidence-based approach to develop and test alternative risk-adjustment models for the OBQI quality indicators within the frame of the existing OASIS instrument. Specifically, instead of using a separate set of risk-adjusters for each OBQI quality indicator where risk-adjusters are primarily determined based on their statistical fit to the model, this project used a core set of risk-adjusters in all models that theory and prior research suggest are important determinants of home health quality. Advantages of a theory and evidence-based approach include simplicity, understandability, stability of the risk-adjustment models over time, conceptual meaningfulness, and the potential for greater parsimony in data elements when a large number of outcome indicators are being risk-adjusted, as is the case in the OBQI program. Findings from the project will contribute to CMS’s future plans for continued refinement of risk-adjustment and outcome measures, and support the Department’s efforts to reduce regulatory burden by streamlining OASIS.



Analyses were conducted in two major phases: preliminary data analyses and final data analyses. Preliminary data analyses included replication of the CMS risk-adjustment models for the first set of 11 outcomes reported in HHQI, and development of alternative models for these outcomes. A Technical Advisory Group (TAG) meeting then was conducted with experts in home health care and risk-adjustment as well as policymakers and provider representatives. Based on the results of the preliminary data analyses, the TAG provided input on our initial approach. After the TAG meeting, final data analyses were conducted. The project team replicated the current models for the remaining 20 quality measures that are currently risk-adjusted in OBQI or HHQI. A final set of alternative risk-adjustment models then was developed for all 31 OBQI quality indicators, followed by an examination of the impact of alternative risk-adjustment models on agency quality ratings.

The data analyzed in this project were obtained from the CMS contractor at the University of Colorado. They drew the data from the OASIS National Repository at CMS to create discrete episodes of home health care during calendar year 2001. The file includes episodes of care beginning and ending within the calendar year. Approximately 1,500,000 OASIS episodes are present in the overall data set. The University of Colorado randomly assigned about a third of the episodes to the developmental sample for initial estimation of risk-adjustment models for most outcomes. The remaining 1,000,000 were used to validate the final models derived from analysis of the developmental sample.

In the preliminary data analyses, six alternative models were estimated for each of the 11 initial HHQI outcomes. We began with a model limited to the core set of clinical, demographic and payment risk-adjusters, including the baseline value of the outcome measure if it was not already among the core variables. Outcome-specific risk-adjusters were added at subsequent steps: Model 2 included other clinical characteristics at baseline that might plausibly affect the outcome, and Model 3 included measures of clinical status prior to home health admission. Four clinical therapies at baseline (i.e., oxygen therapy, IV/infusion therapy, enteral/parenteral nutrition, and ventilator) then were added to the risk-adjustment models of the 11 HHQI outcomes. Living arrangements and social support indicators were added next. Finally, home health episode length of stay (LOS) was added solely to allow comparison of current and alternative model statistics and parameter estimates.

Only three alternative models were estimated for each of the 31 outcome indicators in the final data analyses.

  • Model 1 was limited to the admission (or baseline) value of the outcome indicator and a core set of primarily clinical risk-adjusters drawn from the domains covered by the OASIS start of care instrument.

  • Model 2 added to Model 1 other clinically relevant admission characteristics plausibly influencing the specific outcome.

  • Model 3 added to Model 2 indicators of patient functioning prior to home health admission.

The rationale for examining prior health status variables separately from clinical measures on admission is because of questions regarding the reliability of the former and possible elimination from the OASIS instrument.

The decision to estimate only three sequential models, as opposed to the six estimated in the preliminary analyses, was based on the advice of the TAG and further analysis of the living situation and informal support/assistance measures following the TAG meeting. The analysis confirmed that these factors contributed relatively little to the explanatory power of risk-adjustment models with the exception that they very modestly improved the explanatory power of the Improvement in Medication Management risk-adjustment model. Following this analysis, the living situation and informal support/assistance measures were excluded from all alternative models.

Four sets of statistics were estimated for each current and alternative risk-adjustment model:

  • Number of OASIS items included in the risk-adjustment model.

  • Number of OASIS elements (some OASIS items have multiple elements) included in the risk-adjustment model.

  • R-squared statistic (technically, a pseudo R-squared statistic that measures the extent of the agreement between observed and predicted values).

  • c statistic (a measure of how well the risk-adjusters in the model correctly classify whether an episode will result in the outcome being examined).

The total number of OASIS items and elements used to risk-adjust all OBQI quality indicators also was compared.

An agency-level analysis was conducted following development of a final set of alternative risk-adjustment models. The purpose was to determine how the different approaches to risk-adjustment affect an agency’s quality ratings. Approximately 5,000 agencies were included on the calendar year 2001 files provided to the project team by the University of Colorado. Two “adjusted” agency outcome rates were calculated for each of the 31 outcomes currently risk-adjusted in OBQI or HHQI. One of the adjusted rates was estimated using the current risk-adjustment model and the other was estimated using the “full” alternative model (i.e., Model 3 which includes outcome-specific and “prior” OASIS items, or Model 2 where there were no relevant prior items).



The preliminary set of theory and evidence-based core risk-adjusters in the first phase of the project, where we focused on the original 11 HHQI outcomes, was drawn from a number of domains covered by the OASIS instrument. The selection of the final set of core risk-adjusters was based on findings from the preliminary analyses, comments of TAG members, and examination of a small number of additional OASIS items provided by the University of Colorado following the TAG meeting. In addition to the core, approximately 2-3 outcome-specific risk-adjusters were included in the final, “full” risk-adjustment model developed for each of the 31 OBQI outcomes currently risk-adjusted by CMS. In addition, 1-3 directly related, conceptually important “prior” health status measures were included in the full risk-adjustment models of most of the health status outcomes. The great majority of core as well as supplemental risk factors are clinical measures at baseline suitable for inclusion in electronic health records. All risk-adjusters were constructed from routinely collected OASIS data elements.

Comparison of Current and Alternative Models

Overall results from the comparison of the current and alternative risk-adjustment models are described first, followed by results for specific domains (e.g., Activity of Daily Living (ADL) measures, physiologic indicators). In general, the “full” alternative models typically have slightly lower explanatory power than the current risk-adjustment models. Specifically, the R-squared statistic for the full model tends to be within 1-2 percentage points of the R-squared statistic for the model developed by the University of Colorado. There is a similar pattern for the c statistic. While the number of OASIS items and elements used to risk-adjust a given outcome is sometimes larger and sometimes smaller the alternative model compared with the respective current model, the overall number of OASIS items and elements employed when risk-adjusting all 31 OBQI outcome indicators is considerably smaller for the full alternative models (64 versus 88 OASIS items, and 93 versus 135 OASIS elements).

The ADL and IADL outcomes represent 23 of the 41 OBQI quality indicators and over two-thirds of the 31 outcome indicators currently risk-adjusted by the University of Colorado.

  • Most of the full alternative risk-adjustment models for the ADL and Instrumental Activity of Daily Living (IADL) outcomes have slightly lower explanatory power than the current models; an exception is the risk-adjustment model for Improvement in Ambulation where the alternative model performs significantly better than the current risk-adjustment model.

  • “Prior” OASIS items contribute substantially to the explanatory power (roughly two percentage points to the R-squared statistic) of almost all of the risk-adjustment models of improvement in ADLs and IADLs, but not stabilization in ADLs and IADLs.

  • The ADL and IADL stabilization outcomes all are skewed (i.e., a very large share of those potentially able to stabilize do stabilize) which may explain the relatively low R-squared and relatively high c statistics for the stabilization risk-adjustment models.

“Prior” OASIS items contribute little to the explanatory power of the risk-adjustment models for the remaining health status outcomes. The one exception is risk-adjustment model for Improvement in Urinary Incontinence, a physiologic outcome in the OBQI framework. Among physiologic outcomes, the alternative risk-adjustment model for Improvement in Urinary Tract Infection (UTI) performs considerably worse than the current UTI risk-adjustment model. The R-squared statistic for the full model is 5.9% compared to 12.1% for the current model, and corresponding c statistics are 0.665 and 0.740. The main reason for this difference is the exclusion of home health episode LOS from the alternative model.

No “prior” OASIS items were included in the alternative models for the utilization outcomes (i.e., Acute Care Hospitalization, Discharged to the Community, and Emergent Care). As was the case with the UTI risk-adjustment model, the exclusion of LOS reduces the explanatory power of the alternative models for the three utilization outcomes relative to current models.

Agency Analyses

Regardless of whether the current or “full” alternative model was used to risk-adjust outcomes, the quality ratings for most agencies on most outcomes are similar. In particular, the difference between the current and alternative risk-adjusted percent of an agency’s patients with each outcome is within one to two percentage points for most agencies on most outcomes. It is the ranking of each agency relative to others, however, that is likely to be of greatest concern to providers. Our analysis found that the ranking of agencies using current risk-adjustment models and the ranking using the full alternative risk-adjustment models are in close agreement for most outcomes.

The agency-level analyses then were repeated using only the core risk-adjusters in the alternative risk-adjustment models. This was done in order to better understand the contribution of the outcome-specific and OASIS “prior” items to the finding of similar quality ratings regardless of risk-adjustment approach. The basic results hold. However, as would be expected, the quality ratings are not as close when outcome-specific and OASIS prior items are dropped from the alternative risk-adjustment models of the OBQI indicators.



There are important tradeoffs and differences between the current and alternative approaches to risk-adjusting OBQI quality indicators. The first is the generally higher explanatory power of the current models versus the simplicity of the alternative models and their overall reliance on a smaller number of OASIS items and elements. That current models generally have slightly better explanatory power than the alternative models is not surprising since the “stepwise” approach is likely to result in models with close to the best explanatory power possible for the data set analyzed. At the same time, however, it leads to the selection of a large number of risk factors when all outcome measures are considered. In addition, because the stepwise approach “fits” models to the data on which they are developed, the explanatory power of these models is likely to decline when they are applied to new data sets.

A second tradeoff is between the full alternative models that include the outcome-specific risk-adjusters and alternative models with only the core set of risk-adjusters. The latter tend not to predict outcomes as well as the full models. Measures of physical functioning prior to home health admission are particularly significant in the risk-adjustment models of ADL and IADL improvement. The “prior” OASIS items, however, are more difficult than many other items for home health agencies to collect and are thought to be less reliable than other clinical measures. Should they be dropped from the OASIS instrument, the explanatory power of the risk-adjustment models for most ADL and IADL improvement models would be reduced roughly two percentage points.

The decision to exclude home health LOS from the alternative models, in addition, has a significant impact on a small but important subset of risk-adjustment models (i.e., the utilization outcomes). LOS was excluded because it can be affected by problems in the care process that also affect outcomes (i.e., low quality care can cause a longer stay as well as worse outcomes). If LOS is included in risk-adjustment models, conclusions about the quality of agency care could be erroneous due to quality problems being risk-adjusted away. The TAG convened to review preliminary models developed by the project team strongly supported the decision to exclude LOS from risk-adjustment models. The consequence, however, is reduced explanatory power for some outcomes. A possible methodological solution, which has data burden and simplicity implications, is to collect information on the timing of all of the utilization outcomes (e.g., hospitalization) and estimate hazard models which take into account the time to the outcome of interest.

An agency-level analysis was conducted to examine how alternative approaches to risk-adjustment of the OBQI indicators affect an agency’s quality ratings, with two main findings. First, for most agencies and most outcomes, the adjusted proportion of patients with an outcome is similar regardless of whether the current or the full alternative model is used to risk-adjust outcomes. Second, the relative ranking of agencies using current risk-adjustment models and the ranking using the “full” alternative risk-adjustment models are in close agreement for most outcomes. One limitation of the agency analysis is that for some outcomes a relatively large number of agencies were excluded because too few patients at each of these agencies had the potential to have the outcome (i.e., less than 20 in the study sample).

The results suggest that the relatively small reduction in explanatory power of most of the alternative risk-adjustment models for the OBQI indicators is unlikely to have a substantial effect on the quality ratings of the majority of agencies. A theory and evidence-based modeling approach, then, has the potential to simplify risk-adjustment and provide a consistent and stable basis for risk-adjustment relative to the current approach. This should make it more understandable to providers and encourage individual agencies to risk-adjust their own outcomes. The reliance on a smaller number of OASIS data elements, in addition, would contribute to the Department’s efforts to streamline the OASIS instrument and potentially facilitate the identification of a parsimonious set of clinical measures appropriate for data exchange in an electronic health record environment.