
Estimating Risk

Our analysis estimates individuals' risk for nursing home residence from their observed characteristics. Thus, we turn first to the problem of specifying a discretetime transition probability model of nursing home use, and estimating it from the NLTCCD data files. A discretetime transition probability is the likelihood that an individual in a given state of affairs at the beginning of an interval of time will leave that state for some other during that interval. In the case at hand, we refer to: (1) the probability that an individual in the community at the beginning of a month will enter a nursing home during that month; and (2) the probability that an individual in a nursing home will return to the community (which we define as any survival state other than being in a nursing home: adjustment of results for hospital use is explained below). For aggregate analyses, we assume that persons who die or otherwise permanently exit the service system will be quickly replaced by very similar individuals, constrained by the system's capacity. Hence the system casemix itself is in stationary equilibrium and is not affected by mortality or other permanent ("absorbing") states that may befall individuals within in.
A transition probability function (TPF) is one that expresses the probability of transition for an individual as a function of that individual's characteristics and environment. For our purposes, these are the levels of use of the principal types of community services, as well as a variety of individual and environmental characteristics that previous research has indicated are related to nursing home use.
Unlike ordinary regression models, the unit of analysis for a discretetime transition probability function is the personinterval (here personmonth) rather than the individual person as such (see Allison, 1984, for an introductory treatment). Every person in the NLTCCD sample faced a probability of transition from their thencurrent state during each of the 12 months of the demonstration for which a continuous nursing home use history for the full sample was recorded. The study sample used here, which will be described in more detail below, consisted of 3,446 individuals, who could in principle have been at risk of transition during 41,352 (3,446 x 12) personmonths. Actually, mortality and other attrition reduce the actual risk set to 36,199 personmonths. Because we are interested specifically in analyzing the transitions of living individuals into and out of nursing homes, the risks estimated here condition explicitly on survival. Thus we have structured our transition probability analysis so that risk for transition at each month is measured relative to the survivor pool at that point in time (see Ingram and Kleinman, 1989). In light of the finding of Garber and MaCurdy (1989) that factors which predict institutionalization in the NLTCCD population are largely orthogonal to those which predict death, conditioning on survival can be expected to make little difference in any event relative to an unconditional analysis.
The observed outcome variable in estimating a discretetime transition probability function for leaving a current state is a dummy variable indicating whether a transition from that state has occurred during the period. At every personmonth, this variable will be either one or zero; one if a transition has occurred during that month, zero if not. To determine whether a transition has occurred in a given month, a surviving individual's state (nursing home or community) in that month is compared to their state in the previous month. If the state is the same, no transition is taken to have occurred (a plausible assumption in that it seems unlikely that many will have more than one transition in a single month). If the state in the previous month is different, a transition during the current month is assumed to have occurred. Since the NLTCCD public use data do not permit an unequivocal determination of the precise timing of admissions and exits during a month, we assume that an individual who spends any part of a month in a nursing home spends the entire month there, and that transitions occur at the beginning of a month. The NLTCCD data sets contain indicator variables for each of the first 12 months of the project, plus an initial baseline variable that indicates whether or not an individual was initially in a nursing home (a Skilled Nursing or Intermediate Care Facility). These data permit construction of the transition dummy variables in the manner described above.
There are two distinct TPF's involved, one for transitions from community to nursing home (which we call the CN function) and the other for transitions from nursing home to community (the NC function). Note further that each of the 3,446 individuals can in principle contribute to the personmonths for both functions, assuming they changed state at least once. Of the 36,199 personmonths in the risk set, 2,551 were spent in nursing homes, with 283 transitions to the community, while 33,648 were spent in the community, with 573 transitions to nursing homes.
The two TPF's can be estimated by forming two distinct data sets depending on transition origin state. One consists of personmonths during which individuals were in a nursing home, and hence were at risk of transition to the community; the other of personmonths during which individuals were in the community and at risk of entering a nursing home. The first is used to estimate the parameters of the NC function, the second to estimate the CN function.
Because the observed outcome is a dichotomous or dummy variable indicating whether a transition did or did not occur in that month, familiar models for dichotomous choice can be used to specify and estimate the transition probability models. As in most previous studies, we have specified the probability models as logistic where
Where P^{i}_{n}(t) and P^{i}_{c}(t) are, respectively, the probability of nursing home admission and exit during month t for individual i. Vectors of logistic regression coefficients for admissions and exits are given by ß and , respectively, while Z_{i}(t) is a data vector of regressors, including service use, for individual i at time t. Expression (1) defines a pair of binary persontime logistic regression functions which may be estimated by standard maximum likelihood methods (Ingram and Kleinman, 1989). Once the coefficients in the expressions in (1) have been estimated, individual transition probabilities may be estimated by the usual expedient of substituting the observed regressor values for that individual, including service use, into the righthand side of the equations and obtaining the predicted probabilities.
Given a 2state system such as ours, the steadystate probability of being found in a nursing home for an individual, conditional on survival, may be expressed in terms of the transition probabilities as follows (Kemeny and Snell, 1960):
Here _{i}is the probability of individual i being in nursing home care, while the Pterms are the transition probabilities defined in expression (1). Observe that will be high for those who have both high probability of admission and low probability of exit once admittedthat is, those who will tend to be heavy users of nursing home care if they do not die. Alpha may be thought of equivalently as the longrun proportion of time expected to be spent in nursing home residence by individual i, conditional on survival.
The transition probabilities are not here indexed by time (t), as in expression (1), because in order to satisfy the usual Markov assumption of fixed individual probabilities of transition, we assume in the optimization analysis that the transition probabilities remain fixed at their values 6 months into the demonstration. By this time we assume the component of services due to the NLTCCD to have reached normal operating levels but not yet to have been affected by the anticipated end of the demonstration. We also make the usual Markov assumption that transition probabilities at any point in time are independent of the previous transition history (Kemeny and Snell, 1960). This is probably a questionable assumption in the case of nursing home use, but data limitations preclude a timeinhomogeneous specification of the Markov chains.


Variables in the Model

The dependent variables in the analysis (transition indicators for movement into and out of nursing home care) have been described above. Independent variables in the model consist of community services which are central to the analysis here, and a variety of factors which previous research has indicated are significant predictors of nursing home use (Greene, Lovely and Ondrich, 1992).
The service variables consist of measures, in hours per month, of formal community services received by each individual in the sample. We measure four categories of community service: home nursing, home health aide, personal care aide, and housekeeper. These represent the vast majority (over 90 percent) of total inhome services consumed by the sample. Other categories (e.g. physical therapy) were not used with sufficient frequency to permit reliable statistical estimation of their effects.
In the regressions, services are entered interacted with client impairment indicators found to be substantial predictors of differential impact for that service on nursing home risk. Interactions are paired symmetrically with both the presence and absence of the indicated functional impairment. In two cases (the interaction of home health aide services with the absence of severe cognitive impairment and the interaction of personal care aide services with the absence of severe ADL impairment) the interaction terms were small and statistically insignificant, but had the "wrong" sign in at least one of the logits. Because we believe the anomalous signs are due to sampling error, and because such sign inversions would severely hinder convergence in the optimization algorithm, these two coefficients were set to zero in the analysis. Obviously, with multiple services which may themselves be interactive, both among themselves and with other client characteristics, more elaborate specifications are possible and reasonable. But our experience with these was that collinearity problems become so prevalent in the estimations that it seems likely that experimental rather than observational studies will be necessary to pursue these more subtle and complex specifications.
Information on formally supplied services in the NLTCCD dataset is drawn from surveys administered at the sixth month and again at the twelfth month of the demonstration. The survey instrument uses retrospective questioning that required participants to recall the total hours of services, by type, received in the previous week of community residence from all sources. Because of the retrospective nature of the questioning, we assume that the “snapshot” of service hours reported at the time of the sixmonth survey (rescaled from weeks to months) is representative of the actual hours received in months 1 through 6. Similarly, information reported at the twelvemonth survey is imputed to months 712. The longitudinal measurement of community service use in the NLTCCD involved a number of complications, our treatment of which is detailed in Appendix A, which also explains the price data used.
The other regressors fall roughly into three categories: (1) personal and demographic characteristics, (2) indicators of health and impairment status, and (3) location and demonstrationspecific factors. In our models, these variables are set to their baseline values: only service levels are permitted to vary over time. Hence the perspective taken is one of a prospective or predictive planning model with services presumed to be potentially subject to manipulation during the planning period. While a timevarying regressor approach would be of greater theoretical interest, such a specification would introduce endogeneity and identification problems that could not be resolved with available data.
Personal and demographic characteristics as measured included dummy variables indicating whether the individual was AfricanAmerican, HispanicAmerican, female, a homeowner, or lived alone. Included also are quantitative variables for monthly household income, age (years), and number of surviving children (as a proxy for availability of family support).
Impairment and healthrelated variables include indicators of whether the individual was severely impaired in ADL, IADL or cognition, or used a wheelchair. Measured as a continuous variable is selfrated health.
Site and demonstrationrelated variables included the nursing home bed supply in the client site (beds per thousand over age 65), whether the individual was in the demonstration treatment or control group in a "basic" or "financial control" demonstration site [these were the two different intervention modes for the NLTCCDsee Carcagno and Kemper (1988) for details]. In general, "financial control" sites were established in areas with more extensive service systems than "basic" sites, and permitted case managers to authorize purchase of additional servicesin contrast to "basic" sites where case managers worked with existing services. The nursing home bed rate, as a proxy for availability, may be expected to influence nursing home use.
Because the nonservice regressors serve only as control variables in this study, because many have been considered in other studies using similar methodology (e.g. Branch and Jette, 1982; Weissert and Cready, 1989), and because an extensive treatment of their measurement and the rationale for their inclusion in models predicting nursing home use using the NLTCCD data have been provided elsewhere (Greene and Ondrich, 1990; Garber and MaCurdy, 1989), we will for sake of brevity not repeat these discussions here. Before presenting results of the logit analysis, we outline the optimization problem they will be used to solve.

View full report
"rednh.pdf" (pdf, 1.92Mb)
Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®