National Invitational Conference on Long-Term Care Data Bases: Conference Proceedings. VII. Overview of Long Term Care Data Base Applications


William Scanlon, Ph.D., Georgetown University
Joan F. Van Nostrand, National Center for Health Statistics
Korbin Liu, Sc.D., Urban Institute
Kenneth Manton, Ph.D., Duke University
Judith Wooldridge, Mathematica Policy Research
Thomas Grannemann, Ph.D., Mathematica Policy Research
Evelyn S. Mathis, National Center for Health Statistics
Gerry E. Hendershot, Ph.D., National Center for Health Statistics

WILLIAM SCANLON: This is labeled the producer panel, and its primary objective is to give you the opportunity to ask the questions that you have been developing to this point, particularly since the amount of material that has been presented in these data bases that are being discussed is incredibly vast and complex.

I think calling this a producer panel in some ways is a misnomer because the people on this panel are not just producers of the data but they have been extensive users of the data over the years and concerned with long term care issues. You are really going to get a very broad perspective on the use of these data sets.

The more information that you can draw out of these people about how their data sets might be used is going to be of value to both yourselves and to others in the audience.

We also had hoped to learn more about the relative strengths and weaknesses of the various data sets. We would be very interested when you have questions that can draw out these kinds of comparisons.

As moderator and as a potential user, I am going to take the prerogative and use the power of the microphone to ask some questions to begin with, to get the discussion going.

Basically, the first question is whether a data base provides adequate coverage of an issue and, in terms of its content, what kinds of significant omissions are there with respect to an issue. Also I am concerned about what ability there is to examine the validity of the data that I may have from a particular data set. Are there other sources which I can use to test validity or do I have to rely only on the information from that data set?

Another concern is how the administration of surveys, or the collection of the data, and the content of the particular wording of questions may affect the result that I am going to see from those data sets and when it generates conflict with other statistics that are published elsewhere what did I make of those conflicts?

Finally, I think a point that was brought home very well with respect to Survey of Income and Program Participation (SIPP) is how hard is it going to be for me to get information out of any of these data sets. It is very easy to send our checks to the National Technical Information Service (NTIS) and to get back a public use file. Then what is it going to cost us in terms of hiring programmers and others to extract information from the data sets?

In terms of starting off this discussion, I think there are two major application areas that we would like to discuss.

The first involves service use, both institutional service use and community service use. As Aurora indicated, ideally what we would know about service use and long term care is that we would have a continuous measure on individuals and basically be able to observe every time a service was used, what type of person that was, and under what circumstances did they use the service.

That is not the kind of data that we have, but what we have is data that to a greater or lesser extent tries to approximate that through histories and through recall questions. What we would like to get a sense of is what are the limits of this picture of service use that we have and in terms of new methodologies that are being used in these data sets. The notion of collecting a history is something that has not been done previously in long term care surveys to this extent.

To what extent do we need to be concerned about what means for validating the information are available?

Along with knowing the quantities of service use that we observe, it is very key to understand the circumstances under which people use services. Particularly for those of you who are considering analysis of these data with respect to changing the financing arrangements for services, we would I like to know to what extent we can relate service use to financing or to the price that consumers might face.

We need to know whether we should start a quest through these data sets for that kind of information or whether that is just missing altogether.

The other area which I think is important to address is the area of measuring the prevalence of various dependencies. We clearly want to know the type of population that we are dealing with and the numbers of that population. It is critical to both the public and private sector in their development of long term care policies.

As we look through the data sets, what we see is that there are rich detail about activities of daily living (ADL) and instrumental activities of daily living (IADL) dependencies. At the same time what we see is that there is often subtle wording changes from survey to survey and sometimes even within a survey.

One of the things I think that you are going to face if you buy one of these public use samples and go to use it is what are we to make of these differences? How significant are they? Does this group know from prior research that something is very significant or it can be insignificant and not be of great concern?

Without any more ado, what I would like to do is turn it over to the panel to maybe start with the service use question and start in the area of institutional use.

JOAN VAN NOSTRAND: I think from a policy point of view one of the most crucial questions, particularly when you are looking at service use and financing, is the lifetime of nursing home care.

A few years ago when I used to answer that question I would say, "We have not the faintest idea." But now, I am getting better because I am able to say there are various data sources that you can look at that will add some information so that we can answer that question.

Clearly, what one would like is a longitudinal survey of various cohorts that follow people starting around age 50 or so all the way until their death, but it is too expensive. We are just now starting to focus on these issues of longitudinal data. So we do not have that.

So the next question is, "What do we have and how can we try to work at piecing what we now have together and improving it in our future data collections?"

We have asked some questions from people in nursing homes about what their lifetime use of nursing home care is. We have asked information as to when they were admitted, when they were discharged, and where it was, so that we can try to build this kind of a history, and, in relation to the history, the changes in sources of payment at each admission and at each discharge.

Also, in the Supplement on Aging (SOA) that Gerry Hendershot might be mentioning, that same kind of a question has come out.

I do not want to speak on the National Long Term Care Survey (NLTCS) because there are other people here who are much more knowledgeable about it than I am, except to say that the difference over that 2 year period and who was institutionalized, given that they are all at fairly high risk, is also going to add to the information we have on what the lifetime use of nursing home care is.

I think we are moving closer and closer to being able to answer that question. I hope by the time I retire, when someone calls me and says, "What is the lifetime use of probability of nursing home care?" I will have an answer, and I will be able to say it is really a good one. We are moving there.

WILLIAM SCANLON: We talked a little bit in the National Nursing Home Survey (NNHS) Breakout Session about the plans for validation of the histories, the fact that in the next-of-kin survey and the current resident and discharge survey that you basically collect independent histories, one from the nursing home and one from the next-of-kin. Plans are, I think, underway that there will be a comparison of these data to see how well history can be collected.

Maybe Ken or Korbin with the NLTCS and Medicare claims; were you able to look at any of these issues of how good recall is versus actual information from a third-party source?

KORBIN LIU: In 1982 there were questions in the survey about nursing home history. Basically, the questions were phrased, "Had you spent any time in a nursing home? How many times did you enter a nursing home? How long were you in those nursing homes when you went in?"

That was essentially the part of the questionnaire that provided information on the history of nursing home use in 1982. The 1982 sample was subsequently followed in 1984 and several things could have happened to them. They could have died, they could have been in the community, or they could have been in a nursing home.

In 1984, for those people who were in the community, similar types of questions were asked that were asked in 1982. It was that kind of a history that they were trying to record; which was how many times and how long.

Specific dates were given. Clearly for the 1984, because they did not have to go back any further, the question was phrased "in the last 2 years, how many times were you in a nursing home and for how long?"

If those people were found in a nursing home in 1984, a different set of questions were asked. They were asked when they were admitted, and whether there was a prior nursing home stay before the one that found them in the nursing home in 1984.

For the people who were deceased or were not able to respond, the question was asked of the next-of-kin or somebody knowledgeable about that person. In that particular questionnaire, they were tracking nursing home use between 1982 and 1984, similarly, number of times and duration.

Basically, for the 1982 sample, you have both information on prior use before the survey in 1982.

By virtue of their being in the community in 1982, one would assume that those prior stays were fairly short. That is not necessarily the case, but they could have been fairly short stays. By 1984, they went back, you had another 2 years of history on the people who were in the 1982 sample.

The NLTCS does not have a built-in validation mechanism for those histories. I mean, there was not another next-of-kin survey, for example, to compare with the recall information from the surveys themselves.

The one thing that is available is that we have those files merged with Medicare records. At least on some nursing home use, if they happen to be Medicare use, then there should be some correspondence between what was reported by the individual and what was recorded by the Medicare bills. That is at least one way to begin to try to validate the recall information.

KENNETH MANTON: Some general comments on that, because this is a general thing that comes into the prevalence question that you alluded to. You have a different additional sampling factor when you are talking about episodes or trying to get events out of an event history than when you have the survey date and you are asking people about certain events; there are various factors operational like length bias sampling.

If you ask people within a certain window the probability of a person being included it is going to be a function of how long that episode was. In any given interval of time, if a person has been in a long time, the probability of pulling him in is greater.

These questions come up with respect to the NNHS when you either have the discharge sample or you have a current residence sample and you have got a question of length via sampling. In current residents, the fact that they are still within an episode means that is a truncated episode. You have got to worry about ways of backing that out.

One question came up in looking at certain analyses with NNHS data. Would you really want to longitudinally follow a cohort rather than trying to take some cross-sectional data, say for a current year and doing some analysis to reconstruct the histories? If you spend a lot of time with a long term longitudinal follow-up, the experience in terms of institutionalization 10 years ago for people at a certain age within a cohort may no longer be applicable.

They have got two or three things going on aside from the accuracy of the recall data in terms of actually reconstructing the use of institutions; both the question that you are sampling (episodes), the length of the episode and the nature of the sampling can affect the information about the resident within the survey data itself.

On the question of institutionalization, if you go from 1969 or 1973 up through 1977, you see one set of rates. After 1980 you see different things in terms of the growth of the nursing home population. A cohort study might not have been a very cost-effective way to pick up those types of temporal trends in institutionalization. That is an additional factor of the measurement issue.

There are decay functions or windows, if you will, where you think that recall is going to be reasonably reliable and a 6 month window will give you certain types of things that you can get out in greater detail rather than trying to go back a full year.

There are a number of detailed studies on what is an appropriate recall period, and where the accuracy starts to break down.

JUDITH WOOLDRIDGE: I think it is worth mentioning a few things about the purposes of the National Long Term Care Channeling Demonstration data base before commenting on the service use data.

The Channeling data base is not a nationally representative sample, but it is a sample drawn from the population of individuals whom we hoped would be a high risk of institutionalization. The purpose of the evaluation was to see whether a demonstration program could divert people who are heading into nursing homes into a community service program in a cost-effective way. This is a very special population from ten sites.

The thing about this population is that it was not a greater user of nursing homes and so the data we have on nursing homes is for a relatively few number of people. We wanted to get a very complete data set with respect to hospital, nursing home and community service use over the period during which we followed the individuals.

For the most part, the maximum period we followed anyone was 18 months. For some of the sample, we only followed them for 12 months. For some of the sample, we only followed then for 6 months, part of the community service sample was quite short.

However, it is a very complete data sample with respect to hospital use and costs, and costs by source of payment for nursing home and community service as well.

We established samples for the hospital analysis and the nursing home analysis that were designed to exclude people who we thought would have incomplete data.

We were able to get complete hospital data on the very large part of our research sample. Our assumption was that if somebody was covered under Medicare, and virtually everybody in our sample was covered under Medicare Part A, that we would get good utilization and reimbursement data from Medicare from the claims that we used as a major source of data.

In addition, people who were not Medicare-covered but who where Medicaid-covered throughout the period that we were analyzing, we assumed that Medicaid would be a complete source of coverage for that individual if they were covered throughout.

Otherwise, for a sample of individuals who did not fall into either of those categories, if they had a follow-up interview or if their caregiver was able to provide information to us which told us about service use in the previous 6 months and which providers were used, we went to the providers and extracted data at the provider.

With this combination of data sets and with this kind of approach to identifying who would have a complete sample, we have excellent hospital use data for 18 months.

The nursing home sample was a little bit more limited and the reason for that is largely to do with the issue of coverage.

We assumed, again, the only people who could be considered to have complete data would be those who were Medicaid-covered throughout the period, because Medicaid is a payer of nursing home use and anybody who is Medicaid entitled will be covered for their nursing home use, if they have any.

We did not make any assumptions about Medicare coverage being complete, because Medicare is a very limited payer of nursing home services.

For a large part of the sample, again, those who had a follow-up interview or a caregiver follow-up interview, we were able to go to nursing homes and actually extract information about the nursing home use that occurred in the 6 prior months, both the use, the reimbursement, and who was paying, so that as a result we have three types of reimbursement data. That was from Medicare and Medicaid claims. We also have information on private payment that came from the provider records extract whenever that was used.

We merged all of these data sources and created hospital stays and nursing home stays, which meant that we had to match up all the stay information from these diverse sources. I think it is important, with respect to the validity questions at this point, to mention that for those people who we started out with the interview data and then went to the providers to try and find what their precise service use was, we were able to find most of them.

In a few instances, we could not find a record at all at that provider. For the most part we could and the dates of service matched quite reasonably well.

We then took the stay data and the data that is available from public use is not from the formal stays, and I think for some of you this is a weakness.

What we did was to create utilization of reimbursement estimates for 6 month periods, for the first 6 months, for the second 6 months and the third 6 months after randomization, so that the kinds of data that are available on hospital and nursing homes for that 6 month block include the total number of days and total payments. It also included whether or not an individual had any stays, and whether they were admitted, which is a slightly different issue from the number of stays, because you could have had a stay in a period but have been admitted in a prior period.

We had reimbursement by sources for the 6 month period. We had total days, number of admissions, number of stays and reimbursements. But no information in the sense of a continuous history. You can not see when a person went in and out of the hospital and nursing home. You can not look there to see what kind of service use occurred immediately before a hospital stay in the community, or immediately after a hospital stay in the community. Also from the public use files, within the 6 month period, you can not tell whether or not an individual moved from a hospital to a nursing home or vice versa.

What you can do with these data, though, is because there are three 6 month blocks, you can look over time to see whether an individual had an increasing probability of being in a nursing home at month 6, 12, and 18. That was one of the things we did in our final report.

I should mention that there is data on other medical services paid for by Medicare and Medicaid, such as physician use. We grouped all the other types of Medicare and Medicaid services such as podiatry and everything else, basically, outside of hospital and nursing home use. We have whether or not such services were used, and the reimbursements available.

This excludes a large portion of private payments. For example, if physicians did not accept assignment of benefits, we did not know that and we don't have information on that payment from the individual.

I mentioned that we did not explicitly compare self-report data on the interviews with claims data or provider records, but implicitly we did make a comparison of the interviewer and the provider information, and found most of the data that we were looking for in the provider records.

It is probably worth mentioning, too, that we were matching Medicare and Medicaid claims for ten different states Medicaid files, and we found very good match of Medicaid crossover claims with Medicare claims for hospital stays, and also the nursing home stays.

Another source that we drew on was death records to confirm whether or not we were not finding utilization because an individual was now dead.

All in all I think the data appears to be very consistent across sources, and, therefore, I think probably reasonably valid.

THOMAS GRANNEMANN: I will just talk briefly about the community services data available from the Channeling evaluation.

As Judith mentioned, it is a very select sample. We do have one advantage to this Channeling sample, we have got two groups, a treatment group and a control group. The control group simply represents what the people in the existing system got. If we were interested in what is happening out there now, what are the patterns of service use that are currently in place, we have got a control group.

We also have a treatment group which, particularly for the community services, allows you to make some estimates of what the impact is of providing additional coverage for those services and providing case management.

We were able to look at both those things. In terms of the data sources, we have several sources for the community services. We have Medicare and Medicaid records, but that only covers the things for which Medicare and Medicaid cover.

We do have from that source both quantity of services and dollar amounts of services. It is a data set that provides a lot of information on financial aspects.

For the individual interviews on those services, we asked individuals to report for each of their caregivers what they provided. Those records in terms of being able to validate across data sources reflect something a little different than what the Medicare records do. They would not be classified as nursing and homemaker services as providers do, but will be classified as the individual who actually reported those services. It is a little bit difficult in using the individual interview data for which a large part of the sample has those records to go back and validate across data sources there.

For a 20 percent sample, we have provider records extracts for community-based services. These cover not only the services that the Medicare and Medicaid records do, the traditional home health services that are covered, but also personal care, homemaker, transportation, and meal services.

Because of the way those providers were identified, the sample will not match up. Starting with a 20 percent sample, we are only able to go to providers when the individual reported having received the service and was able to identify the provider, and then only when the provider was able to supply that information. We got very good cooperation from providers, but, again, that gives you a little bit different sample than you get from the other sources.

The other thing that you are able to do with this data set is because we have a lot of information on individual characteristics, you are able to look at some subgroups by ADL categories, by economic resources. We have information on assets and income, the information on Medicaid coverage in the ten sites. This does provide a basis for sorting out what some of the determinates, of community-based service use are.

Both in an environment looking at the control group for what is happening now and under a situation of expanded coverage for community services. If you are trying to make estimates of what would be the effect to expanding some coverage in a case managed type of system, this can provide that kind of information.

KENNETH MANTON: We have seen a number of surveys, that have a complex structure where you are talking about SIPP, and to think about the cross data base analyses become even more daunting.

What I think that says is that there are some issues with respect to structure. The way the information systems are set up, that could be set up in a more parallel fashion that would facilitate both cross study and, for that matter, we also see a lot of complexity within studies with different record types.

I think what has happened is there is a tendency to try and be efficient in data storage, to go to the different record types and put the effort in terms of linking of cross records rather than saying, "Well, I will be less efficient in data storage but maximize the ability to do analysis."

I think there is probably a lot of mileage that could be gained in terms of both cross study and with hidden study validation analyses and substantive analyses that there could be a lot of facilitation by looking at the question of data structures and the information system. Right now, you get the feeling that, "Well, we are putting this into NTIS," but they are coming in in different data structures. Some are turned into person-based records. Others are left with different types of episode structures. You can encode the service you state in a number of different ways. One is an aggregated interval type of record. The other is a full-time line type of data. A third is a bill or episode type of record. All different types of records seem to be being used and that is additional complexity. The data is complex enough without having the complexity of different people doing very different things.

I think that is one area that it would require somebody to do some coordinating work in terms of what is the most efficient data and information structures, especially for the cross study validation and analytic studies.

WILLIAM SCANLON: I think that point is very well taken. I think that one of the things that we face in this is always the loss of information that comes with simplification and abbreviation. At the same time, sometimes the accessibility that comes with both of those things, but sometimes that loss of information can not be tolerated.

KENNETH MANTON: Sometimes it is not even loss of information in the data structure. I mean, if you think of a rectangularized person-based file versus multiple record types encoding the episodes, both have same information. One, he wastes some space on a tape, but in some cases I would much rather waste 30-40 percent of a tape than have four or five different record types. In one case I am dealing with a one-dimensional structure, sorting through on a variable format for a person. In the other case, I have got four or five different formats and record types, and I have got to do special purpose programming to link and merge variables together.

It is not necessarily a data loss question so much as a tradeoff. We have technologies for mass data storage; does it matter a tape costs $10? Do you care if you waste 30-40 percent of it?

There are some implications for various computer systems. I think we are going historically rather than analyzing what the current possibilities are.

WILLIAM SCANLON: I think there are instances, though, where what we do is we aggregate and there is the potential for losing the information.

Let us turn briefly to this issue of measuring dependency and given the diversity that we see in the questions. Does anyone want to comment on what experiences you may have had with looking at these different measures? What kinds of conflicts we might expect and how we might reconcile some of these things or how we might be comfortable with some of these conflicts, as opposed to being disturbed by them?

KENNETH MANTON: I guess I would say that first of all this is an area where there are no simple answers. They have got to look at the questions and the wording of the questions. Some of the prevalence estimates between the 1979 supplement and some of the NLTCS things. I was looking at an article in Milbank by Joan Comone (phonetic) Huntley and other people where they were looking at two or three or four data bases where there are different prevalence estimates and you have to be a little bit of a lawyer to see what some are talking about with the prevalent estimates. At least as best I can tell from the write-up, and also talking to Bill Lyzak, since he produced some estimates like this; you have to see that they are talking about limitations where the help of another person is needed.

Later he said, when he could not start with some of the prevalence estimates, he was focusing on people who require personal services. There is a class of people who may get by with special equipment, you have got to be very careful about a prevalence of what, and sometimes that really means going down to the specific wording of the question, especially some something like functional disability, which can be multi-dimensional in the softer concept.

For functional disability, some people are getting better, some are getting worse. If you freeze the sample or component group at a particular time, over time some of that is going to change on different time schedules.

For example, I am thinking of the NLTCS and, again, the notion of a sampling frame that captures certain types of duration-weighted phenomena. What are you getting a measure of at a given point in time and what can you tell, say, with a 2 year interval versus a 4 year interval versus a 6 month interval?

There are various types of functional or dependency changes that each and of themselves could be substantively very interesting. The hip fracture patient becomes rehabilitated, the Alzheimer's patient who has a fairly long, regular, slow decline, the stroke patient who might have a more prolonged period of disability, but maybe there is some long term rehabilitation. All those are different types of disease entities, the relationships, the disabilities and different phenomena that are of interest for one purpose or the other.

Certain sample designs are going to pick up aspects of those and will not pick up aspects of others.

KORBIN LIU: I think in comparing national estimates, in terms of the population that is disabled, there is the other question of which variables are used in estimating disability. I think even when we were looking initially at care survey, we had selected a set of ADL's and IADL's and Candy Macken who we were working with had selected a different set.

Immediately we came up with different estimates right there. I think some of these comparisons, for example, would be 1979 home care supplement, whether we were using the same ADL's. If you see differences in the rates, they may be a function of the ADL's and IADL's which were used by the two groups.

JOAN VAN NOSTRAND: I just wanted to make the point and kind of expand on a bit of what Korbin was saying. Not only are we looking at the different ADL's and particularly IADL's, because this whole issue of what you need to be able to do to live independently in the community is a crucial one. There are other kinds of screening variables that tend to have some important implications.

One of them is the fact that many of these questions start off with saying, "Because of a mental or physical problem do you have difficulty with ..." and then they list the particular ADL or IADL. These problems they are looking at are chronic. Believe it or not, there are two major definitions of "chronic" and they differ. That leads to very different estimates from what you see in the NLTCS, which is a much higher rate than what you would see from the 1984 SOA, and it has to do with the question in the 1984 SOA, I believe, is this condition chronic and has it lasted for 3 months or more, and there is a certain list of chronic conditions. In the NLTCS, I believe it is that it has lasted 3 months or is going to last for 3 months. That judgmental factor by the person who is responding as to whether or not that is going to last for more than 3 months makes a big difference in what our estimates are.

EVELYN MATHIS: I would add that you need to pay attention to the way the questions are worded and you look for things like "need" or "help." Then you also need for those people who are in institutions, to be aware of policies of institutions. Some of the policy of the institutions will cause the people to become more dependent or even if they are not, they are still going to get the help just because of the policies and the rules of the institutions.

The other thing that you need to be aware of, too, if you can get at the medical diagnoses a lot of that will help you with some things that could cause dependencies.

WILLIAM SCANLON: We are now going to turn to questions from the audience and, hopefully, we have already dealt with some of the things that were of concern to you. Now let us hear about other issues.

QUESTION: I think the first comment I would like to make is I wish you would all applaud for the generosity of the producers who produce all the data for public use.

As a key consumer of the public use data tapes, I am a little bit puzzled by the degree that all the data being gathered were considered to identify the market attributes. By that I mean the individuals living in the community are not living alone. They are not living in the vacuum system, but they are living in the area that faces the competitive market situation and living in that situation they do not know what kind of providers are giving the proper services.

My question is to what degree the NLTCS's have some identifier to identify the Zip Code, county code or city code, and so on? This is the great distinct possibility to link the individual file with area resources file that many of you probably have utilized. Therefore, you can identify to what degree the market forces impact upon the use of service and to what degree they impact upon the access to care.

That is my first question. My second question is analytically whether we could analyze the data either at an individual level or at an everyday level, that is individual behavior in use of long term care services.

At the individual level we all know, including all the variables you can analyze, probably you may account for relative 20-30 percent of variance in chance to be institutionalized. The question is where are those unknowns? What are those black boxes?

One possibility may be to identity what are the contextual variables that are beyond the individual behavioral. The individual attribute might impact upon the use of services or impact upon outcomes of care.

At an everyday level, I think that is a very serious one that we keep gathering data at an institutional level, but pay very little attention on quality of care issue, that is the process of care issues.

I think certain elements should be incorporated in the institutional base survey, not just who the patient is, but also what kind of providers are giving service to him, and to what degree the quality of the care within the institution can be rated consistently.

My third question maybe is much more global. What degree can we identify so-called determinants of the institutionalization or determinants of adverse outcomes of care? Is it possible that we can compose a summary index to identity so-called frailty level?

Susan Hughes mentioned to me that she is trying to formulate a one page instrument that allows you to summarize the risk. If that is the case, to what degree can concurrent validity be addressed? That is, why do we need a 45 page instrument instead of one if one can provide a better result?

I think that it is about time to think about if we can simplify all the data gathering in terms of a much more easy way to identify either frailty level, risk of the institutionalization, risk for hospitalization or risk for dying?

WILLIAM SCANLON: I think there were both some questions there and some real challenges, so I am glad I am a moderator and not a member of the panel.

KORBIN LIU: I guess the one response to the first one is that, obviously, there is no perfect data base for everything. The NLTCS was a nationally representative survey. The sample size was 6,000 people.

I do not think one could ask that type of survey to have geographic specificity and also have cases by geographic areas, like counties, to have any significance.

One of the reasons I think we are here is to look at the multi-data bases and see that they are available. To see how we could derive information from each one to help concatenate or whatever to get the best possible estimates.

On the institutional quality, that is a very geographic, very person, very facilities specific type of survey you are talking about. Incidentally, I think that is very hard to do on a national basis.

There are a number of other options. One is that there are administrative records systems like the Medicare/Medicaid automated certification system that surveys every facility, in theory, every year.

You have facility specificity there. There is some case mix information that was collected, particular data sources. We are seeing an emergence of case mix reimbursement for nursing homes. The data systems that are being derived to establish payment rates have a tremendous amount of patient specific information, because they have got the case mix based system, there are at least four or five of these states now that have case mix data.

That data is also collected longitudinally. I think there are some plans afoot, sponsored by the Health Care Financing Administration (HCFA), to use that longitudinal for the case mix payment to track patients and to look at outliers in terms of, perhaps, his ulcers or whatever, and use that as a tickler system for quality assurance monitoring.

KENNETH MANTON: I think what is relevant in almost any of the national surveys is the local area estimation or looking at the characteristics for identifiable local areas as opposed to using controls for market characteristics in a given set of local areas.

One is can you produce stable rate estimates for a given set of counties. The other is to say that in the areas that you sampled, can you get an initial set of factors as market characteristics that could be put into an explanatory model?

I think the issue there, in terms of national surveys, is these things are generally structured geographically and the question is what is the primary sampling unit structure that is used in a given survey?

A lot of times that is based on a county level thing. If you have provision tied to a county level marker, you could in theory link that. The Zip Code is obviously a lot more specific and obviously when you are going out and surveying the people you have their addresses available. Then there get to be some very tricky confidentiality issues in terms of how far can you decompose the geographic identifiers.

In counties it might be possible for a PSU level type of decomposition. The PSU's are tied either to counties or groups of counties. There would be a basis of linking them there. The Zip Codes are obviously available to the Bureau of the Census, but that probably would not be possible to do in terms of a public use tape.

Then to go to the second question, which is an analytic one in terms of you have a lot of unexplained variance. You do not believe all the rest is simple stochastic random factors. They are systematic unobservables. What important variables are we missing? You can think about that, too, in terms of aggregation effects.

There are various models dealing with the question of aggregation bias as you build up from individuals into county groups or groups of individuals and ask what are unobserved heterogeneic components in the analytic modeling effort.

There are models which adjust for heterogeneity bias in the economics for unemployment durations, etc. There are similar types of models for health event data, which says that there are systematic components, but we can not identify the variables. We can adjust our model estimates for some of those effects if we conceptualize it in certain ways.

One of the ways to get a handle on those unobserved effects is by looking at the packaging of that variance across different size aggregates or units. There are some analytic strategies for that.

THOMAS GRANNEMANN: I want to make just a few comments on the third question which was the ability to predict institutionalization.

We found in the Channeling data that it was fairly difficult to have a good predictor of institutionalization, and it was not really entirely clear what the cause of that is.

I think there is certainly some reason to think there is some imperfect measures in terms of the way we measure ADL and IADL that does not really capture the critical factors.

There are also potentially unobserved things that perhaps might be observable or might not be observable factors that may cause people to go into a nursing home.

The other factor, which I think there may be some evidence for, is the precipitating event thing when we look at people at a point in time. We can not predict very well who will go in because we do not know what is going to happen to them at a later point, maybe things that you can not observe at a given point in time that later happened to them that caused them to go in.

I think there is some evidence of this when you compare studies that have looked at predicting in advance who will go in, those are able to explain less of the differences than the cross-sectional studies which look and say who is there already.

There is some evidence from those kinds of studies that say you can identify some of those characteristics after they are in, but by then it is too late to use it as a predictor.

A comment on the question about whether there is a single number of index that could be constructed. One of the things we did was a HCFA funded follow-up study for the Channeling evaluation, to look at ways of predicting institutionalization. One of the things we did was to develop a summary index based on multiple regression analysis that took account of all the various variables we had on the file and tried to construct an index which was a predictor of nursing home use.

We were, in fact, successful in identifying a high risk group, a group that had several times the probability of going in as the whole Channeling sample, which to begin with started with a fairly impaired group.

I think there was some success there. The purpose was to try to identify people who could be targeted to be diverted from institutionalization, and I think on that count we were less successful in being able to identify people who could be diverted by Channeling. I think we have gone a fair way toward identifying risk factors that can allow you to identify high risk and low risk groups for institutionalization, and I think Channeling data is one source that provides a good basis for making those kind of estimates, although, again, with the qualifications we talked about earlier. That was not a general population we look at.

JOAN VAN NOSTRAND: I want to address the second issue which deals with the question about quality of care, and the need for data, particularly on the quality of institutional care.

As you know, HCFA has been recently reviewing their quality of care process and is in the progress of instituting an entirely new process which focuses not so much on the paper and pencil review, but on actual information about the clients. I think that is a big step forward.

The Institute of Medicine has put out a very large volume on looking at quality of care and measuring quality of care.

In particular for the NNHS, as we were going through the process of developing a pretest and including in it some questions on quality of care, we had a whole variety of questions. It was clear as we went to the Office of Management and Budget (OMB) for our final clearance that there were more questions than we could pay for, because resources, as always, are limited.

We turned to some of our sister agencies, particularly HCFA and basically made a plea for funding. They have limited resources also. We were not able to get the funding, and the guidance we received from OMB was that the focus was to be more on issues of financing and a deemphasis on issues of quality.

We used that guide to help us decide what we could possibly include in the survey, given the amount of money that was available.

GERRY HENDERSHOT: I wanted to return briefly to the question about the small area data.

It is possible to link data for counties to the National Health Interview Survey (NHIS) data files including the SOA. If you have a good county level data file, it is possible to merge that with the SOA, and there are only two obstacles to that. One is it is not part of our usual work and we do not budget for it, and so somebody has to come up with the money to do it.

It is not that expensive to do. It is not that difficult a task, if you have got the good area data file.

The second problem is confidentiality. We do not release small area identifiers on public use data tapes for reasons of confidentiality, so the actual match would have to be done by some kind of special arrangement between us and the user. Usually we do the match ourselves for the user. We are willing and able to do that kind of matching and have, in fact, done it frequently in the past.

QUESTION: I do appreciate the opportunity to hear from the people closest to the data. The question I have is much easier than the last one you had, but it is in three parts. I will be very quick.

In the No. 131 Advance Data, 1985 NNHS the indications were that the number of beds was up over 1977, the number of homes was up, the occupancy rate was up, discharges were up, but not as much. Admissions were actually down in absolute number, and the admission rate was down about 18 percent, if you look at the rate per bed.

My question is, is this a function perhaps of a change in definition between 1977 and 1985 of what constituted a discharge or an admission, or is there an implication with respect to length of stay?

Finally, was there a cross-check done between the Inventory of Long Term Care Places (ILTCP) number of 1985 admissions to test to see if this was a consistent number, that admissions were actually down?

EVELYN MATHIS: The ILTCP was mainly an updating of the National Master Facility Inventory (NMFI).

In the NMFI, the definition of places in the inventory did not change a great deal. How we got the data changed over the years, but the definition of places in the inventory did not change.

The inventory is made up of three different categories of places. One category is hospitals. Of course, we did not use that in the NNHS other than where we got a list of those nursing homes that were based in hospitals.

The second category of places in the inventory, that category we call "nursing and related care homes."

The third category of places in the inventory is facilities for the mentally retarded.

In that other category in the NMFI, not only do we have places for the mentally retarded, there are also facilities for the physical handicapped and the emotionally disturbed, and a lot of other kinds of places.

For the inventory that you heard earlier it included two categories, the one that was the facilities for the mentally retarded and the nursing and related care home.

In the past, and I think it was up until about 1976, we had all of the facilities in the inventory, we sent out a questionnaire and we classified them based on some common criteria that we applied across all facilities.

Beginning in 1976, up through I think it was 1982, we got some of the data from the states and the states that did not supply us with the data, we surveyed those places ourselves. We did not attempt to apply that common set of criteria across the board to all facilities.

If a state reported something as being a nursing facility, then we accepted it as a nursing facility. If it was reported as being a residential place or personal care home, we accepted it as such.

We have not had a chance yet though to look at the data to see how much of an impact the way the data got reported and collected had on the findings of the NNHS.

One thing to bear in mind, regardless of how we got the data, all of the facilities had to meet some very basic criteria.

Number one, they all had to provide care on an in-patient basis. They all had to have at least three beds or more and they all had to provide some level of care in addition to just room and board. If room and board were the only type level of care provided, they were out of scope for the purposes of not just the NNHS, but the NMFI also.

When the interviewer went to the facility in the NNHS, one of the first questions the interviewers asked as a part of our questionnaire so they would not forget, was to establish whether something else was provided in addition to just room and board.

I do not think that you are going to find a great variation in the kinds of places that you are going to find in the universe.

With the second report, No. 135, there was a very detailed outline of the kinds of places that were placed into the universe for sampling.

We also noticed those differences. We think there are several different reasons why you are observing the things that were pointed out in that report.

QUESTION: I am a little concerned. Evelyn said a minute ago "Bear in mind the medical diagnosis." I have a little trouble with that, especially those of us in mental health and psychosocial illness. I think because of the Institution for Mental Disease's (IMD) the medical diagnosis does not mean a great deal in the nursing home, and I would like to get some explanation or her reflections on that.

By and large if an institution has more than 50 percent of their nursing home patients with a primary psychiatric diagnosis, it is considered an IMD. Then there is a question about the reimbursement from Medicare/Medicaid if it is an IMD. There seems to be a tendency in many areas in nursing homes to give the patient a reimbursable diagnosis rather than the true medical operating diagnosis, especially I favor talking about functional capabilities instead of diagnosis anyway.

EVELYN MATHIS: The only way that I can respond to your question is that in developing what we call the current resident questionnaire, and that was the questionnaire that was completed for the sample of residents, we have an open-ended question there where we ask the interviewers to list all of the diagnoses that were listed in the record, and they did it two different times. One was at the time of admission and one was the current or the latest.

We also have some questions on the questionnaire that we got from the National Institute of Mental Health (NIMH) and they gave us the questions to try to capture whether or not the nursing home residents had a mental health diagnosis. They gave us the categories to ask.

NIMH will be getting the data tape. Get in touch with me and I will let you know the office to reach.

The other help that we got from NIMH in this, after we had keyed all of the data, was a check list. In addition to the check list, they had written in "and other specified."

We sent all of the "other specified" to the NIMH. They went through the check lists along with what was written, because some things were duplicated, and they went through that to make sure that what was written in was the same as what they had checked.

We depended entirely on NIMH to help us with the mental health diagnosis. I hope what we collected would be responsive to NIMH, at least we tried to be.

JOAN VAN NOSTRAND: There has been a lot of discussion, particularly with diagnosis related groups (DRG's), that there is often an effort to gain them and select a diagnoses in which the payment would be the largest, although most of the comments I hear from HCFA are that they feel that there is no gaming on the DRG's.

For the NNHS, the sample was something like five current residents and six discharge residents. There was really no way to tell what was happening with 50 percent of them.

Most of the time the responses were given by several nurses. Generally, we asked to talk to the nurse that was most familiar with the care. It is my judgment that they are not as concerned about answering this in terms of that IMD rule.

The other issue I think, though, that it raises is one that Evelyn brought up. What we are asking them to report are things that are recorded in the record. How that IMD rule affects what was recorded in the record is something else again. I would think one might want to do studies on that.

I think based on what was recorded I do not feel that there was that much of a problem with what was reported by those varying nurses on those very few residents or discharges in the nursing home. It is a good question and a great issue to deal with.

QUESTION: Are there any plans or recommendations that you can make to help us with looking at the DRG effect on the institutionalized elderly? We are finding, at least in New York, that we are bouncing them back and forth to the hospital. They are admitted [to the nursing home] from the hospital, and they are going back to the hospital because they are coming in too unstable.

I need to link that question, though, with the nursing shortage which is crippling us, particularly in the nursing home sector. Are there any places to do studies or to develop data to look at the nursing shortage in terms of patient dependency, morbidity, and mortality.

JUDITH WOOLDRIDGE: I just would like to mention that HCFA is working on a study of aftercare right now with Mathematica Policy Research (MPR). Aftercare is defined as the care that people receive when they leave the hospital. I do not think that this is related to the nursing shortage issue.

QUESTION: The linkage I am trying to make is that we seem to be getting the patients sicker and quicker. That sounds cute, but it is a reality. We do not have registered nurses (RN's) in nursing homes to the degree we had them 3 years ago. If we are getting a more technologically dependent patient in the nursing home, we do not have the staff to give them that care. We are bouncing them back to the hospital, which I was wondering also if that affected your admission data. In other words, the number of admissions in X period of time. Are you carrying that as an admission or as a transfer in and out?

JUDITH WOOLDRIDGE: With respect to the Channeling data, I can mention that the majority of the data were collected before DRG's became effective, so that is not an issue.

To the extent that the bouncing back and forth did begin for people in the Channeling sample, as a result of DRG's, that is simply not something that you can identify in our data set.

EVELYN MATHIS: In the NNHS, the Bureau of Health Professions, sponsored one component to look at nurses in nursing homes. The Bureau is in the process of analyzing the data and reviewing it.

In the facility questionnaire, we asked the administrator, or whoever was designated to respond, to give account of the numbers of full-time and part-time employees in various categories working in the nursing homes. I know that from the NNHS, the Bureau and ourselves will be doing some review of that data, but we are attempting to look at the number and sample is limited to just RN's, where we had them to fill out a questionnaire,

A part of that questionnaire was to ask them questions about retention; what was important to them in looking for a job, working in nursing homes, why they would, why they would not, how important certain things were to them to stay there, and why they would not.

I think you will see in about a year or so, some information about what things are important to RN's to accept positions and to work in nursing homes. I know that they are looking at the data and they are in the process of analyzing it. I know they are also planning some very detailed reports on that issue.

KORBIN LIU: We are doing a study with the 1982 and 1984 NLTCS's with the merged Medicare records. I think that the nature of that study highlights, in many ways, the power of that data base, particularly with the Medicare services involved. The study is directed toward looking at pre/post-PPS changes in hospital readmission and post-acute care use.

The 1982 NLTCS constitutes a pre-Prospective Payment System (PPS) situation. The 1984 is almost a post-PPS. We have got to keep in mind that PPS did come in place October 1983.

On the other hand, the beauty of this merged file is that Medicare records are continuous and so we have got Medicare records looking from 1980 to the present time; and that gets into 1985, because we would expect most of the PPS effects to occur not in 1984 but in 1985. I bring this up in part, to highlight the utility of that particular data base which will be public use very soon.

At the same time, Medicare records, the enrollment records, have information on mortality and status of the patients. With these types of administrative records merged with survey data, you have the ability to look at changes over time. With the survey data, the NLTCS are loaded with ADL information which one would not find in the routinely collected information in the administrative records for Medicare.

We have a combination of case mix information that you can get from the survey at two points in time and this continuous utilization information.

KENNETH MANTON: Korbin summarized that fairly well, but you just emphasized that with the different types of episodes, what we were able to do for certain windows, one could look at the tradeoffs in terms of both length of stay and rate of admission between the different types of services. With the changes in the hospitalization rates and lengths of stay, you could see corresponding shifts or to pre and post in terms of the home health. You have the detailed survey data on this highly vulnerable, high-service use group, where you can get very extensive information on their functional dependency and examine the tradeoffs there for that particular group, but you also have all the other sample components.

You can look at these service substitution or tradeoffs pre and post both for a highly chronically disabled community-based population. Then for the sub-samples of a non-chronically disabled population, you can sort them out as well as the institutional group in 1982. You do not have the change data for 1984, but you do have a bit more information on the institutionalized and you have people on the 1984 survey.

That gives you an additional dimension, a look at the change effects within a particular interval pre and post in terms of if people are not going into hospitals where might they be going, what other types of services are they using and what is the pattern of that service use, and gives you a systematic effect, if you will, of the impact rather than just saying, "Well, what was the impact on hospital episodes?"

QUESTION: Betty Cornelius at HCFA has some data in terms of staff being a relationship to quality care and she is tracking it in five different states, looking at the area of dehydration, decubiti, incontinence and several other areas. In fact, she had some preliminary findings that there is an inverse relationship between decubiti, incontinence, dehydration and relationship to staffing ratios.

WILLIAM SCANLON: I would add that HCFA is interested enough in this problem, that in addition to all the other stuff that you just heard, they are funding a study of ours, which is looking at area variations in post-hospital use, pre/post-PPS, and we are looking at the issue of readmissions as well as the use of Medicare staffs on an area basis.

QUESTION: I have the impression that the more research is done the less policy is made.

In Europe and in the country where I live, the research is not on the level as I heard earlier, and that is also why I am here. On the other hand, in my country, and I know also of my neighbor countries, a lot of policy for the elderly is made at the state and federal level on providing good insurance for the elderly people, building nursing homes and so on. I am surprised that there is so much emphasis on collecting data, but the final question is what to do with those data and what is the impact on policy.

KORBIN LIU: I think that the problem is you can not anticipate the policy questions. They are catalyzed by events.

You do not have enough data and you do not have it fast enough to directly address the question, you need to average the data bases. These surveys were conducted in the effort to see what they can each do. The nature of this particular session is to give you a sense for what kind of resources you have when the policy question comes up, and then be able to answer it.

Sometimes it happens that way and sometimes it does not. I mean, clearly we all know situations where policies are made without any empirical support for them.

Let me just talk about a little research that Ken and I did some years ago. We were very interested in the question of length of stay in nursing homes. I think that was 7 years ago when we started that. It was fairly arcane research, and I guess he and I were interested in terms of an intellectual issue.

I think we are fairly gratified now, as private long term care insurance has turned into more than just a notion, that some of the research became probably more relevant now than it was. The length of stay information is probably more relevant now than it was 7 years ago.

It takes maybe 7 years for the data or that type of analysis to become useful for policy. I think maybe that is the nature of the process.

KENNETH MANTON: We had some questions and I think there has been a lot of discussion in terms of change in reimbursement policy and the effect of quality of care. That is an attempt to monitor and fine tune policy in terms of outcomes that are producers, and I think that is probably a very important type of monitoring that one would want to conduct.

Though it can be expensive, research tends to be a lot less expensive than certain types of policy mistakes.

It is important to be efficient. One of the better examples of that is the recent evidence on the changeability of functional status of elderly and the oldest old. The potential for that suggests that there are different types of options for providing services to those individuals. Some of them may be more in a preventative mode of maintaining functional status, and that maybe keeps them from a dependent relationship.

A lot of research at the National Institute on Aging (NIA) is more basic research. Clearly when you look at the projections, the demographic structure of the population, you go out 10, 15, 20 years and you start getting some of these larger birth cohorts coming through. If we keep the traditional patterns at roughly the same rates, We are just going to be overwhelmed with the oldest old population in 20-30 years and the long term care demands.

It almost demands an alternative solution or just an incredible drain on general economic resources within the country.

There may be much more potential for improving the functional status of older people. How do we translate that into action? That seems to me a societal or a humanitarian goal.

The other is if you look at the demographic imperative or population aging, you almost have to develop some alternative responses.

WILLIAM SCANLON: I guess I would add to that, I am usually pessimistic about a lot of things, but one of the things that I feel more optimistic about is the progress that health services research has made, because in some sense health services research is still in its infancy. We are really now getting data bases in long term care that are much better than we have ever had before.

Legitimately, lack of knowledge was a barrier to adopting policies. The federal government has a problem in that once it takes a step, it really does not have a reverse gear to back out of a mistake.

That hesitancy on the part of the federal government is very understandable. All kinds of things have gone on within the states that are much more innovative, much more experimental, and sometimes are abandoned because it is politically easier to do that at a state level or at a local level, so that I think we are leering a lot. It should not be an excuse for too long that we are ignorant and therefore we can not do anything, but to some extent in the past that approach has had some legitimacy.

View full report


"87cfproc.pdf" (pdf, 2.52Mb)

Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®