Although the sampling basis of the two samples is comparable, the underlying populations from which the two samples are drawn differ in several respects. Figure 1 shows, for example, that there are major socio-demographic differences between the privately insured and the non-privately insured samples.
Aside from these known and identifiable differences, there is a high likelihood that there are other unobserved differences between the two groups that could affect service utilization in systematic ways. If unaccounted for, such differences could lead to biased estimators on other variables. An underlying assumption of multivariate modeling is that the effects of omitted (unobserved or unmeasured) variables are randomly distributed. If such effects are not randomly distributed, their exclusion from a model can lead to biased estimates of the coefficients of other included variables, including the key indicator variable -- insurance status.
To address this problem in subsequent analyses, where appropriate, we will employ sample selection models. These models allow one to control for the effect of unobserved variables related to having an insurance policy. Put another way, these models allow us to isolate the effects of "who purchases insurance" from the effect of "who has insurance". This enables us to isolate the "insurance purchase bias" from the "insurance effect" (See Appendix 1 for more detail).
|FIGURE 1: Differences in Socio-Demographic Characteristics of Disabled Elders by Insurance Status|
|SOURCE: 1999 Insured Panel (n=583). For income, n=483. 1994 NLTCS data (n=1357). For education variable, n=1276.
NOTE: Differences are significant at the .05 level.