How Are Immigrants Faring After Welfare Reform?. Appendix 2: Editing and Imputation


In any survey, a certain amount of data are missing, underreported, or misreported, which can be at least partially corrected through data editing and imputation. To help adjust for these problems, survey researchers typically edit (or clean) their data and also impute some data to compensate for missing data. In LANYCIS, we focused on editing and imputation in three areas: income, health insurance, and immigration status.

One common problem in surveys is that respondents know that someone in the household has a certain type of income or health insurance, but might not know how much income or what type of insurance. Sometimes they are just unsure whether there is any type of income or insurance at all. For LANYCIS, John Coder of Sentier Research imputed income and insurance status of individuals using "hot deck" imputation methods, which are also used in surveys like the Current Population Survey (CPS) and National Survey of America's Families (NSAF). These essentially impute values for missing responses using responses sampled from similar types of people. Imputation may lead to erroneous (or questionable) values assigned for individual cases, but it should generally improve the distribution of cases for the overall sample.

We inquired about 1998 annual income by source (e.g., earnings, self-employment income, pensions, child support, interest or dividends, rent, social security income, welfare income, etc.) and by household member. After imputations, we computed each person's total income and created measures of family and household income. These were then compared with federal poverty guidelines (the Department of Health and Human Services guidelines used for program eligibility) to create income poverty ratios.

For insurance status of each focal person, LANYCIS asked whether insurance was employer-sponsored, other private (nongroup), Medicare, Medicaid, other state, or other insurance. As in NSAF, for people with no source of insurance we asked a follow-up question to confirm that the person was uninsured (Rajan, Zuckerman, and Brennan 2000). Anyone who reported receiving TANF or SSI was assigned Medicaid coverage. People could have multiple types of insurance coverage. We also created hierarchical measures of insurance coverage for each individual, in which each person was assigned one type of coverage: Medicaid is ranked at the top of the hierarchy, followed by job-based insurance, other private coverage and then other public coverage (usually either Medicare of State Children's Health Insurance Program).

Immigration status was sometimes imputed using logical editing processes. For instance, we generally used the status of the respondent and spouse to impute the status of their children. In the event that parents had discordant immigration statuses, we assigned the status of the parent who entered the United States at the same time or from the same country as the child. As is standard in surveys of this type, we did not directly ask whether people were undocumented but instead assigned undocumented status to people who do not otherwise report a legal basis for being in the United States. We edited some responses to modify impossible (or very unlikely) status codes, and a number of people were assigned undocumented alien status. In many cases, we used information from the in-depth qualitative interviews to further verify status.

The survey asked respondents and their spouses:

  • Were you admitted as a refugee?
  • Were you admitted as a legal permanent resident (LPR)? If yes, what document allows you to remain in the United States permanently?
  • Were you admitted on a temporary basis? If yes, what document allows you to remain in the United States temporarily? (Possible responses: tourist visa, student visa, temporary work permit, or other temporary document.)
  • Are you now a U.S. citizen?
  • Were you naturalized?
  • Were you born a U.S. citizen?
  • Are you now an LPR? If yes, what document allows you to remain in the United States permanently?
  • Have you been granted asylum?
  • Did you apply for U.S. citizenship? If yes, is the application pending, are you waiting to be sworn in, or were you denied citizenship?
  • When did you come to live in the United States? If more than once, when was the last time?

We define immigrants as foreign-born persons permanently residing in the United States. Foreign-born persons not permanently residing in the U.S. (i.e., non-resident aliens) include students, tourists, and temporary workers. Non-resident aliens are excluded from most analyses in this report, unless their spouses or unmarried partners are immigrants permanently residing in the country. Thus, the study includes primarily adults with four types of immigration status: undocumented, legal immigrant (LPR), refugee and naturalized.

Undocumented immigrants are persons who entered the United States without inspection, overstayed temporary visas, or otherwise violated U.S. immigration laws but remain in the country. In some cases, respondents answered that they do not have documents allowing them to remain in the country legally. In other cases, they answered that they have some type of temporary non-resident document (tourist visa, student visa, temporary work permit, or other document). We used a series of steps to impute undocumented status to some temporary document holders (rather than treat them as non-resident aliens), given that their documents were likely invalid or expired, and they were continuing to reside in the country:

1. All tourist visa holders who last entered the United States more than two years before the survey, assuming that their tourist visas had expired by the survey date.

2. All students not enrolled in school or working 20 hours or more per week, since student visa holders are not allowed to work more than 20 hours.

3. All temporary work permit and other visa holders who last entered the United States more than five years before the survey.

4. All temporary work permit holders not working in occupations for which work permits are valid.

Assignment of legal immigrant status was based on respondents' statements that they possess resident alien or "Green Cards." At the time of the survey, some legal immigrants had already applied for citizenship. But if they had not yet been interviewed and sworn in, we considered them legal immigrants.

Legal immigrants admitted to the United States as refugees are grouped with those who were still refugees at the time of the survey. Immigrants who have been granted asylum are included in the refugee group. Asylum applicants who have not yet been granted that status, however, are included in the undocumented group.

Naturalized citizens are foreign-born persons who have been sworn in as U.S. citizens, regardless of their entry status. Refugees who naturalize are included in this group, since their eligibility for benefits is more generous as U.S. citizens than as refugees.

Family immigration status is based on the combined statuses of respondent and spouse or unmarried partner. To begin with, all immigrant families included in the analyses have at least one immigrant family member. We categorize family immigration status in a hierarchy of increasing public benefit eligibility, with undocumented status at the top, as follows:

1. Any family where either the respondent or spouse/unmarried partner is undocumented is considered undocumented.

2. Any family without an undocumented respondent or spouse, but where either the respondent or spouse is a legal immigrant, is a legal immigrant family.

3. Any family with neither an undocumented nor a legal immigrant respondent or spouse, but where either the respondent or spouse is a refugee is a refugee family.

4. Any family where the respondent and spouse are both naturalized, or where a single respondent is naturalized, is considered naturalized.

Families where both the respondent and spouse (or a single respondent) is a temporary non-resident alien or a U.S.-born citizen, are excluded from the analyses in Part II. This exclusion led us to drop 85 of 3,448 families from the data. For analyses involving all families, the sample size is 3,363. Sample sizes are smaller for subsets of the data (e.g., non-elderly, low-income and/or food-insecure families), or when some variables have missing values.

View full report


"report.pdf" (pdf, 2.23Mb)

Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®