Data processing included edits incorporated into the CAPI programming, which were applied during data collection; data review and editing post-data collection; and data file construction. For some questions, hard and soft edit checks were programmed into CAPI based on the expected range of responses for given questions and logical consistency between questions. Hard edits required interviewers to correct an entry before proceeding, whereas soft edits prompted interviewers to verify or correct unlikely, but possible, responses.
Data review and editing after data collection included a wide range of activities. Responses were reviewed for accuracy, logic, consistency, and completeness. Twenty-six cases identified as having inconsistencies in the number of rooms and apartments and bathrooms were contacted again for further clarification of those responses. Data frequencies were examined to identify missing items and outliers. When missing items, outliers, or extremely implausible responses were identified, an explanation provided by the interviewer in the comments attached to specific questions or in the debriefing questionnaire was sought. For example, interviewers attached a comment to a question if the actual response was greater than the highest value CAPI would accept. Interviewer verbatim entries in the debriefing questionnaire were also reviewed to identify other needed corrections. Corrections were also made to the data when an interviewer error was discovered, for example, if the interviewer accidentally opened the wrong case and recorded the answers under the wrong identification number. Problems relating to CAPI skip patterns and recoded variables were addressed. For example, when an interviewer went back into CAPI to change a response, in a few cases, this caused a discrepancy with subsequent questions. Labels for ‘‘select all that apply’’ variables needed reformatting to yes or no instead of category 1 selected or category 2 not selected. Out-of-range data values were identified and corrected. In addition, ‘‘other–specify’’ responses were back-coded to existing code categories, where appropriate. Item nonresponse analyses that looked at items with the highest ‘‘don’t know’’ and ‘‘refused’’ responses were conducted. Partially completed cases were identified and assessed as to whether or not to include them in the data files. For the facility file, there was only one partially completed case and it was included in the final file. This case had three-fourths of the first section of the facility questionnaire completed. For the resident file, none of the 36 partially completed cases were included in the final file.
Specifications were developed for derived variables, other recoded variables, and creation of the final facility and resident in-house and public-use data files. In addition, in order to provide a complete picture of each sampled facility and its disposition, a paradata file was constructed that contained selected variables from the event history, preload sample file and facility, resident selection, and resident questionnaires. This file was used for unit nonresponse analysis and provided a mechanism to evaluate the fieldwork and effectiveness of strategies taken during the survey field period.