Low-Income and Low-Skilled Workers Involvement in Nonstandard Employment. Constructing Matched Comparison Groups

After defining our treatment and comparison groups, the next step in our methodology is to construct matched comparison groups. That is, we select persons from the comparison group who most closely resemble members of the treatment group on a number of key factors (e.g., demographic characteristics, work and welfare history, family structure). We also control for the timing of the survey interviews, so that the labor market conditions faced by temporary agency workers and the comparison groups will be roughly similar. Samples are matched separately for those who start temporary work following employment and nonemployment, since the relationships in the model are likely to vary with work status.(70)

The basic approach is to use a non-linear regression model to describe who becomes a temporary worker, and then use the predicted probabilities of temporary work from that model as the basis for matching samples. Separate models of the probability of starting a temporary agency job are estimated for those with and without employment in the previous month, allowing the factors affecting the probability to differ for these groups. A multinomial logit model is used for the estimation, to allow for joint estimation of temporary work as compared with the two alternatives: employment and nonemployment.

We estimate two multinomial logit models. The first multinomial logit compares temporary workers who were employed in nontemporary work in the prior month (Treatment Group 1) to nontemporary workers (Comparison Group 1A) and non-workers (Comparison Group1B) who were employed in nontemporary work in the prior month. The second multinomial logit compares temporary workers who were not employed in the prior month (Treatment Group 2) to nontemporary workers (Comparison Group 2A) and non-workers (Comparison Group 2B) not employed in the prior month.

Independent variables for the logit models include:

  1. Human capital variables, including measures of age, education, consistency of labor market attachment, recentness of time out of the labor market, and recent job training;
  2. Indicators of a need and ability to work an irregular work schedule, such as number of children, age of youngest child, marital status, number of adults in the household, and measures of recent changes in these variables;
  3. Other demographic variables that tend to be linked to quality of job such as sex, race, and ethnicity;
  4. For those employed in the prior month, indicators of employment in a low-wage occupation or industry based on data constructed from the CPS, and a recent wage rate; and
  5. Measures of the wave and panel of the interview on which the data are based.

The specific measures used are somewhat different for those employed and not employed in the month prior to when we measure temporary work. The complete list of variables used is reported in Table B.1.

 

Appendix Table B.1
List of variables used in multinomial logit model for those employed in the previous month:
Human Capital Variables:
  • Age, age-squared;
  • Years of education, completion of High School, completion of at least some college;
  • Received job training in the previous year;
  • Percent of previous 4 months with employment;
  • Percent of past 10 years with more than 6 months of employment; if left high school fewer than 10 years earlier, percent of years since age 18 (or 16 if a dropout) with more than 6 months of employment;
  • Duration of current job and duration of current job squared;
  • Whether had a second job within the previous ten years;
  • Time between jobs for those with a second job

Indicators of need/ability to work flexible work schedule:

  • Married and married female;
  • Number of children;
  • Number of children in household decreased over previous year;
  • Child less than age 1, child less than age 3, and child less than age 5;
  • Number of adults;
  • Number of adults decreased/increased over previous year

Other demographic factors:

  • Female;
  • White;
  • Hispanic

Indicators of previous employment

  • Employed in low-wage occupation in previous month;
  • Employed in low-wage industry in previous month;
  • Wage rate in primary job in previous month

Indicators of low-income status:

  • Indicator of family income between 100 and 200 of the poverty line;
  • Indicator of family income above 200 of the poverty (for regressions including persons from higher income families.)

Measures of wave and panel:

  • Wave dummy interacted with panel indicator

Indicators of missing data

  • Separate indicators for missing data in each variable (with distinct missing data.)

Human Capital Variables:

  • Age, age-squared;
  • Years of education, completion of High School, completion of at least some college;
  • Received job training in the previous year;
  • Percent of previous 4 months with employment;
  • Percent of past 10 years with more than 6 months of employment; if left high school fewer than 10 years earlier, percent of years since age 18 (or 16 if a dropout) with more than 6 months of employment;
  • Duration and duration squared of non-employment spell

Indicators of need/ability to work flexible work schedule:

  • Married and married female;
  • Number of children;
  • Number of children in household decreased over previous year;
  • Child less than age 1, child less than age 3, and child less than age 5;
  • Number of adults;
  • Number of adults decreased over previous year; number increased over previous year

Other demographic factors:

  • Female;
  • White

Indicators of low-income status:

  • Indicator of family income between 100 and 200 of the poverty line;
  • Indicator of family income above 200 of the poverty (for regressions including persons from higher income families.)
  • Percent of last four months receiving public assistance.

Measures of wave and panel:

  • Wave dummy interacted with panel indicator

Indicators of missing data

  • Separate indicators for missing data in each variable (with distinct missing data.)

We then use a two-step matching procedure. First, using the first multinomial logit model estimated above, for each person in the sample, we predict a propensity score--the probability of employment by a temporary agency (Treatment Group 1) as compared with being employed in a nontemporary job (Comparison Group 1A) or not being employed (Comparison Group 1B).

To assess whether the propensity score from the model adequately controls for differences between temporary workers and each of the comparison groups, we compare the mean characteristics of temporary workers (Treatment Group 1), employed (Comparison Group 1A), and nonemployed (Comparison Group 1B) persons with comparable probabilities of temporary work. To do this, we sort the temporary agency cases (Treatment Group 1) by their predicted probability of being a temporary agency worker and find the probabilities associated with each quintile of the distribution. For example, let p80 be the probability associated with the 80th percentile and pmax be the maximum probability for temporary agency cases.

Second, we then compare the mean characteristics of temporary workers (Treatment Group 1) with probabilities in each quintile to those employed/not temporary (Comparison Group 1A) and nonemployed (Comparison Group 1B) persons with probabilities in the same ranges. For instance, we compare the means of variables used in the logit model for those Treatment Group 1, Comparison Group 1A, and Comparison Group 1B cases with probabilities between p80 and pmax. If the model is appropriate for building matched comparison groups, the mean characteristics of these three groups' cases should be similar for cases with probabilities within each chosen range. If, as occurs in our analysis, some characteristics are not similar, we re-estimate the regression model, including higher order functions of the variables that are not similar across the groupings.

After attempting to make the characteristics of the temporary agency workers and the two comparison groups similar within each range of predicted probabilities (e.g., between p80 and pmax), we use the predicted probabilities to create a matched sample. The goal is to choose cases from the Comparison Groups 1A and 1B with the same distribution of propensities as those who start temporary work. The propensity score literature suggests several approaches. The easiest approach is to reweight the data for the comparison group so that a weighted one-fifth of the comparison group members have propensity scores between the cutoffs for each quintile of scores for the temporary agency workers. That is, we weight so that one-fifth of the cases have propensity scores between pmin and p20; a fifth between p20 and p40; etc.(71)

One remaining issue is how to treat data from multiple months for a given case. Multiple observations for the same case are likely correlated and thus need to be accounted for in calculating the standard errors. Among the temporary worker cases, approximately 15 percent have multiple spells. However, because we have relatively few observations of temporary work, we plan to include all of them in our analysis and adjust the standard errors for correlations among the observations.(72)

The comparison groups allow more flexibility as to whether to include multiple observations from a case. Each comparison group must represent all months of the data for which our temporary agency workers are included to avoid misattributing the effects of different labor market conditions to temporary work. However, we expect the data for the comparison group persons to be highly correlated over time and as a consequence, little gain from including multiple observations for the same person in the analysis. Furthermore, because we are aiming to obtain comparison groups roughly similar in size to our sample of temporary workers, we do not anticipate needing multiple observations per case.

Our solution is to randomly include one month of data for each person in the comparison groups. Each observation is assigned to comparison groups according to its employment status in the sampled and previous month. By randomly choosing the selected months, we maintain the representativeness of our sample while ensuring that individuals do not show up more than once.