This option for evaluating the LTC RAP relies on using RAPIDS administrative data on LTC RAP apprentices combined with employer and employee information from the LEHD database, a Census Bureau data file that includes state-level Unemployment Insurance administrative information on employment and earnings. This quasi-experimental strategy identifies a comparable comparison group for LTC RAP apprentices by matching characteristics of apprentices with other long-term care workers in the database who have not participated in the apprenticeship program. Similar matching and comparison is also performed for employer sponsors.
This evaluation design helps answer the following research questions:
What is the impact of registered apprenticeship on apprentices?
- How does participation in a LTC RAP affect job tenure within an employer, compared to tenure and employment stability in the absence of the apprenticeship program?
- How does participation in a LTC RAP affect employment stability within the long-term care industry, compared to employment stability within the long-term care industry in the absence of the apprenticeship program?
- How does participation in a LTC RAP affect earnings growth compared to earnings growth in the absence of the apprenticeship program?
What is the impact of registered apprenticeship on LTC RAP sponsors?
- How does offering a LTC RAP affect overall direct care worker turnover, compared to worker turnover in the absence of the apprenticeship program?
- How does offering a LTC RAP affect the loss of employees to competing long-term care providers, compared to the loss of employees in the absence of the apprenticeship program?
- How does offering a LTC RAP affect employment and revenue growth, compared to employment and revenue growth in the absence of the apprenticeship program?
Overview of Evaluation Design
This quasi-experimental design would use firm and worker level LEHD data on apprenticeship sponsors to construct treatment groups of workers who have participated in apprenticeship programs and comparison groups of workers for non-sponsor long-term care providers. The LEHD is a research program directed by the Census Bureau that combines state administrative data on workers and firms from state Unemployment Insurance programs to generate a national matched quarterly employer-employee database. Because this data is required by law, it is unusually complete and accurate on the limited variables it includes. Unlike administrative data available from individual states, the LEHD data can identify workers who move across state lines and firms who have workers in multiple states. Since the core of the LEHD is a matched employer file and worker file, it captures labor market dynamics from both the firm and the workers perspective. Another feature of the LEHD is that a variety of other Census surveys and Internal Revenue Service data are matched to employers and employees using Social Security numbers and Employer Identification Numbers. These identifiers are not directly available to researchers, but researchers can submit lists of Social Security numbers and Employer Identification Numbers to the Census Bureau to obtain an extract of the LEHD data for analysis.
To answer research questions related to the impact of the LTC RAP on apprentices, the following treatment and comparison groups will be constructed from the LEHD data:
All workers who have participated in LTC RAPs and all low-wage workers in matched long-term care firms not administering a LTC RAP.
All workers who have participated in LTC RAPs and all non-apprentice low-wage workers in LTC RAP sponsoring long-term care firms.
All low-wage workers employed by LTC RAP sponsors and all low-wage workers in matched long-term care firms.
One limitation of the LEHD data in constructing these comparison groups is that low-earning direct care workers employed by long-term care firms cannot be distinguished from other low-earning workers employed in long-term care firms, such as housekeeping staff and dietary workers. However, if a known apprentice is matched to housekeeping/dietary staff in a sample, it will be because their earnings histories are extremely close. This suggests a certain degree of substitutability between occupations. Of course there may be some unobserved motivations and values for these different workers, but what matters is the question if these apprentices had not received an apprenticeship would their wage trajectories be comparable to the earnings trajectories of someone in the same industry with the same earnings history. This limitation is an issue primarily for nursing homes, residential care facilities and group homes for people with intellectual disabilities because they provide room and board and employ substantial numbers of housekeeping and dietary staff members; it is less of a problem for home health agencies because they hire relatively fewer non-direct care staff.
To answer research questions related to the impact of the LTC RAP on employers, the treatment group would be all LTC RAP sponsoring firms and the comparison group would be non-LTC RAP sponsoring long-term care firms matched to the treatment groups on variables, such as geographic location, establishment revenue, and number of low-wage employees using propensity score matching techniques. Propensity score matching uses a statistical model to predict the probability of being in the treatment (apprenticeship) group using a series of observable characteristics. This predicted probability of being in the treatment group is then used to construct weights for the comparison group, which make the weighted comparison group more comparable to the treatment group on observable characteristics, therefore somewhat mimicking random assignment. The impact of registered apprenticeship on employment, earnings, turnover, job growth, and worker separation outcomes can be estimated comparing the difference-in-means between the treatment and comparison groups after matching or, ideally, as a difference-in-differences multivariate model, which compares the differences in the outcome over time for apprentices or firms offering LTC RAP and firms not administering a LTC RAP (a strategy that Mueser,Troske, and Gorislavsky(2007) find to be superior in program evaluations of job training programs using propensity score matching of administrative data).
An important advantage of this evaluation design approach is that it can use all persons who have ever been apprentices in the LTC RAP, including those who no longer work for the long-term care provider that administered the apprenticeship or even those no longer working in long-term care. Since the identification of a comparison group is completed after the identification of LTC RAP apprentices, appropriate comparison groups can be constructed for all LTC RAPs, regardless of their implementation date. Indeed, the use of RAPIDS data over a several year period and over several sponsors helps ensure that outcomes are not affected by cyclical factors, and circumstances in specific local markets. This cannot be guaranteed by an experimental design at a specific site. Exhibit 5 summarizes the proposed LEHD impact analyses.
|EXHIBIT 5. Summary of LEHD LTC RAP Impact Analyses|
|Research Question||Treatment Group||Comparison Group||Outcome Variable|
|What is the impact of registered apprenticeship on long-term care workers?||All LTC RAP apprentices, regardless of whether they completed the program or whether they currently still work for the same employer or in a long-term care setting||Low-wage workers in non-LTC RAP long-term care firms, matched with LTC RAP apprentices||
|What is the impact of registered apprenticeship on long-term care workers?||All LTC RAP apprentices, regardless of whether they completed the program or whether they currently still work for the same employer or in a long-term care setting||Low-wage, non-apprentice workers in LTC RAP sponsoring long-term care firms, matched with LTC RAP apprentices||
|What is the impact of registered apprenticeship on long-term care workers?||All low-wage workers employed by LTC RAP sponsors||Low-wage workers in non-LTC RAP long-term care firms, matched with all low-wage workers employed by LTC RAP sponsors||
|What is the impact of registered apprenticeship on long-term care employers?||All LTC RAP sponsors||All non-registered apprenticeship program sponsoring long-term care firms, matched with LTC RAP sponsors||
Treatment of Long-Term Care Occupations
Occupations are identified in the RAPIDS database with occupational codes. In contrast, the LEHD data do not identify worker occupations, but they do identify the industry of the employer. This would typically be an obstacle to identifying an appropriate comparison group, but in the case of long-term care there is a close correspondence between occupation and industry groups. The predominant LTC RAP occupations are CNAs, HHAs, HSSs and DSSs. These occupations align with the industry sectors presented in Exhibit 6.
|EXHIBIT 6. Occupation-Industry Crosswalk|
|Occupation||RAPIDS Occupation Code||Industry||Four Digit NAICS Industry Code|
|Certified Nursing Assistants (CNA)||824, 824C, 824CB, 824A, 824R, 824D, 824G, 824M||Nursing Care Facilities||6231|
|Home Health Aides (HHA)||1086, 1086CB, 1086A, 1086B, 1086D, 1086E||Home Health Care Services||6216|
|Health Support Specialists (HSS)||1086AA||Assisted Living Facilities, and Other Residential Care||6233, 6232, 6239|
|Direct Support Specialists (DSS)||1040, 1040CB||Services for Elderly & Persons with Disabilities||6241|
|NOTES: NAICS Code 6231: Nursing Care Facilities; 6216: Home Health Care Services; 6233: Continuing Care Retirement Communities and Homes for the Elderly; 6232: Residential Mental Health and Substance Abuse Facilities; 6239: Other Residential Care Facilities; 6241: Children and Youth Services, Services for the Elderly and Persons with Disabilities, and Other Individual and Family Services.|
Treatment cases of a specific occupation in the RAPIDS data can be matched to corresponding comparison cases employed by a firm in the corresponding industry in the LEHD data in two different ways. First, treatment cases could be segregated by industrial sector, and matched exclusively to comparison cases in the same sector, so that the propensity score matching is done separately by industry group (which is expected to correspond closely to the occupation group of interest). However, some long-term care firms provide several different services, but are required by state administrative data systems to report only one industry (typically their predominant activity). For example, a nursing home facility operated by a hospital might be classified as a hospital rather than a nursing home for its industry codes, because the hospital facility is the firms predominant activity. To account for this, a second option is to consider all treatment cases (i.e., not segregate them by industry), match them to all comparison cases, and use industry groups as a matching variable only, rather than as a way of identifying industrial sub-samples.
Potential selection bias in this quasi-experimental evaluation of the LTC RAP could occur in at least two ways: the non-random selection of different providers into the LTC RAP initiative or the non-random selection of different employees into the LTC RAP within the provider. One possibility is that higher quality, more financially secure long-term care providers are more likely to start a LTC RAP. On the other hand, those long-term care providers already doing well and satisfied with their training programs may be least likely to use the LTC RAP. From this perspective, bias could run in either direction, especially if the comparison is based on levels and not on changes for each group.
Also, site visits indicate that higher quality employees are generally chosen to participate in the LTC RAPs as apprentices. If so, estimates based on simple comparisons of participants and non-participants might overstate the impact of LTC RAP. To address this problem, an evaluation can use propensity score matching on pre-apprenticeship enrollment earnings (which should help to capture unobservable human capital that contributes to on-the-job productivity), age, gender, job tenure, firm size, and industry to identify a comparison sample. Pre-program earnings and employment records may be a good matching indicator, one likely to capture individual differences in unmeasured pre-program characteristics related to performance in the job market. While variation in wage rates is modest for long-term care workers, there is considerable variation in hours worked and in job tenure. Since the LEHD only provides information on quarterly earnings (i.e., hourly wages multiplied by hours worked), there should be more variation in earnings than in wages. This method is widely used by evaluators of other programs targeted at low-wage workers (Mueser,Troske, and Gorislavsky, 2007 ).
Access to the universe of long-term care providers and LTC RAP sponsors in state Unemployment Insurance administrative data systems through the LEHD allows for the construction of a variety of comparison and treatment groups. Testing between multiple treatment and comparison groups enables evaluators to address biases that may be present in some specifications, but not others. For example, potential selection bias may produce a regression estimate either above or below the true effect. Using two different comparison groups for each outcome will allow generation of these upper and lower bound effects so that the true effect can be bounded between the two estimates. This strategy provides some assurance of the range of the treatment effect.
Multiple treatment/comparison group pairs are amenable to the propensity score matching approach, and each pair has advantages and disadvantages associated with it. These are summarized in Exhibit 7. If propensity score matching on observable characteristics successfully accounts for unobservable characteristics of apprentices and LTC RAP sponsors because they are correlated with the unobservable characteristics, then selection biases can be minimized.
|EXHIBIT 7. Advantages and Disadvantages of LEHD Impact Analyses|
|Treatment Group||Comparison Group
(to be matched to the Treatment Group)
|All RAPIDs apprentices||All low-wage workers in non-LTC RAP long-term care firms||
|All RAPIDs apprentices||All low-wage workers in LTC RAP sponsoring firms||
|All low-wage workers employed by LTC RAP sponsors||All low-wage workers in long-term care firms, regardless of whether they completed the program or whether they currently still work for the same employer or in a long-term care setting||
|All LTC RAP sponsors||All non-LTC RAP sponsoring long-term care firms||
If propensity score matching on observable characteristics successfully accounts for unobservable characteristics of apprentices and LTC RAP sponsors because they are correlated with the unobservable characteristics, then selection biases can be minimized.
1. Treatment Groups
The treatment groups used in the evaluation, presented in Exhibit 8, will be drawn from employers and apprentices in the RAPIDS data that are identifiable in the LEHD data. The LEHD covers all workers covered by the state Unemployment Insurance program (which should be the entire RAPIDS universe). A total of over 4,300 unique LTC RAP participants are included in the RAPIDS system between January 2005 and May 2011, representing 119 programs. This treatment group may expand if more LTC RAPs are implemented between May 2011 and any evaluation. If sample sizes were adequate, separate analyses could be conducted of workers who completed the apprenticeship program, people currently in the program, and people who dropped out prior to completing the program. It is not uncommon for social programs to have high dropout rates and be effective for those who complete the program.
|EXHIBIT 8. Treatment and Comparison Group for LEHD Analyses|
|Treatment Group||Comparison Group|
|All RAPIDs apprentices (N~3,750)||All low-wage workers in long-term care firms (N~2,000,000)|
|All RAPIDs apprentices (N~3,750)||All low-wage workers in LTC RAP sponsoring firms (N~5,000)|
|All low-wage workers employed by registered apprenticeship program sponsors (N~5,000)||Low-wage workers in long-term care firms (N~2,000,000)|
|All registered apprenticeship program sponsors (N~119)||All non-registered apprenticeship program sponsoring long-term care firms (N~100,000)|
2. Comparison Groups
If all LTC RAPs were implemented simultaneously, a single comparison group could be chosen for all programs. However, this is not the case. In order to guarantee that pre-apprenticeship characteristics of the treatment group are matched to characteristics of the comparison group during the same time frame, propensity score matching must be conducted separately for each quarterly wave of LTC RAP registration. Thus, the apprentices that register with a LTC RAP at varying points in time will be matched to comparison cases from the LEHD data on the basis of quarterly earnings occurring before the LTC RAP registration.
The propensity score matching to determine the appropriate weights for the comparison group would follow Rubin (2001) and Mueser, Troske, and Gorislavsky (2007) , and consist of an estimation of the predicted probability of being in the treatment group as a Logit function of eight quarters of earnings data and a set of additional variables, including industry/occupation group and geographic location for each quarter of the LEHD data available.
The propensity score matching approach is applied in two different ways to obtain the different comparison groups desired. In the first way, one applies propensity score weighting to construct a comparison group of roughly the same size as the treatment group (e.g., apprentices). In the second way, one applies weights to the population at large from whom the comparison group is drawn (e.g., all long-term care employers). Apprentices or employers/sponsors which have a close match get a high weight and apprentices or employer/sponsors which do not have a close match get a low weight.
Estimated Statistical Power
Evaluators develop statistical power calculations in order to determine the minimum sample size necessary to have confidence in the studys ability to detect a policy relevant impact. Interest in identifying the minimum sample size needed is common because of the cost of obtaining a larger sample, for example, by increasing the number of people required to complete surveys. For this particular evaluation, the evaluator would have access to a large sample at very modest cost. To be conservative, for the apprentice-level calculations, we used the approximate number of apprentices (3,750) in the RAPIDS data at the end of 2009 for the size of each of the treatment and control groups in the comparisons to be made. For the employer sponsor calculation, we assumed 150 employer sponsors would have a LTC RAP by the time a potential evaluation was fielded; in other words, we assumed that additional programs would be added to the current 119 employer sponsors.
In estimating power calculations, the issue is: How small a difference in each outcome measure (e.g., the impact) can be detected at p<0.05 with approximately 80% power given the number of apprentices (or employer sponsors) in the analysis for the variation (expressed as the standard deviation) in the outcome measure? Statistical power is the degree of confidence (probability) with which one can correctly reject the hypothesis that there is no impact. Conventionally, statisticians suggest that power of 80% is satisfactory.
We estimated statistical power for three different outcomes -- apprentice annual earnings and job tenure, and employer sponsor-level turnover. For the outcome measure of apprentice annual earnings, the analysis could detect a difference as small as $300 in annual earnings using mean annual earnings of $21,000 and a standard deviation of $5,000, and assuming 3,750 employees each in the apprentice and comparison groups. For the outcome measure of apprentice job tenure, the analysis could detect a difference in job tenure as small as 0.7 months using mean tenure of 30 months and a standard deviation of 12 months, assuming 3,750 employees each in the apprentice and comparison groups. For the outcome measure of annual employer-sponsor turnover (where turnover is expressed in percentage points), the analysis could detect a difference as small as 5 percentage points using a mean turnover rate of 55 percentage points and a standard deviation of 25 percentage points, assuming 150 employers in the LTC RAP employer sponsor group and almost 99,850 employers in the comparison group. In each case, the evaluator would be able to detect even relatively low impacts of the LTC RAP.
Domains on Which Information Will Be Gathered
Information will be collected on treatment and comparison cases using data from the LEHD. An advantage of using this data is that it ensures that information is collected consistently across all programs, and between treatment and comparison groups. The primary domains on which information will be gathered are:
Quarterly earnings: Earnings recorded in state unemployment insurance data systems are reported in the LEHD for each job held in a quarter.
Quarterly employment: Cases will be considered employed in a quarter if they have positive earnings during that quarter. Quarterly employment information can also be used to construct a job tenure variable, which can be used for matching.
Industry: The LEHD records four digit NAICS industry codes which will be mapped on to occupational codes in the RAPIDs data (see Exhibit 6).
Employer: Employer Identification Numbers are also provided in the LEHD data so that in addition to assessing the impact of the LTC RAP on employment and earnings in general, attachment to the RAP sponsoring firm, and firm turnover will also be determined.
Geographic location: The geographic location of long-term care providers will be an important matching variable, which ensures that long-term care providers are compared to cases operating in comparable long-term care markets.
Demographic characteristics: Age and gender are available on employees in the LEHD and can also be used to match. Education level and race/ethnicity are not available.
Firm revenue: Gross revenues collected in economic censuses and linked to the LEHD can be used for matching to ensure that treatment cases are compared to comparison cases from similar sized firms.
In addition to this primary information, other firm-level data available in the LEHD may be used to improve the quality of the match between treatment and comparison groups. For example, information on firm age, if complete, could contribute to the quality of the match.
Data Collection Process
While no primary data collection will be necessary for this evaluation design, data will be obtained from the LEHD, which is maintained by the Census Bureau. Researchers must apply to the Census Bureau for use of the LEHD data for specific projects. This application process can be time consuming, and should be initiated very early in the evaluation process. Evaluators will not be able to identify data linked to specific Social Security numbers or employer identification numbers, but they will be able to submit these identifiers to the Census Bureau, so that cases can be extracted and assigned an identification number for the analysis.
Social Security numbers and Employer Identification Numbers for LTC RAP sponsors and participants will be drawn from the RAPIDs data for submission to the Census Bureau. After applying for use of the LEHD data and signing a data users agreement, the Census Bureau will provide:
All LEHD cases that have Social Security numbers that match the Social Security numbers of LTC RAP participants. These Social Security numbers will be replaced with personal identification keys and an indicator variable identifying the cases as apprentices.
All LEHD cases that do not have Social Security numbers that match the Social Security numbers of registered apprenticeship program participants, but who have been employed by firms that have sponsored LTC RAPs. These firms will be identified by an Employer Identification Number submitted to the Census Bureau. Social Security numbers for these cases will be replaced with personal identification keys and an indicator variable identifying these cases as non-apprentices.
All LEHD cases who have been employed by firms that have not sponsored LTC RAPs but who have reported NAICS industry codes associated with the long-term care industry (Codes 6231, 6216, 6233, 6232, 6239, and 6241).
Time Frame to Collect and Analyze Data
We anticipate that the LEHD analysis option would take approximately 27 months to complete. The activities would include initial planning and data acquisition, including obtaining Social Security numbers for matching RAPIDS to LEHD data (15 months), data cleaning and analysis (6 months), and report development (6 months). According to DOL officials, the application process for obtaining personal data, such as Social Security numbers, takes about a year.
We estimate costs for this option to be approximately $285,000.
Statistical Methods for Analyzing the Data
1. Propensity Score Matching
Propensity score matching methods will be used to generate an appropriate comparison group for the quasi-experimental evaluation design. This technique generates weights to be applied to the comparison group so that it more closely resembles the treatment group on observable variables. The match will be conducted by producing a predicted probability of being in the treatment group using a logit model of treatment group status as a function of earnings history, employment history, geographic location, industry, and other matching variables.
Ideally, matching on these observable characteristics should help to control for other unobservable characteristics as well. Certain apprentice characteristics may be correlated with the earnings of apprentices, although these characteristics are not measured in the LEHD data. While wages for direct care workers do not vary greatly (Khatutsky, Wiener, Anderson et al., 2011), the number of hours worked do , so some of the variation in earnings may reflect the unmeasured characteristics of workers selected to become apprentices.
Once a predicted probability of being in the treatment group is produced for all members of the comparison group, a variety of matching strategies can be used that predict probability to weight the comparison group. These include the nearest neighbor method, the odds ratio method, and the Kernel density method. The nearest neighbor method pairs each treatment case with the comparison case that has the closest propensity score to it, generating a one-to-one match between the treatment and comparison group. The odds ratio method and the Kernel density method generate a weight for all comparison cases using the propensity score. Comparison cases with high propensity scores are given high weights, and those with low propensity scores are given low weights. Mueser, Troske, and Gorislavsky (2007) find that impact estimates for job training programs are not especially sensitive to the choice of matching method. In order to confirm the robustness of any evaluation of the LTC RAP initiative, multiple matching methods should be used. After implementing these matching strategies, Rubin (2001) suggests several balancing tests to confirm the strength of the match. The balancing tests are various versions of a difference of means test on the matching variables. A strong match should reduce statistically important differences between the treatment and comparison groups on the matching variables.
2. Impact Estimation
Once the propensity score matching method produces a viable comparison group, several estimation strategies can be used to produce an estimate of the impact of the LTC RAP program, including a difference of means test of post-registration earnings with and without regression adjustment, and a difference-in-differences test of earnings with and without regression adjustment. Mueser, Troske, and Gorislavsky (2007) find that the difference-in-differences estimator is more faithful to random assignment results, although multiple approaches should be attempted and compared. The estimated differences can be examined by the level of the propensity score; thus, one can observe changes in earnings for those most likely selected for the program as compared with changes in earnings for those least likely to be selected. Sample size constraints may limit the principal subgroup analysis to the CNA and DSS occupational categories, although other characteristics of workers (e.g., age or race) may also be of interest.
3. Alternative Versions of the Administrative Data Design
If selection bias is considered to be a major obstacle to evaluation of the LTC RAP, other alternative non-experimental strategies can be considered using LEHD data administrative data. One alternative strategy is a regression discontinuity design, which uses pre-determined cut-offs in the assignment of treatment to identify the impact of the treatment. For example, all CNAs at Agape Senior are ranked, and apprentices are chosen from among the top 20% of employees. Since there is a sharp cut-off in the assignment of treatment, cases immediately above and immediately below the cut-off are expected to be very similar on all of their characteristics, except for their admission to apprenticeship, generating a type of natural experiment. Regression discontinuity designs might be appropriate for LTC RAPs which use some sort of test or evaluation to assign employees to the apprenticeship. Only a minimal difference is expected in the performance of employees in the 79th percentile compared to the 80th percentile, but there is a large difference in likelihood of becoming an apprentice. The change in the outcome variable at this point of discontinuity provides a reasonable estimate of the impact of the treatment. Although not common among LTC RAPs, Agape Senior is probably not the only LTC RAP program that uses an objective employee performance measure to decide who will participate (or at least who will be offered the opportunity to participate) in the apprenticeship program. If enough programs use this approach, it may be possible to use this analytic approach. Sample sizes and the frequency of using this approach of selecting apprentices may, however, limit the feasibility of this approach, especially for a national evaluation.