The Arizona Evaluation Project on Homelessness was designed to address the need to improve the measurement of program impacts at the client level. The Project was designed to use aggregate impact measures to assess the effectiveness of particular agencies as well as the overall effectiveness of the various continuums of care in the state. The Project commenced in 2002 and included several stages, including an assessment of best practices in outcome measurement, psychometric testing of various instruments, the creation and deployment of a standardized instrument, establishment of a reporting and analysis system, and the creation of a feedback process with the providers.
The first stage brought together service providers to determine what, if any, evaluation tools were being employed by their agencies. Each agency that provided an instrument was also asked to provide raw data on at least 150 homeless clients. The intent was to analyze the psychometric properties of the existing instruments to determine which, if any, met sufficient standards for reliability and validity. Approximately ten instruments were provided, half of which were called “Self-Sufficiency Matrix.” Despite the common name and some obvious similarities across the instruments, the various self-sufficiency matrices had striking differences and appeared to reflect different evolutions at each agency of a long-lost progenitor tool.
Reliability is measured in many ways and is often narrowly defined as the extent to which two measurements yield consistent results in a short period of time (test-retest reliability). This is a specific type of reliability, but the concept of reliability is broader; it also refers to the amount of error in a given set of measurements. The type of reliability most often studied by psychometricians is internal reliability, which measures the level of error and hence the quality of a given instrument. The internal reliability of each assessment tool provided to the project team was assessed using the archived data set accompanying the tool.
While an instrument can be reliable, it may still not be useful. To help assess the potential utility of each of the assessment tools provided, construct validity was also examined. To examine the extent to which the instruments were capturing one or more underlying constructs, a factor analysis was also conducted for each instrument. A factor analysis is a multivariate statistical technique which determines the extent to which items on a test “clump” together to form subsets of questions that measure particular scales. Identifying the existence of such underlying scales can be used to establish client typologies for program targeting as well as program performance assessment.
Upon review of the ten instruments that were submitted along with archived data, only one instrument met acceptable reliability and validity standards. This tool was one of the versions of the “self-sufficiency matrix”; it was far superior not only to the other types of instruments but also to the other versions of the self-sufficiency matrix. Since this instrument showed some promise, it was further piloted by a number of local agencies for six months. The agencies submitted all of their data for further psychometric testing. One large agency used the tool as a client self-report measure, while the others used it as a case manager reporting tool. Results from the pilot indicated that it was an inappropriate tool to use for self-report with the homeless population, but it was much more reliable and valid as a case manager reporting tool. The factor analysis yielded two robust factors: the extent of client dysfunction/functioning and the extent of independent life skills. An overall combined score for self-sufficiency is the sum of these two factors. The two factors and the overall score all demonstrated good reliability (internal reliability of client dysfunction=.79, independent life skills=.78, and overall self-sufficiency = .81). The final instrument produced is provided in Exhibit 1.
|Income||No income.||Inadequate income and/or spontaneous or inappropriate spending.||Can meet basic needs with subsidy; appropriate spending.||Can meet basic needs and manage debt without assistance.||Income is sufficient, well managed; has discretionary income and is able to save.|
|Employment||No job.||Temporary, part-time or seasonal; inadequate pay, no benefits.||Employed full time; inadequate pay; few or no benefits.||Employed full time with adequate pay and benefits.||Maintains permanent employment with adequate income and benefits.|
|Housing||Homeless or threatened with eviction.||In transitional, temporary or substandard housing; and/or current rent/mortgage payment is unaffordable (over 30% of income).||In stable housing that is safe but only marginally adequate.||Household is in safe, adequate subsidized housing.||Household is safe, adequate, unsubsidized housing.|
|Food||No food or means to prepare it. Relies to a significant degree on other sources of free or low-cost food.||Household is on food stamps.||Can meet basic food needs, but requires occasional assistance.||Can meet basic food needs without assistance.||Can choose to purchase any food household desires.|
|Childcare||Needs childcare, but none is available/accessible and/or child is not eligible.||Childcare is unreliable or unaffordable, inadequate supervision is a problem for what childcare is available.||Affordable subsidized childcare is available, but limited.||Reliable, affordable childcare is available, no need for subsidies.||Able to select quality childcare of choice.|
|Children’s Education||One or more school-aged children not enrolled in school.||One or more school-aged children enrolled in school, but not attending classes.||Enrolled in school, but one or more children only occasionally attending classes.||Enrolled in school and attending classes most of the time.||All school-aged children enrolled and attending on a regular basis.|
|Adult Education||Literacy problems and/or no high school diploma/GED are serious barriers to employment.||Enrolled in literacy and/or GED program and/or has sufficient command of English to where language is not a barrier to employment.||Has high school diploma/GED.||Needs additional education/training to improve employment situation and/or to resolve literacy problems to where they are able to function effectively in society.||Has completed education/training needed to become employable. No literacy problems.|
|Legal||Current outstanding tickets or warrants.||Current charges/trial pending, noncompliance with probation/parole.||Fully compliant with probation/parole terms.||Has successfully completed probation/parole within past 12 months, no new charges filed.||No active criminal justice involvement in more that 12 months and/or no felony criminal history.|
|Health Care||No medical coverage with immediate need.||No medical coverage and great difficulty accessing medical care when needed. Some household members may be in poor health.||Some members (e.g. Children) on AHCCCS.||All members can get medical care when needed, but may strain budget.||All members are covered by affordable, adequate health insurance.|
|Life Skills||Unable to meet basic needs such as hygiene, food, activities of daily living.||Can meet a few but not all needs of daily living without assistance.||Can meet most but not all daily living needs without assistance.||Able to meet all basic needs of daily living without assistance.||Able to provide beyond basic needs of daily living for self and family.|
|Mental Health||Danger to self or others; recurring suicidal ideation; experiencing severe difficulty in day-to-day life due to psychological problems.||Recurrent mental health symptoms that may affect behavior, but not a danger to self/others; persistent problems with functioning due to mental health symptoms.||Mild symptoms may be present but are transient; only moderate difficulty in functioning due to mental health problems.||Minimal symptoms that are expectable responses to life stressors; only slight impairment in functioning.||Symptoms are absent or rare; good or superior functioning in wide range of activities; no more than every day problems or concerns.|
|Substance Abuse||Meets criteria for severe abuse/dependence; resulting problems so severe that institutional living or hospitalization may be necessary.||Meets criteria for dependence; preoccupation with use and/or obtaining drugs/alcohol; withdrawal or withdrawal avoidance behaviors evident; use results in avoidance or neglect of essential life activities.||Use within last 6 months; evidence of persistent or recurrent social, occupational, emotional or physical problems related to use (such as disruptive behavior or housing problems); problems have persisted for at least one month.||Client has used during last 6 months, but no evidence of persistent or recurrent social, occupational, emotional, or physical problems related to use; no evidence of recurrent dangerous use.||No drug use/alcohol abuse in last 6 months.|
|Family Relations||Lack of necessary support form family or friends; abuse (DV, child) is present or there is child neglect.||Family/friends may be supportive, but lack ability or resources to help; family members do not relate well with one another; potential for abuse or neglect.||Some support from family/friends; family members acknowledge and seek to change negative behaviors; are learning to communicate and support.||Strong support from family or friends. Household members support each other’s efforts.||Has healthy/expanding support network; household is stable and communication is consistently open.|
|Mobility||No access to transportation, public or private; may have car that is inoperable.||Transportation is available, but unreliable, unpredictable, unaffordable; may have care but no insurance, license, etc.||Transportation is available and reliable, but limited and/or inconvenient; drivers are licensed and minimally insured.||Transportation is generally accessible to meet basic travel needs.||Transportation is readily available and affordable; car is adequately insured.|
|Community Involvement||Not applicable due to crisis situation; in “survival” mode.||Socially isolated and/or no social skills and/or lacks motivation to become involved.||Lacks knowledge of ways to become involved.||Some community involvement (advisory group, support group), but has barriers such as transportation, childcare issues.||Actively involved in community.|
|Safety||Home or residence is not safe; immediate level of lethality is extremely high; possible CPS involvement||Safety is threatened/temporary protection is available; level of lethality is high||Current level of safety is minimally adequate; ongoing safety planning is essential||Environment is safe, however, future of such is uncertain; safety planning is important||Environment is apparently safe and stable|
|Parenting Skills||There are safety concerns regarding parenting skills||Parenting skills are minimal||Parenting skills are apparent but not adequate||Parenting skills are adequate||Parenting skills are well developed|
The client assessment tool was then used for predictive mathematical modeling. The fear of the project staff was that building expectations and incentives for demonstrating client improvement alone could produce an unintended consequence, namely, that agencies would gravitate toward the “low hanging fruit,” i.e., relatively easy clients who require less investment of staff time to produce results. An assessment system that included disincentives to serve a particular client group would be counterproductive. The predictive modeling was an attempt to avoid this dilemma. Using HMIS data fields including supplementary client history fields and baseline scores on the self-sufficiency matrix, equations are generated to determine the predictors of change while in homeless assistance programs for the varying level of dysfunction, independent skills, and overall self-sufficiency. These equations are then used to predict the amount of change that would be predicted in each individual client if randomly assigned to a homeless assistance program. Each individual’s predicted change is uniquely determined based upon the client’s individual characteristics. These predicted changes constitute the expected change for each client. The expected change is then compared to the actual change at the time of program exit. Agencies whose clients typically do better than expected are the most successful and those whose clients typically perform below expectations are in need of programmatic improvements (see sample feedback form, Exhibit 2).
Exhibit 2 Arizona Homeless Evaluation Project Progress Report
(Based on clients who have exited the program; N=129)
I. PROGRAM: Demo Shelter
Type: Emergency Shelter
Continuum: MAG Continuum of Care Regional Committee on Homelessness
Date: June 19, 2006
We have compared characteristics of Demo Shelter clients to clients from other agencies with like program types within the Maricopa Continuum of Care who entered and exited programs during the same time period (October 2005 through March 2006). In terms of these demographic variables, Demo clients tend to be mildly older, mildly less likely to be female, and mildly more likely to serve black clients and mildly less likely to serve Hispanic clients. However, overall there are not great differences in the demographic characteristics.
|DEMOGRAPHICS||Demo Shelter||Other Emergency Shelters|
|Gender (% female) 1||53 %||75 %|
|White||64 %||62 %|
|Black||24 %||16 %|
|Asian||1 %||2 %|
|Native American||11 %||11 %|
|Hispanic||14 %||21 %|
|Other||0 %||8 %|
|DV clients||26 %||26 %|
|Extent of homelessness|
|First time||40 %||43 %|
|1-2 times in past||46 %||41 %|
|Long-term||6 %||6 %|
|Chronic||8 %||10 %|
|1 Arizona HMIS systems contain a high percentage of McKenny-Vento funded participants as well as those served under Arizona Department of Economic Security contracts. Other homeless clients are less well represented within HMIS. This produces a higher percentage of homeless clients than is believed to be represented in the general homeless population.|
|MATRIX SCORES UPON ENTRY|
|Dysfunction Score||Demo clients moderately less dysfunctional|
|Independent Life Skills Score||Demo clients mildly greater life skills|
|Total Self-Sufficiency Score||Demo clients mildly less challenging|
|DEMO CLIENT OUTCOMES||Expected||Actual||Difference|
|Independent Life Skill Scores||6.9||7.3||+0.4|
|Overall Self-Sufficiency Scores||8.3||8.8||+0.5|
The predictive model determines the most likely change each client would make if they were randomly assigned to a homeless assistance program. This expected change is then compared to the actual change clients make in the program. If the difference is positive this program is performing above expectations and if the difference is negative then the agency is performing below expectations.
Overall, Demo Shelter is mildly better than other programs in decreasing dysfunction and moderately better in increasing independent life skills and overall self-sufficiency. Demo Shelter has its greatest success with homeless individuals recently released from jail/prison. An area of challenge for Demo Shelter is the program’s difficulty in having significant impact with its Hispanic clients.
No agency excels with all clients, and the predictive model allows each agency to objectively explore whether there are systematic differences between the types of clients with whom they experience the most success and those who are most challenging. Each agency receives a written feedback report on a quarterly basis detailing how, if at all, their clients differ from those served by other agencies, the extent to which agency outcomes differ from those expected from the predictive model, and the relative strengths and weaknesses of client successes within each agency. For example, one agency serving disabled and older homeless men and women was able to determine that it was far more effective with the older subpopulation than with people with disabilities. Further analyses showed that the frequency of “acting out” behavior among the people with disabilities was determinative of agency effectiveness, with a greater frequency of “acting out” associated with less successful client progress. This agency is now exploring what practices and techniques can increase its effectiveness with such clients. Another agency was able to identify that despite stronger outcomes than expected overall, it was much less successful with Hispanic clients. As a result, the agency is working with agencies that are more successful with Hispanics to help identify what changes might increase its effectiveness with this subpopulation.
Such feedback systems can also allow agencies to rethink their target populations. If an agency learns that it is effective with people who have a mental illness or a substance use disorder, but is ineffective when these conditions are co-occurring, that knowledge is valuable both for the program and for the local continuum of care. For example, if another agency is highly effective with clients who have co-occurring disorders, the initial agency can either choose to learn from that agency and strengthen outcomes with this group, or it can decide to accept clients with whom it is likely to be effective and refer those clients with whom it is less likely to be successful to programs more likely to benefit them.
The initial expectation of the project was that agencies would naturally discuss and learn from each other in this feedback process. However, it became apparent that the various continuums of care (CoCs) could play a convening role by structuring activities that brought both leadership and line staff from the agencies together to learn from each other in “evaluative learning circles.” These are regularly scheduled meetings of homeless agencies from similar locations with similar missions to learn from each other the relative strengths and weaknesses of each and how they can cooperate to produce better client outcomes.
Beyond aiding individual clients or individual agencies, the evaluation system has been helpful in identifying patterns that are valuable for policy considerations for the CoCs as a whole. One finding has been that the distinction between emergency and transitional programs in actual practice in Arizona appears to be an arbitrary one. There is no difference locally between the two types of programs in who they serve, the types and extent of problems their clients exhibit, or the expected change from each program. Another finding in data analyzed thus far suggests that, across all agencies, there is a window of between three and seven weeks when programs are likely to have their greatest impact. Shorter term stays are typically inadequate to effect change, and stays longer than seven weeks tend to cause individuals (but not families) to regress. This suggests that, for homeless individuals, a period of training and stabilization of three to seven weeks followed by placement in long-term housing is likely to maximize client impact. It is also hoped that the predictive model will assist in the rating and ranking process for the McKinney-Vento Assistance application by making quality assessments more objective and rigorous.
The findings related to duration of treatment and lack of distinction between emergency and transitional programming were included to demonstrate the types of findings the model is capable of yielding. However, these results should be regarded with some caution. They are accurate for the sample of homeless we have studied. The sample is not yet representative of the broader homeless community and a sizable number of clients in transitional housing are still in the pipeline without yet having an exit matrix. We are anxious to see if these findings persist when the dataset becomes more representative of the entire state homeless population.
This case study provides one example of how a jurisdiction is able to use program and outcomes data to develop benchmarking and performance standards, as well as to develop a process for engaging providers in discussions about strategies for improving their performance. The development of the self-sufficiency matrix was an important tool in that process, as was the creation of learning communities. Other potential approaches are also possible. In the next section, a case study from Columbus, Ohio, is presented, with particular attention to some of the challenges that community faced in bringing performance measurement to its system.