Policy Research for Front of Package Nutrition Labeling: Developing and Testing a Summary System Algorithm. 5.1 Summary of Findings


RTI developed and tested a nutrient density-based algorithm that included positive scores for nutrients that should be encouraged and negative scores for nutrients that should be limited in the diet. We scored a set of foods using the algorithm and compared the average scores of food groupings. As expected, nutrient-dense foods (i.e., foods with substantial amounts of vitamins and minerals and few calories) scored high, and foods that are low in nutrient density (i.e., that supply calories but relatively small amounts of micronutrients) scored low. Fruits, vegetables, and legumes and nuts had the highest group scores, and the lowest group scores were seen with fats and oils and sweets and beverages (e.g., coffee, tea, soft drinks, and fruit drinks). A series of modifications were made to the algorithm, adding and removing various nutrients and food components, and effects on food scores and the ability of the algorithm to predict overall dietary quality were assessed.

Although the HEI scores total diet, it is inherently different from a food-scoring algorithm; nevertheless, it provides a mechanism to evaluate how well the individual food scores of an individual's diet relate to overall diet quality. In this report, we demonstrate that a final algorithm with weighting factors for nutrients that were derived from statistical analyses of nutrient intakes of the U.S. population resulted in higher prediction of dietary quality than seen with existing nutrient algorithms that have been tested similarly. Our final algorithm explained two-thirds of the variation in HEI scores, compared with one-third to one-half with other nutrient density algorithms. Our algorithm included nutrients or food components with positive weighting factors for protein, unsaturated fat, fiber, calcium, and vitamin C and with negative weighting factors for saturated fat, sodium, and added sugars. The use of nutrient values per 100 kcal was slightly better at predicting overall dietary quality than using nutrients per reference serving sizes (RACC). Among the top-scoring foods were raw and leafy green vegetables on a per 100 kcal basis and avocado, almonds, oranges, and strawberries on a per RACC basis. The algorithm worked well in predicting dietary quality across various population groups, such as age, ethnicity, socioeconomic status, and weight status.

Numerous criteria and considerations must be accounted for when developing a nutrient scoring system for foods. The selection of nutrients or food components is probably the most critical component of the process. We began by selecting nutrients that were deemed important and were limited in the diets of Americans. However, nutrients coexist in foods, and the addition of nutrients to a scoring system does not necessarily help improve prediction of overall dietary quality. The unit basis for the nutrient data is important because it determines the amount of the nutrient that is included in the calculation of the score. We found that when examining scores of foods, the algorithms based on RACC servings seemed to reduce extreme values for some foods such as low-calorie vegetables. A scoring system based on RACC servings is intuitively appealing because it accounts for serving size differences among various types of foods. However, the algorithms based on RACC servings were slightly less predictive of overall dietary quality. More extreme scores using algorithms based on 100 kcal servings, particularly for fruits and vegetables, may have driven the higher R2 values of models predicting HEI. The fact that HEI is based on nutrient standards on a per 1,000 kcal basis could also contribute to a better prediction by an algorithm scored on a per 100 kcal basis.

Our original baseline algorithm and modifications that weighed positive and negative nutrients equally performed reasonably well in predicting HEI scores. Approximately 50% of the variance in HEI score was explained by the respective algorithms. Weighting of nutrients has been used in very few existing nutrient scoring systems because there is a lack of concrete scientific evidence to support specific weighting factors to apply to nutrients. Our new approach, developed by Nutrition Impact, LCC, used weighting factors obtained from beta coefficients of nutrient intake variables in regression models predicting HEI scores. The final nutrient density algorithm resulted in greater prediction of dietary quality assessed by HEI score than our baseline algorithm and its modifications. The final algorithms predicted approximately two-thirds of the variance in HEI scores (R2 of 65% for the 100 kcal- and 60% for RACC-based algorithm) compared with one-half of the variance explained by the modified algorithm with vitamin C and whole grains (R2 of 52% for the 100 kcal- and 44% for RACC-based algorithm). Previous published validation studies of nutrient density indexes reported R2 values of 45% with the NRFI (Fulgoni et al., 2009) and 29% with the ONQI associated with the NuVal shelf-labeling system (Katz et al., 2010). Additional regression models with our final algorithm using various subpopulations demonstrated R2 values comparable to those for the overall population. Further analyses of the top 10 eight-nutrient or food component models showed that the R2 values were extremely close to each other, suggesting that a number of potential algorithms include eight nutrients or food components that would be roughly equivalent in predicting dietary quality based on the HEI.

Summary systems can be simplified for the consumer by categorizing a score into different levels. For example, categories used by others have included three levels, such as traffic light colors or text signifying "low, medium, or high." We assessed categories that used both three- and five-point categorization of scores using the final algorithm. This resulted in reasonable rankings of foods based on three- or five-point ratings. The three-category system performed as well as the five-category system in distinguishing between common foods (e.g., whole grain vs. white bread, nonfat vs. whole-milk yogurt). The use of such three- or five-category rankings of foods may be more helpful to consumers than a continuous score, although this needs to be tested with consumers. Categorical rankings of foods using this algorithm could also be compared with other existing ranking systems.