
4.3.1 Results of Regression Models Predicting HEI Scores

The performance of the algorithms in predicting overall dietary quality was assessed by calculating algorithm scores for foods consumed by the 15,576 individuals in NHANES 20052008. The results for the regression of the composite algorithm scores for the NHANES participant diets on the HEI diet quality scores for the participant diets are presented in Table 41. The models included covariates for age, gender, and ethnicity. The details of the methodology are described at the end of Section 4.1.
In all of the linear regression models, the weighted mean scores were significantly associated with HEI scores (p < 0.0001). The weighted mean scores of foods consumed by individuals were slightly higher on a per 100 kcal than on a per RACC basis. For the baseline algorithm (NDS1KCAL), the model with scores based on a per 100 kcal basis explained 46.2% of the variation in HEI scores compared with 39.0% explained by the per RACC scores (NDS1RACC). For each modification to the baseline algorithm, the variation in HEI scores was better explained by the algorithm calculated on a per 100 kcal basis than a per RACC basis. The model with the highest R^{2} was the modification with vitamin C and whole grains on a per 100 kcal basis (NDS6), explaining 51.6% of the variance in HEI scores. The algorithm with vitamin C had the second highest R^{2} (49.6%).
A better explanation of HEI scores when the food scores were calculated on a per 100 kcal basis rather than a per RACC basis could possibly be that the range of scores on a per 100 kcal basis is more extreme. The fact that the HEI is based on nutrient standards on a per 1,000 kcal basis could also contribute to a better prediction by an algorithm scored on a per 100 kcal basis.


4.3.2 Plots of Predicted HEI Scores

Figure 44 illustrates the prediction of HEI scores by the covariates only (age, gender, and race), which explained 4.17% of the variance in HEI scores. To illustrate good prediction of the algorithm score at the high and low ends of actual HEI scores, Figure 45 displays HEI scores predicted by the modified algorithm on a per 100 kcal basis with vitamin C added.
Figure 44. Plot of Predicted HEI Scores with Covariates Only (Age, Gender, Race)
The figure shows a scatterplot for the predicted HEI scores (on the vertical axis) from a regression model with only the covariates, age, gender, and race as independent variables. The actual HEI scores are on the horizontal axis. The diagonal line shows theoretical perfect prediction of the HEI. The model showed that the covariates alone predicted only 4.17% of variation in the HEI. Data are from 16,587 participants in NHANES 20052008.
Figure 45. Plot of Predicted HEI Scores with Modified Algorithm on per 100 Kcal with Vitamin C Added
The figure shows a scatterplot for the predicted HEI scores (on the vertical axis) from a regression model for a modified algorithm on a per 100 kcal basis with vitamin C added (NDS4CKCAL). The actual HEI scores are on the horizontal axis. The diagonal line shows theoretical perfect prediction of the HEI. The model showed that the algorithm accounted for 45.59% of variation in the HEI. Agreement was reasonably good at high and low HEI values, as shown by points lying near the diagonal line. Data are from 16,587 participants in NHANES 20052008.
