In this study, we introduced a new HLI that uses data-driven weighting to account for the magnitude of the relationship between individual lifestyle factors and specific disease outcomes. We extensively compared standard and outcome-specific versions of the HLI by estimating HR, C-index, and PAF under a range of scenarios including risk of cancer, T2D, CVD, and premature death. Two strategies for operationalizing the HLI were also explored, using binary indicators or categorical scores of the five factors in turn.
In our study, models based on outcome-specific HLIs consistently had higher discriminatory power than models based on standard HLIs, and in some cases, such as for type 2 diabetes, the difference was large. The reason for this limitation of the standard HLI is clearly shown in Figure 2 when considering binary indicators. Since the standard HLI assumes that all lifestyle factors are equally associated with disease risk, in analyses based on the standard HLI, different lifestyle patterns with the same number of unhealthy factors necessarily lead to the same predicted disease hazard rate. Conversely, our analyses using outcome-specific HLIs reflected the heterogeneity of disease hazard rates among these lifestyle patterns with the same number of unhealthy factors. This limitation of the standard HLI in terms of discriminatory power highlights that it may not be the best analytical choice for risk stratification or risk prediction, especially in situations where certain lifestyle factors are strongly associated with the outcomes under consideration, such as BMI in the case of type 2 diabetes.
Most previous studies on HLIs used standard HLIs to address etiological questions, specifically estimating disease-specific HRs to quantify the impact of adherence to healthy lifestyle habits and disease-specific PAFs to measure the public health burden attributable to unhealthy lifestyle habits. In our study, we observed that for T2D, HR estimates for standard HLIs were consistently weaker, and sometimes significantly weaker, than outcome-specific HLIs. These results suggest that analyses using outcome-specific HLIs are more likely to detect associations, especially for diseases with weaker associations with lifestyle habits. Conversely, PAF estimates were consistently larger when standard HLIs were used. Although it may seem contradictory to estimate weaker HRs and larger PAFs with standard HLIs than outcome-specific HLIs, our theoretical study of linear causal models and results from our examination of the empirical distributions of standard HLIs and outcome-specific HLIs, shown in Figure S1, may help clarify this apparent contradiction. According to the binary version of the standard HLI, 59% of the EPIC study population had a standard HLI of 3 units or less, i.e., more than 2 standard deviations below the maximum HLI of 5 units. As a result, if a large proportion of participants adhered to a healthier lifestyle as possible, the health benefits would have been obtained, leading to larger PAF estimates. On the other hand, according to, for example, the mortality-specific HLI, 65% of the study population had HLI values within 1 standard deviation of the maximum HLI. As a result, the benefits of early mortality from adhering to a healthier lifestyle as possible would have been less pronounced, which explains the lower PAF estimates. In essence, the analysis of outcome-specific HLIs closely mimics the analytical strategy in which individual lifestyle elements are jointly evaluated within the same model, thus resulting in similar PAF estimates. Our results therefore highlight that analyses based on standard HLIs may bias the assessment of the public health burden attributable to unhealthy lifestyles. As stated in the theoretical study of linear causal models, it can be said that using the standard HLI may provide an approximately valid estimate of the PAF of a latent variable, such as reflecting health consciousness. However, the validity of this approach, especially whether the standard HLI is a better proxy than the weighted HLI for this latent variable, requires further evaluation.
The etiology of chronic diseases is complex, and some simplification by summary volumes is welcome in epidemiological studies. To paraphrase Box's maxim, “All abstracts are wrong, but some are useful.”37 For a summary to be useful, it must produce results that are approximately valid. The validity of the results of an analysis based on a standard HLI can be assessed by comparing the results with outcome-specific HLIs or with results from individual lifestyle factors. If the results are similar, the standard HLI may be appropriate because it does not rely on data-driven weighting and facilitates the comparison of findings across studies and across health-related outcomes. However, the assumption that the standard HLI facilitates comparison across studies may be relaxed given the myriad versions of the standard HLI that have been proposed in the literature5,6,9,10,17,18,38.
Multiple lifestyle factors influence an individual's health, some of which are more important than others and should be reflected in public health recommendations. Toward this end, the “healthiest” lifestyle profile can be defined as the combination of individual lifestyle behaviors associated with the lowest disease risk, the highest life expectancy, or the highest chronic disease-free life expectancy.9 Developing and validating HLIs using weights derived from meta-analyzed associations between disease risk, mortality, or composite outcomes reflecting mortality and common chronic diseases can help characterize and promote these healthiest profiles.
Like previous versions of the standard HLI8, the HLI considered in this study is based on five separate components: smoking habits, alcohol intake, diet, physical activity, and obesity. Individual components can be combined into outcome-specific HLIs using sophisticated statistical methods such as splines. Also, working with a refined classification of these five components, including more descriptors such as more refined information on smoking intensity and obesity, and a broader range of dietary exposures, or including information on other lifestyle factors such as sleep quality39,40 and stress, may allow a more accurate assessment of the relationship between lifestyle and health-related outcomes. The evaluation performed in this study relied on the EPIC cohort, and the study populations in the different countries were generally more health conscious than the source populations. The lack of such information in EPIC did not allow us to take into account other major chronic diseases that may affect the observed associations with the outcomes of interest. For example, chronic obstructive pulmonary disease (COPD) frequently co-occurs with CVD and shares smoking as a major risk factor41. Although these potential limitations were acknowledged, they were unlikely to affect the main conclusions of the study, which were supported by evidence of theoretical outcomes from a simple linear causal model.