In a recent study published in Nature Medicine, researchers developed a proteomic age clock that uses plasma proteins to predict biological age and associated health risks. The researchers found that the clock accurately predicted age and was associated with risk of major chronic diseases, multimorbidity, and mortality in a range of populations.
Study: Proteomic aging clock predicts mortality and risk of common age-related diseases in different populations. Image credit: kiehlord/Shutterstock
background
Aging is an important factor in the development of chronic diseases such as heart disease, stroke, diabetes, and cancer, but its timing and severity vary from person to person. Chronological age is often used to estimate biological aging, but it may not be an accurate proxy measure. This study is significant in that it is the first to validate the proteomic age clock in a large, diverse population, providing a powerful tool to predict age-related disease and mortality. Using omics data that reflect an individual's biological function allows for more accurate estimates. Biological aging influences the risk of chronic disease, disability, and healthcare demand. Until now, deoxyribonucleic acid methylation (DNAm) clocks have been used to measure biological age, but protein levels may be able to provide more direct insight into the mechanisms of aging. Previous studies have developed proteomic age clocks to predict disease risk and mortality, but none have done so in large, diverse populations. Therefore, the researchers in this study addressed this gap by developing and validating a proteomic age clock across a variety of populations and evaluating its predictive power for risk of chronic disease, mortality, and aging-related traits.
About the Research
The study drew data from three large biobank cohorts: the UK Biobank (UKB), the China Kadoorie Biobank (CKB), and FinnGen. The researchers developed and validated a proteomic age clock via the Olink Explore 3072 platform. This clock can predict a person's biological age based on the expression levels of certain proteins, which may differ from their chronological age. They analyzed this difference, called “ProtAgeGap,” and investigated its relationship with aging, frailty, and disease.
A total of 45,441 participants from UKB (age 39-71 years, 54% female), 3,977 from CKB (age 30-78 years, 54% female), and 1,990 from FinnGen (age 19-78 years, 52% female) were included. Proteomic data were processed and normalized across cohorts, and 2,897 proteins were selected for analysis after quality control. A gradient boosting model (LightGBM) was employed and outperformed other machine learning models in predicting chronological age. Recursive feature elimination identified the 20 most significant proteins to form a minimal predictive model (ProtAge20) that maintained high accuracy. The model was trained and validated using 5-fold cross-validation in UKB and applied to the CKB and FinnGen cohorts to calculate ProtAgeGap. Statistical analyses included linear or logistic regression, Cox proportional hazards model, functional enrichment analysis, Shapley additive explanatory (SHAP) interaction analysis, Kaplan-Meier survival analysis, and protein-protein interaction (PPI) network visualization.
a, UKB participants were split into training and test sets in a 70:30 ratio. In the training set, a LightGBM model was trained to predict chronological age using 2,897 plasma proteins and 5-fold cross-validation. We identified 204 proteins associated with predicting chronological age using the Boruta feature selection algorithm and retrained an improved LightGBM model using these 204 proteins and evaluated on the UKB test set. b, We further independently validated the proteomic age clock model using independent data from CKB and FinnGen. c, Protein predicted age (ProtAge) was calculated in the full UKB sample using 5-fold cross-validation and LightGBM. ProtAgeGap was calculated as the difference between ProtAge and chronological age. Linear and logistic regression were used to test the association of ProtAgeGap with a comprehensive panel of biological aging markers and measures of frailty and physical/cognitive status. Additionally, we used Cox proportional hazards models to test the association between ProtAgeGap and mortality, 14 common diseases, and 12 cancers. Due to the small sample size of CKB and the lack of disease cases in FinnGen, most association analyses were performed only in UKB. Figures were created with BioRender.com.
Results and Discussion
During the 11-16 year follow-up period, there were 10.6%, 36% and 1% deaths in the CKB, UKB and FinnGen cohorts, respectively. A total of 204 aging-associated proteins were identified and the association of age with these proteins was found to be stable over time.
ProtAgeGap was found to correlate with biological aging markers and clinical outcomes. It was shown to be a strong predictor of multimorbidity, all-cause mortality (hazard ratio (HR) = 1.15/year ProtAgeGap), and risk of 14 non-cancer diseases, including Alzheimer's disease (HR = 1.11), chronic kidney disease (HR = 1.14), and type 2 diabetes (HR = 1.13). In addition, ProtAgeGap also showed an association with cancer risk, including breast cancer (HR = 1.12), lung cancer (HR = 1.09), and prostate cancer (HR = 1.08). ProtAgeGap was also found to be associated with various biological aging markers (e.g., telomere length, insulin-like growth factor 1) and indices of cognitive and physical function. Sensitivity analyses including non-smokers and normal weight individuals confirmed these associations.
Studies have shown that the proteomic age clock is heavily influenced by proteins involved in diverse biological functions, including cell-extracellular matrix interactions, immune response and inflammation, hormone regulation, reproduction, neurodevelopment, and differentiation. The proteomic clock has limited overlap with the DNAm clock, highlighting novel aging-associated proteins and providing additional insights into aging biomarkers. This study is strengthened by the use of a gradient boosting model that allows for nonlinear associations and interactions between proteins, improving generalizability compared to other models. However, this study is limited by the limited protein coverage due to the use of only the Olink Explore 3072 platform, and the lack of DNAm data for direct comparison with the DNAm age clock.
Conclusion
In conclusion, the proteomic age clock developed in this study provides a robust predictive system for biological aging and provides insight into age-related disease, frailty, and mortality mechanisms. This study suggests that plasma proteomics is a reliable method for measuring biological age, which may guide drug targets, novel interventions, or lifestyle modifications to reduce premature mortality and delay the onset of major age-related health conditions.