Associations between Differential Aging and Lifestyle, Environment, Current, and Future Health Conditions: Findings from Canadian Longitudinal Study on Aging

Introduction: An aging population will bring a pressing challenge for the healthcare system. Insights into promoting healthy longevity can be gained by quantifying the biological aging process and understanding the roles of modifiable lifestyle and environmental factors, and chronic disease conditions. Methods: We developed a biological age (BioAge) index by applying multiple state-of-art machine learning models based on easily accessible blood test data from the Canadian Longitudinal Study of Aging (CLSA). The BioAge gap, which is the difference between BioAge index and chronological age, was used to quantify the differential aging, i.e., the difference between biological and chronological age, of the CLSA participants. We further investigated the associations between the BioAge gap and lifestyle, environmental factors, and current and future health conditions. Results: BioAge gap had strong associations with existing adverse health conditions (e.g., cancers, cardiovascular diseases, diabetes, and kidney diseases) and future disease onset (e.g., Parkinson’s disease, diabetes, and kidney diseases). We identified that frequent consumption of processed meat, pork, beef, and chicken, poor outcomes in nutritional risk screening, cigarette smoking, exposure to passive smoking are associated with positive BioAge gap (“older” BioAge than expected). We also identified several modifiable factors, including eating fruits, legumes, vegetables, related to negative BioAge gap (“younger” BioAge than expected). Conclusions: Our study shows that a BioAge index based on easily accessible blood tests has the potential to quantify the differential biological aging process that can be associated with current and future adverse health events. The identified risk and protective factors for differential aging indicated by BioAge gap are informative for future research and guidelines to promote healthy longevity.


Introduction
The global population is aging rapidly.In 2015, people aged 65 and over accounted for 8.5% of the global population, and this percentage is projected to increase to 16.7% in 2050 [1].For developed countries, the proportions are higher; for example, in Canada, 16.2% were over 65 years in 2018, and this is projected to reach 23.4% in 2040 [2].An aging population places strain on healthcare systems as rates of chronic disease increase with age.For example, in Canada, older adult aged 65 and over, although representing only 16.2% of the general population, accounted for 45.7% of the total health expenditures in 2019 [2].Understanding aging, and its interactions with lifestyle, environmental factors, and chronic disease, is a prerequisite for implementing effective interventions to promote healthy longevity.Chronological age is a common choice for representing and quantifying aging processes.However, the full complexity of biological aging cannot be encapsulated solely by chronological age, as people with the same chronological age can have significantly different health status.
Currently, researchers are searching for biomarkers to help gauge the full complexity of the aging process [3].Among them, various biological age (BioAge) metrics, also called BioAge indices, have been developed [3][4][5][6].Aging leaves many footprints on the biological system, including telomere shortening, gradual changes in gene expression, and DNA methylation levels [4].A commonly used approach to developing a BioAge index is to construct machine learning (ML) models that predict chronological age based on biological measurements collected from a large population.Then, given the unique biological measurements of an individual, the model can estimate his or her BioAge [4].Such a method may capture differences in BioAge for people of the same chronological age.If trained on a representative population, the model estimates of BioAge should reflect the typical (i.e., expected) aging.The BioAge gap, which is defined as the gap between estimated BioAge index and chronological age, is a measure of the differential aging (the difference between BioAge and chronological age) [4,7].A positive BioAge gap indicates that an individual is biologically "older" than expected given the chronological age, which may result from accelerated accumulation of multiple morbidities in aging.A negative BioAge gap suggests that an individual is biologically "younger" than expected given their chronological age, which may result from fewer morbidities or delayed biological aging.
Many types of biological measures have been used to quantify aging, including metabolomics, proteomics, epigenomics, and transcriptomics [3,8].Several recent studies have also used easily accessible clinical laboratory biochemical and hematology tests to develop BioAge indices in humans.Putin et al. [9] created a BioAge index using an ensemble of 21 deep neural networks on 46 clinical blood test measures.Mamoshina et al. [10] developed another aging index based on 20 selected laboratory biomarkers using a deep neural network, and the model was used to quantify the accelerated aging effects among smokers.At present, no study has fully explored the utility of laboratory measures based BioAge indices in characterizing the interactions between differential aging and lifestyles, environmental factors, current disease, and prospective disease.Understanding these associations and identifying risk factors for differential aging could guide effective public health recommendations to promote healthy longevity.Furthermore, a BioAge gap could theoretically be used to improve future health outcomes through addressing the modifiable risk factors and promoting the modifiable protective factors.
The Canadian Longitudinal Study on Aging (CLSA) is one of the largest aging research projects in the world [11], collecting comprehensive information on lifestyle, home, and environmental factors, disease status, and standard clinical laboratory biochemistry and hematology measures for about 30,000 participants aged 45 years to 87 years old.We aimed to develop a BioAge index using the clinical laboratory measures in CLSA data [11] to quantify biological aging at the individual level.We also aimed to investigate the relationships between differential aging indicated by the BioAge gap and lifestyle, home, and environmental factors, and current and prospective chronic diseases based on the comprehensive measures in CLSA data.

Data
CLSA is a Canada-wide longitudinal study on adult development and aging [11,12].Over 50,000 Canadians aged 45 and over participated in the study with a planned follow-up of at least 20 years.Residents living on First Nations reserves or in long-term care institutions, full-time members of the Canadian Armed Forces, and those cognitively impaired or who cannot respond in English or French are excluded from the CLSA study [11].For this study, we used the Comprehensive Cohort of CLSA study, comprised 30,097 participants for whom data were collected through both in-home interviews and CLSA data collection site visit.4,673 subjects without laboratory measures at the baseline were excluded from the analysis.Currently, the baseline data collected from 2012 to 2015 and the follow-up phase data collected from 2015 to 2018 are available.We retrieved the chronological age, clinical laboratory measures (detailed information can be found in online suppl.Table S1; for all online suppl.material, see https://doi.org/10.1159/000534015),socioeconomic status, lifestyle, home and neighborhood environmental factors, and disease conditions at the baseline, and 10 primary chronic disease conditions (including all the neurological and psychological disorders and some common chronic diseases, e.g., cancer, heart disease, and kidney disease) at the follow-up phase.

Aging Index Development
To construct the BioAge index, the 31 laboratory measures present in the CLSA data and sex were used as the predictive variables, while the chronological age at the time of blood sample collection was used as the target outcome (see online supplementary Table S1 for the summary statistics of these measures and chronological age).The data were initially split into a training set (80% of the data) and a test set (20%).Various ML models were chosen and trained on the training set.The model that exhibited the highest performance in terms of R-squared error on the test set was then selected to generate a biological aging index through the use of 10-fold cross-validation (CV) on the complete dataset.Details about the data preprocessing and modeling process can be found in the supplement.

Post hoc Analysis of Important Variables for Building Aging Index
After the models were selected and evaluated, the importance of each predictive variable was derived to assess their contribution in estimating the BioAge.Almost all the available variables in the model had nonzero feature importance measures.However, we wanted to focus on interpreting the most important ones rather than all the variables.We implemented a permutation-based method to heuristically select the variables with higher than chance contributions in estimating BioAge for interpretation (see online supplementary Figure S1 for details).

Post hoc Association Analysis between Variables and BioAge Gap
Having estimated the BioAge index for our CLSA cohort, we computed the BioAge gap as the difference between an individual's estimated BioAge index based on CV and their chronological age.Details about the calculation of the BioAge gap (selecting the best performed ML models in estimating BioAge, predicting the BioAge in a CV framework to avoid overfitting) can be found in the supplement.After that, we used a linear model to explore the association between the estimated BioAge gap, i.e., the difference between predicted age and chronological age, and a phenotype of interest (e.g., diabetes) while controlling for common confounding factors, including the subject's chronological age, sex, and socioeconomic status factors (education level and household income).As shown by Lange et al. [13], the dependence of the BioAge gap on the chronological age could be corrected by including chronological age as the covariate (see supplement for details of these socioeconomic status factors).In the above linear model, the coefficient of the phenotype represents its partial association with the outcome after adjusting the effect of the confounding factors.For a categorical phenotype variable, one level was taken as the reference level, while for a numerical phenotype variable, its raw value was scaled to have a mean 0 and standard deviation 1.In addition, the corresponding samples were removed for rare levels of categorical variables (frequency less than 1%) to avoid false discoveries in the related association analysis.The BioAge gap is an estimated value of uncertainty.Without accounting for the uncertainty, the association analysis will underestimate the variation, thus leading to false discovery.We applied bootstrapping to account for the uncertainty in the BioAge gap (see supplement for details).In addition, as chronological age was not measured with precision, for example, both 70.9 years and 70.1 years were taken as 70 years old, we will only interpret the significant variables with an effect size larger than or close to 1 year.

Results
The best performing model explained 34.6% of the variation in the chronological age of the hold-out test set (online suppl.Table S2; Fig. S2).

Important Variables for Building the Biological Aging Index
The contribution of each variable to building the BioAge index is shown in online supplementary Figure S3.The post hoc analysis identified lymphocytes, red blood cells, mean corpuscular volume, red blood cell distribution width, hemoglobin A1C, 25-hydroxyvitamin D, albumin, alanine aminotransferase, ferritin, and lowdensity lipoprotein (LDL), as the most important features contributing to the BioAge index.How these variables affect the BioAge index in the ML model is shown in online supplementary Figure S4.

Lifestyle Factors Associations with BioAge Gap
After FDR correction, 90 of the 409 lifestyle factors in CLSA data were identified to be significantly associated with the BioAge gap.These significant lifestyle factors, the associated effect sizes, and p values are shown in online supplementary Table S3.The lifestyle factors are categorized into dietary and other lifestyle factors for results demonstration.

Dietary Lifestyle Factors
The detected associations between dietary lifestyle factors and BioAge gap can be summarized as follows.(1) Frequent consumption of red meat and chicken was associated with a positive BioAge gap (Fig. 1a, b). ( 2) Daily consumption of processed meat, i.e., sausages, hot dogs, ham, smoked meat, bacon, was associated with a positive BioAge gap (Fig. 1c).

Other Lifestyle Factors
The other lifestyle factors associated with the differential aging indicated by the BioAge gap can be summarized as: (1) Cigarette smoking status was related to a positive BioAge gap (Fig. 2a).Current smokers, on average, were 1.479 ± 0.133 (p value <0.001) years older measured by the BioAge index compared to nonsmokers, and previous smoking status was also significantly associated with a positive BioAge gap.However, the effect size (0.236 ± 0.076 p value = 0.003) of previous smoking status is not large enough for accurate interpretation.Many other smoking-related variables, for example, number of cigarettes smoked per day, type of smokers, were significant and had large effect sizes.In addition, to compare the smoking results with previous research [10,14], we also explored the association among different age groups.As shown in online supplementary Figure S5, compared to nonsmokers, current smoking status is significantly associated with an over 1-year positive BioAge gap in all the age groups except for those over 80 years old.Furthermore, the association between previous smoking status and BioAge gap is not significant among all the age groups.Furthermore, alcohol consumption was associated with a positive BioAge gap, but the effect size was less than 1 year.(2) People participating in strenuous activities, such as paddleball, and tennis, were predicted to be 1.40 ± 0.451 (p value = 0.002) years younger compared to the people not participating in such activities (Fig. 2c).However, all other 24 strenuous sports included in the CLSA data, for example, basketball, bicycling, hiking, were not significant.In addition, participating in light activities, for example, badminton, billiards, boating, fishing, were not significant.(3) People who used a wheelchair or motorized carts for transportation were at risk of a 1-year positive BioAge gap (Fig. 2d).

Association of the Home and the Neighborhood Environmental Factors with the BioAge Gap
After FDR correction, 6 of the 28 home and the neighborhood environmental factors were significantly associated with the BioAge gap.These significant environmental factors, the associated effect sizes and p values, are shown in online supplementary Table S4.The main finding is that passive smoking exposure at home was associated with a positive BioAge gap (Fig. 2b).

Association between the BioAge Gap and Current Disease Conditions
After FDR correction, 79 of the 379 disease-related variables were significantly associated with a BioAge gap.These significant disease-related variables, the associated effect sizes and p values, are shown in online supplementary Table S5.Some of the features with significant associations and large effect sizes can be summarized as: (1) Reported chronic conditions were associated with a positive BioAge gap (Fig. 3a).(2) Cardiovascular diseases, i.e., congestive heart failure, aneurysm, peripheral vascular disease, angina, stroke, heart attack, or myocardial infarction, were associated with a positive BioAge gap (Fig. 3b).(3) Various cancers, i.e., kidney cancer, other lymphoid, hematopoietic and related tissue cancer, lung cancer, and non-Hodgkin lymphoma, were associated with a positive BioAge gap (Fig 3c).( 4) Other diseases and conditions, for example, kidney disease or failure, multiple sclerosis, diabetes, emphysema, chronic bronchitis, chronic obstructive pulmonary disease, or chronic changes in lungs due to smoking, underactive thyroid gland, pneumonia in the last year, shuffling gait and poor balance, were also associated with positive BioAge gap (Fig. 3d-f).

Association between BioAge Gap and Prospective Chronic Disease Conditions
We selected several major chronic physical and mental disorders to see if the onset of these diseases in the followup phase was associated with the BioAge gap at baseline.The association analysis results of all the selected disease conditions are summarized in online supplementary Table S6.After FDR correction, 5 out 10 future disease conditions were significant.The most significant ones are as follows: Parkinson's disease onset at the follow-up phase was associated with 2.21 ± 0.782 years (p value = 0.0054) positive BioAge gap at the baseline.Being told by a doctor to have memory problem onset at followup was associated with 0.789 ± 0.394 years (p value = 0.0475) positive BioAge gap at baseline and the onset of kidney disease at the follow-up phase was associated with 1.77 ± 0.370 years (p value <0.001) positive BioAge gap at baseline.Also, the onset of diabetes at the follow-up phase was related to 1.10 ± 0.184 years (p value <0.001) positive BioAge gap at baseline.

Modifiable Risk Factors
As modifiable risk factors, those factors with the potential to be intervened or prevented, are crucial for interventions, we also listed out the identified modifiable risk factors with large effect size in Table 1.

Discussion
In this study, we showed how the biological aging can be quantified with a BioAge index estimated from commonly available laboratory blood measures and its interaction with lifestyle, home and the neighborhood environmental factors, and current and future disease conditions.We successfully identified a series of modifiable risks, protective lifestyle and environmental factors for the differential aging gauged by the BioAge gap.Furthermore, we demonstrated the utility of the BioAge gap in informing adverse health status by its strong association with various current and future disease conditions.To our best knowledge, this is the first study to fully explore the associations between human aging, quantified by a laboratory-measure-based BioAge index, and comprehensive lifestyle, environmental, and disease conditions.The variables contributing to the BioAge index are largely consistent with previous studies.For example, lymphocytes [15], red blood cell counts [16], alanine aminotransferase [17], and ferritin [18] have been identified to be associated with aging.Also, 25-hydroxyvitamin D [19] and albumin [20] were shown to be related to the aging and also to poor health outcomes.Hemoglobin A1c, the biomarker for diagnosing prediabetic and diabetic, was observed to have a higher level among older nondiabetic individuals [21].In addition, the increasing of LDL level with age was identified as a result of reduced LDL clearance ability within aging population [22].
We identified multiple medical conditions associated with the positive BioAge gap ("older" than the expected BioAge) in the aging population (Table 1).An array of existing chronic diseases, i.e., cancers, cardiovascular diseases, diabetes, kidney diseases, chronic respiratory diseases, and underactive thyroid gland, were identified to be associated with a positive BioAge gap, which reflects the subject was potentially biologically "older" than expected based on her/his chronological age.Previous research has identified that positive BioAge gap gauged by methylationbased aging indices are risk factors for the incidence of cardiovascular conditions, i.e., fatal coronary heart disease, peripheral arterial disease, and heart failure [23].Our results provide more evidence to support the association between positive BioAge gap and cardiovascular diseases.The associations observed between positive BioAge gap and various cancers also support previous findings that cancer and chemotherapy are potential aging accelerators [24].In addition, the identified associations between positive Bio-Age gap, kidney diseases, and diabetes are also supported by previous research.Jeroen et al. [25] summarized the underlying mechanisms of chronic kidney disease and premature aging.And type 2 diabetes was identified to be associated with aging acceleration, increased telomere shortening and mitochondrial DNA depletion [26].Increased prevalence of thyroid disorder and decreased thyroid function has been observed in elderly population [27].
Our finding of the association between the underactive thyroid gland and positive BioAge gap suggests that aging acceleration may exist in those with hypothyroidism.The associations between positive BioAge gap, recent pneumonia, and chronic lung diseases suggest aging acceleration patterns may exist in those with lung disease.The results were partly supported by previous research that chronic obstructive pulmonary disease, a chronic lung disease, has been considered a condition of accelerated lung aging [28].We also found that positive differential aging at baseline, i.e., a positive BioAge gap, was associated with the onset of kidney disease, diabetes, and Parkinson's disease about 3 years into the future, suggesting that the biological aging index might be predictive future diseases.The association with Parkinson's disease is particularly notable for Parkinson's disease is not typically detectable in routine blood tests [29], which are the basis for our aging index.One potential explanation for the strong association of our BioAge gap with Parkinson's disease is that dopamine neurons degenerate more quickly in Parkinson's disease, leading to accelerated or exaggerated aging that could be captured by the aging pattern of multiple blood markers [30].
Except for the medical outcomes, we also identified multiple modifiable risk factors that are associated with the differential aging but have the potential to be intervened (Table 1).We found that current and previous smoking behaviors were associated with positive differential aging.However, the effect size of current smoking is about 1.480 years, while the effect size of previous smoking is only moderate.The previous research on the association between smoking and aging acceleration is inconsistent.For example, Linli et al. [31] identified that active regular smoking behavior was associated with 1.190 years of brain age acceleration.Mamoshina et al. [10] found that active smokers had a higher aging rate based on laboratory measures.Associations between smoking status and methylation-based BioAge gaps were found by Horvath et al. [32].Our results support these findings, showing that smoking may contribute to positive differential aging.In addition, unlike the results in Mamoshina et al. [10], which showed that only subjects younger than 40 years had signs of accelerated aging, we identified associations between current smoking and a positive BioAge gap across all the age groups except for those over 80 years (online suppl.Fig. S5).For smokers over 80 years old, previous research has shown no increase in mortality risk compared to smokers from other age groups [33].Notably, we find an association between second-hand smoking exposure and increase BioAge gap.Overall, our results imply that quitting smoking and reducing secondhand smoking exposure may potentially reduce the risk of increased biological aging related to smoking.
Previous research has shown that accelerated aging, as indicated by telomere length and DNA methylation contents, is associated with dietary phosphate intake from red meat consumption.Our results support this finding.We further identified that frequent consumption of chicken or processed meat was related to "older" than expected BioAge.Healthy eating, on the other hand, for example, frequent consumption of legumes, vegetables, and fruits, is known to lower the risk of cardiovascular disease and cancer and is recommended in dietary guidelines [34].We find support for these dietary recommendations showing their associations with "younger" than expected BioAge.Playing tennis was particularly associated negative BioAge gap, supporting the idea that physical activities promote health status in older adults [35].Overall, our results imply that healthy eating and vigorous sport benefit the elderly by slowing down their biological aging.
Our study has several limitations.First, the positive BioAge gap only indicates that the subject's BioAge may be "older" than expected.To validate this aging acceleration effect, we need to show that the positive BioAge gap increases over time in a longitudinal study.However, while CLSA is a longitudinal study, the laboratory measures are currently only available at baseline.Therefore, it is not possible to derive the BioAge gap at follow-up.Second, the observed associations cannot be taken as casualties, and our results should be interpreted with caution.
In summary, a BioAge index, estimated with standard laboratory blood tests, shows potential as a marker for biological aging.The strong associations between the BioAge gap and many current and future diseases may lead to potential clinical utilization of this BioAge as a measure of differential aging.We identify several actionable risk factors (current smoking, regular consumption of red meat, chicken, and processed meat, and poor outcomes in nutritional risk screening) and protective factors (playing tennis, regular consumption of legumes, fruits, and vegetables) that might inform public health guidelines to promote healthy longevity.

Fig. 1 .
Fig. 1.Dietary factors associated with differential aging indicated by the biological age gap.The level with zero height was the reference level in the association analysis.Light gray indicates significant risk factors, while dark gray indicates significant protective factors.a Red meat: the frequency of beef, pork consumption.b Chicken: the frequency of chicken or turkey consumption.c Sausage: the frequency of sausages, hot dogs, ham, smoked meat, bacon, consumption.d Legumes: the frequency of legume consumption, including beans, peas, lentils.e Fruit: fresh, frozen, or canned fruit.f Vegetable: vegetables except for carrots, potatoes, or salad.* indicates the p value is <0.05, while ** indicates the p value is <0.01.

Fig. 2 .
Fig. 2. Lifestyle factors associated with the differential aging indicated by biological age gap. a Smoking: smoking status.b Passive smoking: frequency of passive smoking exposure at home.c Tennis: participated in tennis.d Wheelchair: used a wheelchair or motorized cart in the last month.

Fig. 3 .
Fig. 3. Current disease conditions associated with the differential aging indicated by biological age gap.Some diseases are combined to save space.a Chronic conditions: reported chronic conditions.b Heart: heart disease (including congestive heart failure); vascular: peripheral vascular disease or poor circulation in limbs; aneurysm: thoracic, abdominal, or cerebral aneurysm; angina: angina; stroke: stroke or cerebrovascular accident; heart attack: heart attack or myocardial infarction; coronary artery: coronary artery bypass surgery.c Cancer: kidney cancer; lung: lung cancer; lymphoid: lymphoid, hematopoietic and related tissue; non-Hodgkin: non-Hodgkin lymphoma cancer.d Lung: emphysema, chronic bronchitis, chronic obstructive pulmonary disease, or chronic changes in lungs due to smoking; pneumonia: pneumonia in the last year.e Shuffling gait/poor balancing: shuffling gait or poor balance.f Other: diabetes (borderline diabetes or high blood sugar); underactive thyroid gland; kidney disease (kidney disease or failure).

Table 1 .
Modifiable risk factors for differential aging