Multimodal Machine Learning for 10-Year Dementia Risk Prediction: The Framingham Heart Study
Article type: Research Article
Authors: Ding, Huitonga; b | Mandapati, Amiyac; d | Hamel, Alexander P.e | Karjadi, Codya; b | Ang, Ting F.A.a; b; f; g | Xia, Weimingh; i; j | Au, Rhodaa; b; f; g; k | Lin, Honghuange; *
Affiliations: [a] Department of Anatomy and Neurobiology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA | [b] The Framingham Heart Study, Boston University School of Medicine, Boston, MA, USA | [c] Department of Religious Studies, Brown University, Providence, RI, USA | [d] The Warren Alpert Medical School, Brown University, Providence, RI, USA | [e] Department of Medicine, University of Massachusetts Chan Medical School, Worcester, MA, USA | [f] Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA | [g] Slone Epidemiology Center, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA | [h] Geriatric Research Education and Clinical Center, VA Bedford Healthcare System, Bedford, MA, USA | [i] Department of Pharmacology and Experimental Therapeutics, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA | [j] Department of Biological Science, Kennedy College of Sciences, University of Massachusetts Lowell, Lowell, MA, USA | [k] Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
Correspondence: [*] Correspondence to: Honghuang Lin, PhD, Department of Medicine, University of Massachusetts Chan Medical School, Lake Ave North, S6-755, Worcester, MA 01655, USA. Tel.: +1774 455 4881; E-mail: Honghuang.Lin@umassmed.edu.
Abstract: Background:Early prediction of dementia risk is crucial for effective interventions. Given the known etiologic heterogeneity, machine learning methods leveraging multimodal data, such as clinical manifestations, neuroimaging biomarkers, and well-documented risk factors, could predict dementia more accurately than single modal data. Objective:This study aims to develop machine learning models that capitalize on neuropsychological (NP) tests, magnetic resonance imaging (MRI) measures, and clinical risk factors for 10-year dementia prediction. Methods:This study included participants from the Framingham Heart Study, and various data modalities such as NP tests, MRI measures, and demographic variables were collected. CatBoost was used with Optuna hyperparameter optimization to create prediction models for 10-year dementia risk using different combinations of data modalities. The contribution of each modality and feature for the prediction task was also quantified using Shapley values. Results:This study included 1,031 participants with normal cognitive status at baseline (age 75±5 years, 55.3% women), of whom 205 were diagnosed with dementia during the 10-year follow-up. The model built on three modalities demonstrated the best dementia prediction performance (AUC 0.90±0.01) compared to single modality models (AUC range: 0.82–0.84). MRI measures contributed most to dementia prediction (mean absolute Shapley value: 3.19), suggesting the necessity of multimodal inputs. Conclusion:This study shows that a multimodal machine learning framework had a superior performance for 10-year dementia risk prediction. The model can be used to increase vigilance for cognitive deterioration and select high-risk individuals for early intervention and risk management.
Keywords: Alzheimer’s disease, dementia risk prediction, machine learning, magnetic resonance imaging, multimodal data, neuropsychological test
DOI: 10.3233/JAD-230496
Journal: Journal of Alzheimer's Disease, vol. 96, no. 1, pp. 277-286, 2023