Open this publication in new window or tab >>2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]
Modern clinical decision support requires models that are both accurate and mechanistically interpretable. DNA methylation tracks the cumulative influence of development, lifestyle, and environment on gene regulation, but its dimensionality and tissue specificity complicate analysis and clinical application. This thesis develops explainable deep learning methods that learn coherent biological signals from genome-wide methylation data, aiming to derive reliable biomarkers of aging, disease risk and severity, and system-level health. Central to our approach are deep autoencoders, unsupervised multi-layered neural networks that efficiently compress DNA methylation data into low-dimensional embeddings that preserve relevant biology, paired with interpretability techniques that expose feature contributions and model reasoning, such as perturbation-based latent activation.
By training on large multi-tissue compendia of human DNA methylation samples, we observed that the autoencoders self-organized their latent spaces, recapitulating protein-protein interaction (PPI) modules. Interpreting these structured embeddings yielded pathway-enriched epigenomic signatures that supported accurate epigenetic age estimation and robust classification of disease status and smoking. Building on these findings, we introduced a PPI-guided autoencoder that incorporates a graph-regularized protein interaction prior, encouraging each latent unit to be functionally specific and colocalized within the human interactome. We showed that this soft guidance improved the mechanistic interpretability of downstream models, in this case supervised translators that map between omics modalities (transcriptomics, DNA methylation, genomics).
In parallel, we combined autoencoder embeddings with established aging markers to train explainable neural-network age clocks that achieved state-of-the-art cross-tissue precision, while also capturing fine-grained developmental, immune, and metabolic signatures. Finally, we operationalized these representations in a clinical decision-support pipeline that predicts respiratory, cardiovascular, and metabolic system-level health scores from blood methylation, with supervised deep learning models that highlight biological processes associated with each physiological system. Collectively, this work provides a scalable and auditable framework that converts methylomes into interpretable feature sets and actionable indicators for clinical use, enabling early risk assessment, monitoring of treatment responses and lifestyle changes, and informed therapeutic target prioritization.
Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2025. p. 105
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2490
Keywords
Deep learning, Autoencoders, DNA methylation, Aging, Health
National Category
Medical Genetics and Genomics
Identifiers
urn:nbn:se:liu:diva-219551 (URN)10.3384/9789181183320 (DOI)9789181183313 (ISBN)9789181183320 (ISBN)
Public defence
2025-12-18, C1, C-building, Campus Valla, Linköping, 09:00 (English)
Opponent
Supervisors
Note
Funding Agencies: Swedish Heart-Lung Foundation
2025-11-172025-11-172025-11-17Bibliographically approved